J.Pietschmann wrote:

Maybe I'm wrong in trying to do so, but I'd like to handle both formatting objects in the same way.

If page numbers can be resolved to strings early, it should be
done. All the hassle for space readjusting, and perhaps reflowing
content, should be reserved for forward references, if only for
performance reasons.

Sorry, my last message was not very clear (and / or I misunderstood your comments).

The point is that the "real" page numbers are not known until the addAreas() phase, when pages are actually created.

The Knuth-style page breaking algorithm gets a representation of a whole page-sequence (or part of it, if there are break conditions) and then computes all the page breaks at once: so, the fo:page-numbers comprised in that page-sequence cannot know in which page they will be placed, and the line breaking is necessarily performed using elements whose width could be just a guess.

What I meant when I said that both page-number and page-number-citation should be handled in the same way was this: during the line breaking their real value is equally unknown.

Well, to be more precise the value of a page-number is *always* unknown during line breaking, while a page-number-citation could refer to an object in a previous page-sequence, so it could be known: in this case the method PNCLM.get() already returns a TextArea with the "real" value and its ipd (maybe you were referring to this? this won't be changed at all).

[from the other message]
- sometimes, when a particularly elegant output is needed, it would really be desirable to have a two-steps algorithm, with line-breaking performed again once the actual width of each object is known.

Well, it's not for particular elegant output, it's for the
case of having multiple page number citations which point
to five digit page numbers in the same line. Real life examples
include references to page numbers in roman number format, which
easily get into the six character range, and enumerating
references in book indices, where the problem is may be amplified
as an index is usually set in several narrow columns.

Great examples, I did not think of them!

I imagine that, should the index be in a page-sequence preceding the ones with the content, the line breaking of it could be really ugly, due to the provisional width of the references.

This example is really interesting: in this case, a re-flowing of the index pages could not be able to achieve a better output, should it be performed before the breaking of the page-sequence with the content; and it could be avoided just deferring the breaking of this page-sequence, so that the first breaking can already work using the real values for all page-number-citations.

If we see each page-sequence as a node, and a page-number-citation as a directed edge from one node (the target page-sequence) to another one (the page-sequence containing the page-number-citation), this is a well-known problem: the topological sorting of a graph. If the graph is acyclic then there is a sorting of its nodes such that for each edge going from a node A to a node B, A precedes B in the sorting order; i.e., the page-sequences could be ordered so that each one is flowed when all its page-number-references are already known.

Very interesting indeed ... as soon as I finish working on the line-adjusting I'll spend some more thought on this ...

(sorry for the long message!)

Regards
    Luca


Reply via email to