Victor Mote wrote:
Victor Mote wrote:

Oleg Tkachenko wrote:

I think we should separate fo tree itself from the process of its
building. fo
tree structure is required and I agree with Keiron - it's not a
DOM, it's just
tree representation and I cherish the idea to make it an
effectively small
structure like saxon's internal tree. But any interim buffers should be
avoided as much as it's possible (well, Piter's buffer seems not
to be a burden).
This is probably a philosophical difference. It seems to me that the area
tree is built on the foundation of the fo tree, and that if we only get a
brief glimpse of the fo tree as it goes by, not only does our foundation
disappear, but we end up putting all of that weight into the
which tends to make the whole thing collapse.


After thinking about this a bit more, I think I confused this issue. I think
what you were saying is that the existing FOP FO tree /is/ the lightweight
data structure that you like. I see your point, and yes I agree, there is no
need to replace it with something heavier. My train of thought was in a
different direction -- ie. how to get that structure written to disk when
necessary so that it doesn't all have to be in memory. I (think I) also had
a wrong conception of how long the FO tree data persisted. My apologies for
the confusion.

I will comment at greater length, later, on the issues you have raised, but I want to make some comments on the tree structures here.

Most people coming to FOP get confused by the fact that SAX is used for parsing. They think in terms of a SAX/DOM dichotomy, and assume that, because we are using SAX, we have nothing like a DOM. In fact, the FO tree is our DOM, or the first stage of our DOM. In the beginning... the FO tree was always there while the area tree was being built, but Mark Lillywhite did some hacking to restrict the tree to the currently active page sequence.

As you point out, the FO tree provides the semantics of the layout. The Area tree is an internal representation of the series of marks on the page. If re-flowing is called for, the information from the FO tree is, once again, required. In my opinion, that means that the FO tree has to be cached. To be more precise, the FO tree has to be able to be cached. I envisage the layout engine feeding instructions back to the FO tree concerning subtrees; basically, delete subtree or cache subtree. The layout engine knows whether the layout of a particular page or page sequence is firm or rubbery, and can instruct the FO tree accordingly.

Such decisions would be made very carefully in the layout engine. Back in the mists of time, Arved noted that the page numbering problem could be minimised by allowing enough room for the page number worst case. That was a sensible restriction, but it implies a good guess about just what that worst case is going to be. To get that completely right, you need to lay it all out. In any case, if you have the ever-popular "Page x of y" in your static-content, you need to redo every page anyway. What the initial guess, if it's correct, circumvents, is the need to reflow every page, with all of its nightmarish implications.

This is a case for which the min/opt/max expressions of FOP were made.

Take a punt about last page number width.
Layout the pages, using "optimum".
Get to the end, with all page numbers resolved.
Go back and reflow lines/paragraphs as necessary, using the full min/max range to avoid page under/overflow.

(N.B. This won't entirely remove the need for backup and reflow in other circumstances.)

I should point out here that I perceive the need for a third tree - a layout tree. It parallels the layout managers, which themselves form a tree. This is still a vague idea for me, but the layout tree would be the work-in-progress on the area tree. It's necessary because much of the layout happens bottom-up, and at the bottom, layout is occurring which cannot go into the current page. Firstly, you don't want to throw away the layout work that you have already done. Secondly, after the page boundary slashes across the layout you have been engaged in, you want to be able to pick up all of the threads again at the beginning of the new page. The layout tree formalises this procedure. Read Jeffrey Kingston's Lout design document for some insight on this.

When I talk about the layout engine, I have in mind the process that builds the layout tree, and moves chunks as they are completed into the area tree.

"Lord, to whom shall we go?"

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Reply via email to