Re: Thoughts on interaction between FOTree and layoutengine

Andreas L Delmelle Thu, 17 Jan 2008 15:09:48 -0800

On Jan 17, 2008, at 20:57, Simon Pepping wrote:

On Thu, Jan 17, 2008 at 12:27:11AM +0100, Andreas L Delmelle wrote:
Right now, the element list is constructed as the result ofrecursive calls
to getNextChildLM.getNextKnuthElements().
/The/ return list upon which the page breaker operates is the onethat is
ultimately returned by the FlowLM.
Instead of that, I've been thinking in the direction of making ita data
structure that exists 'physically' separate from the LMs.
This structure, created and maintained by the PageSequenceLM,would be
passed down into an appendNextKnuthElementsTo() method.
The lower-level LMs can signal an interrupt to the ancestor LMs,based oninformation they get through the LayoutContext --forced breaksbeing the
most prominent.
The FlowLM, instead of simply continuing the loop, could givecontrol backto the PageSequenceLM, which can run the page breaker over thelist up to
that point.
I would rather pass a reference to the page breaker in the
getNextKnuthElements call. Each LM can then append Knuth elements in a
callback to the pagebreaker. At each such append callback, the page
breaker can decide to run the Knuth algorithm and ship pages. When
this callback finishes, the LM can continue. Running the Knuth
algorithm intermittently makes no sense in a total-fit algorithm.

Right. Running the algorithm intermittently may make no sense /in/ atotal-fit algorithm, the implementation of which currently takes forgranted the fact that a given sequence S will always be complete, sowe can immediately move on from end-of-layout to building the areatree for the page-sequence. Suppose, however, that this will nolonger be guaranteed.

Also, I would see the basic strategy evolve in such a way that wehave some check on the list's eventual size to determine whether touse best-fit or total-fit. Implementing this logic inside the LMs orthe breaking-algorithm seems out-of-place. As Jeremias mentioned, wewould have to have some mechanism for limiting memory consumption.Keeping total-fit as the default strategy for the layout-engine isfine by me, as long as we also can switch to best-fit at theappropriate point. This point is unrelated to anything layout-specific, so I was thinking that a separate data structure would helpmake the implementation of such a mechanism much cleaner. If thischeck becomes part of each childLM's getNextElements(), this mightturn out to become a pain to maintain... If we implement it as partof the add() and remove() of that hypothetical structure, it remainsnicely separated from both the LM-logic and the breaking-algorithm.Two places where it does not belong.

We would also need to be able to determine the likelihood of thecomputedpage-breaks changing due to additional content, if the FlowLM stilhas
childLMs coming up.
The Knuth algorithm does not determine actual pagebreaks but feasible
pagebreaks. Feasible pagebreaks are only determined by the preceding
Knuth sequence. Following Knuth elements can contribute more feasible
pagebreaks, but they can not change already determined feasible
pagebreaks.

Only at the appropriate time is the best feasible
pagebreak selected, and are the actual pagebreaks determined. The
appropriate time is the end of the sequence in a total-fit
strategy. In the best-fit strategy it is the point where no further
feasible pagebreaks for the current page can be contributed.


Hmm... my bad, I think I expressed it incorrectly.

My interpretation is that, given incomplete sequences andintermittent runs, the score/demerits for each feasible break (its'feasibility') determined by a prior run could still change due toadditional childLMs having contributed more elements in the next.

There is no strict "point" where we switch from determining feasiblebreaks to computing actual ones: the feasible breaks become 'more'actual with each progression.


Basically, it's about questions like:

What happens if we give the algorithm a sequence S, and subsequentlya different sequence S' which actually is S plus some more elementsadded to the tail?How are the resulting feasible breaks influenced if the total-fitalgorithm is run first over S, then over S'?Is there a basis for saying: "The feasible break at position P in Sis more/less likely to be considered as an actual break, when youcompare the results for the same sequence, with some more contentadded, given the same context."

I have always somehow assumed there to be a threshold in the mostcommon cases. In the sense that certain (maybe in fact even the bulkof) feasible breaks can already be excluded quite early, even if thesequence consists of millions of elements. Simply because they would,in intermittent runs, repeatedly be considered to be 'bad' breaks, orbecause they become 'worse' than the previous run.Something like this happens already, in a way, I think, albeit verydeeply embedded inside the algorithm, and a bit late too...Once we start determining actual breaks, the further we progress inthe page-sequence, the more layout-possibilities that need to beexcluded. We simply cannot consider all of them, including the onesthat would lead to unreasonable results.

Yet another way to look at it: the result of a total-fit strategy isthe best-fit for the entire page-sequence, which can be viewed as theresult of multiple total-fits for subsets of the page-sequence,especially where forced page-breaks are involved. If we place aforced break-after="page" on the first block in a document, the total-fit result for the first page can in general be determined longbefore knowing anything else about the document.



Cheers

Andreas

Re: Thoughts on interaction between FOTree and layoutengine

Reply via email to