Andreas L Delmelle wrote:
> On Jan 17, 2008, at 20:57, Simon Pepping wrote:
> 
>> On Thu, Jan 17, 2008 at 12:27:11AM +0100, Andreas L Delmelle wrote:
>>>
>>> The lower-level LMs can signal an interrupt to the ancestor LMs,
>>> based on
>>> information they get through the LayoutContext --forced breaks being the
>>> most prominent.
>>> The FlowLM, instead of simply continuing the loop, could give control
>>> back
>>> to the PageSequenceLM, which can run the page breaker over the list
>>> up to
>>> that point.
>>
>> I would rather pass a reference to the page breaker in the
>> getNextKnuthElements call. Each LM can then append Knuth elements in a
>> callback to the pagebreaker. At each such append callback, the page
>> breaker can decide to run the Knuth algorithm and ship pages. When
>> this callback finishes, the LM can continue.

Sounds good.


<snip/>
> Also, I would see the basic strategy evolve in such a way that we have
> some check on the list's eventual size to determine whether to use
> best-fit or total-fit. Implementing this logic inside the LMs or the

I don’t agree. If I’m layouting a scientific book, whatever its size 
I want total-fit, and I’m ready to buy as many memory sticks as needed 
to make that possible.
If I’m printing out database records I don’t care about typesetting 
quality, I want it as fast as possible so I’ll switch to best-fit. And 
that while both documents may eventually have the same number of pages.


<snip/>
>>> We would also need to be able to determine the likelihood of the
>>> computed
>>> page-breaks changing due to additional content, if the FlowLM stil has
>>> childLMs coming up.
>>
>> The Knuth algorithm does not determine actual pagebreaks but feasible
>> pagebreaks. Feasible pagebreaks are only determined by the preceding
>> Knuth sequence. Following Knuth elements can contribute more feasible
>> pagebreaks, but they can not change already determined feasible
>> pagebreaks.
> 
>> Only at the appropriate time is the best feasible
>> pagebreak selected, and are the actual pagebreaks determined. The
>> appropriate time is the end of the sequence in a total-fit
>> strategy. In the best-fit strategy it is the point where no further
>> feasible pagebreaks for the current page can be contributed.
>>
> 
> Hmm... my bad, I think I expressed it incorrectly.
> My interpretation is that, given incomplete sequences and intermittent
> runs, the score/demerits for each feasible break (its 'feasibility')
> determined by a prior run could still change due to additional childLMs
> having contributed more elements in the next.

No. Once the demerits of a feasible break are computed they won’t 
change, whatever comes afterwards. That’s all the idea of dynamic 
programming.


> There is no strict "point" where we switch from determining feasible
> breaks to computing actual ones: the feasible breaks become 'more'
> actual with each progression.
> 
> Basically, it's about questions like:
> What happens if we give the algorithm a sequence S, and subsequently a
> different sequence S' which actually is S plus some more elements added
> to the tail?

The same sequence of feasible breaks for the common part, and 
potentially a totally different sequence of final chosen breaks. 


> How are the resulting feasible breaks influenced if the total-fit
> algorithm is run first over S, then over S'?
> Is there a basis for saying: "The feasible break at position P in S is
> more/less likely to be considered as an actual break, when you compare
> the results for the same sequence, with some more content added, given
> the same context."

> I have always somehow assumed there to be a threshold in the most common
> cases. In the sense that certain (maybe in fact even the bulk of)
> feasible breaks can already be excluded quite early, even if the
> sequence consists of millions of elements. Simply because they would, in
> intermittent runs, repeatedly be considered to be 'bad' breaks, or
> because they become 'worse' than the previous run.

That’s not possible. The last page of a sequence of 300 pages may 
influence the choice of the very first page break.



I agree with Jeremias that it is too soon to speak about 
multi-threading. Well, to start implementing anything at least, we can 
always speak ;-)
The layout code is already very complicated and would need some 
non-trivial cleanup and refactoring first. Besides, several FO elements 
aren’t implemented yet and I think it’s more urgent.

I also remember Chris’ earlier comment about the fact that with 2 other 
guys he spent 3 months fixing the bugs of a multi-threaded application. 
Layout is complex enough.

The only way I think multi-threading could occur is by reflecting (and 
enforcing) the overall software architecture. Basically I see the whole 
process as a series of specialised tasks that could be piped together, 
pretty much like the Unix way:
    XML parsing | FO tree building | Layout | Area tree | Rendering
Each task could then be put in a separate thread.

It’s important to keep layout as a single sequential process IMO, as 
page-sequences aren’t totally independent: think of forward page 
references, but also, on a longer term, the implementation of total-fit 
over the whole document, and not only a page-sequence. One might choose 
the (n + 1) pages version of a previous page sequence instead of the 
n pages one, even if it’s slightly less desirable, because this page 
shift will help a further page sequence to be layout much better.
This is a very far future but I don’t want it to be compromised.


Vincent

Reply via email to