Andreas L Delmelle wrote: > On Jan 17, 2008, at 20:57, Simon Pepping wrote: > >> On Thu, Jan 17, 2008 at 12:27:11AM +0100, Andreas L Delmelle wrote: >>> >>> The lower-level LMs can signal an interrupt to the ancestor LMs, >>> based on >>> information they get through the LayoutContext --forced breaks being the >>> most prominent. >>> The FlowLM, instead of simply continuing the loop, could give control >>> back >>> to the PageSequenceLM, which can run the page breaker over the list >>> up to >>> that point. >> >> I would rather pass a reference to the page breaker in the >> getNextKnuthElements call. Each LM can then append Knuth elements in a >> callback to the pagebreaker. At each such append callback, the page >> breaker can decide to run the Knuth algorithm and ship pages. When >> this callback finishes, the LM can continue.
Sounds good. <snip/> > Also, I would see the basic strategy evolve in such a way that we have > some check on the list's eventual size to determine whether to use > best-fit or total-fit. Implementing this logic inside the LMs or the I don’t agree. If I’m layouting a scientific book, whatever its size I want total-fit, and I’m ready to buy as many memory sticks as needed to make that possible. If I’m printing out database records I don’t care about typesetting quality, I want it as fast as possible so I’ll switch to best-fit. And that while both documents may eventually have the same number of pages. <snip/> >>> We would also need to be able to determine the likelihood of the >>> computed >>> page-breaks changing due to additional content, if the FlowLM stil has >>> childLMs coming up. >> >> The Knuth algorithm does not determine actual pagebreaks but feasible >> pagebreaks. Feasible pagebreaks are only determined by the preceding >> Knuth sequence. Following Knuth elements can contribute more feasible >> pagebreaks, but they can not change already determined feasible >> pagebreaks. > >> Only at the appropriate time is the best feasible >> pagebreak selected, and are the actual pagebreaks determined. The >> appropriate time is the end of the sequence in a total-fit >> strategy. In the best-fit strategy it is the point where no further >> feasible pagebreaks for the current page can be contributed. >> > > Hmm... my bad, I think I expressed it incorrectly. > My interpretation is that, given incomplete sequences and intermittent > runs, the score/demerits for each feasible break (its 'feasibility') > determined by a prior run could still change due to additional childLMs > having contributed more elements in the next. No. Once the demerits of a feasible break are computed they won’t change, whatever comes afterwards. That’s all the idea of dynamic programming. > There is no strict "point" where we switch from determining feasible > breaks to computing actual ones: the feasible breaks become 'more' > actual with each progression. > > Basically, it's about questions like: > What happens if we give the algorithm a sequence S, and subsequently a > different sequence S' which actually is S plus some more elements added > to the tail? The same sequence of feasible breaks for the common part, and potentially a totally different sequence of final chosen breaks. > How are the resulting feasible breaks influenced if the total-fit > algorithm is run first over S, then over S'? > Is there a basis for saying: "The feasible break at position P in S is > more/less likely to be considered as an actual break, when you compare > the results for the same sequence, with some more content added, given > the same context." > I have always somehow assumed there to be a threshold in the most common > cases. In the sense that certain (maybe in fact even the bulk of) > feasible breaks can already be excluded quite early, even if the > sequence consists of millions of elements. Simply because they would, in > intermittent runs, repeatedly be considered to be 'bad' breaks, or > because they become 'worse' than the previous run. That’s not possible. The last page of a sequence of 300 pages may influence the choice of the very first page break. I agree with Jeremias that it is too soon to speak about multi-threading. Well, to start implementing anything at least, we can always speak ;-) The layout code is already very complicated and would need some non-trivial cleanup and refactoring first. Besides, several FO elements aren’t implemented yet and I think it’s more urgent. I also remember Chris’ earlier comment about the fact that with 2 other guys he spent 3 months fixing the bugs of a multi-threaded application. Layout is complex enough. The only way I think multi-threading could occur is by reflecting (and enforcing) the overall software architecture. Basically I see the whole process as a series of specialised tasks that could be piped together, pretty much like the Unix way: XML parsing | FO tree building | Layout | Area tree | Rendering Each task could then be put in a separate thread. It’s important to keep layout as a single sequential process IMO, as page-sequences aren’t totally independent: think of forward page references, but also, on a longer term, the implementation of total-fit over the whole document, and not only a page-sequence. One might choose the (n + 1) pages version of a previous page sequence instead of the n pages one, even if it’s slightly less desirable, because this page shift will help a further page sequence to be layout much better. This is a very far future but I don’t want it to be compromised. Vincent