Re: Thoughts on interaction between FOTree and layoutengine
Andreas L Delmelle schrieb: Yet another way to look at it: the result of a total-fit strategy is the best-fit for the entire page-sequence, which can be viewed as the result of multiple total-fits for subsets of the page-sequence, especially where forced page-breaks are involved. If we place a forced break-after=page on the first block in a document, the total-fit result for the first page can in general be determined long before knowing anything else about the document. Though my knowledge about fop-internals are very limited I think forced page-breaks are a good point to finish the layout of the pages before and release flow-content. For context-data and static content there is no chance to do so.
Re: Thoughts on interaction between FOTree and layoutengine
On 18.01.2008 00:09:14 Andreas L Delmelle wrote: On Jan 17, 2008, at 20:57, Simon Pepping wrote: On Thu, Jan 17, 2008 at 12:27:11AM +0100, Andreas L Delmelle wrote: Right now, the element list is constructed as the result of recursive calls to getNextChildLM.getNextKnuthElements(). /The/ return list upon which the page breaker operates is the one that is ultimately returned by the FlowLM. Instead of that, I've been thinking in the direction of making it a data structure that exists 'physically' separate from the LMs. This structure, created and maintained by the PageSequenceLM, would be passed down into an appendNextKnuthElementsTo() method. The lower-level LMs can signal an interrupt to the ancestor LMs, based on information they get through the LayoutContext --forced breaks being the most prominent. The FlowLM, instead of simply continuing the loop, could give control back to the PageSequenceLM, which can run the page breaker over the list up to that point. I would rather pass a reference to the page breaker in the getNextKnuthElements call. Each LM can then append Knuth elements in a callback to the pagebreaker. At each such append callback, the page breaker can decide to run the Knuth algorithm and ship pages. When this callback finishes, the LM can continue. Running the Knuth algorithm intermittently makes no sense in a total-fit algorithm. Right. Running the algorithm intermittently may make no sense /in/ a total-fit algorithm, the implementation of which currently takes for granted the fact that a given sequence S will always be complete, so we can immediately move on from end-of-layout to building the area tree for the page-sequence. Suppose, however, that this will no longer be guaranteed. Also, I would see the basic strategy evolve in such a way that we have some check on the list's eventual size to determine whether to use best-fit or total-fit. Implementing this logic inside the LMs or the breaking-algorithm seems out-of-place. As Jeremias mentioned, we would have to have some mechanism for limiting memory consumption. Keeping total-fit as the default strategy for the layout-engine is fine by me, as long as we also can switch to best-fit at the appropriate point. This point is unrelated to anything layout- specific, so I was thinking that a separate data structure would help make the implementation of such a mechanism much cleaner. If this check becomes part of each childLM's getNextElements(), this might turn out to become a pain to maintain... If we implement it as part of the add() and remove() of that hypothetical structure, it remains nicely separated from both the LM-logic and the breaking-algorithm. Two places where it does not belong. Please note that total-fit is not possible if the available IPD changes from page to page as this will require regeneration of part of the element list(s) after a page break decision. So part of the check whether total-fit can be used must be the inspection of the layout-master-set. This will add further complexity. The layout-master-set creates further complications anyway: page-position last/only processing are non-trivial especially is column balancing and span changes come into play (one part where FOP is still incomplete today). BTW, has anyone checked how the XSL 1.1 flow-map functionality impacts all this? I haven't had the chance. snip/ Jeremias Maerki
Re: Thoughts on interaction between FOTree and layoutengine
Andreas L Delmelle wrote: On Jan 17, 2008, at 20:57, Simon Pepping wrote: On Thu, Jan 17, 2008 at 12:27:11AM +0100, Andreas L Delmelle wrote: The lower-level LMs can signal an interrupt to the ancestor LMs, based on information they get through the LayoutContext --forced breaks being the most prominent. The FlowLM, instead of simply continuing the loop, could give control back to the PageSequenceLM, which can run the page breaker over the list up to that point. I would rather pass a reference to the page breaker in the getNextKnuthElements call. Each LM can then append Knuth elements in a callback to the pagebreaker. At each such append callback, the page breaker can decide to run the Knuth algorithm and ship pages. When this callback finishes, the LM can continue. Sounds good. snip/ Also, I would see the basic strategy evolve in such a way that we have some check on the list's eventual size to determine whether to use best-fit or total-fit. Implementing this logic inside the LMs or the I don’t agree. If I’m layouting a scientific book, whatever its size I want total-fit, and I’m ready to buy as many memory sticks as needed to make that possible. If I’m printing out database records I don’t care about typesetting quality, I want it as fast as possible so I’ll switch to best-fit. And that while both documents may eventually have the same number of pages. snip/ We would also need to be able to determine the likelihood of the computed page-breaks changing due to additional content, if the FlowLM stil has childLMs coming up. The Knuth algorithm does not determine actual pagebreaks but feasible pagebreaks. Feasible pagebreaks are only determined by the preceding Knuth sequence. Following Knuth elements can contribute more feasible pagebreaks, but they can not change already determined feasible pagebreaks. Only at the appropriate time is the best feasible pagebreak selected, and are the actual pagebreaks determined. The appropriate time is the end of the sequence in a total-fit strategy. In the best-fit strategy it is the point where no further feasible pagebreaks for the current page can be contributed. Hmm... my bad, I think I expressed it incorrectly. My interpretation is that, given incomplete sequences and intermittent runs, the score/demerits for each feasible break (its 'feasibility') determined by a prior run could still change due to additional childLMs having contributed more elements in the next. No. Once the demerits of a feasible break are computed they won’t change, whatever comes afterwards. That’s all the idea of dynamic programming. There is no strict point where we switch from determining feasible breaks to computing actual ones: the feasible breaks become 'more' actual with each progression. Basically, it's about questions like: What happens if we give the algorithm a sequence S, and subsequently a different sequence S' which actually is S plus some more elements added to the tail? The same sequence of feasible breaks for the common part, and potentially a totally different sequence of final chosen breaks. How are the resulting feasible breaks influenced if the total-fit algorithm is run first over S, then over S'? Is there a basis for saying: The feasible break at position P in S is more/less likely to be considered as an actual break, when you compare the results for the same sequence, with some more content added, given the same context. I have always somehow assumed there to be a threshold in the most common cases. In the sense that certain (maybe in fact even the bulk of) feasible breaks can already be excluded quite early, even if the sequence consists of millions of elements. Simply because they would, in intermittent runs, repeatedly be considered to be 'bad' breaks, or because they become 'worse' than the previous run. That’s not possible. The last page of a sequence of 300 pages may influence the choice of the very first page break. I agree with Jeremias that it is too soon to speak about multi-threading. Well, to start implementing anything at least, we can always speak ;-) The layout code is already very complicated and would need some non-trivial cleanup and refactoring first. Besides, several FO elements aren’t implemented yet and I think it’s more urgent. I also remember Chris’ earlier comment about the fact that with 2 other guys he spent 3 months fixing the bugs of a multi-threaded application. Layout is complex enough. The only way I think multi-threading could occur is by reflecting (and enforcing) the overall software architecture. Basically I see the whole process as a series of specialised tasks that could be piped together, pretty much like the Unix way: XML parsing | FO tree building | Layout | Area tree | Rendering Each task could then be put in a separate thread. It’s important to keep layout as a single sequential process IMO, as page-sequences aren’t
Re: Thoughts on interaction between FOTree and layoutengine
On Fri, Jan 18, 2008 at 12:09:14AM +0100, Andreas L Delmelle wrote: I have always somehow assumed there to be a threshold in the most common cases. In the sense that certain (maybe in fact even the bulk of) feasible breaks can already be excluded quite early, even if the sequence consists of millions of elements. Simply because they would, in intermittent runs, repeatedly be considered to be 'bad' breaks, or because they become 'worse' than the previous run. Something like this happens already, in a way, I think, albeit very deeply embedded inside the algorithm, and a bit late too... Once we start determining actual breaks, the further we progress in the page-sequence, the more layout-possibilities that need to be excluded. We simply cannot consider all of them, including the ones that would lead to unreasonable results. Study the book, Andreas. Donald Knuth has treated all this, in a manner that is quite complete. As Vincent said, this is the principle of dynamic programming. Earlier feasible pagebreaks are kept 'alive' because later feasible pagebreaks keep a chain of references to them. Early feasible pagebreaks that are not part of a 'best' chain of pagebreaks up to one of the active nodes, are not referenced anymore and therefore disappear from memory. Simon -- Simon Pepping home page: http://www.leverkruid.eu
Re: Thoughts on interaction between FOTree and layoutengine
On Fri, Jan 18, 2008 at 12:52:01PM +0100, Luca Furini wrote: I was wondering whether this the promotion can be performed also each time the active node list contains just a single node ... Yes, that should be possible. Simon -- Simon Pepping home page: http://www.leverkruid.eu
Re: Thoughts on interaction between FOTree and layoutengine
On Thu, Jan 17, 2008 at 12:27:11AM +0100, Andreas L Delmelle wrote: Right now, the element list is constructed as the result of recursive calls to getNextChildLM.getNextKnuthElements(). /The/ return list upon which the page breaker operates is the one that is ultimately returned by the FlowLM. Instead of that, I've been thinking in the direction of making it a data structure that exists 'physically' separate from the LMs. This structure, created and maintained by the PageSequenceLM, would be passed down into an appendNextKnuthElementsTo() method. The lower-level LMs can signal an interrupt to the ancestor LMs, based on information they get through the LayoutContext --forced breaks being the most prominent. The FlowLM, instead of simply continuing the loop, could give control back to the PageSequenceLM, which can run the page breaker over the list up to that point. I would rather pass a reference to the page breaker in the getNextKnuthElements call. Each LM can then append Knuth elements in a callback to the pagebreaker. At each such append callback, the page breaker can decide to run the Knuth algorithm and ship pages. When this callback finishes, the LM can continue. Running the Knuth algorithm intermittently makes no sense in a total-fit algorithm. We would also need to be able to determine the likelihood of the computed page-breaks changing due to additional content, if the FlowLM stil has childLMs coming up. The Knuth algorithm does not determine actual pagebreaks but feasible pagebreaks. Feasible pagebreaks are only determined by the preceding Knuth sequence. Following Knuth elements can contribute more feasible pagebreaks, but they can not change already determined feasible pagebreaks. Only at the appropriate time is the best feasible pagebreak selected, and are the actual pagebreaks determined. The appropriate time is the end of the sequence in a total-fit strategy. In the best-fit strategy it is the point where no further feasible pagebreaks for the current page can be contributed. The Knuth algorithm now performs a loop over all Knuth elements. This loop can be made interruptible, provided the state is preserved. This property makes it possible to run the Knuth algorithm intermittently. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.eu
Re: Thoughts on interaction between FOTree and layoutengine
On Jan 17, 2008, at 20:57, Simon Pepping wrote: On Thu, Jan 17, 2008 at 12:27:11AM +0100, Andreas L Delmelle wrote: Right now, the element list is constructed as the result of recursive calls to getNextChildLM.getNextKnuthElements(). /The/ return list upon which the page breaker operates is the one that is ultimately returned by the FlowLM. Instead of that, I've been thinking in the direction of making it a data structure that exists 'physically' separate from the LMs. This structure, created and maintained by the PageSequenceLM, would be passed down into an appendNextKnuthElementsTo() method. The lower-level LMs can signal an interrupt to the ancestor LMs, based on information they get through the LayoutContext --forced breaks being the most prominent. The FlowLM, instead of simply continuing the loop, could give control back to the PageSequenceLM, which can run the page breaker over the list up to that point. I would rather pass a reference to the page breaker in the getNextKnuthElements call. Each LM can then append Knuth elements in a callback to the pagebreaker. At each such append callback, the page breaker can decide to run the Knuth algorithm and ship pages. When this callback finishes, the LM can continue. Running the Knuth algorithm intermittently makes no sense in a total-fit algorithm. Right. Running the algorithm intermittently may make no sense /in/ a total-fit algorithm, the implementation of which currently takes for granted the fact that a given sequence S will always be complete, so we can immediately move on from end-of-layout to building the area tree for the page-sequence. Suppose, however, that this will no longer be guaranteed. Also, I would see the basic strategy evolve in such a way that we have some check on the list's eventual size to determine whether to use best-fit or total-fit. Implementing this logic inside the LMs or the breaking-algorithm seems out-of-place. As Jeremias mentioned, we would have to have some mechanism for limiting memory consumption. Keeping total-fit as the default strategy for the layout-engine is fine by me, as long as we also can switch to best-fit at the appropriate point. This point is unrelated to anything layout- specific, so I was thinking that a separate data structure would help make the implementation of such a mechanism much cleaner. If this check becomes part of each childLM's getNextElements(), this might turn out to become a pain to maintain... If we implement it as part of the add() and remove() of that hypothetical structure, it remains nicely separated from both the LM-logic and the breaking-algorithm. Two places where it does not belong. We would also need to be able to determine the likelihood of the computed page-breaks changing due to additional content, if the FlowLM stil has childLMs coming up. The Knuth algorithm does not determine actual pagebreaks but feasible pagebreaks. Feasible pagebreaks are only determined by the preceding Knuth sequence. Following Knuth elements can contribute more feasible pagebreaks, but they can not change already determined feasible pagebreaks. Only at the appropriate time is the best feasible pagebreak selected, and are the actual pagebreaks determined. The appropriate time is the end of the sequence in a total-fit strategy. In the best-fit strategy it is the point where no further feasible pagebreaks for the current page can be contributed. Hmm... my bad, I think I expressed it incorrectly. My interpretation is that, given incomplete sequences and intermittent runs, the score/demerits for each feasible break (its 'feasibility') determined by a prior run could still change due to additional childLMs having contributed more elements in the next. There is no strict point where we switch from determining feasible breaks to computing actual ones: the feasible breaks become 'more' actual with each progression. Basically, it's about questions like: What happens if we give the algorithm a sequence S, and subsequently a different sequence S' which actually is S plus some more elements added to the tail? How are the resulting feasible breaks influenced if the total-fit algorithm is run first over S, then over S'? Is there a basis for saying: The feasible break at position P in S is more/less likely to be considered as an actual break, when you compare the results for the same sequence, with some more content added, given the same context. I have always somehow assumed there to be a threshold in the most common cases. In the sense that certain (maybe in fact even the bulk of) feasible breaks can already be excluded quite early, even if the sequence consists of millions of elements. Simply because they would, in intermittent runs, repeatedly be considered to be 'bad' breaks, or because they become 'worse' than the previous run. Something like this happens already, in a way, I think, albeit
Re: Thoughts on interaction between FOTree and layoutengine
On Wed, Jan 16, 2008 at 01:20:36AM +0100, Andreas L Delmelle wrote: So, on top of that, I'm thinking of making b) less of a monolithic process. At the moment, we always wait for an endPageSequence() call on the AreaTreeHandler, which works fine for small to medium-sized page-sequences, but is definitely not scaleable to larger ones consisting of a lot of FOs. I think we should take a look at implementing endFlow(), for instance, or startFlow(). At those points, we are already guaranteed to have at least the part of the FOTree that is necessary to perform some basic preliminary layout (the *ahem* Pagination and Layout FOs). At the moment our only page breaking strategy is the total-fit algorithm. It will by its design wait until all Knuth elements of the page sequence are available. Then it makes no difference if the layout engine is called at endPageSequence or more frequently. One of my goals is to enable another page breaking strategy, a best-fit algorithm per page. That would enable one to complete layout and area building for parts of a page sequence. But the current layout engine first collects all Knuth elements for the page sequence and then starts the page breaker. Making the LMs return their Knuth elements to the page breaker immediately is required to finish pages early. But that is another major refactoring effort. Simon -- Simon Pepping home page: http://www.leverkruid.eu
Thoughts on interaction between FOTree and layoutengine
Hi people, I know I've mentioned it a few times before, so here's another attempt at brainstorming about possible improvements in interaction between fotree and layoutengine. (So, obviously what follows does not apply to the output formats that bypass the layoutengine, like RTF) In short, it concerns two aspects of this interaction that, when revisited, could lead to a much faster and less memory-hungry FOP. At least, that's the idea. I have read some on Java performance tuning, but would hardly call myself an expert on the matter... The first would be FOP-internal multi-threading, which is already possible at the higher-level (and relatively easy to implement, IIC). Right now, we do roughly, single-threaded: a) catch SAX events, and build a sort of FOP-DOM (the FOTree) until we reach the end of a page-sequence b) perform layout for the entire page-sequence c) create the area tree, and render it After c) we go back to a), while that process could have easily continued by starting b) in a separate thread, at least for the first a). Hope you can follow... :/ As Chris recently mentioned, forcing this /would/ make FOP non- compliant to the J2EE Bean standard, so this could probably be made part of the configuration somehow (and, leaping to Adrian's recent post on fop-users@: this could then be specified in the XSLT). At first glance, by itself, this would make FOP actually more memory- hungry than it currently is. Imagine a handful of 20-page sequences being processed concurrently, and consider the number of objects involved if we're talking about lots of small blocks, inlines, tables etc. So, on top of that, I'm thinking of making b) less of a monolithic process. At the moment, we always wait for an endPageSequence() call on the AreaTreeHandler, which works fine for small to medium-sized page- sequences, but is definitely not scaleable to larger ones consisting of a lot of FOs. I think we should take a look at implementing endFlow (), for instance, or startFlow(). At those points, we are already guaranteed to have at least the part of the FOTree that is necessary to perform some basic preliminary layout (the *ahem* Pagination and Layout FOs). Other handlers that could turn out to be interesting to implement (but I'm guessing this to be interesting only for the flow, not for the static-content): * endExternalGraphic() / endInstreamForeignObject() * endInlineContainer() [... ;-) ...] / endBlockContainer() * endInline() / endBlock() In fact, there are a whole lot more. The idea is obviously not to have the PageSequenceLM run over the entire page-sequence multiple times, but to have the next childLM continue where the last one left off. If the area addition is started in yet another thread, I think, it would even become possible to release/GC parts of the FOTree (have the LMs dereference their FO) long before we even reach the first endPageSequence() during parsing. I'm not entirely sure yet, but I have a vague feeling that Simon's Interleaved_Page_Line_Breaking branch will be quite beneficial in getting this right (or may even be the key to making the whole thing feasible in the first place). If anyone has specific ideas/thoughts in this area (or questions), I'm definitely interested in anything you see in the code that I may have missed. The goal is of course, as always: to have FOP format 20 copies of the Encyclopedia Brittannica at the same time without exceeding the 64MB limit... ;-) Cheers Andreas
Re: Thoughts on interaction between FOTree and layoutengine
On 16.01.2008 01:20:36 Andreas L Delmelle wrote: Hi people, I know I've mentioned it a few times before, so here's another attempt at brainstorming about possible improvements in interaction between fotree and layoutengine. (So, obviously what follows does not apply to the output formats that bypass the layoutengine, like RTF) In short, it concerns two aspects of this interaction that, when revisited, could lead to a much faster and less memory-hungry FOP. At least, that's the idea. I have read some on Java performance tuning, but would hardly call myself an expert on the matter... The first would be FOP-internal multi-threading, which is already possible at the higher-level (and relatively easy to implement, IIC). Right now, we do roughly, single-threaded: a) catch SAX events, and build a sort of FOP-DOM (the FOTree) until we reach the end of a page-sequence b) perform layout for the entire page-sequence c) create the area tree, and render it After c) we go back to a), while that process could have easily continued by starting b) in a separate thread, at least for the first a). Hope you can follow... :/ As Chris recently mentioned, forcing this /would/ make FOP non- compliant to the J2EE Bean standard, so this could probably be made part of the configuration somehow (and, leaping to Adrian's recent post on fop-users@: this could then be specified in the XSLT). At first glance, by itself, this would make FOP actually more memory- hungry than it currently is. Not only at first glance. It's one major problem when you start multi-threading (more below). Imagine a handful of 20-page sequences being processed concurrently, and consider the number of objects involved if we're talking about lots of small blocks, inlines, tables etc. So, on top of that, I'm thinking of making b) less of a monolithic process. At the moment, we always wait for an endPageSequence() call on the AreaTreeHandler, which works fine for small to medium-sized page- sequences, but is definitely not scaleable to larger ones consisting of a lot of FOs. I think we should take a look at implementing endFlow (), for instance, or startFlow(). At those points, we are already guaranteed to have at least the part of the FOTree that is necessary to perform some basic preliminary layout (the *ahem* Pagination and Layout FOs). What's preliminary layout? I don't get it. startFlow/endFlow doesn't help at all. That only excludes the static content from the page-sequence. One flow could still be huge. Other handlers that could turn out to be interesting to implement (but I'm guessing this to be interesting only for the flow, not for the static-content): * endExternalGraphic() / endInstreamForeignObject() * endInlineContainer() [... ;-) ...] / endBlockContainer() * endInline() / endBlock() In fact, there are a whole lot more. The idea is obviously not to have the PageSequenceLM run over the entire page-sequence multiple times, but to have the next childLM continue where the last one left off. If the area addition is started in yet another thread, I think, it would even become possible to release/GC parts of the FOTree (have the LMs dereference their FO) long before we even reach the first endPageSequence() during parsing. The key here would be to have mechanisms to limit memory consumption. If the FO is built up faster than the layout engine can consume it you still haven't gained anything. Smells like a lot of thread synchronization and complexity if you do it the multi-threading way. Even single-threaded, the complexity would grow again because there will be more interaction between the different parts of FOP. Furthermore, you need to know exactly when you can release an FO tree or layout object, i.e. when you're absolutely sure that you won't need it anymore. Currently, the first inline FO in a page-sequence is kept in memory even if the layout engine is already on page 234. I'm not entirely sure yet, but I have a vague feeling that Simon's Interleaved_Page_Line_Breaking branch will be quite beneficial in getting this right (or may even be the key to making the whole thing feasible in the first place). His work is the precondition for that to be become possible in the first place. I'd like to do this one step at a time. Once we have restartable layout we can think about ways to limit the amount of objects we keep in memory at one point in time. If anyone has specific ideas/thoughts in this area (or questions), I'm definitely interested in anything you see in the code that I may have missed. The goal is of course, as always: to have FOP format 20 copies of the Encyclopedia Brittannica at the same time without exceeding the 64MB limit... ;-) Cheers Andreas I appreciate you starting this discussion but I think it's slightly too soon. Jeremias Maerki