Re: FOP Memory issues (fwd from fop-users)

Andreas L Delmelle Mon, 08 Jan 2007 08:16:03 -0800

On Jan 5, 2007, at 16:20, Jeremias Maerki wrote:

Adding page breaks will not be enough, BTW. But you already noticedthat.FOP can currently only release memory at the end of a page-sequence. Soinstead of creating page-breaks, try to restart a new page-sequence. The
memory usage should drop considerably.

If I remember correctly, that was precisely the problem, sinceCliff's report consists of one giant table. It's supposed to looklike one uninterrupted flow, so figuring out where the page-sequencesshould end is next to impossible... (or IOW: sorting that out kind ofdefeats the purpose of using a formatter to compute the page-breaks) :/

There's also a little class (CachedRenderPagesModel) which could
theoretically be used instead of the default RenderPagesModel. Itallowsto temporarily off-load rendered pages to disk if they can't berendered
right away. But this is not actively tested and does not help with the
memory consumption of the FO tree which probably is representing the
largest part in your case.

The one way I see that FOP is ever going to get close to resolvingthe issue of arbitrarily sized page-sequences, is if the overallprocessing is 'slightly' modified (quoted, since it seems like only asmall change, but it would still be quite some work for one man).

The redesign was ultimately meant to modularize FOP. Now the fo-treeand the layoutengine have been successfully extracted into separatemodules, seems like it's time to revisit the way they work together.Currently, we have two monolithic modules performing their respectiveoperations in sequential order. One module (layout) can't start untilthe other (fo-tree) has reached a critical boundary(FOEventHandler.endPageSequence()), and vice versa, the fo-tree can'tcontinue until layout for a page-sequence has finished.

Very briefly put: the key would be to implementAreaTreeHandler.endBlock().Use that event to start/resume the layout-loop (ideally this loopshould run in a separate thread, so there would be real performance-boosts on MP-systems), and use endPageSequence() instead only toperform one finishing pass over the whole sequence.

Such a change could bring us closer to enhancing FOP in other areasas well.Multiple endBlock() events each offer an opportunity for thePageSequenceLM to record available IPD changes, take into accountfootnotes/floats associated with a block etc.


Rough sketch:

At the very first endBlock() the parent FlowLM and PageSequenceLM areinstantiated, and the first block-sequence is created. The breaker isrun a first time, storing the resulting active nodes.Every next occurrence of the event, the ancestor LMs and a set ofactive nodes are already present, a sequence for the current block isadded, and the breaker is run again...As such, the page-breaking algorithm would run incrementally,performing multiple passes over the same block-sequences.

As you can see from the simplistic sketch, I'm still a bit unsureabout the specifics, but if all goes well, in the moststraightforward cases, some LMs can begin adding their areas longbefore the physical end-of-page-sequence is reached. If that alsoimplies they can release the reference to their FO (and instruct theFOTree to release the reference as well via FONode.removeChild()),large parts of the FOTree can be garbage-collected much sooner thanthey are now.Think of the content of block-containers, non-marker parts of thestatic-content, table-headers/-footers. Even large text-blocks: notethat the TextLM currently creates a copy of the correspondingFOText's char array, while the original happily occupies the sameamount of memory.

The overall changes would be far from trivial though, AFAICT, but I'dlove to see some more brainstorming in this direction. Biggestproblem, IIC, is that AbstractBreaker.doLayout() currently performseverything in one go.




Cheers,

Andreas

Re: FOP Memory issues (fwd from fop-users)

Reply via email to