I've been making pretty big PDFs with a similar system and can share a few off-the-cuff comments.
It's obvious to me that the structure of your fo document - sequences, page layout, flows, etc - can make a significant difference in memory usage and speed. However, I don't have enough concrete conclusions about what exactly does what and how to offer any useful advice on that level... perhaps others can help you there... The big thing I noticed was page numbers. If you use them and especially if you make references to them (i.e. building a table of contents) you'll see a significant speed and memory impact. GC consequences to references, perhaps. Basically you need to tune your fop script to give FOP the maximum possible heap size (i.e. fop.bat starts with "java -Xmx256M ..."). If you make it too large, you'll discover that between Java and FOP the memory access patterns will spank the shit out of your VM once the heap exceeds the available RAM and you start to swap. Some observations of your machine's memory availability during normal use and some experimentation should get you to the right number. Your experience of having 40+ minute rendering times is strongly suggestive of swap binding. Practically speaking, you need to make your heap small enough that java never swaps, and limit your recordset size on the front end to make sure that you never hit that memory ceiling. Between this and throwing hardware at the problem (multiple Xeons and 1GB+ RAM) we've made a go of it for 1000+ page documents. But of course every recordset+template is different so that pagecount isn't necessarily meaningful at all. One thing that I haven't tried yet but am very curious to experiment with is FOP + IBM JVM. CW has it that the IBM VM is significantly superior to Sun's VM on both CPU and RAM efficiency. If I manage to get to this, I'll post my results. If anyone else has or happens to get to it first, I'd love to hear what happens. I mean, after your "write-once-porting-is-slightly-less-painful" experience. On Tue, 12 Mar 2002, David Le Strat wrote: > All, > > I am currently working on a project where we are dynamically creating PDF > documents based on a user input. When a user selects a specific period of > time, we pull out the matching records from the database, convert the > dataset to XML and render a PDF report based on that dataset. Now, > everything works fine when we are manipulating up to 200 records (we get the > result in 1 or 2 minutes). However some reports manipulate 7000 or 8000 > records and in these particular instances, the performance degrades fairly > significantly (no report was rendered after 40 minutes). > > Does any of you have any idea/input on how to improve performance using FOP > in such cases and what type of performance we should expect for the above > examples? > > Thank you for your help. > > David Le Strat. >