On Apr 8, 2008, at 15:53, egreene wrote:

We are having memory/timeout issues on our production Websphere box using FOP as an alternative (in the future the only) process to creating PDFs from XML data. The application seemed to have passed load testing in stage but in production required that the application's memory be increased 0.5GB to 2GB.


FOP 0.94?

Lately, a number of issues have been reported regarding memory-issues on IBM JVMs. Maybe that is the culprit here as well. Was the test-environment identical to the production-setup?

Which version of the IBM JVM is used in your production environment?
I have received confirmation in an off-list discussion that one and the same document, rendered with FOP 0.94:
- ran fine on Sun JVM 1.4.2
- caused GC-related issues with IBM JVM 1.4.2 SR7 and SR10
- ran fine on IBM JVM 1.5

Note: 'fine' is very relative, as the document was using 1GB+ of heap. (for trunk/0.95beta 'only' 768MB was needed) The document was a single page-sequence containing a 400+ paged table; 7000+ rows times 10+ columns...

The process works fine in batch for standard directories at night...

Most likely because you have more control over the number of concurrent jobs (maybe only one?)

... but we also have the ability for the user to customize directories and this is done
real-time.

Where you typically have multiple of those jobs running concurrently.

Imagine one single document using up 0.5GB, then multiple of those jobs being performed concurrently can easily lead to 2GB+ of heap. From my past experiments, I can confirm that FOP does not use twice as much memory for two concurrent renderings, but still obviously significantly more than only one thread...

Also, running FOP in multiple threads concurrently does not always pay off in terms of processing speed. It works fine if the documents are relatively small, but a few concurrent renderings of the document I described above will definitely bring almost any server to its knees. FOP runs significantly slower once the amount of concurrent processes exceeds the number of CPUs. In cases of a very large number of concurrent requests, you're sometimes better off introducing some sort of buffer and deal with the requests in the background, say maximum four at a time. The total runtime and amount of required resources will be drastically reduced. If you run 16 jobs, 4 at a time, the entire job will be done quicker than if you would blindly launch them all 16 in one go, unless you would back that up by buying more CPUs and RAM.

The application generates PDFs from XML files that range from 100Kb to
2.7MB. The XSLs that create the directories average around 400Kb-500Kb.
The setup of the FO files is as follows (on average):

The /size/ of the XSL is of little importance, but 2.7MB of XML, depending on the /structure/ of the XSLT, can easily lead to a virtual FO file that is 10 times the volume (27MB).

24 page sequences (w/ even and odd sequence masters on sequences other than
blank and cover pages):

- 2 sequences are table of contents
- 1 sequence is a back of the book index
- 2-3 static content sequences
- Other sequences are dynamic and include multi-column data

I have done all the sugestions to improve performance on the Apache site as
far as multiple sequences and so forth.  As you can see I cannot avoid
forward references because of the table of contents (English and Spanish). I also went through and reduced the amount of redundant code (font information
and other attributes that the XSL designer added).

If you have followed all the suggestions so far, and you simply cannot do without forward references, I see no other option than to try to limit the amount of requests that can be processed concurrently in your webapp somehow (by introducing a level of indirection, and a sort of waiting queue...)


HTH!

Cheers

Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to