On Apr 8, 2008, at 15:53, egreene wrote:
We are having memory/timeout issues on our production Websphere box
using FOP
as an alternative (in the future the only) process to creating PDFs
from XML
data. The application seemed to have passed load testing in stage
but in
production required that the application's memory be increased
0.5GB to 2GB.
FOP 0.94?
Lately, a number of issues have been reported regarding memory-issues
on IBM JVMs.
Maybe that is the culprit here as well. Was the test-environment
identical to the production-setup?
Which version of the IBM JVM is used in your production environment?
I have received confirmation in an off-list discussion that one and
the same document, rendered with FOP 0.94:
- ran fine on Sun JVM 1.4.2
- caused GC-related issues with IBM JVM 1.4.2 SR7 and SR10
- ran fine on IBM JVM 1.5
Note: 'fine' is very relative, as the document was using 1GB+ of
heap. (for trunk/0.95beta 'only' 768MB was needed)
The document was a single page-sequence containing a 400+ paged
table; 7000+ rows times 10+ columns...
The process works fine in batch for standard directories at night...
Most likely because you have more control over the number of
concurrent jobs (maybe only one?)
... but we also have the ability for the user to customize
directories and this is done
real-time.
Where you typically have multiple of those jobs running concurrently.
Imagine one single document using up 0.5GB, then multiple of those
jobs being performed concurrently can easily lead to 2GB+ of heap.
From my past experiments, I can confirm that FOP does not use twice
as much memory for two concurrent renderings, but still obviously
significantly more than only one thread...
Also, running FOP in multiple threads concurrently does not always
pay off in terms of processing speed. It works fine if the documents
are relatively small, but a few concurrent renderings of the document
I described above will definitely bring almost any server to its knees.
FOP runs significantly slower once the amount of concurrent processes
exceeds the number of CPUs. In cases of a very large number of
concurrent requests, you're sometimes better off introducing some
sort of buffer and deal with the requests in the background, say
maximum four at a time. The total runtime and amount of required
resources will be drastically reduced. If you run 16 jobs, 4 at a
time, the entire job will be done quicker than if you would blindly
launch them all 16 in one go, unless you would back that up by buying
more CPUs and RAM.
The application generates PDFs from XML files that range from 100Kb to
2.7MB. The XSLs that create the directories average around
400Kb-500Kb.
The setup of the FO files is as follows (on average):
The /size/ of the XSL is of little importance, but 2.7MB of XML,
depending on the /structure/ of the XSLT, can easily lead to a
virtual FO file that is 10 times the volume (27MB).
24 page sequences (w/ even and odd sequence masters on sequences
other than
blank and cover pages):
- 2 sequences are table of contents
- 1 sequence is a back of the book index
- 2-3 static content sequences
- Other sequences are dynamic and include multi-column data
I have done all the sugestions to improve performance on the Apache
site as
far as multiple sequences and so forth. As you can see I cannot avoid
forward references because of the table of contents (English and
Spanish). I
also went through and reduced the amount of redundant code (font
information
and other attributes that the XSL designer added).
If you have followed all the suggestions so far, and you simply
cannot do without forward references, I see no other option than to
try to limit the amount of requests that can be processed
concurrently in your webapp somehow (by introducing a level of
indirection, and a sort of waiting queue...)
HTH!
Cheers
Andreas
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]