RE: Help me from Outmemory issue

Derek Schenk 14 Jun 2004 14:13:11 -0000

From my recent investigations I can assure you there there is no FO rendering tool that is less memory intensive, and the current version of FOP is one of the best choices in terms of memory usage. I use FOP for report generation, and I can have source XML documents that can be from 20kb to 1.2GB (yes GB). This is do-able if your willing to accept that performance is not going to be stellar, and you have to accept a lot of restrictions in XSL and XML development.

Just to head off the inevitable question. Many of the reports are stored for future audit purpose, and the number of pages isn't an issue since the PDF is searchable. The information needs to be stored in a non-(easily)-editable format as at a point in time. The transforms must go through the XSL:FO transformation because the end user is able to edit the stylesheets and alter visual layout.

Here's some of my experience:

1. The first problem your run into is that none of the XML transformers are easily able to handle a 1.2GB XML for transformation. All of them seem to insist on loading the entire XML into memory, even when using a SAX or Stream input source. There are three solutions. a.) Use SAXON in the preview mode which allows you to process the document as you read it, or b.) Write a disk based XML DOM solution and extend out the NodeSource class in SAXON (see the JDOM input source example), or c) create the FO directly instead of creating XML for transformation.

I choose option b, and I created a Disk based XML that allows DOM style access to an XML document of unlimited size. The memory footprint is about 12MB regardless of the XML size, but you pay with performance.

2. All of the current FOP tools I looked at use memory to store the the pre-rendered FO information. The only product that I could overcome this with was FOP, where after an end of page-sequence it renders to disk and free's the memory. In order to use this I determine a logic group-by that is frequent enough to reduce memory usage and break on that. It results in a new page at the end of every group-by selection, but it works. There are a lot of choices for this, you could use something like a break on change in first letter in an alpha-name sort (ie moving from A->B creates a new page).

By page sequence the fo:tag that is refered to is '<fo:page-sequence>'. Each time a '</fo:page-sequence>' is encountered the in memory info is rendered.

!!!This means that you can no longer use page numbering, so I use iText to post-process the report and add the 'Page x of y' to the bottom of every page.

To date, my largest transformation was a 1.1GB FO document, resulting in a 378,000 page report. Total transform time, including XML -> FO, then FO-> PDF was about 45 minutes on a P4 2.8 with 512MB RAM. The upper JVM memory size was set at -Xmx128m, and during the transform the peak memory usage was about 110MB. (so under the maximum)

As a finally note, there is a FOP alternate design project that is underway that is addressing the memory usage, but it's not yet available

Hope these suggestions help.

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Monday, June 14, 2004 2:45 AM
To: [EMAIL PROTECTED]
Subject: Help me from Outmemory issue

Hi all,

I am having out of memory issues when transforming my FO - > PDF using
fop-0.20.5rc2. I read that using multiple page sequences in the XSL and
therefore in the FO means that fop will release some memory. I don 't see how
I can do this. My XML file is generated dynamically from a database so I
don 't know how big it will be. Is there any solutions I can use that uses
multiple page sequences or possible change the xml structure, if not is
there another FOP transformer that isn 't as memory intensive. Thanks in
advanced for any help. If I havn 't provided enough info please ask and I can
get back to you.

With regards

Bhaskar

RE: Help me from Outmemory issue

Reply via email to