Hi fop-dev, In one of my use cases, I create a PDF file having about 20000 pages from FOP intermediate format. I imagined this as a streaming process (e.g. read a page in FOP_IF, write it to PDF and release memory) with the exception of caching of images. In reality, by analyzing a heap dump taken with the -XX:+HeapDumpOnOutOfMemoryError parameter on a production server, I found out that o.a.f.r.p.PDFDocumentHandler keeps for every page a reference to be used for bookmarks & outlines. In my case, the retained heap size of every page is about 150kb. If you multiply this with the number of pages, the memory usage is large. Even worse, on my production server I have 10 threads creating 20k page documents in parallel.
Attached is a patch against the latest revision 1051938 of trunk that considerably reduces the memory usage by keeping only a String pdfPageRef instead of the full org.apache.fop.pdf.PDFReference object. This was possible because from the object we only need to get that string. Ideally, I would like not to keep at all the page references if bookmarks & outlines are not used. Or at least, keep it only for the pages that are indeed referenced. Is this possible ? If so, do you have any hints for this ? If further optimizations are not possible or complex, then I guess I will just open an issue and attach this patch. I hope you agree with the addition of generics on the Map declaration and with the change of "new Integer()" to "Integer.valueOf())" (findbugs performance warning). Greetings, Alexis Giotis
Index: src/java/org/apache/fop/render/pdf/PDFDocumentHandler.java =================================================================== --- src/java/org/apache/fop/render/pdf/PDFDocumentHandler.java (revision 1051938) +++ src/java/org/apache/fop/render/pdf/PDFDocumentHandler.java (working copy) @@ -39,7 +39,6 @@ import org.apache.fop.pdf.PDFAnnotList; import org.apache.fop.pdf.PDFDocument; import org.apache.fop.pdf.PDFPage; -import org.apache.fop.pdf.PDFReference; import org.apache.fop.pdf.PDFResourceContext; import org.apache.fop.pdf.PDFResources; import org.apache.fop.render.extensions.prepress.PageBoundaries; @@ -93,7 +92,7 @@ protected PageReference currentPageRef; /** Used for bookmarks/outlines. */ - protected Map pageReferences = new java.util.HashMap(); + protected Map<Integer, PageReference> pageReferences = new java.util.HashMap<Integer, PageReference>(); private final PDFDocumentNavigationHandler documentNavigationHandler = new PDFDocumentNavigationHandler(this); @@ -238,7 +237,7 @@ pdfUtil.generatePageLabel(index, name); currentPageRef = new PageReference(currentPage, size); - this.pageReferences.put(new Integer(index), currentPageRef); + this.pageReferences.put(Integer.valueOf(index), currentPageRef); this.generator = new PDFContentGenerator(this.pdfDoc, this.outputStream, this.currentPage); @@ -308,21 +307,22 @@ } PageReference getPageReference(int pageIndex) { - return (PageReference)this.pageReferences.get( - new Integer(pageIndex)); + return this.pageReferences.get(Integer.valueOf(pageIndex)); } static final class PageReference { - private final PDFReference pageRef; + private final String pageRef; private final Dimension pageDimension; private PageReference(PDFPage page, Dimension dim) { - this.pageRef = page.makeReference(); + // Avoid keeping references to PDFPage as memory usage is + // considerably increased when handling thousands of pages. + this.pageRef = page.makeReference().toString(); this.pageDimension = new Dimension(dim); } - public PDFReference getPageRef() { + public String getPageRef() { return this.pageRef; } Index: src/java/org/apache/fop/render/pdf/PDFDocumentNavigationHandler.java =================================================================== --- src/java/org/apache/fop/render/pdf/PDFDocumentNavigationHandler.java (revision 1051938) +++ src/java/org/apache/fop/render/pdf/PDFDocumentNavigationHandler.java (working copy) @@ -189,7 +189,7 @@ p2d = new Point2D.Double( action.getTargetLocation().x / 1000.0, (pageRef.getPageDimension().height - action.getTargetLocation().y) / 1000.0); - String pdfPageRef = pageRef.getPageRef().toString(); + String pdfPageRef = pageRef.getPageRef(); pdfGoTo.setPageReference(pdfPageRef); pdfGoTo.setPosition(p2d);