Try loading the file using a scratch file: http://pdfbox.apache.org/apidocs/org/apache/pdfbox/pdmodel/PDDocument.html#load(java.lang.String,%20org.apache.pdfbox.io.RandomAccess)
This will help lessen the memory load. On Mon, Jun 3, 2013 at 2:50 PM, mihaela olteanu <[email protected]>wrote: > Hello, > > I have a use case where I need to merge a large number of small pdf > document (hundred of thousands) into one pdf document. > Currently I am using the > method: org.apache.pdfbox.util.PDFMergerUtility.appendDocument(destination, > source); for all the source documents, not directly mergeDocuments() method > in the same class because I need to also add some bookmarks. Finally I save > the document. > > Is it a better way of doing this with a lower memory footprint? I tried > importing each page from the source documents by using the method > PDDocument.importPage() but still throws errors in version 1.8.2. > > When I call PDDocument.load(File) the whole document is loaded in memory? > If so, it means that saving the generated pdf after merging a subset of > documents and then reloading it would not decrease the memory use anyway ... > > Could somebody point me to the right way of doing this? > > Thanks, > Mihaela

