Hi,

 

Unless I'm missing something the comment "By default this directory will be
the value of the system property java.io.tmpdir" is wrong.

http://pdfbox.apache.org/apidocs/org/apache/pdfbox/pdfparser/PDFParser.html#
setTempDirectory(java.io.File)

 

Looks like if setTempDirectory is not set then the document is instantiated
using the default COSDocument constructor, which uses RandomAccessBuffer,
i.e. all done in memory.

 

This explains why after upgrading from 0.7.2 to 1.6  we started getting 'out
of heap space' errors when processing large PDFs. To fix this we now
explicitly set the temp dir, i.e.

parser.setTempDirectory(new File(System.getProperty("java.io.tmpdir")));

 

Would be good to make this clear in to Javadoc, or perhaps change the
behaviour back to how it presumably used to be.

 

Thanks,

Martin

 

___________

Martin Wilson
 <mailto:[email protected]> [email protected]

 

Asset Bank: take control of your digital assets
 <http://www.assetbank.co.uk> http://www.assetbank.co.uk
+44 (0)1273 923 150

 

Registered in England No. 3865036. VAT Reg. No. 786 6073 86 Registered
Office: 12-13 Ship Street, Brighton, East Sussex, BN1 1AD

 

Reply via email to