I have now confirmed that the issues my customer had are related to using/not using temporary directories.

Not using directories increased the memory usage of a web application from 120 MB to 800 MB while manipulating a PDF of mere 7 MB in size.

PDFBox is not used to render the PDF, nor are features like text extraction used. PDFBox is used to parse the PDF including the page content streams, making some manipulations to the content and then saved back to disk.

I don't think that this usage should trigger such an increase of memory. Usually, it's decoding of images into bitmaps that could explain high memory usage, maybe this is done unnecessarily in 1.5.0 in the scenario above?

Regards

Michael


Reply via email to