ryuukei created PDFBOX-3852:
-------------------------------
Summary: Overlay a pdf file which is 750 pages ends up in
OutOfMemoryError
Key: PDFBOX-3852
URL: https://issues.apache.org/jira/browse/PDFBOX-3852
Project: PDFBox
Issue Type: Bug
Components: PDModel
Affects Versions: 2.0.6
Environment: Unbuntu, jetty
Reporter: ryuukei
Attachments: 750-pages.pdf, Overlay.patch
We found an issue and solution to fix it, you guys might would be interested to
have a look and see whether it is worth applying the attached patch to benefit
more pdfbox users. :-) And a bit more detail this error happens based on jetty
running time memory setting, and pdf file size.
* Application platform:
Unbuntu, jetty
* The test case to produce this issue:
Add simple overlay to all pages (in this case it is 750 pages). The
processPages function eats up the JVM memories while applying the overlay to
the file.
* sample code for using pdfbox overlay:
{code}
PDDocument document = PDDocument.load( pdf );
HashMap<Integer, String> overlayGuide = new HashMap();
for (int i = 0; i < pagenunber; i++)
{
// "watermarked.pdf" meat to be a file which contains watermarks on the page
overlayGuide.put(i+1, "watermarked.pdf");
}
Overlay overlay = new Overlay();
overlay.setInputPDF( document );
overlay.setOverlayPosition( Overlay.Position.FOREGROUND );
PDDocument overlayResult = overlay.overlay( overlayGuide );
{code}
* Error log:
{code}
INFO | jvm 1 | main | 2017/07/03 13:06:23 | java.lang.OutOfMemoryError:
Java heap space
STATUS | wrapper | main | 2017/07/03 13:06:23 | Filter trigger matched.
Restarting JVM.
INFO | jvm 1 | main | 2017/07/03 13:06:23 | at
org.apache.pdfbox.io.ScratchFile.<init>(ScratchFile.java:128)
INFO | jvm 1 | main | 2017/07/03 13:06:23 | at
org.apache.pdfbox.io.ScratchFile.getMainMemoryOnlyInstance(ScratchFile.java:143)
INFO | jvm 1 | main | 2017/07/03 13:06:23 | at
org.apache.pdfbox.cos.COSStream.<init>(COSStream.java:55)
INFO | jvm 1 | main | 2017/07/03 13:06:23 | at
org.apache.pdfbox.multipdf.Overlay.createStream(Overlay.java:***)
INFO | jvm 1 | main | 2017/07/03 13:06:23 | at
org.apache.pdfbox.multipdf.Overlay.processPages(Overlay.java:364)
INFO | jvm 1 | main | 2017/07/03 13:06:23 | at
org.apache.pdfbox.multipdf.Overlay.overlay(Overlay.java:128)
{code}
* Solution
Apply MemoryUsageSetting to Overlay, allows Overlay to use file as temp output.
* Update for the Overlay usage:
{code}
PDDocument document = PDDocument.load( pdf );
HashMap<Integer, String> overlayGuide = new HashMap();
for (int i = 0; i < pagenunber; i++)
{
overlayGuide.put(i+1, "watermarked.pdf");
}
Overlay overlay = new Overlay();
overlay.setInputPDF( document );
overlay.setOverlayPosition( Overlay.Position.FOREGROUND );
// set overlay to use temp file as out rather than memory
MemoryUsageSetting memoryUsageSetting = MemoryUsageSetting.setupTempFileOnly(
);
memoryUsageSetting.setTempDir( new File ( "someTempWorkingDir" ) );
overlay.setMemoryUsageSetting( memoryUsageSetting );
PDDocument overlayResult = overlay.overlay( overlayGuide );
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]