Yachun Miao created PDFBOX-3734:
-----------------------------------
Summary: out of memory issue when convert scaned pdf to image
Key: PDFBOX-3734
URL: https://issues.apache.org/jira/browse/PDFBOX-3734
Project: PDFBox
Issue Type: Bug
Components: Rendering
Affects Versions: 2.0.5
Environment: win7 64bit, jdk 1.7 64bit
Reporter: Yachun Miao
i had a scaned pdf file which just 2.8M, when try pdf to image feature, i get
OOM with -Xmx200m:
{color:red}
at java.awt.image.DataBufferByte.<init>(DataBufferByte.java:92)
at
java.awt.image.ComponentSampleModel.createDataBuffer(ComponentSampleModel.java:415)
at
sun.awt.image.ByteInterleavedRaster.<init>(ByteInterleavedRaster.java:89)
at
sun.awt.image.ByteInterleavedRaster.createCompatibleWritableRaster(ByteInterleavedRaster.java:1281)
at
sun.awt.image.ByteInterleavedRaster.createCompatibleWritableRaster(ByteInterleavedRaster.java:1292)
at org.apache.pdfbox.filter.DCTFilter.fromBGRtoRGB(DCTFilter.java:246)
at org.apache.pdfbox.filter.DCTFilter.decode(DCTFilter.java:171)
at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:69)
at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:162)
at
org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:235)
at
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.<init>(PDImageXObject.java:124)
at
org.apache.pdfbox.pdmodel.graphics.PDXObject.createXObject(PDXObject.java:70)
at
org.apache.pdfbox.pdmodel.PDResources.getXObject(PDResources.java:409)
at
org.apache.pdfbox.contentstream.operator.graphics.DrawObject.process(DrawObject.java:53)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:838)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:495)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:469)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:150)
at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:206)
at
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:145)
{color}
After i enlarge jvm max heap size to 500M, then it works.
I know pdf rendering is very difficulty, but do we have some manner to avoid
consumpting so much memory? whatever it is a bit surprized pdfbox use 500M
memory to handle one page of scaned pdf (total 2.8M). ratio is around 200
times.
But as per me, it is ok to decrease some quality of image converted. (actually
the quality of original image in pdf not good as well. :)). Tell me if we do
have such methods. I will help try.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]