Nehal created PDFBOX-3474:
-----------------------------
Summary: OutofMemoryError while converting pdf to image
Key: PDFBOX-3474
URL: https://issues.apache.org/jira/browse/PDFBOX-3474
Project: PDFBox
Issue Type: Bug
Components: PDModel
Affects Versions: 2.0.2
Environment: Development
Reporter: Nehal
Priority: Blocker
Our test PDF contain 4-5 pages with one images in each page. Our requirement is
to convert each page to image. While doing load testing on this use case (about
10-15 threads), JVM runs out of memory with following error.
Caused by: java.lang.OutOfMemoryError: Java heap space
at
org.apache.pdfbox.io.ScratchFileBuffer.addPage(ScratchFileBuffer.java:121)
at
org.apache.pdfbox.io.ScratchFileBuffer.ensureAvailableBytesInPage(ScratchFileBuffer.java:184)
at
org.apache.pdfbox.io.ScratchFileBuffer.write(ScratchFileBuffer.java:236)
at
org.apache.pdfbox.io.ScratchFileBuffer.write(ScratchFileBuffer.java:220)
at
org.apache.pdfbox.io.RandomAccessOutputStream.write(RandomAccessOutputStream.java:52)
at org.apache.pdfbox.filter.DCTFilter.decode(DCTFilter.java:174)
at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:69)
at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:163)
at
org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:235)
at
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createInputStream(PDImageXObject.java:565)
at
org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit(SampledImageReader.java:233)
at
org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage(SampledImageReader.java:138)
at
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:340)
at org.apache.pdfbox.rendering.PageDrawer.drawImage(PageDrawer.java:793)
at
org.apache.pdfbox.contentstream.operator.graphics.DrawObject.process(DrawObject.java:62)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:815)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:472)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:446)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:149)
at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:189)
at
org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:208)
at
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:139)
at
org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:94)
CODE:
private List<Image> convertPDFToImage(ByteArrayOutputStream pdf) throws
AXOLServiceException, IOException{
List<Image> images =new ArrayList<Image>();
PDDocument document =
PDDocument.load(pdf.toByteArray(),"",null,null,MemoryUsageSetting.setupMainMemoryOnly(5347737));
PDFRenderer pdfRenderer = new PDFRenderer(document);
long contentLength = 0;
for (int page = 0; page < document.getNumberOfPages();
++page)
{
BufferedImage bim =
pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB);
ByteArrayOutputStream tmp = new
ByteArrayOutputStream();
ImageIO.write(bim, "jpeg", tmp);
tmp.flush();
tmp.close();
bim.flush();
contentLength += tmp.size();
if(contentLength>5347737){
document.close();
throw new RuntimeException("Exceeds
maximum allowed size limit",
EAUDocumentService.EAU_ERR_CODE_EXCCEDS_SIZE_LIMIT, "");
}
images.add(bim);
}
document.close();
return images;
}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]