[
https://issues.apache.org/jira/browse/PDFBOX-1715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr closed PDFBOX-1715.
-----------------------------------
Resolution: Cannot Reproduce
I'm closing this issue as you apparently didn't approval.
Btw -Xmx512m isn't much. Try desperate measures, e.g. -Xmx2g :-)
You can reopen anytime, if you have further information. Also try the current
version.
> java.lang.OutOfMemoryError when extracting images
> -------------------------------------------------
>
> Key: PDFBOX-1715
> URL: https://issues.apache.org/jira/browse/PDFBOX-1715
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 1.8.1
> Environment: LSB Version:
> :core-4.0-amd64:core-4.0-ia32:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-ia32:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-ia32:printing-4.0-noarch
> Distributor ID: CentOS
> Description: CentOS release 4.7 (Final)
> Release: 4.7
> Codename: Final
> Java 1.6.0
> Reporter: sarathy
>
> We are trying to extract images from PDF file. As part of that, we are
> converting a PDPage into an image. using PDPage.convertImage method. Its a 52
> page document.
> At that time, We are seeing the following trace:
> Here are the steps:
> PDDocument document = PDDocument.load(inputStream);
> List<PDPage> pages = document.getDocumentCatalog().getAllPages();
> for (PDPage pdPage : pages) {
> if (pdPage.getResources() != null && pdPage.getResources().getImages() !=
> null)
> PageInfo page = new PageInfo(pdPage, true, rotation);
> ...
> }
> }
> In PageInfo, we are doing:
> BufferedImage bimage = page.convertToImage();
> And after processing about 12 or so pages, it starts complaining as follows.
> java.lang.OutOfMemoryError: Java heap space
> at
> org.apache.pdfbox.io.RandomAccessBuffer.expandBuffer(RandomAccessBuffer.java:263)
> at
> org.apache.pdfbox.io.RandomAccessBuffer.write(RandomAccessBuffer.java:222)
> at
> org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:108)
> at java.io.OutputStream.write(OutputStream.java:75)
> at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:102)
> at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:295)
> at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:237)
> at
> org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:172)
> at
> org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:231)
> at
> org.apache.pdfbox.pdmodel.common.PDStream.getByteArray(PDStream.java:509)
> at
> org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap.getRGBImage(PDPixelMap.java:185)
> at
> org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:83)
> at
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:554)
> at
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268)
> at
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235)
> at
> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215)
> at
> org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:125)
> at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:781)
> at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:712)
> at oss.rcpt.PageInfo.<init>(PageInfo.java:328)
> at oss.utl.PDFImageSplitter.execute(PDFImageSplitter.java:217)
> at oss.utl.PDFUtilities.getImageCount(PDFUtilities.java:165)
> at cms.utl.PDFImageOperations.main(PDFImageOperations.java:157)
> when we run this from command line,
> * if we set -Xms=512m and -Xmx=512m, its complaining after 12 pages.
> * if we set -Xms=1024m and -Xmx=1024m, its complaining after 27 pages.
> On the side, we are also getting "Colour key masking isn't supported" message
> for each image in the file.
--
This message was sent by Atlassian JIRA
(v6.2#6252)