Have you tried telling Java to increase its heap size?  With some large
documents I had to do so ... at least with an earlier version of PDFBox.

http://www.devx.com/tips/Tip/5578

On Tue, Apr 28, 2009 at 11:39 AM, Paul Sowden (JIRA) <[email protected]>wrote:

>
>    [
> https://issues.apache.org/jira/browse/PDFBOX-453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703681#action_12703681]
>
> Paul Sowden commented on PDFBOX-453:
> ------------------------------------
>
> Is anyone looking into this? I'm trying to work around the issue but with
> little success :-(
>
> > FlateFilter decode() throwing OutOfMemoryError
> > ----------------------------------------------
> >
> >                 Key: PDFBOX-453
> >                 URL: https://issues.apache.org/jira/browse/PDFBOX-453
> >             Project: PDFBox
> >          Issue Type: Bug
> >          Components: Text extraction
> >         Environment: OSX, Windows, Centos 5.2
> >            Reporter: Paul Sowden
> >         Attachments: s417sec_1.pdf, s4uk12ter_1.pdf
> >
> >
> > When parsing certain PDF files an OutOfMemoryError occurs at FlateFilter
> line 100. The files in question are not big.
> > An exception occured in        parsing the PDF Document.
> > java.lang.OutOfMemoryError
> >       at java.util.zip.Inflater.inflateBytes(Native Method)
> >       at java.util.zip.Inflater.inflate(Inflater.java:221)
> >       at
> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:135)
> >       at
> org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:100)
> >       at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:279)
> >       at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:221)
> >       at
> org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:156)
> >       at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:367)
> >       at
> org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:325)
> >       at
> org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:50)
> >       at
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:493)
> >       at
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:214)
> >       at
> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:173)
> >       at
> org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:335)
> >       at
> org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:259)
> >       at
> org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:215)
> >       at
> org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:148)
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>

Reply via email to