[ 
https://issues.apache.org/jira/browse/PDFBOX-536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mel Martinez updated PDFBOX-536:
--------------------------------

    Attachment: PDFXrefStreamParser.java
                09_05_11_Archiv.pdf

This PDF triggers the bug during text extraction.

The replacement PDFXrefStreamParser.java src file fixes the problem.

> missing iterator.hasNext() test in PDFXrefStreamParser
> ------------------------------------------------------
>
>                 Key: PDFBOX-536
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-536
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 0.8.0-incubator
>            Reporter: Mel Martinez
>         Attachments: 09_05_11_Archiv.pdf, PDFXrefStreamParser.java
>
>
> The class:     org.apache.pdfbox.pdfparser.PDFXrefStreamParser
> uses an unbounded iterator in it's parser method.
> Specifically, line 100 should be changed from:
>             while(pdfSource.available() > 0)
> To
>             while(pdfSource.available() > 0 && objIter.hasNext())
> Not having this check causes line 115 to blow up with a 
> NoSuchElementException.
> I will attach a test file that triggers the problem (during Text extraction) 
> and also a patched version of PDFXrefStreamParser.java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to