[ https://issues.apache.org/jira/browse/PDFBOX-536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mel Martinez updated PDFBOX-536: -------------------------------- Attachment: PDFXrefStreamParser.java 09_05_11_Archiv.pdf This PDF triggers the bug during text extraction. The replacement PDFXrefStreamParser.java src file fixes the problem. > missing iterator.hasNext() test in PDFXrefStreamParser > ------------------------------------------------------ > > Key: PDFBOX-536 > URL: https://issues.apache.org/jira/browse/PDFBOX-536 > Project: PDFBox > Issue Type: Bug > Components: Parsing > Affects Versions: 0.8.0-incubator > Reporter: Mel Martinez > Attachments: 09_05_11_Archiv.pdf, PDFXrefStreamParser.java > > > The class: org.apache.pdfbox.pdfparser.PDFXrefStreamParser > uses an unbounded iterator in it's parser method. > Specifically, line 100 should be changed from: > while(pdfSource.available() > 0) > To > while(pdfSource.available() > 0 && objIter.hasNext()) > Not having this check causes line 115 to blow up with a > NoSuchElementException. > I will attach a test file that triggers the problem (during Text extraction) > and also a patched version of PDFXrefStreamParser.java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.