[ https://issues.apache.org/jira/browse/PDFBOX-503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andreas Lehmkühler resolved PDFBOX-503. --------------------------------------- Resolution: Fixed Fix Version/s: 0.8.0-incubator I've added Daves patch with version 810130. Thanks to Dave for the contribution. > PDF loader causes infinite loop on non-PDF inputs > ------------------------------------------------- > > Key: PDFBOX-503 > URL: https://issues.apache.org/jira/browse/PDFBOX-503 > Project: PDFBox > Issue Type: Bug > Components: Parsing > Affects Versions: 0.8.0-incubator > Reporter: Dave Engberg > Fix For: 0.8.0-incubator > > > The current SVN head for the pdfbox incubator will experience an infinite > loop in PDFParser.parseHeader() if you feed any non-PDF document to the > parser. The problem is that it tries to find the PDF header within the > document by skipping over any non-matching lines which don't start with a > numeric digit. It relies on a readLine() function from BaseParser.java which > will return an empty string when the stream is at the end-of-file. The > parseHeader() call will loop on these empty lines. > I've patched this in our system by throwing an IOException from > BaseParser.readLine() if the stream is already at the end-of-file at the > beginning of that call. > Index: src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java > =================================================================== > --- src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java (revision > 802578) > +++ src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java (working copy) > @@ -1088,6 +1088,11 @@ > { > StringBuffer buffer = new StringBuffer( 11 ); > > + if (pdfSource.isEOF()) > + { > + throw new IOException( "Error: End-of-File, expected line"); > + } > + > int c; > while ((c = pdfSource.read()) != -1) > { -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.