[ https://issues.apache.org/jira/browse/PDFBOX-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hacho updated PDFBOX-537: ------------------------- Attachment: TestPDFBOX537.java corrupt-endless-loop-in-0.8.pdf The attached file [corrupt-endless-loop-in-0.8.pdf] causes an endless loop in parseCOSDictionary() with the following example code Also attaching a JUnit test case with the same code. public void testEndlessLoop() throws IOException { FileInputStream fin = new FileInputStream("testfiles/corrupt-endless-loop-in-0.8.pdf"); try { PDDocument pdfDoc = PDDocument.load(fin); try { @SuppressWarnings("unchecked") List<PDPage> pageList = (List<PDPage>) pdfDoc.getDocumentCatalog().getAllPages(); int pageCntr = 0; for( PDPage page: pageList ) { pageCntr++; byte[] content = page.getContents().getByteArray(); if ( null == content ) { //errors.add("Err on page " + pageCntr ); } } } finally { pdfDoc.close(); } } finally { fin.close(); } } > Endless loop in org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary() > on certain corrupt PDF streams > ---------------------------------------------------------------------------------------------------------- > > Key: PDFBOX-537 > URL: https://issues.apache.org/jira/browse/PDFBOX-537 > Project: PDFBox > Issue Type: Bug > Components: Parsing > Affects Versions: 0.8.0-incubator > Environment: The problem occurs on certain corrupt streams. I have a > file which is 417 bytes long and has this problem. I don't see a way to > attach the file here. Let me know how I can do so or who do I send it to. I > would think it's a good idea to add this file to the SVN repository and maybe > create a test case for it, unless there is a test case that already checks a > folder with thest files. > Reporter: Hacho > Attachments: corrupt-endless-loop-in-0.8.pdf, TestPDFBOX537.java > > Original Estimate: 1h > Remaining Estimate: 1h > > The issue seems to have been introduced on 01-Sep-2009 in svn revision 810122 > with the addition of the loop to wait for a valid dictionary > Index: PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java > =================================================================== > --- PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java > (revision 793364) > +++ PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java > (revision 810122) > @@ -183,7 +183,23 @@ > if( c == '>') > { > done = true; > - } > + } > + else > + if(c != '/') > + { > + //an invalid dictionary, we are expecting > + //the key, read until we can recover > + logger().warning("Invalid dictionary, found:" + (char)c > + " but expected:\''"); > + int read = pdfSource.read(); > + while(read != -1 && read != '/' && read != '>') > + { > + read = pdfSource.read(); > + } > + if(read != -1) > + { > + pdfSource.unread(read); > + } > + } > else > { > COSName key = parseCOSName(); > @@ -206,9 +222,12 @@ > > if( value == null ) > { > - throw new IOException("Bad Dictionary Declaration " + > pdfSource ); > + logger().warning("Bad Dictionary Declaration " + > pdfSource ); > } > - obj.setItem( key, value ); > + else > + { > + obj.setItem( key, value ); > + } > } > } > char ch = (char)pdfSource.read(); -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.