Tilman Hausherr created PDFBOX-2306: ---------------------------------------
Summary: Error reading stream, expected='endstream' actual='endobj' Key: PDFBOX-2306 URL: https://issues.apache.org/jira/browse/PDFBOX-2306 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Fix For: 2.0.0 I get this exception with the file of PDFBOX-269: {code} java.io.IOException: Error reading stream, expected='endstream' actual='endobj' at offset 183468 at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseCOSStream(NonSequentialPDFParser.java:1578) at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1249) at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1176) at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseDictObjects(NonSequentialPDFParser.java:1152) at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:487) at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:755) at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1155) at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1138) {code} The cause is that a stream ends with endobj instead of endstream. This is accepted in the non sequential parser in readUntilEndStream() but later it isn't. It is a problem that was fixed in the old parser many years ago. My fix is for the sequential parser. I also changed a misleading error message nearby. -- This message was sent by Atlassian JIRA (v6.3.4#6332)