[
https://issues.apache.org/jira/browse/PDFBOX-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Lehmkühler resolved PDFBOX-3292.
----------------------------------------
Resolution: Fixed
COSParser#checkXRefStreamOffset simply checked if a dictionary can be found at
the given offset. I rare case, such as the given pdfs, there is a dictionary
but not the one we are looking for. Saying that, I've implemented an additional
check to see if the dictionary at the given offset is the right one or not and
now everything works fine.
[[email protected]] Thanks for the report!
> Error reading stream, expected='endstream' actual='' in non-truncated files
> ---------------------------------------------------------------------------
>
> Key: PDFBOX-3292
> URL: https://issues.apache.org/jira/browse/PDFBOX-3292
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 2.0.0
> Reporter: Tim Allison
> Assignee: Andreas Lehmkühler
> Priority: Minor
> Fix For: 2.0.1, 2.1.0
>
>
> When PDF files are truncated, one of the most common exceptions in PDFBox
> 2.0.0 is:
> {noformat}
> java.io.IOException: Error reading stream, expected='endstream' actual='' at
> offset 165888
> at
> org.apache.pdfbox.pdfparser.COSParser.parseCOSStream(COSParser.java:999)
> at
> org.apache.pdfbox.pdfparser.COSParser.parseXrefObjStream(COSParser.java:326)
> at org.apache.pdfbox.pdfparser.COSParser.parseXref(COSParser.java:287)
> at
> org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:192)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:249)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:847)
> {noformat}
> There are two files in govdocs1 that are NOT truncated and trigger this
> exception in 2.0.0, but were parsed by PDFBox 1.8.11 with the classic parser.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]