[
https://issues.apache.org/jira/browse/PDFBOX-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225872#comment-14225872
]
Tilman Hausherr commented on PDFBOX-2523:
-----------------------------------------
Indeed, it is only with the non-Seq parser. (Second test from yesterday in
TIKA-1442)
> IOException: Error: Expected a long type at offset 1218571, instead got 'xref'
> ------------------------------------------------------------------------------
>
> Key: PDFBOX-2523
> URL: https://issues.apache.org/jira/browse/PDFBOX-2523
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.8.8, 2.0.0
> Reporter: Tilman Hausherr
> Attachments: 853115.pdf
>
>
> I get this with the attached file when using the non-sequential parser:
> {code}
> Exception in thread "main" java.io.IOException: Error: Expected a long type
> at offset 1218571, instead got 'xref'
> at
> org.apache.pdfbox.pdfparser.BaseParser.readLong(BaseParser.java:1689)
> at
> org.apache.pdfbox.pdfparser.BaseParser.readObjectNumber(BaseParser.java:1617)
> at
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseXrefObjStream(NonSequentialPDFParser.java:746)
> at
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseXref(NonSequentialPDFParser.java:697)
> at
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:480)
> at
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:1013)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:951)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)