[
https://issues.apache.org/jira/browse/PDFBOX-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Lehmkühler updated PDFBOX-2351:
---------------------------------------
Fix Version/s: 1.8.8
> /XRefStm content missing in saved file
> ---------------------------------------
>
> Key: PDFBOX-2351
> URL: https://issues.apache.org/jira/browse/PDFBOX-2351
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 2.0.0
> Reporter: Tilman Hausherr
> Fix For: 1.8.8, 2.0.0
>
>
> Do this:
> - open the file immo-kurier_arsenal_93x62.pdf, PDFBOX-1577.pdf,
> PDFBOX-1756-436857.pdf, PDFBOX-2251-070075.pdf, test-landscape2.pdf or any
> file that has an /XRefStm with loadNonSeq
> - call getDocumentCatalog()
> - save to another file
> - open that file with loadNonSeq()
> {code}
> java.io.IOException: Error: Expected a long type at offset 688, instead got
> '"'
> at org.apache.pdfbox.pdfparser.BaseParser.readLong(BaseParser.java:1718)
> at
> org.apache.pdfbox.pdfparser.BaseParser.readObjectNumber(BaseParser.java:1645)
> at
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseXrefObjStream(NonSequentialPDFParser.java:548)
> at
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:410)
> at
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:794)
> at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1156)
> {code}
> The saved file still has the old /XRefStm value, but no content. I debugged a
> bit, it is confusing - the /XRefStm is never read, instead the /Prev is used,
> which leads to an old-style xref table. When saving, the existing /XRefStm
> value is kept in doWriteXRef() even if PDFBox "believes" it has no
> XRefStream. But doWriteXRefInc() is smarter and deletes the item if there is
> no XRefStream.
> I haven't tested it with 1.8. We should test it if there's a fix.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)