[
https://issues.apache.org/jira/browse/PDFBOX-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14255787#comment-14255787
]
Arthur Blake commented on PDFBOX-2527:
--------------------------------------
OK thanks - I can try that. I may try and look into the code too and see if
there is a way I can help fix both of these bugs.
> IOException: Negative seek offset in NonSequentialPDFParser
> -----------------------------------------------------------
>
> Key: PDFBOX-2527
> URL: https://issues.apache.org/jira/browse/PDFBOX-2527
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.8.8, 2.0.0
> Reporter: Tilman Hausherr
> Assignee: Andreas Lehmkühler
> Priority: Minor
> Fix For: 1.8.9, 2.0.0
>
> Attachments: PDFBOX-2527-069020.pdf
>
>
> {code}
> Exception in thread "main" java.io.IOException: Negative seek offset
> at java.io.RandomAccessFile.seek(Native Method)
> at
> org.apache.pdfbox.io.RandomAccessBufferedFileInputStream.seek(RandomAccessBufferedFileInputStream.java:116)
> at
> org.apache.pdfbox.io.PushBackInputStream.seek(PushBackInputStream.java:234)
> at
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:492)
> at
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:1013)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:951)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:897)
> at org.apache.pdfbox.tools.PDFReader.parseDocument(PDFReader.java:375)
> at org.apache.pdfbox.tools.PDFReader.openPDFFile(PDFReader.java:340)
> at org.apache.pdfbox.tools.PDFReader.main(PDFReader.java:326)
> at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:80)
> {code}
> This happens with several malformed PDFs from the test set in TIKA-1442.
> These files (303385, 069020, 303385, 742141, 982996) all have some trash at
> the end.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)