Tilman Hausherr created PDFBOX-2527:
---------------------------------------

             Summary: IOException: Negative seek offset in 
NonSequentialPDFParser
                 Key: PDFBOX-2527
                 URL: https://issues.apache.org/jira/browse/PDFBOX-2527
             Project: PDFBox
          Issue Type: Bug
          Components: Parsing
    Affects Versions: 1.8.8, 2.0.0
            Reporter: Tilman Hausherr
            Priority: Minor


{code}
Exception in thread "main" java.io.IOException: Negative seek offset
        at java.io.RandomAccessFile.seek(Native Method)
        at 
org.apache.pdfbox.io.RandomAccessBufferedFileInputStream.seek(RandomAccessBufferedFileInputStream.java:116)
        at 
org.apache.pdfbox.io.PushBackInputStream.seek(PushBackInputStream.java:234)
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:492)
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:1013)
        at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:951)
        at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:897)
        at org.apache.pdfbox.tools.PDFReader.parseDocument(PDFReader.java:375)
        at org.apache.pdfbox.tools.PDFReader.openPDFFile(PDFReader.java:340)
        at org.apache.pdfbox.tools.PDFReader.main(PDFReader.java:326)
        at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:80)
{code}
This happens with several malformed PDFs from the test set in TIKA-1442. These 
files (303385, 069020, 303385, 742141, 982996) all have some trash at the end.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to