Using the latest 1.5 Tika release (not snapshot), I get an IOException when a PDF doesn't have certain headers.

java.io.IOException: Error: Header doesn't contain versioninfo
at org.apache.pdfbox.pdfparser.PDFParser.parseHeader(PDFParser.java:335)
    at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:177)
    at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1238)
    at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1203)
    at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:111)
...

Since I'm dealing in my code with file uploads, IOExceptions can easily happen in other ways. Shouldn't this be a TikaException of some type, or at least something other than just an IOException?

--

Thanks,

Daniel Gibby

Reply via email to