Using the latest 1.5 Tika release (not snapshot), I get an IOException
when a PDF doesn't have certain headers.
java.io.IOException: Error: Header doesn't contain versioninfo
at
org.apache.pdfbox.pdfparser.PDFParser.parseHeader(PDFParser.java:335)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:177)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1238)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1203)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:111)
...
Since I'm dealing in my code with file uploads, IOExceptions can easily
happen in other ways.
Shouldn't this be a TikaException of some type, or at least something
other than just an IOException?
--
Thanks,
Daniel Gibby