Hi,
On Tue, Jul 1, 2014 at 1:51 PM, Nick Burch <[email protected]> wrote:
> On Fri, 27 Jun 2014, Daniel Gibby wrote:
>> Shouldn't this be a TikaException of some type, or at least something
>> other than just an IOException?
>
> One option might be to catch the IOException in the Tika code, then re-throw
> it as a TikaException. However, I'd probably prefer it if we could get the
> PDFBox project to make it a more specific exception, which we could then
> catch and re-throw as a TikaException. I'm not sure we want to be catching
> all PDFBox IOExceptions, as that might mask a real IOException?
The TaggedInputStream class [1] was designed for such cases where we
want to distinguish between IOExceptions thrown by the underlying
InputStream and those thrown by the library processing the stream. It
can be used like this:
TaggedInputStream tagged = new TaggedInputStream(stream);
try {
parse(tagged);
} catch (IOException e) {
tagged.throwIfCauseOf(e); // throws IOException if from stream
throw new TikaException("Parse error", e);
}
[1] http://tika.apache.org/1.0/api/org/apache/tika/io/TaggedInputStream.html
BR,
Jukka Zitting