Hi, On Thu, Feb 5, 2009 at 2:34 AM, Jonathan Koren <jonat...@soe.ucsc.edu> wrote: > I bring this up, because what I assume is an IIOException from > com.sun.imageio.plugins.jpeg.JPEGMetadata ("JFIF not permitted in stream > metadata") got rethrown by Tika and it caused my program to fail as it got > missed by all my catches and eventually rethrown all the way back up to > main.
This is something I've been worrying about as well. The problem is that currently Tika has no way to distinguish between IOExceptions caused by the document input stream failing and by the parser library failing to parse the document. The former should be allowed to reach the client application as documented in the @throws IOException clause of the parse() method, but the latter should be caught and wrapped into a TikaException. I've been doing some background work to enable such distinctions, see https://issues.apache.org/jira/browse/IO-192. Would you be interested in joining the effort? BR, Jukka Zitting