On 22/08/12 06:29, Ramachandran, Karthik wrote:
I'm having some trouble with the TNEFParser so I would like to prevent
the AutoDetect parser from using it.
Is there a way to override the default org.apache.tika.parser.Parser to
prevent it from using the the TNEFParser?
For a long term fix, you should open a bug in JIRA for the TNEG problem,
attach a problematic file to the bug report, and work with us to get the bug
fixed.
Short term, unpack the Tika Parsers jar file, edit the
META-INF/services/org.apache.tika.parser.Parser file and remove the TNEF parser
from the list. That will stop it being auto-loaded and used by AutoDetectParser
Without changes to the Tika Parsers jar file, that's a little trickier. There
are two options available. One is to create a TikaConfig instance yourself,
rather than relying on the default one, and only supply a limited list of
parsers to that. Depending on if you want to whitelist or blacklist, that might
be easy or more difficult. Alternately, you could use the fact that the last
registered parser for a mimetype wins. So, create your own jar with a services
file, and your own dummy parser. Have that parser declare that it handles the
TNEF mimetype, but have it do nothing. Add the jar to your classpath, and then
your dummy parser will be used instead
Nick