Hi,

The quick guide to adding parsers (
http://tika.apache.org/0.9/parser_guide.html) says that you should modify
the following to make tika aware of a new parser:

   -  tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
   -
   
tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Parser

Is there a way to add and/or replace parsers without modifying Tika source?

For instance, in one place where I use Tika I might want to replace the
standard Tika RFC822 parser with one that captures more email headers as
metadata. Can I change the parser used by AutoDetectParser by calling
getParsers()/setParsers() on the AutoDetectParser? Is there some other
preferred way to programmatically change the mapping from some MediaType to
a specific parser?

Thanks,
Paul

Reply via email to