Hi,
I have an existing example how to override a built-in parser with the
ServiceRegistry mechanism. The file
"META-INF/services/org.apache.tika.parser.Parser" lists my desired
parser class, and the directory containing the META-INF directory is
added to the class path.
With Tika 1.0 this worked fine for both TikaCLI and direct use of the
Tika API.
Now I tested with Tika 1.9, and the TikaCLI class seems to ignore the
ServiceRegistry mechanism. The only way I could get TikaCLI to use my
externally supplied parser was by creating a Tika XML configuration
file, and by specifying that one with the "--config" option.
Is that intended behavior for TikaCLI now? It looks like the "--config"
options is fairly new.
And is there any documentation on the syntax of the Tika XML
configuration file? I was able to find some examples of configuration
files that I used as blueprints, but I could not find a description of
the XML syntax on the Tika website.
Thanks
Stephan
- Overriding built-in parser for TikaCLI with Tika 1.9 Stephan Mühlstrasser
-