Hi,

I have an existing example how to override a built-in parser with the ServiceRegistry mechanism. The file "META-INF/services/org.apache.tika.parser.Parser" lists my desired parser class, and the directory containing the META-INF directory is added to the class path.

With Tika 1.0 this worked fine for both TikaCLI and direct use of the Tika API.

Now I tested with Tika 1.9, and the TikaCLI class seems to ignore the ServiceRegistry mechanism. The only way I could get TikaCLI to use my externally supplied parser was by creating a Tika XML configuration file, and by specifying that one with the "--config" option.

Is that intended behavior for TikaCLI now? It looks like the "--config" options is fairly new.

And is there any documentation on the syntax of the Tika XML configuration file? I was able to find some examples of configuration files that I used as blueprints, but I could not find a description of the XML syntax on the Tika website.

Thanks
Stephan

Reply via email to