I was wrong. "excluded" parsers are not loaded. We have a unit test for this. If we find that they are loaded, that's a bug that needs to be fixed.
As mentioned elsewhere, https://issues.apache.org/jira/browse/TIKA-4215 is the source of your problem. We were loading two "default" tika configs just to get the version number in tika-server. That issue fixes that problem. This fix may improve the loading speed of tika-server, too. :D On Wed, Mar 20, 2024 at 3:54 PM Tim Allison <talli...@apache.org> wrote: > Looking at TikaConfig, it looks like the "excluded" parsers are actually > loaded and initialized, but they are not added to the composite parser if > they're on the exclude list. > > We should try to avoid loading them at all if they are excluded. IIRC, > this is a bit complex in TikaConfig. Let me take a look... > > On Wed, Mar 20, 2024 at 3:25 PM Josh Burchard <burch...@pnp-hcl.com> > wrote: > >> Hi all, >> >> I've got Tika 2.9.1 server running on Linux and Tika is checking for the >> presence of ImageMagick. I tried disabling the TesseractOCR parser in my >> xml config file, but the check is still happening. I can certainly try >> changing my request headers to disable it but that's in compiled code and I >> was hoping to make the xml change as a more immediate workaround. >> >> I can reply with my config if interested, but I'm just using exactly >> what's mentioned in the doc >> <https://cwiki.apache.org/confluence/display/TIKA/TikaOCR> as far as the >> <parsers> element is concerned. >> >> Josh Burchard >> HCL Domnio >> >