Thanks Nick. Looks like the option I was looking for is the 3rd one, but the docs say it is only available in Tika 2.x - am I right?
On Thu, Jun 10, 2021 at 3:47 PM Nick Burch <apa...@gagravarr.org> wrote: > On Thu, 10 Jun 2021, Cristian Zamfir wrote: > > It would be nice if this was feasible via the headers of each request. I > > find it more convenient to use if/else in my code than in the yaml files > > used for k8s configuration. Is there such an option? > > Three options, see > > https://cwiki.apache.org/confluence/display/TIKA/TikaOCR#TikaOCR-DisableOCRinTikadisable-ocr > * Don't install tesseract on the machine hosting Tika > * Supply a Tika Config file that disables the Tesseract parser > * Send the Server the custom header X-Tika-OCRskipOcr: true > > Nick >