On Mon, 21 Sep 2015, Brian Young wrote:
Hello, we are long time Tika users that have recently started using
Tesseract. We would like to be able to enable/disable Tesseract per
extraction with Tesseract disabled until we choose to enable it.
The easiest way would be to have two different TikaConfig objects, and
pick between them (+their parsers) at runtime
Have your with-Tesseract one just be the default config if you want
Have your no-Tesseract one be created with a config file along the lines
of
<properties>
<parsers>
<parser class="org.apache.tika.parser.DefaultParser">
<parser-exclude class="org.apache.tika.parser.ocr.TesseractOCRParser"/>
</parser>
</parsers>
</properties>
Thanks
Nick