Hey Sergey - why not just remove your tesseract binary from your $PATH environment variable - leave it where it is, e.g., /usr/bin, etc., but simply exclude that from your path.
If you want to go the exclude route, check out: http://s.apache.org/3O0 You could use mime-excludes to exclude the TesseractParser for MIME types that you don’t want it to get called on. HTH. Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Sergey Tsalkov <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Wednesday, August 19, 2015 at 11:19 PM To: "[email protected]" <[email protected]> Subject: want to disable tesseract ocr parser >Hey awesome Tika folks! >The reason I'm writing is that I want to disable the >TesseractOCRParser. The reason is that it makes Tika take longer to >finish, and I don't need the OCRed results. > >I can't simply uninstall tesseract from the system because I use it >for other things. > >I thought about sending Tika a custom PATH that excludes /usr/bin so >it can't find tesseract, but that seems ugly and likely to break >things. > >Then I thought I could pass a custom config.xml to disable it, but I >can't figure out how to write the config file. > >I would greatly appreciate any help! > >Thanks, >Sergey
