On Mon, 7 Jan 2013, Maciej Liżewski wrote:
I am using tika with Apache Solr. What I need to achieve is to process all images with provided external parser instead of default image/jpeg parser. In general this is all about some external OCR software.

Your best bet is to include a parsers service file with your custom parser jar. Have that list your custom parser class, and have the parser return that it handles image/jpeg. Tika will prefer a custom parser to a built in one, if it finds two which claim to handle the same mimetype. No config needed, it's all serviceloader stuff

Nick

Reply via email to