Hi All,

I want to index images and pdf documents which have images into Solr. I
test it with my Solr 6.3.0.

I've installed tesseract at my computer (Mac). I verify that Tesseract
works fine to extract text from an image.

I index image into Solr but it has no content. However, as far as I know, I
don't need to do anything else to integrate Tesseract with Solr.

I've checked these but they were not useful for me:

http://lucene.472066.n3.nabble.com/TIKA-OCR-not-working-td4201834.html
http://lucene.472066.n3.nabble.com/Fwd-configuring-Solr-with-Tesseract-td4361908.html

My question is, how can I support OCR with Solr?

Kind Regards,
Furkan KAMACI

Reply via email to