Honestly i would not run tesseract on the same server as Solr. It takes a lot 
of resources and may negatively impact Solr. Just write a small program using 
Tika+Tesseract that runs on a different server / container and posts the 
results to Solr.

About your question: Probably Tika (a dependency of Solr) figured it out or 
depending on your format Pdfbox (used by Tika).

> Am 11.02.2020 um 19:15 schrieb Karan Jain <sachu8...@gmail.com>:
> 
> Hi All,
> 
> The Solr version 7.6.0 is running on my local machine. I have installed
> Tesseract through following steps:-
> yum install tesseract echo export PATH=$PATH:/usr/share/tesseract
>>> ~/.bash_profile
> echo export TESSDATA_PREFIX=/usr/share/tesseract >>~/.bash_profile
> 
> Now the deployed Solr is supporting tesseract. I searched TESSDATA_PREFIX
> in https://github.com/apache/lucene-solr and found no reference there. I
> could not understand How Solr came to know about the deployed tesseract.
> Please tell the specific java class in Solr if possible.
> 
> Thanks for your time,
> Best,
> Karan

Reply via email to