Honestly i would not run tesseract on the same server as Solr. It takes a lot of resources and may negatively impact Solr. Just write a small program using Tika+Tesseract that runs on a different server / container and posts the results to Solr.
About your question: Probably Tika (a dependency of Solr) figured it out or depending on your format Pdfbox (used by Tika). > Am 11.02.2020 um 19:15 schrieb Karan Jain <sachu8...@gmail.com>: > > Hi All, > > The Solr version 7.6.0 is running on my local machine. I have installed > Tesseract through following steps:- > yum install tesseract echo export PATH=$PATH:/usr/share/tesseract >>> ~/.bash_profile > echo export TESSDATA_PREFIX=/usr/share/tesseract >>~/.bash_profile > > Now the deployed Solr is supporting tesseract. I searched TESSDATA_PREFIX > in https://github.com/apache/lucene-solr and found no reference there. I > could not understand How Solr came to know about the deployed tesseract. > Please tell the specific java class in Solr if possible. > > Thanks for your time, > Best, > Karan