Hello,

On OS X at least, tesseract and tessdata may not be under a common root.
 e.g.:

/opt/local/share/tessdata

/opt/local/bin/tesseract


Unfortunately it looks like TesseractOCRParser does not accommodate for
this since there is only one configuration value that is used for finding
the binary as well as setting the TESSDATA _PREFIX environment var.


Now, TESSDATA_PREFIX does not get set if I do not pass in the path on the
config object.  However, even though tesseract is in my path, it isn't
found when the ProcessBuilder executes unless I've given it the full
path... which of course sets the TESSDATA_PREFIX to the wrong thing.


It seems like maybe it would be best to handle these as two separate
configuration values?  But short of that and a new version of Tika, does
anyone have any other advice?


Thank you

Brian

Reply via email to