Hi folks,

It looks to me that TesseractOCRParser doesn't work on Linux unless the
Tesseract executable and the 'tessdata' folder are in the same location on
the filesystem. This makes sense in a Windows environment (where everything
is installed together by default), but in linux, package managers (*and*
source code installations) tend to split the files up across the filesystem.

I believe this could be alleviated by creating a second property in
TesseractOCRConfig that points to the 'tessdata' folder separately from the
Tesseract executable. That, or a bit of documentation that clarifies that
the files need to be together.

I would be more than willing to work on either solution, but only if the
team considered it worthwhile.

Anyway, thanks for making a great library, and for taking time to read this.

Reply via email to