Hi folks, It looks to me that TesseractOCRParser doesn't work on Linux unless the Tesseract executable and the 'tessdata' folder are in the same location on the filesystem. This makes sense in a Windows environment (where everything is installed together by default), but in linux, package managers (*and* source code installations) tend to split the files up across the filesystem.
I believe this could be alleviated by creating a second property in TesseractOCRConfig that points to the 'tessdata' folder separately from the Tesseract executable. That, or a bit of documentation that clarifies that the files need to be together. I would be more than willing to work on either solution, but only if the team considered it worthwhile. Anyway, thanks for making a great library, and for taking time to read this.
