On Wed, 22 Jul 2015, Christian Wolfe wrote:
It looks to me that TesseractOCRParser doesn't work on Linux unless the Tesseract executable and the 'tessdata' folder are in the same location on the filesystem. This makes sense in a Windows environment (where everything is installed together by default), but in linux, package managers (*and* source code installations) tend to split the files up across the filesystem.

I'm not sure that's correct as a blanket statement. I have tesseract installed on ubuntu 14.04, from packages, and it works just fine with Tika. The tesseract binary and data aren't in the same folders, and no additional parameters are needed

I'm wondering what's different with your setup?

Thanks
Nick

Reply via email to