Like I said, it works fine for me if I do *not* specify the Tesseract path in my Java program, in which case it uses the system environment variables to find the Tesseract files (namely $PATH to find the executable, and $TESSDATA_PREFIX to find the tessdata folder).
On Thu, Jul 23, 2015 at 2:23 PM, Nick Burch <[email protected]> wrote: > On Wed, 22 Jul 2015, Christian Wolfe wrote: > >> It looks to me that TesseractOCRParser doesn't work on Linux unless the >> Tesseract executable and the 'tessdata' folder are in the same location on >> the filesystem. This makes sense in a Windows environment (where everything >> is installed together by default), but in linux, package managers (*and* >> source code installations) tend to split the files up across the filesystem. >> > > I'm not sure that's correct as a blanket statement. I have tesseract > installed on ubuntu 14.04, from packages, and it works just fine with Tika. > The tesseract binary and data aren't in the same folders, and no additional > parameters are needed > > I'm wondering what's different with your setup? > > Thanks > Nick >
