Brian,

I submitted a patch for this bug that was accepted by the team -
https://github.com/apache/tika/pull/56

I do'nt think it has made it to any release version.

On Wed, Sep 9, 2015 at 3:55 PM, Brian Young <[email protected]> wrote:

> Hello,
>
> On OS X at least, tesseract and tessdata may not be under a common root.
>  e.g.:
>
> /opt/local/share/tessdata
>
> /opt/local/bin/tesseract
>
>
> Unfortunately it looks like TesseractOCRParser does not accommodate for
> this since there is only one configuration value that is used for finding
> the binary as well as setting the TESSDATA _PREFIX environment var.
>
>
> Now, TESSDATA_PREFIX does not get set if I do not pass in the path on the
> config object.  However, even though tesseract is in my path, it isn't
> found when the ProcessBuilder executes unless I've given it the full
> path... which of course sets the TESSDATA_PREFIX to the wrong thing.
>
>
> It seems like maybe it would be best to handle these as two separate
> configuration values?  But short of that and a new version of Tika, does
> anyone have any other advice?
>
>
> Thank you
>
> Brian
>
>
>
>
>
>
>

Reply via email to