Ah that is very good- thank you.  Looks like it will be in 1.11.


On Wed, Sep 9, 2015 at 4:00 PM, Christian Wolfe <[email protected]> wrote:

> Brian,
>
> I submitted a patch for this bug that was accepted by the team -
> https://github.com/apache/tika/pull/56
>
> I do'nt think it has made it to any release version.
>
> On Wed, Sep 9, 2015 at 3:55 PM, Brian Young <[email protected]>
> wrote:
>
>> Hello,
>>
>> On OS X at least, tesseract and tessdata may not be under a common root.
>>  e.g.:
>>
>> /opt/local/share/tessdata
>>
>> /opt/local/bin/tesseract
>>
>>
>> Unfortunately it looks like TesseractOCRParser does not accommodate for
>> this since there is only one configuration value that is used for finding
>> the binary as well as setting the TESSDATA _PREFIX environment var.
>>
>>
>> Now, TESSDATA_PREFIX does not get set if I do not pass in the path on the
>> config object.  However, even though tesseract is in my path, it isn't
>> found when the ProcessBuilder executes unless I've given it the full
>> path... which of course sets the TESSDATA_PREFIX to the wrong thing.
>>
>>
>> It seems like maybe it would be best to handle these as two separate
>> configuration values?  But short of that and a new version of Tika, does
>> anyone have any other advice?
>>
>>
>> Thank you
>>
>> Brian
>>
>>
>>
>>
>>
>>
>>
>

Reply via email to