Christian Wolfe created TIKA-1703:
-------------------------------------
Summary: Can't Specify Tesseract Data Folder Distinct from
Tesseract Executable Path
Key: TIKA-1703
URL: https://issues.apache.org/jira/browse/TIKA-1703
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.9
Reporter: Christian Wolfe
Priority: Minor
Fix For: 1.9
If a user specifies the path to the Tesseract executable using
{{TesseractOCRConfig.setTesseractPath}}, then Tika will assume that the
Tesseract config folder (usually referred to as the 'tessdata' folder) is in
the same location. This is usually true in a Windows environment, where
everything is installed into a central location.
However, this is not necessarily the case in a Linux environment. If one were
to build Tesseract from source, for example, the config folder will be
installed in a different location than the Tesseract executable.
One way to fix this would be to add a way to specify the location of the
Tesseract config folder separate from the path to the executable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)