Dave Meikle created TIKA-1477:
---------------------------------
Summary: Add customer header to allow overriding of OCR language
to be used in Tika Server
Key: TIKA-1477
URL: https://issues.apache.org/jira/browse/TIKA-1477
Project: Tika
Issue Type: Bug
Components: server
Reporter: Dave Meikle
Assignee: Dave Meikle
Priority: Minor
Fix For: 1.7
The _TesseractOCRParser_ relies on different language models to accurately OCR
content written in different languages. At present, the Tika Server provides
no way to specify additional specific languages without code changes.
To enable clients to ask for processing to be performed using specific language
models, we should add an optional new custom HTTP header (e.g.
X-Tika-OCRLanguage) which will override the TesseractOCRConfig language value
and set it on the ParseContext for use during parsing.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)