[jira] [Commented] (TIKA-1477) Add custom header to allow overriding of OCR language to be used in Tika Server

Hudson (JIRA) Wed, 19 Nov 2014 05:05:25 -0800

    [ 
https://issues.apache.org/jira/browse/TIKA-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14217865#comment-14217865
 ]


Hudson commented on TIKA-1477:
------------------------------

SUCCESS: Integrated in tika-trunk-jdk1.6 #304 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.6/304/])
TIKA-1477: Added new custom header to Tika resource override Tesseract OCR 
language (dmeikle: 
http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1640535)
* /tika/trunk/tika-server/src/main/java/org/apache/tika/server/TikaResource.java


> Add custom header to allow overriding of OCR language to be used in Tika 
> Server
> -------------------------------------------------------------------------------
>
>                 Key: TIKA-1477
>                 URL: https://issues.apache.org/jira/browse/TIKA-1477
>             Project: Tika
>          Issue Type: Bug
>          Components: server
>            Reporter: Dave Meikle
>            Assignee: Dave Meikle
>            Priority: Minor
>             Fix For: 1.7
>
>
> The _TesseractOCRParser_ relies on different language models to accurately 
> OCR content written in different languages.  At present, the Tika Server 
> provides no way to specify additional specific languages without code changes.
> To enable clients to ask for processing to be performed using specific 
> language models, we should add an optional new custom HTTP header (e.g. 
> X-Tika-OCRLanguage) which will override the TesseractOCRConfig language value 
> and set it on the ParseContext for use during parsing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TIKA-1477) Add custom header to allow overriding of OCR language to be used in Tika Server

Reply via email to