[
https://issues.apache.org/jira/browse/TIKA-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419669#comment-16419669
]
Hudson commented on TIKA-2584:
------------------------------
UNSTABLE: Integrated in Jenkins build tika-2.x-windows #226 (See
[https://builds.apache.org/job/tika-2.x-windows/226/])
Fix for TIKA-2584 contributed by ewanmellor. (commits: rev
0954278d204fa05ac16ada7dcc81a1ae3e4f42fc)
* (edit)
tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRConfig.java
* (edit)
tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRParser.java
> Tika should have a way to pass arbitrary Tesseract options
> ----------------------------------------------------------
>
> Key: TIKA-2584
> URL: https://issues.apache.org/jira/browse/TIKA-2584
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Affects Versions: 1.17
> Reporter: Ewan Mellor
> Priority: Minor
> Fix For: 1.18, 2.0.0
>
>
> Tesseract has a very large number of config options (use tesseract
> --print-parameters to see them). There is no mechanism for
> TesseractOCRParser / TesseractOCRConfig to pass these to Tesseract, and so
> they cannot be controlled by user code.
> Tika should pass these through as opaque key-value pairs, so that user code
> can set them as necessary.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)