[ 
https://issues.apache.org/jira/browse/TIKA-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18059505#comment-18059505
 ] 

Hudson commented on TIKA-4667:
------------------------------

SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk17 #1215 (See 
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk17/1215/])
TIKA-4667 - add Tess4J in-process OCR parser and docs (#2615) (github: 
[https://github.com/apache/tika/commit/55bd3e7dd68e1279af460608aa7e357b6556f2e3])
* (add) docs/modules/ROOT/examples/tess4j-full.json
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-tess4j-module/src/test/java/org/apache/tika/parser/ocr/tess4j/Tess4JConfigTest.java
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-tess4j-module/src/test/resources/test-documents/testOCR.jpg
* (add) docs/modules/ROOT/examples/tess4j-basic.json
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-tess4j-module/src/main/java/org/apache/tika/parser/ocr/tess4j/Tess4JConfig.java
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-tess4j-module/src/test/java/org/apache/tika/parser/ocr/tess4j/Tess4JParserTest.java
* (edit) docs/modules/ROOT/nav.adoc
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-tess4j-module/src/main/java/org/apache/tika/parser/ocr/tess4j/Tess4JParser.java
* (add) docs/modules/ROOT/pages/configuration/parsers/tess4j-parser.adoc
* (add) tika-parsers/tika-parsers-ml/tika-parser-tess4j-module/pom.xml
* (edit) tika-parsers/tika-parsers-ml/pom.xml


> Add tess4j wrapper in 4.x
> -------------------------
>
>                 Key: TIKA-4667
>                 URL: https://issues.apache.org/jira/browse/TIKA-4667
>             Project: Tika
>          Issue Type: New Feature
>            Reporter: Tim Allison
>            Priority: Major
>
> A long while ago we declined the contribution of a tess4j wrapper. The reason 
> was that we didn't want to be responsible for getting tess4j working on 
> everyone's various OS.
> I still don't think we want this responsibility, but I think we should make 
> it available for testing and evaluation. Given we know the OS of the docker 
> image, if it is substantially better than shelling out to tesseract, we could 
> put it in our server docker image.
> The other thing that's changes is tika-pipes. I had concerns about native 
> code. That would now be isolated into the pipes forked process so we don't 
> have to worry as much about damaging the main jvm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to