[jira] [Commented] (TIKA-2293) Tess4jOCRParser - A simpler Java version of TesseractOCRParser

ASF GitHub Bot (JIRA) Mon, 20 May 2019 21:52:05 -0700


    [ 
https://issues.apache.org/jira/browse/TIKA-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844488#comment-16844488
 ]


ASF GitHub Bot commented on TIKA-2293:
--------------------------------------

changetoblow commented on issue #158: TIKA-2293 - Tess4jOCRParser - A simpler 
Java version of TesseractOCRParser
URL: https://github.com/apache/tika/pull/158#issuecomment-494233047
 
 
   I found that the main reason for this problem was that when tika parsed the 
embedded image in word, it was finally parsed into a temporary file with the 
suffix of TMP and sent to tess4j for identification, which was not recognized 
by tess4j.So how to modify the type generated by tika to adapt to tess4j? Can 
you give me some ideas
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


>  Tess4jOCRParser - A simpler Java version of TesseractOCRParser
> ---------------------------------------------------------------
>
>                 Key: TIKA-2293
>                 URL: https://issues.apache.org/jira/browse/TIKA-2293
>             Project: Tika
>          Issue Type: Improvement
>          Components: ocr
>            Reporter: Thejan Wijesinghe
>            Priority: Major
>
> Right now, TesseractOCRParser calls tesseract and imagemagick from command 
> line. Intention of this new parser "Tess4jOCRParser" is to use the Tess4J API 
> instead of the runtime.exec way to executing tesseract out of process.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (TIKA-2293) Tess4jOCRParser - A simpler Java version of TesseractOCRParser

Reply via email to