[jira] [Commented] (TIKA-2293) Tess4jOCRParser - A simpler Java version of TesseractOCRParser

Thejan Wijesinghe (JIRA) Sat, 18 Mar 2017 01:43:42 -0700

    [ 
https://issues.apache.org/jira/browse/TIKA-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931126#comment-15931126
 ]


Thejan Wijesinghe commented on TIKA-2293:
-----------------------------------------

Other than that, I have also added a image preprocessing function to the 
Tess4JOCRParser, It only supports OCRing rotated images at the moment. But it 
is not using any python script like Rotation.py to calculate the rotation angle 
or Imagemagick to correct the image angle. It is a pretty straight forward 
approach, I have implemented here. So no redundant I/O, making temporary 
resources. So, I presume it is faster.

>  Tess4jOCRParser - A simpler Java version of TesseractOCRParser
> ---------------------------------------------------------------
>
>                 Key: TIKA-2293
>                 URL: https://issues.apache.org/jira/browse/TIKA-2293
>             Project: Tika
>          Issue Type: Improvement
>          Components: ocr
>            Reporter: Thejan Wijesinghe
>             Fix For: 1.15
>
>
> Right now, TesseractOCRParser calls tesseract and imagemagick from command 
> line. Intention of this new parser "Tess4jOCRParser" is to use the Tess4J API 
> instead of the runtime.exec way to executing tesseract out of process.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (TIKA-2293) Tess4jOCRParser - A simpler Java version of TesseractOCRParser

Reply via email to