[
https://issues.apache.org/jira/browse/TIKA-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931126#comment-15931126
]
Thejan Wijesinghe commented on TIKA-2293:
-----------------------------------------
Other than that, I have also added a image preprocessing function to the
Tess4JOCRParser, It only supports OCRing rotated images at the moment. But it
is not using any python script like Rotation.py to calculate the rotation angle
or Imagemagick to correct the image angle. It is a pretty straight forward
approach, I have implemented here. So no redundant I/O, making temporary
resources. So, I presume it is faster.
> Tess4jOCRParser - A simpler Java version of TesseractOCRParser
> ---------------------------------------------------------------
>
> Key: TIKA-2293
> URL: https://issues.apache.org/jira/browse/TIKA-2293
> Project: Tika
> Issue Type: Improvement
> Components: ocr
> Reporter: Thejan Wijesinghe
> Fix For: 1.15
>
>
> Right now, TesseractOCRParser calls tesseract and imagemagick from command
> line. Intention of this new parser "Tess4jOCRParser" is to use the Tess4J API
> instead of the runtime.exec way to executing tesseract out of process.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)