[
https://issues.apache.org/jira/browse/TIKA-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15644455#comment-15644455
]
Matthew Caruana Galizia commented on TIKA-2167:
-----------------------------------------------
[[email protected]] to replicate the issue:
1) build tika-app from master
2) java -jar target/tika-app-1.15-SNAPSHOT.jar
3) drag simple.tiff onto the window
4) select View > Plain text
Result: the only output is a series of newlines.
> Image processing causes OCR to fail
> -----------------------------------
>
> Key: TIKA-2167
> URL: https://issues.apache.org/jira/browse/TIKA-2167
> Project: Tika
> Issue Type: Bug
> Components: ocr
> Affects Versions: 1.14
> Environment: Mac OS X 10.11.6; Java 1.8.0_45; tesseract 3.04.01;
> ImageMagick 6.9.6-2
> Reporter: Matthew Caruana Galizia
> Priority: Critical
> Labels: convert, image, ocr, tiff
> Attachments: simple.tiff
>
>
> Image processing before OCR is enabled by default in the OCR configuration
> properties file. Unless this is disabled, running Tika on a simple TIFF image
> (attached) with two clear words fails. When image processing is disabled, it
> succeeds.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)