[ 
https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093987#comment-14093987
 ] 

Petr Vas edited comment on TIKA-93 at 8/12/14 11:45 AM:
--------------------------------------------------------

[~chrismattmann], do you know when we can expect this OCR parser to appear in 
released version (i.e. is there any expected release date for Tika 1.7)?
Would there be any RC / beta version that can be used?

I can see that previous versions of Tika used to be released each half year or 
so and it puts 1.7 release date somewhere in Feb 2015. Does it sounds right?


was (Author: yonyonson):
Chris, do you know when we can expect this OCR parser to appear in released 
version (i.e. is there any expected release date for Tika 1.7)?
Would there be any RC / beta version that can be used?

I can see that previous versions of Tika used to be released each half year or 
so and it puts 1.7 release date somewhere in Feb 2015. Does it sounds right?

> OCR support
> -----------
>
>                 Key: TIKA-93
>                 URL: https://issues.apache.org/jira/browse/TIKA-93
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Chris A. Mattmann
>            Priority: Minor
>             Fix For: 1.7
>
>         Attachments: TIKA-93.patch, TIKA-93.patch, TIKA-93.patch, 
> TIKA-93.patch, TesseractOCRParser.patch, TesseractOCRParser.patch, 
> TesseractOCR_Tyler.patch, TesseractOCR_Tyler_v2.patch, testOCR.docx, 
> testOCR.pdf, testOCR.pptx
>
>
> I don't know of any decent open source pure Java OCR libraries, but there are 
> command line OCR tools like Tesseract 
> (http://code.google.com/p/tesseract-ocr/) that could be invoked by Tika to 
> extract text content (where available) from image files.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to