[jira] [Commented] (TIKA-2749) OCR on PDFs should "just work" out of the box

2018-11-22 Thread Rick Leir (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695964#comment-16695964 ] Rick Leir commented on TIKA-2749: - Luis, Tesseract accepts TIFF and JPEG, so why convert it to a PDF?

[jira] [Commented] (TIKA-2749) OCR on PDFs should "just work" out of the box

2018-11-21 Thread Rick Leir (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694939#comment-16694939 ] Rick Leir commented on TIKA-2749: - Hi Tim [~talli...@apache.org] Yes, the "just work" goal is great.

[jira] [Commented] (TIKA-2749) OCR on PDFs should "just work" out of the box

2018-11-21 Thread Rick Leir (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694911#comment-16694911 ] Rick Leir commented on TIKA-2749: - Hi Luis [~lfcnassif] Your main goal is "to ocr scanned docs". Can I