Tim Allison created TIKA-1995:
---------------------------------
Summary: Improve OCR Strategy options for the PDFParser
Key: TIKA-1995
URL: https://issues.apache.org/jira/browse/TIKA-1995
Project: Tika
Issue Type: Improvement
Reporter: Tim Allison
On TIKA-1994, we added the capability to run OCR on a full page for PDFs
instead of the inline images. The initial patch only had three OCR strategies:
no_ocr, ocr_only, ocr_and_text. Let's add other strategies that might improve
performance (speed/accuracy/redundancy).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)