Tim Allison created TIKA-3264:
---------------------------------

             Summary: Improve the per page OCR heuristics for AUTO mode
                 Key: TIKA-3264
                 URL: https://issues.apache.org/jira/browse/TIKA-3264
             Project: Tika
          Issue Type: Improvement
    Affects Versions: 2.0.0
            Reporter: Tim Allison


We're currently using character count per page as the sole reason to run OCR in 
AUTO mode on PDFs.

Let's use this issue to discuss better options.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to