Tim Allison created TIKA-3271:
---------------------------------

             Summary: Change default image resize size in TesseractParser's 
pre-processing step
                 Key: TIKA-3271
                 URL: https://issues.apache.org/jira/browse/TIKA-3271
             Project: Tika
          Issue Type: Improvement
            Reporter: Tim Allison


If users have ImageMagick installed and they select image preprocessing, one of 
the things we are currently doing is telling ImageMagick to expand the image by 
900%.  This _may_ make sense for small images..tbd...however, this can lead to 
massive files and dramatic increases in processing time.

At some point, we should probably increase the image size based on the initial 
image size, e.g. dynamic resizing.

Until then, for Tika 2.0.0, I propose that we change the default to 200%.  This 
value is completely heuristic and not based on much data aside from Peter 
Kronenberg's work: 
https://lists.apache.org/thread.html/rb1dece05760d10f1b165b03b97fef8b609dc40c4cd06bdb8cc36469d%40%3Cuser.tika.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to