On 7/2/08, [EMAIL PROTECTED] ([EMAIL PROTECTED]) wrote: >I'd like to use PIL to prep an image file to improve OCR quality. > >Specifically, I need to filter out all but black pixels from the image (i.e., >convert all non-black pixels to white while retaining the black pixels). > >Can someone please direct me to the appropriate PIL function/method to >accomplish this along with a brief description of the correct arguments to use?
I don't have the arguments to use, but the process is a bit more involved to enhance a bi-level image obtained through grayscale in order to get the best results (IMO). The best results I have seen are by applying a moderately strong 'S' curve with sharp shoulders, then applying two passes of unsharp masking, one with a large aperture and a subsequent with a lower-intensity and smaller aperture, then finally maping the to bit-level required by OCR (usually a threshold into a bitmap). Another trick, if you have the time is to scan at a higher resolution (in integer increments i.e. 2x, 3x, 4x so interpolation doesn't interfere), process the image as described then reduce the resolution to the optimum OCR res. I have to admit, this is from a while ago, I'm not sure what the current state of affairs is with OCR software (been 10 years, if a day, since I used any). Scott _______________________________________________ Image-SIG maillist - Image-SIG@python.org http://mail.python.org/mailman/listinfo/image-sig