The DPI measure is confusing for Tesseract's OCR, forget about it. The
big thing is within-image font's x-height, measured in pixels.
Tesseract, trained with ordinary fonts, proved good with fonts of
12-64 pixel height. If you have bigger characters, scale them down. If
you have a font that's bold, use morphology and erode characters after
binarization. Experiment. Removing "greyness" won't help as it's not a
generic way of getting rid of uneven illumination; you need to use
more sophisticated algorithms. Just using Photoshop won't let you
achieve much.

Warm regards,
Dmitri Silaev
www.CustomOCR.com





On Fri, Aug 19, 2011 at 8:18 PM, Andriy Malovanyy <[email protected]> wrote:
> To Zdenko:
> I think I have 3.0 version installed, so maybe I should reinstall the
> new version and try it. Thanks for the description of psm. Did you try
> to recognize other unedited images which I attached to
> the first post??
>
> To Rob:
> Initially I had 640x480 image with 72dpi with number occupying almost
> all the image. What I did is just opened the image in Photoshop, went
> to size of image menu, changed the resolution to 300 dpi (image
> increased in size) and set the image size back to 640x480. So, with
> that I got 640x480 image with 300dpi resolution.
>
> On 19 Aug, 17:56, Robert Komar <[email protected]> wrote:
>> On Fri, 19 Aug 2011, Andriy Malovanyy wrote:
>> > To sriranga:
>> > I tried changing dpi (check the previous post). It doesnt work.
>>
>> Did you rescale the image from 72 dpi to 300 dpi, or just change
>> the tag on the original image to say 300 dpi?  The latter won't work.
>> Tesseract seems to be tuned to work best for scans at 300 dpi
>> (although I've often successfully used 600 dpi).  Scans done at
>> 72 dpi usually get very poor results from tesseract.
>>
>> Cheers,
>> Rob Komar
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to