Here are some things to try to get better results:
1) resize the image larger so characters such as 'e' are at least 20
to 30 pixels high.
2) threshold to remove noise; (make gray values above 130 or so all
get mapped to 255).
3) unsure what tesseract does with bullets;  does anyone else know?
4) If this is a scanned image, rescan at 300 dpi.
5) I vaguely remember JPEG is not the preferred format; png, bmp, tiff
are better with tesseract if I remember correctly.

See some of my other posts for additional details.  Or search other
posts in this group.

On Dec 20, 8:45 pm, tomlei <[email protected]> wrote:
> I just installed tesseract for OCR usage and the first attempt the
> it failed giving me the right txt (most of the words were weird
> characters)
>
> the pic 
> is:http://www.rentingtime.com/uploads/listing/l0033/0000033158/or48255.jpg
>
> i run it through some free online OCR websites and they can ready it.
>
> Can anybody explain what am i doing wrong or how to improve tesseract ?

--

You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.


Reply via email to