Here are some things to try to get better results: 1) resize the image larger so characters such as 'e' are at least 20 to 30 pixels high. 2) threshold to remove noise; (make gray values above 130 or so all get mapped to 255). 3) unsure what tesseract does with bullets; does anyone else know? 4) If this is a scanned image, rescan at 300 dpi. 5) I vaguely remember JPEG is not the preferred format; png, bmp, tiff are better with tesseract if I remember correctly.
See some of my other posts for additional details. Or search other posts in this group. On Dec 20, 8:45 pm, tomlei <[email protected]> wrote: > I just installed tesseract for OCR usage and the first attempt the > it failed giving me the right txt (most of the words were weird > characters) > > the pic > is:http://www.rentingtime.com/uploads/listing/l0033/0000033158/or48255.jpg > > i run it through some free online OCR websites and they can ready it. > > Can anybody explain what am i doing wrong or how to improve tesseract ? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

