Before the grayscale processing and after the threshold try to dilate and erode the image, in this way you can fill the white spaces inside the characters. Dilate can expand the black pixels, inside and outside the characters outline. Erode will made the opposite operation, but if the inside is filled with black, it will continue black, smoothing the outside of the outline. Try also with images with more pixels if you experiment problems in this two operations.
If you find that tesseract doesn't recognizes most characters, you may need to train the font, like for a new language. But i think the key is the preprocessing. If dilate and erode don't work for you, try to find another image transformation that helps, there are many that may be useful for you (and many that i don't know yet... sorry) 2012/10/4 [email protected] <[email protected]> > hi, > > > i would like to recognize a costum font with tesseract, ive played > around with the screens below but did not get anything besides some > chars that were recognized. > any idea howto get the data from pictures like these? > > heres the source material: > http://dmk-crew.dyndns.info/files/bf2-a-z.jpg > > and here with some modifications > http://dmk-crew.dyndns.info/files/bf2-a-z-grayscale.jpg > dmk-crew.dyndns.info/files/bf2-a-z-threshold.jpg > > is the train option maybe the way to go? > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- * Francisco Loché Costa,* * Ingeniero Técnico de Telecomunicación, esp. Telemática.* -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

