Before the grayscale processing and after the threshold try to dilate and
erode the image, in this way you can fill the white spaces inside the
characters. Dilate can expand the black pixels, inside and outside the
characters outline. Erode will made the opposite operation, but if the
inside is filled with black, it will continue black, smoothing the outside
of the outline. Try also with images with more pixels if you experiment
problems in this two operations.

If you find that tesseract doesn't recognizes most characters, you may need
to train the font, like for a new language. But i think the key is the
preprocessing. If dilate and erode don't work for you, try to find another
image transformation that helps, there are many that may be useful for you
(and many that i don't know yet... sorry)

2012/10/4 [email protected] <[email protected]>

> hi,
>
>
> i would like to recognize a costum font with tesseract, ive played
> around with the screens below but did not get anything besides some
> chars that were recognized.
> any idea howto get the data from pictures like these?
>
> heres the source material:
> http://dmk-crew.dyndns.info/files/bf2-a-z.jpg
>
> and here with some modifications
> http://dmk-crew.dyndns.info/files/bf2-a-z-grayscale.jpg
> dmk-crew.dyndns.info/files/bf2-a-z-threshold.jpg
>
> is the train option maybe the way to go?
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>



-- 
*  Francisco Loché Costa,*
*  Ingeniero Técnico de Telecomunicación, esp. Telemática.*

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to