Am Freitag, 5. Oktober 2012 18:33:30 UTC+2 schrieb Francisco Loché Costa: > > Looking at the first image you have attached, i think you may need to > eliminate that balck outline who surrounds the blue in the characters (by > eliminate i mean turn it into the same colour as the background). If you > manage to solve that, maybe you can turn blue in black and you will get > separated characters. Maybe try to expand the background pixels a little, > just to cover the black outlines. Try to investigate about if you consider > that it works for you. > > 2012/10/4 [email protected] <javascript:> <[email protected] <javascript:>> > >> hi, >> >> >> i would like to recognize a costum font with tesseract, ive played >> around with the screens below but did not get anything besides some >> chars that were recognized. >> any idea howto get the data from pictures like these? >> >> heres the source material: >> http://dmk-crew.dyndns.info/files/bf2-a-z.jpg >> >> and here with some modifications >> http://dmk-crew.dyndns.info/files/bf2-a-z-grayscale.jpg >> dmk-crew.dyndns.info/files/bf2-a-z-threshold.jpg >> >> is the train option maybe the way to go? >> >> -- >> You received this message because you are subscribed to the Google >> Groups "tesseract-ocr" group. >> To post to this group, send email to [email protected]<javascript:> >> To unsubscribe from this group, send email to >> [email protected] <javascript:> >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en >> > > > > -- > * Francisco Loché Costa,* > * Ingeniero Técnico de Telecomunicación, esp. Telemática.* >
thx for all the input, i managed to get quite far with help of the #gimp and a guy named Ankh in particular - he knows his gimp stuff. heres the picture that he created: http://dmk-crew.dyndns.info/files/bf2-a-z-ankh.png steps for gimp: 1. blue/selective gaussian blur, radius 2, threshhold 20 2. select by colour in the middle of one of the letters, but before letting go of the button, dragging to the right until the letters were mostly selected (needs some trying, what looks best) 3. cut havent looked into gimp scripting yet, but since the second step needs manual input i dont think its possible with gimp looking at imagemagick right now tesseract output, it recognized 26 chars. not correct, but it finds something - probably learning a new font helps nhndelghliklmnnpqrxluumxyz -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

