Hi Valent, this is the only developer group for tesseract afaik. Activity here has its highs and lows.
Tesseract uses the Otsu-Algorithm to create an 1Bit(black-white) Image bevor it start recognition. Otsu works best if the background is just one color. So its good, like you already did, to preprocess your images. For Example make it grayscale, kill the noise, smooth it,... What surprises me is that your cleaned image doesn't get good results. Maybe its because the numbers look a bit "broken" and you still have some noise in the background, these can be very irritating for tesseract. Actually broken characters shouldn't be that problem for tesseract, since its one of its strenges to be good in recognizing broken chars. I would put a smoothfilter or some morphologic operations(erosion, dilation, closing, opening) so get rit of the noise in the background and to "fix" your numbers. On Thursday, July 12, 2012 9:49:03 AM UTC+2, Valent wrote: > > > Has anybody tested my image files? Are they the problem or is this > > some tesseract bug that I should report? > > Is this google group right place to ask these questions? Is there some > other developers mailing list? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

