Tesseract is not good at handling small amounts of text. You may try to duplicate the image area so the numbers appear more than once and then post-process.
On Sun, Feb 12, 2012 at 7:56 PM, JD <[email protected]> wrote: > I'm using v 3.01 on Windows 7 to perform OCR on another program. I > don't have access to the fonts the program is using, so I trained > tesseract using some screenshots, and so far the text recognition is > far better than I expected. However, when I try to process a > screenshot that contains only a few numbers, it doesn't match anything > at all. If was matching garbage, or the wrong numbers, then I'd just > keep working on improving the training... but it doesn't find > anything. Does anyone have a suggestion about what I should try? > > It doesn't look like I can attach a screenshot, but the numbers are in > a column... something like this: > > 10 > 13 > 14 > 15 > 17 > > I pre-process the screenshots so the text is black on white. I also > zoom in on the images, so they're slightly blurred (only very > slightly)... but the text recognition is near perfect, so I don't > think that's an issue. Plus, it seems like it should find SOMETHING. > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- ``All that is gold does not glitter, not all those who wander are lost; the old that is strong does not wither, deep roots are not reached by the frost. >From the ashes a fire shall be woken, a light from the shadows shall spring; renewed shall be blade that was broken, the crownless again shall be king.” -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

