Oh. So what you read has to accomplish an underlying rule, it has to be a word form a dictionary or something like that. I'm trying to read addresses, different kind of zip codes and stuff like that, they have not a sorting rule. I think that the only option that I have is what Patrick suggested; comparing the width of the characters. Thanks for your clarification.
2010/7/19 Jimmy O'Regan <[email protected]> > On 19 July 2010 17:09, Andres <[email protected]> wrote: > > Hello people, > > > > I'm trying to distinguish between 0 (number) and O (vowel). > > > > O vowel is in uppercase. > > > > In my training tif image, I included lots of zeros and lots of Os, like > > this: O0O0O0O0O0 OOOO 0000 > > > > Boxes and all the training procedure is ok, the log with no errors, but > when > > it reads this line O0O0O0O0O0 all of these characters are read as O > vowels. > > It's a classification problem: 0 and O look identical to OCR (as do 1, > I and l). There's a post-processing step that normalises 'words' > containing digits/letters, which is what's happening here > > > > > Could you people have some tip for this ? > > > > Thanks, > > > > Andres > > > > -- > > You received this message because you are subscribed to the Google Groups > > "tesseract-ocr" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]<tesseract-ocr%[email protected]> > . > > For more options, visit this group at > > http://groups.google.com/group/tesseract-ocr?hl=en. > > > > > > -- > <Leftmost> jimregan, that's because deep inside you, you are evil. > <Leftmost> Also not-so-deep inside you. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<tesseract-ocr%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

