Oh. So what you read has to accomplish an underlying rule, it has to be a
word form a dictionary or something like that. I'm trying to read addresses,
different kind of zip codes and stuff like that, they have not a sorting
rule. I think that the only option that I have is what Patrick suggested;
comparing the width of the characters.
Thanks for your clarification.


2010/7/19 Jimmy O'Regan <[email protected]>

> On 19 July 2010 17:09, Andres <[email protected]> wrote:
> > Hello people,
> >
> > I'm trying to distinguish between 0 (number) and O (vowel).
> >
> > O vowel is in uppercase.
> >
> > In my training tif image, I included lots of zeros and lots of Os, like
> > this: O0O0O0O0O0 OOOO 0000
> >
> > Boxes and all the training procedure is ok, the log with no errors, but
> when
> > it reads this line O0O0O0O0O0 all of these characters are read as O
> vowels.
>
> It's a classification problem: 0 and O look identical to OCR (as do 1,
> I and l). There's a post-processing step that normalises 'words'
> containing digits/letters, which is what's happening here
>
> >
> > Could you people have some tip for this ?
> >
> > Thanks,
> >
> > Andres
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "tesseract-ocr" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> > [email protected]<tesseract-ocr%[email protected]>
> .
> > For more options, visit this group at
> > http://groups.google.com/group/tesseract-ocr?hl=en.
> >
>
>
>
> --
> <Leftmost> jimregan, that's because deep inside you, you are evil.
> <Leftmost> Also not-so-deep inside you.
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<tesseract-ocr%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to