Tesseract always recognizes the letter 'C' as number '0'. strange ------------------ Original ------------------ From: "Dmitri Silaev"<[email protected]>; Date: Mon, Aug 15, 2011 02:21 PM To: "tesseract-ocr"<[email protected]>; Cc: "trifusion"<[email protected]>; Subject: Re: Finding zero instead of letter 'O'
There's no way to do this in a generic high-level manner. All you can do is to restrict zeros or O's for the entire document. If you want flexibility, you should run OCR on a word-by-word basis. If there are only few such words or digit combinations, you can also try adding these to the dictionary and this could suffice with no extra setup. Another approach is post-processing, which is mentioned by you. Warm regards, Dmitri Silaev www.CustomOCR.com On Sun, Aug 14, 2011 at 10:52 PM, Greg <[email protected]> wrote: > is there any way to force a preference for the number zero '0' over a > letter 'O'? Particularly when adjacent to another number (123456789)? > > Some times I am getting 1 0 O returned. > > I can convert later if I have to, but prefer not to if possible. > > Thanks > > Greg > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

