Re: unicharambigs doesn't seem to work [Tesseract 3.01]

Nick White Mon, 21 Jan 2013 01:58:46 -0800

Hi Preston,

> However, when I set the type indicator to 0 (NOT_AMBIG) the replacement
> never happens even when a replacement would change a word from a
> non-dictionary word into a dictionary word. I've assured that the
> dictionary contains the necessary word.


It's my understanding that type 0 doesn't necessarily ensure a
potential change from a non-dictionary word to a dictionary word. It
uses weighting to decide whether to make the change, so for example
if it's pretty confident (however erroneously) that e.g. the
characters are 'c l' and not 'd', due to spacing or whatever, it
won't necessarily make the switch. That said I haven't tested it too
much, or read the code. But that would explain why it isn't always
working where you expect it to.

> Furthermore 2 (DEFINITE_AMBIG), 3 (SIMILAR_AMBIG) and 4 (CASE_AMBIG) don't
> seem to have any effect, though I'm not clear what they're supposed to do
> anyways.

Yes, it would be great to get some proper documentation on these. I
also don't have a good idea of what they're supposed to do (though
they are used in some of the .traineddata files).

Hope this helps.

Nick

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Re: unicharambigs doesn't seem to work [Tesseract 3.01]

Reply via email to