Hi Preston, > However, when I set the type indicator to 0 (NOT_AMBIG) the replacement > never happens even when a replacement would change a word from a > non-dictionary word into a dictionary word. I've assured that the > dictionary contains the necessary word.
It's my understanding that type 0 doesn't necessarily ensure a potential change from a non-dictionary word to a dictionary word. It uses weighting to decide whether to make the change, so for example if it's pretty confident (however erroneously) that e.g. the characters are 'c l' and not 'd', due to spacing or whatever, it won't necessarily make the switch. That said I haven't tested it too much, or read the code. But that would explain why it isn't always working where you expect it to. > Furthermore 2 (DEFINITE_AMBIG), 3 (SIMILAR_AMBIG) and 4 (CASE_AMBIG) don't > seem to have any effect, though I'm not clear what they're supposed to do > anyways. Yes, it would be great to get some proper documentation on these. I also don't have a good idea of what they're supposed to do (though they are used in some of the .traineddata files). Hope this helps. Nick -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

