Hello, I'm using Tesseract 3.01 and unicharambigs doesn't work as expected.
For starters, for lines where the type indicator is *1* (REPLACE_AMBIG) the replacement always occurs as expected. However, when I set the type indicator to *0* (NOT_AMBIG) the replacement never happens even when a replacement would change a word from a non-dictionary word into a dictionary word. I've assured that the dictionary contains the necessary word. Furthermore *2* (DEFINITE_AMBIG), *3* (SIMILAR_AMBIG) and *4* (CASE_AMBIG) don't seem to have any effect, though I'm not clear what they're supposed to do anyways. Also confusing is that I've unpacked the eng.unicharambigs from eng.traineddata and there are several lines where the type indicator is either *0* (NOT_AMBIG) or *1 *(REPLACE_AMBIG). To me this suggests that in English mode Tesseract correctly applies each of these rules. Lastly, for reference I have been able to track down two tickets which seem to be related to my problem. http://code.google.com/p/tesseract-ocr/issues/detail?id=719 http://code.google.com/p/tesseract-ocr/issues/detail?id=542 Is there anything I can do to resolve this issue? Much thanks. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

