unicharambigs doesn't seem to work [Tesseract 3.01]

preston . grossman Thu, 27 Dec 2012 23:05:06 -0800

Hello,

I'm using Tesseract 3.01 and unicharambigs doesn't work as expected.


For starters, for lines where the type indicator is *1* (REPLACE_AMBIG) the 
replacement always occurs as expected.

However, when I set the type indicator to *0* (NOT_AMBIG) the replacement 
never happens even when a replacement would change a word from a 
non-dictionary word into a dictionary word. I've assured that the 
dictionary contains the necessary word.

Furthermore *2* (DEFINITE_AMBIG), *3* (SIMILAR_AMBIG) and *4* (CASE_AMBIG) 
don't seem to have any effect, though I'm not clear what they're supposed 
to do anyways.

Also confusing is that I've unpacked the eng.unicharambigs from 
eng.traineddata and there are several lines where the type indicator is 
either *0* (NOT_AMBIG) or *1 *(REPLACE_AMBIG). To me this suggests that in 
English mode Tesseract correctly applies each of these rules.

Lastly, for reference I have been able to track down two tickets which seem 
to be related to my problem.
http://code.google.com/p/tesseract-ocr/issues/detail?id=719
http://code.google.com/p/tesseract-ocr/issues/detail?id=542

Is there anything I can do to resolve this issue?

Much thanks.

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

unicharambigs doesn't seem to work [Tesseract 3.01]

Reply via email to