I'm using tesseract as part of a bigger OMR program, to recognize the names of students. I expect to improve accuracy by handing tesseract a list of 30 or so names from the roster: right now I'm just passing an identical roster to lang.freq-dawg, lang.word-dawg, and lang.user- words.
Training with known lists makes tess a lot better, but its still spitting out nonsense words. Is there any way to FORCE tesseract to use one of the words in its dictionary (preferably exactly once)? Otherwise, I'm looking at some kind of HMM post-processing and the code is already bloated... -b -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

