I trained Tesseract (based on the eng language) to work with a particular customer derived font. hen I have finished training, I use other image files and try and scan for characters, during this I get errors in reading a 0 (zero). It will come up sometimes as an 8, Q, D or an O (the letter). The images I am using are not mixing up font size nor putting in lower case. I have tried using the unicharambigs file, but I'm not sure that it is being implemented correctly. I renamed it to eng.unicharambigs as well and nothing happened there. I even tried to set the Type Indicator value to 1 to mandate the substitution and nothing happened. Anyway, I feel I'm missing something simple, but have wound myself around this problem where I put myself in the middle of the forest. If someone can point me in a proper direction or give a few pointers I would appreciate the help. The files created or used: Trained Data: https://drive.google.com/file/d/0B3S3PcVl6aznbHh5TW9HX3FCdlU/view?usp=sharing Image: https://drive.google.com/file/d/0B3S3PcVl6azndXRvcGhCRlJRUWc/view?usp=sharing Box: https://drive.google.com/file/d/0B3S3PcVl6aznYWNZaUJ2RVFDU2s/view?usp=sharing unicharambigs: https://drive.google.com/file/d/0B3S3PcVl6aznUHdKX2RucExBaXM/view?usp=sharing font_properties: https://drive.google.com/file/d/0B3S3PcVl6aznTVl6OUNvRnh0LXM/view?usp=sharing Thanks, Jim
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/c71a648a-f4eb-462e-8475-755408f262fe%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

