Hi Doug, > Anybody have instructions on training Tesseract that actually work?
Sorry you're having troubles training Tesseract. The instructions at TrainingTesseract3 should work - they are rather long and detailed, but many people here have followed them and created useful training files as a result. Please can you be more specific about what you've tried and what failed? > And I am still not clear why I have to create a new "language"? I have a > number > of bitmap (not truetype) English fonts that Tesseract does a mediocre job on > "out of the box". All I want to do is add these couple fonts and work with > them. I suppose if the files used to create the English training data were > available I could add them to the English language files. But they don't > appear > to exist?. Or a tool to extract them from the training data file??? The box & tif files for the English training don't exist unfortunately, which is why you can't just straightforwardly add extra fonts to it. You could however replace just the character shape parts of the training (re-using the dawgs, unicharambigs, etc.) by replacing the .tr files, using the combine_tessdata tool. > (Yes, I am a *tiny* bit frustrated at the crap documentation at this point) Hopefully you'll find us here helpful enough to offset that, and hopefully we'll be able to improve the documentation based on where it's failing for you. Nick P.S. Note that the documentation has been changed very recently as the way training is done will change a little bit in the next version of Tesseract. -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

