Uncompress the eng.traineddata file with the combine_tessdata command. I did so and found many -dawg files. Also unicharset and unicharambigs files. Use the command dawg2wordlist to uncompress the -dawg files into wordlist, freqlist, etc. However, you cannot get the font_properties and .box files.
The only way you can add to the eng.traineddata is adding new words into wordlist, new unambiguous rules into unicharambigs file, or bigram rules into bigram file. You cannot unpack all .box files and add a little the way you do when training a new language. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ec9d3fd0-a715-4974-8bcb-0913328cf8b7%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

