Uncompress the eng.traineddata file with the combine_tessdata command. I 
did so and found many -dawg files. Also unicharset and unicharambigs files. 
Use the command dawg2wordlist to uncompress the -dawg files into wordlist, 
freqlist, etc. However, you cannot get the font_properties and .box files. 

The only way you can add to the eng.traineddata is adding new words into 
wordlist, new unambiguous rules into unicharambigs file, or bigram rules 
into bigram file. You cannot unpack all .box files and add a little the way 
you do when training a new language. 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ec9d3fd0-a715-4974-8bcb-0913328cf8b7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to