On Tue, May 01, 2012 at 11:29:43AM -0700, Falke wrote: > The above, of course, would beg the question: Can you just swap out > the dictionary component of traineddata? I am assuming one can. (So > as not to have to retrain from scratch)
You should be able to, yes, using combine_tessdata to extract, wordlist2dawg to create new dictionary files (see the training wiki page), and combine_tessdata to recombine the training data with the new dictionary. An alternative would be to just specify in a config to use a custom dictionary file, and not use those in the existing training file. This is explained well in the "CONFIG FILES AND AUGMENTING WITH USER DATA" section of http://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html Nick -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

