"Now, if you pass the word *bazaar* as a trailing command line parameter to Tesseract, Tesseract will not bother loading the system dictionary nor the dictionary of frequent words and will load and use the eng.user-words and eng.user-patterns files you provided. The former is a simple word list, one per line. The format of the latter is documented in dict/trie.h on read_pattern_list()."
did not understand quite well.. I have done what you sugested Em terça-feira, 11 de fevereiro de 2014 12:11:07 UTC, Nick White escreveu: > > Hi Carl, > > > I haven't been able to figure out how to do either of those, but I get > the > > feeling that is the wrong direction. > > No, it sounds right, and you're nearly there. The relevant > documentation for you is the "CONFIG FILES AND AUGMENTING WITH USER > DATA" section of the manual[0]. > > So, call your word list eng.user-words, put it in the tessdata > directory, then create a config file called 'customwords' in the > tessdata/configs directory, with the following contents: > > load_system_dawg F > load_freq_dawg F > user_words_suffix user-words > > Note that when I say "the tessdata directory", I mean a directory > that by default will probably be /usr/share/tesseract-ocr/tessdata. > > Hope that helps. > > Nick > > 0. > http://tesseract-ocr.googlecode.com/svn-history/trunk/doc/tesseract.1.html#_config_files_and_augmenting_with_user_data > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/30a9b353-3980-4a24-a355-9e12fed11dc3%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

