2010/8/1 Zdenko Podobný <[email protected]>: > Dňa 30.07.2010 15:04, patrickq wrote / napísal(a): > > This what I did: > > 1. Created a text file called eng.user-words, containing: > Chest > Chestnut > Floor > Vice > > 2. Placed the file in the tessdata folder (next to eng.traineddata) > > 3. Ran recognition on an image returning "Chesf" instead of "Chest" > and "Fioor" instead of "Floor". Both mistaken "f" and "i" look quite > right visually so I can only assume their confidence level would be > low (but I didn't check). > > No effect whatsoever - zip. I can only assume that a variable must be > set or a function needs to be called to turn this on (even though > there is no mention of needing to set anything in the documentation) > or (most likely) I just don't understand how this works and the > dictionary kicks in only on the day or the summer solstice and when > there is a full moon or something. > > > I played with strace & grep and I found out that user dictionary is not used > (opened) in standard installation (svn revision 447). > > When I set up variable "global_user_words_suffix" to "user-words" (or > something else you like ;-) ) tesseract opened user dictionary file. > > global_user_words_suffix can be found in 2 files: > dict/dict.h: extern STRING_VAR_H(global_user_words_suffix, "user-words", > "A list of user-provided words."); > dict/permute.cpp:STRING_VAR(global_user_words_suffix, "", "A list of > user-provided words."); > > I believe problem is in dict/permute.cpp that define this variable as empty > string. >
Seems right; the *_VAR and *_VAR_H declarations are usually 'balanced'. I put it back in in r448 -- <Leftmost> jimregan, that's because deep inside you, you are evil. <Leftmost> Also not-so-deep inside you. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

