I ran into this problem recently. Here is the solution (I'm using Tesseract 3.01): to use user-words list, in dict.h and dict.cpp, find user_words_suffix and change the "" to "user-words" //dict.h STRING_VAR_H(user_words_suffix, "user-words", "A list of user-provided words.");
//dict.cpp STRING_INIT_MEMBER(user_words_suffix, "user-words", "A list of user-provided words.", getImage()->getCCUtil()->params()), This assumes, then, that in your tessdata folder there is a file named "eng.user-words" with your user made word list. .bj. On Sep 27, 8:03 am, Slavko Kocjancic <[email protected]> wrote: > Hello... > > I have question about user-words. > I use eng.traineddata and OCR works well. But the problem is that text > have a lot of foregin names and that is not recongnized correctly. So I > try to make file eng.user-words in same directory as eng.traineddata is > and put that names in file one name per line. Then I try to OCR again. > But no difference. So the question is. > Is enought to just make file eng.user-words or something else should be > done? > > Thanks. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

