I'm not sure, but this may be related to the "tessedit_ok_mode" param which can take values from 0 to 5 in the current revision (590). Or with some other param around it in "tesseractclass.h". To give a better answer, some experimentation with your data might be needed. If you don't know what to do with the above info, you should send your data and image files, and the full set of command lines you used, and perhaps you'll get an answer from someone here.
Warm regards, Dmitri Silaev www.CustomOCR.com On Wed, Jul 27, 2011 at 9:33 PM, Worms <[email protected]> wrote: > I wonder why lang.user-words is more accurate than lang.freq- > dawg&lang.word-dawg though they were both made by almost same > word_list file. > ------------------------------------------------------------------------- > case 1 : > word_list.txt -> change name -> lang.user-words > > case 2 : > word_list.txt('word_list.txt', which was same file used in case1.) -> > change name -> frequent_words_list.txt > ->wordlist2dawg frequent_words_list lang.freq-dawg lang.unicharset > > and > wordlist2dawg words_list lang.word-dawg lang.unicharset ->(this > lang.word-dawg file is only add 2 words) > > ------------------------------------ > I thought 'case 2' supposed to be better than 'case 1' or same as > 'case 1' in its result. However, the result was worse about 'case2' > than about 'case1' and I want to know why that happened. > > thank you for reading!! > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

