There is (was) a bug in wordlist2dawg. It was losing whole subtrees when a prefix word followed a longer word.This is fixed in the new 2.04 code in svn. Ray.
On Wed, Jan 14, 2009 at 7:30 AM, Piranha <[email protected]> wrote: > > Hi, I'm using tesseract on linux and have trouble with the IsValidWord > function from the API. > > Because I rewrote tesseract to support multiple dictionaries I need to > use IsValidWord to find out in which dictionary the word has been > found. For the dictionaries I used wordlist2dawg on my word lists. > > Now some words can be found and others in the same dictionaries > cannot. Creating the dawg does not yield any warning or error so I > suppose it is created correctly. I did not rewrite anything beyond the > valid_word function in permute.cpp. Are there parameters which abort a > search beforehand or anything similar? I cannot figure out, how this > is possible. Anybody made similar experiences with the dawg? > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

