Hi All, After having used the suggested algoritms for a few weeks I found that the suggestions were not completely to my liking;
1. small words (1,2 or 3) characters were never corrected. Especially three letter words and abbreviations suffered. 2. often used misspelings in Dutch words between 4 and 5 characters were missed. E.g. 'fiets' was suggested as a possible spell suggestion for 'feits' since no matching 3gram exist between the two. The same held true for misspellings based on 'ch' and 'g' both being the same sound in Dutch but written differently. 3. words that could never be part of a suggestion were added based on a single matchting n-gram. (e.g. if I ask for suggestions on 'per' then tupperware is also suggested. But solely based on length it is clear that it has a minimal distance of 7. This is how i solved it. I introduced two new fields to the ngram index 1. length (LEN) 2. firstletterlastcharacter (FLLC) At index time I use the following rules: for all words the length is stored; for all words of 1,2,3 character i store the 1 grams for all words of 4 character i store the 2 grams and the FLLC field for all words of 5 and 6 character i store the 3 grams and the FLLC field for all words of 7 or more characters i store the 3 grams When searching I limit the search to those fields that have a length [goal.length - maxd, goal.length+d] and based on the possible outcomes I apply the 1,2,3 and the FLLC field same as at index time. The results are ordered first on distance and second on the difference in length. So if two suggestions share the levenhstein distance with the goal then the suggestion that has the lowest (goal.length - suggestion.length) gets priority. The result for me is a much more natural spellcheck result. I am not at all sure if these results translate to English. But with this approach I can guarantee that all spellingmistakes that involve changing two characters and all spelling mistakes that involve ommitting or adding one character are found (or at least that is what i think now ;-). Just want to thank the people again that responsed to my original mail, your great suggestions have greatly improved the user and technical performance of my solution. -- Aad Nales --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]