On Wed, Jun 25, 2003, Oron Peled wrote about "Re: kde": > Is there an "soundex" algorithm for hebrew? If there is than we > can precompute soundex values for all dictionary words and store > them in some sorted data structure (say, a dbm file). Than we > can present all words with same soundex.
Think for a moment why the "soundex" was invented in English. The problem in english is sometimes you know a word, say "enough", you know how to say it, but you forgot how to write it. You try "eenaf", "inuf", "eenoph", and none of these has barely any letters in common with the correct word "enough", they just sound the same. So in English trying to change a few letters to come up with a correct word is rarely enough. In Hebrew, this problem is not very serious. Except a few changes in letters that Hspell already attempts (het/khaf, he/`ain/aleph, bet/vav, and a few more), and the addition or deletion of imot kria (vav, yud, aleph) that Hspell also tries, spelling in Hebrew is very consistent with the way a word sound. This is why trying to add/delete immot qri'a and to exchange het/khaf, etc., is enough in Hebrew. There are other kinds of errors that are not spelling mistakes, but rather typos - an accidentally-missing or accidentally-added consonant, two transposed letters, two seperate words run together, etc. Hspell does not currently attempt to correct such errors, because I believed that such errors could be easily corrected by the user without Hspell's help, and automatic correction might list too many irrelevant suggestions. Maybe in the future I'll rethink this. -- Nadav Har'El | Wednesday, Jun 25 2003, 25 Sivan 5763 [EMAIL PROTECTED] |----------------------------------------- Phone: +972-53-245868, ICQ 13349191 |Willpower: The ability to eat only one http://nadav.harel.org.il |salted peanut. ================================================================= To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
