Nadav Har'El wrote: > 1. Aspell does not (or at least we didn't figure out how to) support prefixes, > so instead of a 125,000 word word list (in this release) we had to multiply > this by the number of prefixes (he, shin, etc. - about 20 prefixes in all) > and the resulting over-million-word list took ages to load into aspell > (hspell is much faster, even when written in Perl!).
If anybody is curious, I learned the following information from Melingo and/or Prof. Choueka (I don't remember who exactly, but I think both agree with it): <off-topic, academic stuff> If "Tzurot" is the term for the number of all the variations of Hebrew words, including all the combinations of prefixes AND SUFFIXES (which Nadav and Dan didn't count), then there are about 70 million Tzurot in Hebrew. There are many Tzurot that have never been used. Some of them may become popular in the future, so it is not a good idea to just harvest zillion Hebrew texts and corpuses, and build a dictionary of everything. Such a dictionary will suffer from: 1. Still many tzurot will not be included in it. 2. If it is really big, it will include many mistakes. </off topic, academic stuff> -- Eli Marmor [EMAIL PROTECTED] CTO, Founder Netmask (El-Mar) Internet Technologies Ltd. __________________________________________________________ Tel.: +972-9-766-1020 8 Yad-Harutzim St. Fax.: +972-9-766-1314 P.O.B. 7004 Mobile: +972-50-23-7338 Kfar-Saba 44641, Israel ================================================================= To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
