Re: Important News

Nadav Har'El Mon, 24 Feb 2003 01:19:04 -0800

On Mon, Feb 24, 2003, Eli Marmor wrote about "Re: Important News":
> hspell will have to wait...
> (in any case, hspell is still in the process of being ported to C, and
> as long as it isn't ready, I can't integrate it).


Note that a version of the hspell front-end in C can be probably ready in
less than 2 weeks, if I shift my focus to it (or if someone else does it).
The major algorithms are ready (with a 50-fold decrease in start-up time
and 6-fold decrease in memory use). The most complicated thing I'm sitting
on now is how to specify which words can sensibly take which prefixes (this
was a feature missing in the Perl version, and I don't want the C version
not to have it; If push comes to shove, I can write a C version without this
feature).

Anyway, our current focus is on expanding the dictionary. This is very
important because if we assume that words are distributed in a power-law
distribution (and I think this assumption is close to being true), then
doubling the number of words in the dictionary *squares* the chance of
false-negative (not recognizing correct words). So if release 0.1 had 120,000
inflections, and in a typical document (based on some experiment I did) 10% of
the different correct words in the document were not recognized, then doubling
the number of words to 240,000 (we're nearing that mark) should hopefully
reduce the false-negatives to 1%.

Please tell me when you are getting serious about integrating hspell with
OpenOffice, and I *will* shift my focus to finishing the C interface. Anyway,
I guess that you'll some some Open-Office-specific work to do to support
a Hebrew spellchecker (any spellchecker), so you don't have to wait for
my side of the work to be finished before you start yours.

-- 
Nadav Har'El                        |      Monday, Feb 24 2003, 22 Adar I 5763
[EMAIL PROTECTED]             |-----------------------------------------
Phone: +972-53-245868, ICQ 13349191 |Ways to Relieve Stress #10: Make up a
http://nadav.harel.org.il           |language and ask people for directions.

Re: Important News

לענות