On Sun, 15 Dec 2002, Nadav Har'El wrote:

> On Sun, Dec 15, 2002, Oded Arbel wrote about "Re: Announcement: free Hebrew 
>Spell-Checker":
> > Can you please expand on the reasons you chose not to incorporate the
> > generated word lists as a language package to some existing spell checker
> > (such as myspell or aspell) and thus making it immidietly useful for "end
> > users" ?
>
> During our tests, we got both ispell and aspell to work with our word lists.
> However, this approach proved both impractical and limited because:
>
>  1. Aspell does not (or at least we didn't figure out how to) support prefixes,
>     so instead of a 125,000 word word list (in this release) we had to multiply
>     this by the number of prefixes (he, shin, etc. - about 20 prefixes in all)
>     and the resulting over-million-word list took ages to load into aspell
>     (hspell is much faster, even when written in Perl!).

  However, you could suggest a *patch* to aspell which will replace the
word-checking routine for Hebrew....

  BTW - Any plans of creating a CPAN module Lingua::HE::Spell (or the
likes)? If so, I suggest an option of tying a hash to check words (in
addition to a seperate function), that way programs based on simple hashed
wordlists will still work with minimal change.

>  2. In hspell I could add our home-brew code which (for example) recognizes
>     acronyms and correct gimatria, while adding it into aspell will require
>     a lot of cooperation with the aspell project.
>
> You must understand that out of the time Dan and I spent on this project,
> only about 1% (!) went into writing the "hspell" program (a Perl script,
> actually). 99% of the work went into writing the inflection programs for
> nouns and verbs, and building the dictionaries, and this was the actual
> important work, work that nobody has done before us (in Hebrew and free).

  Did you look at the work of Erel Segal
(http://www.cs.technion.ac.il/~erelsgl) and his morphologial analyzer?

[snip]

> BTW, if you look at our TODO, you'll see that one of the plans for some
> future release is to write a C library for interfacing with the word lists;
> This C library, once written (by us or by someone else), could be used
> from aspell, pspell, OpenOffice, kword, or whatever.

  Do you intend to LGPL or GPL the C library? Also, don't forget a PERL
module too!

  Alon

-- 
This message was sent by Alon Altman ([EMAIL PROTECTED]) ICQ:1366540
The RIGHT way to contact me is by e-mail. I am otherwise nonexistent :)
--------------------------------------------------------------------------
 -=[ Random Fortune ]=-
I program, therefore I am.

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to