Daniel Naber <daniel.na...@languagetool.org> wrote:

> Hi,
>
> yesterday I tried to update the English dictionary that LT includes. The
> details are documented at
> https://github.com/languagetool-org/languagetool/issues/329 but in a
> nutshell: our spell checking is so complicated that the dictionary
> update didn't work.
>
> We could really need a process that allows us to use hunspell
> dictionaries directly, without conversion to other formats. The original
> reason we don't use hunspell (or only parts of it) is that it's slow,
> especially when it comes to generating suggestions. Today I ran a test
> with hunspell 1.4.1 and LT, and it turns out LT is about 4-5 times
> faster.
>
> What could be a solution:
>
> A) Improve hunspell to be faster. We'd need someone who can do this and
> then we'd still rely on native code, which isn't what we want in Java
> (but we've lived with it for years now).
>
> B) Finally write a Java-based spell checker that can read hunspell
> dictionaries. The internet is full of spell checkers, but we need one
> with support for advanced features like compound words (important for
> German).
>
> C) I don't know, do you have an idea?
>
> If we cannot find a solution, the current situation will persist so that
> some dictionaries probably won't be updated.


If Hunspell is thread-safe (?), could we search for suggestions of
multiple words in parallel in multiple threads?

Dominique

------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to