2013/11/26 Daniel Naber <list2...@danielnaber.de>

> On 2013-11-26 15:27, Jaume Ortolà i Font wrote:
>
> > Look at these wordlists [1]. They are Apache 2.0. The words are
> > classified in 256 ranges.
>
> > [1]
> https://github.com/mozilla-b2g/gaia/tree/master/keyboard/dictionaries
>
> The German one looks okay. Unless the quality is a serious problem for
> some language, I'd suggest to simple use these lists.


I think the quality won't be a serious problem (even with the tokenization
differences in Catalan). The goal is just to avoid common words (usually
short ones) being hidden by dozens of other uncommon words in spelling
suggestions. So these wordlists seem good enough.

Now, we need Marcin to say something about how to add this data to the FSA
dictionaries. I guess we just need to add an extra field (with a separator)
after the POS tag.

Regards,
Jaume
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to