> On 2014-10-27 10:26, R.J. Baars wrote:
>
>> I was able to make a file though. It is 3 Mb uncompressed.
>>
>> You can download it from dev.taaltik.nl/is.okay.zip
>
> Thanks, what was the exact command you used to create this list?

Multiple. And manual editing.

I first changed it into utf-8;
I removed the po: flags
I changed the tab chars into spaces
Then I unmunched.
I used sed to remove the trailing flags, which are created, as well as
trailing numbers
Then I added my own collection of icelandic words
And finally I used hunspell -G to generate an accepted list from it.

There is another trick to enhance it a bit more; You could throw all
collected words of all languages through Hunspell using Icelandic, and
catch the suggestions to add those to the list again.

I did not do that.


I gave a go at galician, but that one is even worse. I am quite sure it is
not as good too, since I sam quite some Dutch proper names in it....

Nevertheless, I can generate something Catalan using Spanish and
Portuguese as input, catching the suggestions and use them as words.
It is the best I can do with this set...



Ruud







>
> Regards
>   Daniel
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>



------------------------------------------------------------------------------
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to