> On 2014-10-27 10:26, R.J. Baars wrote: > >> I was able to make a file though. It is 3 Mb uncompressed. >> >> You can download it from dev.taaltik.nl/is.okay.zip > > Thanks, what was the exact command you used to create this list?
Multiple. And manual editing. I first changed it into utf-8; I removed the po: flags I changed the tab chars into spaces Then I unmunched. I used sed to remove the trailing flags, which are created, as well as trailing numbers Then I added my own collection of icelandic words And finally I used hunspell -G to generate an accepted list from it. There is another trick to enhance it a bit more; You could throw all collected words of all languages through Hunspell using Icelandic, and catch the suggestions to add those to the list again. I did not do that. I gave a go at galician, but that one is even worse. I am quite sure it is not as good too, since I sam quite some Dutch proper names in it.... Nevertheless, I can generate something Catalan using Spanish and Portuguese as input, catching the suggestions and use them as words. It is the best I can do with this set... Ruud > > Regards > Daniel > > > ------------------------------------------------------------------------------ > _______________________________________________ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel > ------------------------------------------------------------------------------ _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel