I am currently exporting word frequencies for all languages I have collected over the years.
These frequency lists are 'dirty', which means there has been done no check if words are correct. That will be handled by the the speller anyway. Spell checker maintainers could also use it for input.. The counting has also been done case and accent-specific. So Maxima, maxima and Máxima (Uppercased plural of 'the most', 'the most', and the name of our queen are counted separately. The data is completely utf8 encoded. The gaia header is a little bit off; I had no description for the language at hand, and ther eis no real use for a version. But there should be no problem using the data to add frequency classes to Morfologikspeller for LT. If you can spare a bit of computer time on a workstation, you could help collecting this kind of data by running a tiny java application: data.taaltik.nl/tool/TaalTik.jar Is there enyone that could help me with figuring out an appropriate license? (I don know anything about those..) Ruud ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel