Hi Lars, and all,

The current German dictionary maintained by Björn Jacke has 80,000 basic forms which expand to 300,000 variations, for a factor of 3.75. Swedish/Danish/Norwegian have the same way to form basic words (with compounds) as German. Basic words can often be translated syllable by syllable, so the number of basic forms should be about the same. But the Scandinavian languages use endings instead of the definite article (the/der/die/das), resulting in a larger number of expanded variations.

If we're into statistics, then the Polish dictionary has something like 3.5 million expanded forms, and about 300.000 base forms. The quality of the dictionary is excellent.

How was that achieved? Simple, set up a local scrabble-like community and develop a scrabble dictionary using scrabble players linguistic competence. It's incredibly efficient.

Then you simply tweak the Scrabble dict to your needs (like removing rare and confusing forms).

I recommend this kind of technique to all l10 teams and dict developers. Look at www.kurnik.pl to see how the site is managed, and in www.kurnik.pl/dictionary there is some info on the dict.

Best regards, and happy holidays,
Marcin

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to