[native-lang] Re: [l10n-dev] Re: [website-dev] Re: [l10n-dev] Site and tools infrastructure : irc meeting proposal

2007-10-16 Thread Marcin Miłkowski

Kazunari Hirano pisze:

Hi,

On 10/16/07, Rafaella Braconi [EMAIL PROTECTED] wrote:

My suggestion: October 20-21 from 13 h UTC

+1
I prefer Saturday to Sunday :)


+1 for Saturday

(We have parliamentary elections in Poland on Sunday, and they're pretty 
important, so I won't be available).


Regards,
Marcin

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[native-lang] Re: [lingu-dev] Re: [native-lang] Status update season!

2006-12-22 Thread Marcin Miłkowski

Hi Lars, and all,


The current German dictionary maintained by Björn Jacke has 80,000 
basic forms which expand to 300,000 variations, for a factor of 
3.75.  Swedish/Danish/Norwegian have the same way to form basic 
words (with compounds) as German.  Basic words can often be 
translated syllable by syllable, so the number of basic forms 
should be about the same. But the Scandinavian languages use 
endings instead of the definite article (the/der/die/das), 
resulting in a larger number of expanded variations.


If we're into statistics, then the Polish dictionary has something like 
3.5 million expanded forms, and about 300.000 base forms. The quality of 
the dictionary is excellent.


How was that achieved? Simple, set up a local scrabble-like community 
and develop a scrabble dictionary using scrabble players linguistic 
competence. It's incredibly efficient.


Then you simply tweak the Scrabble dict to your needs (like removing 
rare and confusing forms).


I recommend this kind of technique to all l10 teams and dict developers. 
Look at www.kurnik.pl to see how the site is managed, and in 
www.kurnik.pl/dictionary there is some info on the dict.


Best regards, and happy holidays,
Marcin

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]