eleonora46 wrote:

> If both the above are true, then the spell checker 
> did a really good work.

Did you try to compute these numbers for your own German 
dictionary, and compare it to the other German dictionaries from 
Björn Jacke or Franz Michael Baumann?  German is one of few 
languages where more than one free dictionary is available, so it 
could be a good test case.  Since you continue to work in 
parallel, I guess each of you are convinced that you do a better 
job than the others?  How do you measure or compare this?

German is a good test case also for another reason: Many people in 
Europe (such as me) know it as their 3rd language, after their 
native language and English.

> The recognition of obscure words is more the area of grammar checkers,
> they should mark obscure words being similar to often used,
> mispelled words.

This note on obscure words connects to what Kevin wrote:

> > cases, like the obscure word "yor" in English, should clearly 
> > not be included since they are most likely to be a misspelling 
> > of a common word.

It seems we would need statistics on how common "yor" (or should 
that be yore?) is in its right use and how common it is as a 
misspelling of "your" (or you're).  It is easy enough to find 
statistics on word frequencies, but how or where can we find stats 
on errors?  A simple Google search finds 2.59 billion "your" and 
4.17 million "yore", but I cannot tell which of the "yore" 
occurrences are errors.  There are also 4.37 million (!) hits for 
"yor" but they seem to be a film title, a surname, various company 
names and the ISO language code for Yoruba.  The first obvious 
error usage I find is "all yor base r blong 2 us", which is 
apparently stylistic and not a mistake.

One idea for finding stats on errors is to compare changes made to 
Wikipedia articles.  The complete text revision history is 
available from download.wikimedia.org.  All you need is to step 
through the changes and make statistics for all the small changes 
such as "yor" being changed to "your".  Has anybody done this?

Another idea is to make OpenOffice.org report all corrections made 
by users worldwide to some centralized database.  I guess this 
would conflict with users' interest in their own privacy.


-- 
  Lars Aronsson ([EMAIL PROTECTED])
  Aronsson Datateknik - http://aronsson.se

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to