Op Vrydag 03-08-2007 om 13:18 uur [tijdzone +0200], schreef Stefan
Baltzer:
> Thank you for the hint, Daniel.
> 
> The first part of QAing them will indeed be to get an overview:
>   - What dicts do we have?
>   - What dicts are "maintained", who is responsible?
>   - Which one is the latest and/or greatest
>   - What are the differences within one language (i.e. Dutch "green" vs. 
> "white", German "old vs. new"...)
>   - How do we get rid of "outdated" ones?
>   - What "quality level" do they have? This is 
> yet-to-be-defined-for-each-language, i.e. amount of "errors", 
> usefullness of proposals, memory and speed performance...
> 
> ... since "QAing dictionaries" is no "industrial standard procedure", we 
> must find a way to get some kind of "structure" to do so.
> It will be interesting to get "linguists" into QA work.
> I heard that some native lang teams have a more or less strict 
> separation between localisation and QA.
> 
> I am looking forward to work in this "grey zone" this fall :-)
> 
> Regards
> Stefan


Hi Stefan

About the 11 official languages of South Africa, the information is
summarised here: http://translate.org.za/content/view/1610/54/

This page mentions the quality and should give a link to the newest
version.

Pavel helped a lot recently to get the Afrikaans checker into the
official builds, but the others are only hosted by ourselves.

translate.org.za maintains all of these, although many of them really
hasn't received much attention at all yet. Basically all the non-English
languages needed hunspell features to some extent, so at least
improvement is possible now, but resources are scarce. For some of them
really big morphology work is necessary and up to now we just haven't
been able to dedicate time to it.

You can contact me if you need any information on any of these 11
checkers.

Although there has been some orthography changes for some of the
languages, this doesn't need to be considered while our support for
either orthography is not all that good.

We also have the initial files for a Swahili checker in our version
control, but I'm not aware of any work on that since it was initially
created. 

All these languages use our unified build system for maintaining a
single word list and from that creating OpenOffice.org packs and
Mozilla .xpi files (also some support for aspell and ispell). Recently
there was some progress in getting hunspell into the Mozilla projects,
so hopefully we'll see it as part of Firefox 3 and Thunderbird 3.

About a QA procedure:
Somebody did a review of some Afrikaans spell checkers a while ago
(including ours) and devised a few simple metrics. It is written in
Afrikaans, but I can help with the basic idea or put you in contact with
the author.

While developing the Afrikaans hyphenation, I used some simple metrics
to track my progress, but nothing rigorous.  I might not even have the
exact scripts anymore, but can share the ideas if you are interested.

About thesauri I don't know - we haven't even gotten to that yet.

Keep well
Friedel

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to