Op Vrydag 03-08-2007 om 13:18 uur [tijdzone +0200], schreef Stefan Baltzer: > Thank you for the hint, Daniel. > > The first part of QAing them will indeed be to get an overview: > - What dicts do we have? > - What dicts are "maintained", who is responsible? > - Which one is the latest and/or greatest > - What are the differences within one language (i.e. Dutch "green" vs. > "white", German "old vs. new"...) > - How do we get rid of "outdated" ones? > - What "quality level" do they have? This is > yet-to-be-defined-for-each-language, i.e. amount of "errors", > usefullness of proposals, memory and speed performance... > > ... since "QAing dictionaries" is no "industrial standard procedure", we > must find a way to get some kind of "structure" to do so. > It will be interesting to get "linguists" into QA work. > I heard that some native lang teams have a more or less strict > separation between localisation and QA. > > I am looking forward to work in this "grey zone" this fall :-) > > Regards > Stefan
Hi Stefan About the 11 official languages of South Africa, the information is summarised here: http://translate.org.za/content/view/1610/54/ This page mentions the quality and should give a link to the newest version. Pavel helped a lot recently to get the Afrikaans checker into the official builds, but the others are only hosted by ourselves. translate.org.za maintains all of these, although many of them really hasn't received much attention at all yet. Basically all the non-English languages needed hunspell features to some extent, so at least improvement is possible now, but resources are scarce. For some of them really big morphology work is necessary and up to now we just haven't been able to dedicate time to it. You can contact me if you need any information on any of these 11 checkers. Although there has been some orthography changes for some of the languages, this doesn't need to be considered while our support for either orthography is not all that good. We also have the initial files for a Swahili checker in our version control, but I'm not aware of any work on that since it was initially created. All these languages use our unified build system for maintaining a single word list and from that creating OpenOffice.org packs and Mozilla .xpi files (also some support for aspell and ispell). Recently there was some progress in getting hunspell into the Mozilla projects, so hopefully we'll see it as part of Firefox 3 and Thunderbird 3. About a QA procedure: Somebody did a review of some Afrikaans spell checkers a while ago (including ours) and devised a few simple metrics. It is written in Afrikaans, but I can help with the basic idea or put you in contact with the author. While developing the Afrikaans hyphenation, I used some simple metrics to track my progress, but nothing rigorous. I might not even have the exact scripts anymore, but can share the ideas if you are interested. About thesauri I don't know - we haven't even gotten to that yet. Keep well Friedel --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
