In May [1] I wrote about the status of Finnish spellchecking, and plans about 
creating a WWW application for editing spellchecking dictionaries. The status 
of Finnish spellchecking is currently quite good, and our work is now 
concentrating around the WWW application for maintaining and further 
developing the vocabulary.

A test version of the WWW application "Joukahainen" is now available at
http://joukahainen.lokalisointi.org/
While the current version more or less works, it is still unfinished and 
unfortunately only available in Finnish and most of the interesting features 
require an user account to be able to test them, so I do not expect that test 
installation to be of much use for most of the readers on this list. But I 
have now reached the point where it would be interesting to internationalise 
the application to be usable for other languages as well. I expect to be able 
to do this by the end of this month. This means that I will be translating 
the application to English and will replace the Finnish data with the 
contents from a current English hunspell dictionary file. If there are any 
language teams that are already interested in using this application for 
maintaining their dictionaries, I can provide some instructions on what to do 
and where to start any time, just ask.

The current feature set includes a word editor that can be used to add string 
attributes, flags and alternative spellings to any words. All edit actions 
are logged and comments can be added in similar way than in typical Bugzilla 
installations (the process is just a little less complicated). Words can be 
added, either manually or by first storing a list of candidate words in the 
database which will be used to pre-fill the word entry form (I have received 
some test material from the language recognising web crawler by Kevin 
Scannell and this feature is build around that kind of data). Once the data 
is in the database, creating a spellchecking dictionary can be done by just 
writing an exporter for the particular spellchecker. For hunspell this will 
be quite easy.

For use in languages that only need simple affix flags associated with each 
word this system may seem overly complex. I will try to make it easy to use 
in such cases as well. The main benefit of the system is that it allows 
distributed editing of the word list in a way that records the editing 
history and allow anyone to review the changes.

I can later write more when things start to get ready and I have something 
working to show in English as well.

Harri

[1] http://lingucomponent.openoffice.org/servlets/ReadMsg?list=dev&msgNo=1806

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to