Hello,

In a recent message I proposed to help have Verbiste conjugate in your
own language, with little success I must confess. Today I'd like to
propose you to help Tanglet [1] fully work in your language. Tanglet is
a game in which you have to form as many words as possible from a random
set of letters placed in a grid. Tanglet checks user's input thanks to
lists of words and can also show you in the end all the words that were
possible using these lists.

[1] http://gottcode.org/tanglet/

While the interface translation is easy because already available on
Transifex, this doesn't solve the problem that the current version of
Debian Squeeze and then of our development version, can only check and
propose words in Czech, English and French. I checked more recent
versions of Tanglet and saw only 3 additional languages: Dutch, German
and Hebrew. These languages also require to adapt the lists of words
because their format is a bit more complex than it was in our version of
Tanglet.

So I decided to write a script that makes the list of words from
standard dictionaries available in Linux. As I don't like manual
operations, the script does everything needed, from installing the
dictionary package to producing the final, clean list of words. I
contacted the upstream author to tell him about my script and he, in
turn, gave me additional scripts to make lists in the new format (not
useful for us yet) and to make the dice of letters used to generate the
letters grid.

Using this, I've been able to generate the following new languages:
Greek, Spanish, Italian, Norwegian Bökmal, Portuguese (PT and BR),
Romanian, Russian. Note that we can generate lists for non Latin
alphabets without any problem. I uploaded them onto our server [2], you
can test them in Tanglet by copying the full language directory to the
proper Tanglet data directory [3]. The purpose is to play with the game
in your language and tell me if this seems to work well. NB: training is
required if you want to reach high scores!

[2] http://media.doudoulinux.org/tanglet/
[3] /usr/share/tanglet/data/

However, before going too far, discussions with the upstream author
showed that we have to answer several questions for each language:

* how can we find proper nouns, not allowed in Tanglet (using uppercase
letters?)
* the same for abbreviations (using dots?)
* how to deal with accentuated letters or any other letter decoration
(switch them all to their non decorated one?)

Currently I've discarded any word containing dots, commas and
separators, as well as words containing uppercase letters. Moreover all
accentuated/decorated letters are removed using the tool unaccent that
can only replace them all. If your language requires different rules,
please tell me. If you want me to generate a language that is not on our
server yet in order to test, please tell me too!

Looking forward to reading you,
-- 
Cheers,
JM.

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Doudoulinux-dev mailing list
[email protected]
https://mail.gna.org/listinfo/doudoulinux-dev

Reply via email to