Benjamin Kampmann wrote:
Hi there.

First of all, if you have questions about i18n or want to blame someone
for its bad behaviour, just point to me. I'm the guy, who designed it
and the one that also decided on the 3-letter code thingy.

Hi Benjamin,

I just joined the mailing list so I didn't have have access to your original mail so followed up on Philippe's mail instead, sorry about that. And no, I'm definitely not out to blame anybody. I just wanted to make sure you knew about Babel, since it's a fairly new project.


As I can see, this 3-Letter-Code decisions needs a lot of explanation.
So I'll write a document about it and will link it to the
3-letter-Code-Page on the Wiki. I get tired to explain it.

Okay. let's start with your remarks.

-------- CUT --------
Hi,

I'm glad to see that Elisa is getting i18n support again. Have you guys
considered using Babel[1] instead of "just" the stock gettext module? I
might be a bit biased but I think it provides a few interesting
advantages:

 * Not only for message translation. Babel gives access to most of the
CLDR[2] locale data database. This include things like
date/number/currency formatting, locale specific names for currencies,
countries, languages etc.
During my recherche for the new system, I also saw this one. But the lot
advantages it has are not used in elisa. Not now and maybe never. For
the number and currency stuff, we have also the python internal stuff
[1]. I'm not sure, if we really need it anyway, but I've it in my
mind ;).

I don't know enough about your plans for Elisa in the future. But if you're planning to add PVR support you will probably need to know a lot of locale specific data to properly localize the recording scheduler. Not only easy stuff like how to translate weekdays and date/time formatting. But also things like "does the week start on Monday or on Sunday in this locale".


 * A pure python implementation. Does not require GNU gettext for
message
   extraction or catalog compilation. Which is usually not available on
   non-linux platforms.
We are not using gnu gettext. We are using a gnu gettext compatible
python implementation [2]. We are only using python code what means that
it is existing on all python systems. For the extracting we are shiping
the pot-files today (normally the user does not have to do it on its
own) for a release.

True, I was talking about the message extraction part. But you're right, that's only an issue for non-Linux developers.


 * gettext compatible api.
well, aren't we?

Yes, I was just pointing out that Babel uses the same api as you currently use, so no need to "rewrite everything".



 * Supports message extraction from python, genshi and glade by default
   but more formats can be added easily using plugins.
no glade, no genshi needed. we only have python.

You see there is no need for such a system here. Especially if it needs
some more libs or modules for the user to be installed (at least on
feisty). We have the python implementation, which is (at this point)
good enough for us.

True, it's up to you to decide if the benefits are enough to add an additional dependency. As I said before I just wanted to make sure you were aware of Babel and the additional benefits it provides compared to the gettext+locale python modules.



[1]: http://babel.edgewall.org/
[2]: http://unicode.org/cldr/

my 2 cents ;) :
[1] http://docs.python.org/lib/module-locale.html
[2] http://docs.python.org/lib/module-gettext.html


Btw, why are you using a 3-letter language code instead of the more
common
language_TERRITORY? Will this not make it impossible to have different
translations for for example UK and American English (en_UK, en_US)?
I get always this example. Because it is nearly the only one that fits.
What about prussian? I mean your argumentation is not explaining, why
there is at least this 3-Letter code existing. If you could get all
languages with this 2-letter thing, why should there be this ISO-639-3
anyway?

It is very simple: because you won't get all the languages with this
code. In germany itself for example I know without looking up at least 6
different kind of languages. Every one is very different and can be
understood as its own language. And that is exactly what this ISO-639-3
is about: for _all_ languages.

We are always thinking about 'my grandma' as the ultimate user for our
system. And for my grandma it would be a killing feature if the
multimedia system would speak with her in prussia, a language she spoke
before the 2nd World War as she was a child. Even if there is no
prussian translation existing yet, we at least wanted to have a system
that is able to do it.

I my opinion, we should generally more use the 3-Letter-Codes. And btw.
when looking at babel, I wasn't sure if that system offers the
possibility to use the 3-Letter code.

I hope, it is now easier to understand.

Well, I have no idea what the gettext language code would be for prussian, but I'm sure there is one. The language_TERRITORY convention used by GNU gettext does not always use 2-letter codes (ISO-639). For more rarely used languages a 3-letter code (ISO-639-2) code is used instead. So as far as I can tell this convention also covers all languages. Another important detail is that this convention also allows a TERRITORY to be specified, which is very important to be able to determine which locale data to use.

http://www.gnu.org/software/gettext/manual/html_node/Language-Codes.html
Another thing. Wouldn't the use of 3-letter codes force you to have to maintain some kind of translation table between 3-letter codes and the language_TERRITORY convention used by the operating system in order to set the locale and to detect the locale and language used by the operating system?

Cheers,
Jonas

Reply via email to