Re: [translate-pootle] rename and clone language

2012-06-09 Thread Christian PERRIER
Quoting Chris Leonard (cjlhomeaddr...@gmail.com):

These examples are indeed the most common exceptions. Let's add my
experience about them.

 1)
 en, en_US and en_GB are commonly found because
 a) there are a number of distinct spelling differences (orthography)
 that you do not see with Spanish.
 b) the burden of maintaining the translations is actually quite minimal.
 In Gnome, there is even a PERL script that takes a first pass at this
 for you, with pretty good results.

The main problem here is deciding about the *original*
strings. Depending on the maintainer of the software, they're either
en_US, en_GB or, most often en_airport (the same English I'm writing
right now...:-)).

Standardi{z|s}ing on one of both (not to mention en_AU, en_ZA, etc.)
is always a good idea, but requires a quite good knowledge of English,
most often available to native speakers only.

In Debian, we try to standardize on en_US when we can (indeed, when we
have someone reviewing the texts). Not because it's better or
whatever, but mostly because it seems to be the most widely used, so
that requires less modifications. We eve have a debian-l10n-english
mailing list aimed at encouraging eviews of texts, with a few
volunteers trying to help package maitnainers, documentation writers,
our publicity team, etc. to get good texts out.

And, nearly always we don't have any en_GB translation. Probably
because the often very technical texts we have do not require this.


 2)
 zh_CN and zh-TW are in fact quite different, you never see just zh (in
 my experience).   When present, both zh_CN and zh-TW typically get
 well maintained.  Occasionally one will also see zh_HK, but it is less
 common.

Those both are indeed different variations in *written* forms of the
Chinese language*s* (Simplified, used mostly in mainland China and
Traditional used in Taiwan, Hong-Kong, Singapore, etc.). Actually we
should use zh@simplified and zh@traditional (modifiers) rather than zh_CN for
Simplified and zh_TW for Traditional (country variants).

Using country variants instead of modifiers for Chinese is common
practice (we used it in Debian) but actually bad practice as it leaves
people using zh_GK or zh_SG out of the game (they have to add zh_TW as
alternative in their locale settings).


 
 3)
 pt_PT and pt_BR are very frequently found.  I don't know the
 distinctions well, but when present, they both get well maintained.

Yet another difference established by common practice. Indeed,
Brazilian is different enough from Portuguese to nearly warrant its
own ISO code, which would make things better.

Most software use pt.po and pt_BR.po files. A mistake would be using
pt_BR and pt_PT as continental Portuguese is also used in former
Portuguese colonies and there are locales for some of them.


 
 4)
 Occasionally one will see de_DE and de_CH, although much more rarely,
 and generally with much less completeness on de_CH

I've seen very few of these and I often try to discourage them. From
what I heard of German and Swiss fellows, the difference is more is
spoken language than written one.

The same stands for French, where the written language is standardized
over French-speaking countries with few enough variants (the most well
known is the way to say 7x, 8x, 9x numbers, between France and
Belgium/switzerland).


 
 5)
 fa and fa_AF seems to be an important distinction as well (Iranian
 Persian vs Afghani Dari).

I have not enough experience about these. In Debian we have a few
Persian translations and all of them us fa.


Another common variant case is bn_IN and bn_BD. I'm having very hard
times understanding if that's for real reasons of for political
reasons. Unfortunately, politics often enters such things and
everything related to languages and countries becomes sensitive one
day or another.

My last story about this are the two variants of Serbian : ekavian and
ijekavian. I still remember a meeting at last Debian conference (in
Banja Luka, Republika srpska, part of federation of Bosnia and
Herzegovina) where I was dropped into something that was looking like
Dayton negotiations in the 90's, between Serbs from Serbia and Serbs
from Republika Srpska. Both wanted to do their own work and, believe
me, you really don't want to be in the middle of this..:-)

As a result, we now have sr.po translations for Serbianekavian
(the form used in Serbia) and sr@ijekavian for ijekavian (the form
used in the Serb part of Bosnia). My proposal to use sr_BS was
completely ruled out by localsfor political reasons (even though
they are part of Bosnia, they don't feel like this). My proposal to
use bs (Bosnian) which is also existing was also ruled out even
though bs is actually exactly s...@ijekavianbut only used by
people in the muslim part of Bosnia. The conclusion of this meeting
was Welcome to the Balkans.

Interesting game, isn't it?



--
Live Security Virtual Conference
Exclusive 

Re: [translate-pootle] rename and clone language

2012-06-09 Thread Michael Bauer

09/06/2012 07:05, sgrìobh Christian PERRIER:
 Those both are indeed different variations in *written* forms of the 
 Chinese language*s* (Simplified, used mostly in mainland China and 
 Traditional used in Taiwan, Hong-Kong, Singapore, etc.). 
Agreed, _HK actually would only make sense if there was a yue_HK locale.
 I've seen very few of these and I often try to discourage them. From 
 what I heard of German and Swiss fellows, the difference is more is 
 spoken language than written one. 
It's a bit like the crazy situation in China/HK, spoken and written 
Swiss German being a linguistically distinct language but because they 
can't agree on a single written standard and because for historical 
reasons Standard German is taught in all schools (in the German part), 
they're quite capable of using de_DE.

 Another common variant case is bn_IN and bn_BD. 
I suspect that's a bit like the difference between Urdu and Hindi which 
are structurally very similar but (simplyifing vastly) Hindi preferring 
Sanskrit derived roots whereas Urdu prefers Persian/Arabic derived roots.

On the other hand, if history is anything to go by most emigrant 
varieties of a language eventually drift to such an extent that a split 
occurs. I'm no judge of how much South American Spanish has drifted but 
from what I know of Portuguese _BR and _PT, I suspect it may have 
drifted considerably, but not necessarily in obvious ways. The 
vocabulary may remain the same but the semantics may have drifted and 
the choice of preferred tenses/forms may also have drifted, both of 
which are less clear than ordenador vs computador to the outsider but 
may be very noticeable to the native.

Being practical, splitting up South American Spanish into a dozen 
locales may be a little over the top though, I somehow can't see the 
written language having drifted THAT much in each country. But how about 
using some suffix for general South American Spanish?

Just a thought.

Michael

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle


Re: [translate-pootle] rename and clone language

2012-06-09 Thread Leandro Regueiro
On Sat, Jun 9, 2012 at 11:22 AM, Michael Bauer f...@akerbeltz.org wrote:

 09/06/2012 07:05, sgrìobh Christian PERRIER:
 Those both are indeed different variations in *written* forms of the
 Chinese language*s* (Simplified, used mostly in mainland China and
 Traditional used in Taiwan, Hong-Kong, Singapore, etc.).
 Agreed, _HK actually would only make sense if there was a yue_HK locale.
 I've seen very few of these and I often try to discourage them. From
 what I heard of German and Swiss fellows, the difference is more is
 spoken language than written one.
 It's a bit like the crazy situation in China/HK, spoken and written
 Swiss German being a linguistically distinct language but because they
 can't agree on a single written standard and because for historical
 reasons Standard German is taught in all schools (in the German part),
 they're quite capable of using de_DE.

 Another common variant case is bn_IN and bn_BD.
 I suspect that's a bit like the difference between Urdu and Hindi which
 are structurally very similar but (simplyifing vastly) Hindi preferring
 Sanskrit derived roots whereas Urdu prefers Persian/Arabic derived roots.

 On the other hand, if history is anything to go by most emigrant
 varieties of a language eventually drift to such an extent that a split
 occurs. I'm no judge of how much South American Spanish has drifted but
 from what I know of Portuguese _BR and _PT, I suspect it may have
 drifted considerably, but not necessarily in obvious ways. The
 vocabulary may remain the same but the semantics may have drifted and
 the choice of preferred tenses/forms may also have drifted, both of
 which are less clear than ordenador vs computador to the outsider but
 may be very noticeable to the native.

 Being practical, splitting up South American Spanish into a dozen
 locales may be a little over the top though, I somehow can't see the
 written language having drifted THAT much in each country. But how about
 using some suffix for general South American Spanish?

 Just a thought.

I am not quite sure about how this really is, but if you search a
little you will find out that the most common locales for spanish are
es, es_AR, es_MX, es_ES in this order and then you may found es_VE,
es_CO or even es_CL which I only have seen once in my life for each of
this latest locales. The es_ES locale appears mainly when another
spanish locale is used for localization but it is also common to see
only es instead of es_ES.

Bye

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle