Re: [translate-pootle] rename and clone language
On 06/09/2012 03:05 AM, Christian PERRIER wrote: Interesting game, isn't it? I think it's fascinating. The politics of software development :-) I think that information is rather valuable, and it can be useful to put it on some wiki page somewhere. Some kind of How to choose your languages link from http://en.wikipedia.org/wiki/Pootle Michiel -- Michiel Dethmers mich...@phplist.com http://www.phplist.com Open Source newsletter manager -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Translate-pootle mailing list Translate-pootle@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/translate-pootle
Re: [translate-pootle] rename and clone language
Quoting Chris Leonard (cjlhomeaddr...@gmail.com): These examples are indeed the most common exceptions. Let's add my experience about them. 1) en, en_US and en_GB are commonly found because a) there are a number of distinct spelling differences (orthography) that you do not see with Spanish. b) the burden of maintaining the translations is actually quite minimal. In Gnome, there is even a PERL script that takes a first pass at this for you, with pretty good results. The main problem here is deciding about the *original* strings. Depending on the maintainer of the software, they're either en_US, en_GB or, most often en_airport (the same English I'm writing right now...:-)). Standardi{z|s}ing on one of both (not to mention en_AU, en_ZA, etc.) is always a good idea, but requires a quite good knowledge of English, most often available to native speakers only. In Debian, we try to standardize on en_US when we can (indeed, when we have someone reviewing the texts). Not because it's better or whatever, but mostly because it seems to be the most widely used, so that requires less modifications. We eve have a debian-l10n-english mailing list aimed at encouraging eviews of texts, with a few volunteers trying to help package maitnainers, documentation writers, our publicity team, etc. to get good texts out. And, nearly always we don't have any en_GB translation. Probably because the often very technical texts we have do not require this. 2) zh_CN and zh-TW are in fact quite different, you never see just zh (in my experience). When present, both zh_CN and zh-TW typically get well maintained. Occasionally one will also see zh_HK, but it is less common. Those both are indeed different variations in *written* forms of the Chinese language*s* (Simplified, used mostly in mainland China and Traditional used in Taiwan, Hong-Kong, Singapore, etc.). Actually we should use zh@simplified and zh@traditional (modifiers) rather than zh_CN for Simplified and zh_TW for Traditional (country variants). Using country variants instead of modifiers for Chinese is common practice (we used it in Debian) but actually bad practice as it leaves people using zh_GK or zh_SG out of the game (they have to add zh_TW as alternative in their locale settings). 3) pt_PT and pt_BR are very frequently found. I don't know the distinctions well, but when present, they both get well maintained. Yet another difference established by common practice. Indeed, Brazilian is different enough from Portuguese to nearly warrant its own ISO code, which would make things better. Most software use pt.po and pt_BR.po files. A mistake would be using pt_BR and pt_PT as continental Portuguese is also used in former Portuguese colonies and there are locales for some of them. 4) Occasionally one will see de_DE and de_CH, although much more rarely, and generally with much less completeness on de_CH I've seen very few of these and I often try to discourage them. From what I heard of German and Swiss fellows, the difference is more is spoken language than written one. The same stands for French, where the written language is standardized over French-speaking countries with few enough variants (the most well known is the way to say 7x, 8x, 9x numbers, between France and Belgium/switzerland). 5) fa and fa_AF seems to be an important distinction as well (Iranian Persian vs Afghani Dari). I have not enough experience about these. In Debian we have a few Persian translations and all of them us fa. Another common variant case is bn_IN and bn_BD. I'm having very hard times understanding if that's for real reasons of for political reasons. Unfortunately, politics often enters such things and everything related to languages and countries becomes sensitive one day or another. My last story about this are the two variants of Serbian : ekavian and ijekavian. I still remember a meeting at last Debian conference (in Banja Luka, Republika srpska, part of federation of Bosnia and Herzegovina) where I was dropped into something that was looking like Dayton negotiations in the 90's, between Serbs from Serbia and Serbs from Republika Srpska. Both wanted to do their own work and, believe me, you really don't want to be in the middle of this..:-) As a result, we now have sr.po translations for Serbianekavian (the form used in Serbia) and sr@ijekavian for ijekavian (the form used in the Serb part of Bosnia). My proposal to use sr_BS was completely ruled out by localsfor political reasons (even though they are part of Bosnia, they don't feel like this). My proposal to use bs (Bosnian) which is also existing was also ruled out even though bs is actually exactly s...@ijekavianbut only used by people in the muslim part of Bosnia. The conclusion of this meeting was Welcome to the Balkans. Interesting game, isn't it? -- Live Security Virtual Conference Exclusive
Re: [translate-pootle] rename and clone language
09/06/2012 07:05, sgrìobh Christian PERRIER: Those both are indeed different variations in *written* forms of the Chinese language*s* (Simplified, used mostly in mainland China and Traditional used in Taiwan, Hong-Kong, Singapore, etc.). Agreed, _HK actually would only make sense if there was a yue_HK locale. I've seen very few of these and I often try to discourage them. From what I heard of German and Swiss fellows, the difference is more is spoken language than written one. It's a bit like the crazy situation in China/HK, spoken and written Swiss German being a linguistically distinct language but because they can't agree on a single written standard and because for historical reasons Standard German is taught in all schools (in the German part), they're quite capable of using de_DE. Another common variant case is bn_IN and bn_BD. I suspect that's a bit like the difference between Urdu and Hindi which are structurally very similar but (simplyifing vastly) Hindi preferring Sanskrit derived roots whereas Urdu prefers Persian/Arabic derived roots. On the other hand, if history is anything to go by most emigrant varieties of a language eventually drift to such an extent that a split occurs. I'm no judge of how much South American Spanish has drifted but from what I know of Portuguese _BR and _PT, I suspect it may have drifted considerably, but not necessarily in obvious ways. The vocabulary may remain the same but the semantics may have drifted and the choice of preferred tenses/forms may also have drifted, both of which are less clear than ordenador vs computador to the outsider but may be very noticeable to the native. Being practical, splitting up South American Spanish into a dozen locales may be a little over the top though, I somehow can't see the written language having drifted THAT much in each country. But how about using some suffix for general South American Spanish? Just a thought. Michael -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Translate-pootle mailing list Translate-pootle@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/translate-pootle
Re: [translate-pootle] rename and clone language
On Sat, Jun 9, 2012 at 11:22 AM, Michael Bauer f...@akerbeltz.org wrote: 09/06/2012 07:05, sgrìobh Christian PERRIER: Those both are indeed different variations in *written* forms of the Chinese language*s* (Simplified, used mostly in mainland China and Traditional used in Taiwan, Hong-Kong, Singapore, etc.). Agreed, _HK actually would only make sense if there was a yue_HK locale. I've seen very few of these and I often try to discourage them. From what I heard of German and Swiss fellows, the difference is more is spoken language than written one. It's a bit like the crazy situation in China/HK, spoken and written Swiss German being a linguistically distinct language but because they can't agree on a single written standard and because for historical reasons Standard German is taught in all schools (in the German part), they're quite capable of using de_DE. Another common variant case is bn_IN and bn_BD. I suspect that's a bit like the difference between Urdu and Hindi which are structurally very similar but (simplyifing vastly) Hindi preferring Sanskrit derived roots whereas Urdu prefers Persian/Arabic derived roots. On the other hand, if history is anything to go by most emigrant varieties of a language eventually drift to such an extent that a split occurs. I'm no judge of how much South American Spanish has drifted but from what I know of Portuguese _BR and _PT, I suspect it may have drifted considerably, but not necessarily in obvious ways. The vocabulary may remain the same but the semantics may have drifted and the choice of preferred tenses/forms may also have drifted, both of which are less clear than ordenador vs computador to the outsider but may be very noticeable to the native. Being practical, splitting up South American Spanish into a dozen locales may be a little over the top though, I somehow can't see the written language having drifted THAT much in each country. But how about using some suffix for general South American Spanish? Just a thought. I am not quite sure about how this really is, but if you search a little you will find out that the most common locales for spanish are es, es_AR, es_MX, es_ES in this order and then you may found es_VE, es_CO or even es_CL which I only have seen once in my life for each of this latest locales. The es_ES locale appears mainly when another spanish locale is used for localization but it is also common to see only es instead of es_ES. Bye -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Translate-pootle mailing list Translate-pootle@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/translate-pootle
[translate-pootle] rename and clone language
Hello Everyone My translators would like to split off Argentinian from Spanish, in order to account for the language differences. I currently have an es language. http://translate.phplist.com/ So, I'd like to rename es to es_ES and clone es to es_AR, so that the Argentinians continue on the translation from where es is now. Is there an easy way to do this using Pootle? Thanks -- Michiel Dethmers mich...@phplist.com http://www.phplist.com Open Source newsletter manager -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Translate-pootle mailing list Translate-pootle@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/translate-pootle
Re: [translate-pootle] rename and clone language
Quoting Michiel Dethmers (mich...@phplist.com): Hello Everyone My translators would like to split off Argentinian from Spanish, in order to account for the language differences. I currently have an es language. http://translate.phplist.com/ So, I'd like to rename es to es_ES and clone es to es_AR, so that the Argentinians continue on the translation from where es is now. I would not recommend doing this. At least, leave es as is so that users with locales for countries that are neither Argentina nor Spain still have a Spanish translation. Also, I think this is the best way to waste resources by splitting work just because people can't agree on a few words and translations (for Spanish, differences are really minimal and most l10n teams I know have been able to find compromises to avoid fights about computador vs. ordenador). At least, for French, I always fight very hard when I find PO files names fr_CA, fr_CH and *also* fr_FR. So, as a short conclusion: - try to avoid the split - if you can't, don't change es to es_ES -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Translate-pootle mailing list Translate-pootle@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/translate-pootle
Re: [translate-pootle] rename and clone language
On Fri, Jun 8, 2012 at 12:46 PM, Christian PERRIER bubu...@debian.org wrote: Quoting Michiel Dethmers (mich...@phplist.com): Hello Everyone My translators would like to split off Argentinian from Spanish, in order to account for the language differences. I currently have an es language. http://translate.phplist.com/ So, I'd like to rename es to es_ES and clone es to es_AR, so that the Argentinians continue on the translation from where es is now. I would not recommend doing this. At least, leave es as is so that users with locales for countries that are neither Argentina nor Spain still have a Spanish translation. Also, I think this is the best way to waste resources by splitting work just because people can't agree on a few words and translations (for Spanish, differences are really minimal and most l10n teams I know have been able to find compromises to avoid fights about computador vs. ordenador). At least, for French, I always fight very hard when I find PO files names fr_CA, fr_CH and *also* fr_FR. So, as a short conclusion: - try to avoid the split - if you can't, don't change es to es_ES I must agree with Christian. At Sugar Labs, even with major OLPC deployments in Uruguay, Paraguay, Peru. Mexico and elsewhere in the Spanish speaking world, our Spanish L10n community has found common ground in the advantages of maintaining a single lang-es project. Having taken a fairly thorough look around at various translation hosting servers, my sense is that many people begin these country-specific branches with the best of intentions, but they are almost never maintained in the long run. I am entirely in favor of linguistic self-determination, but I do think people over-estimate the variability of the Spanish vocabulary from which software UI's actually draw. If the text concerned cuisine or culture, there would be much more justification for the burden of maintaining the various branches. On the other hand, I am quite interested in developing more country-specific voices for e-speak because spoken Spanish is far more variable, with the yeismo and seismo (for instance). Gnome upstream as an example: Many 100% complete http://l10n.gnome.org/teams/es/ Mostly 0% http://l10n.gnome.org/languages/es_AR/ http://l10n.gnome.org/languages/es_CL/ http://l10n.gnome.org/languages/es_CO/ http://l10n.gnome.org/languages/es_CR/ http://l10n.gnome.org/languages/es_DO/ http://l10n.gnome.org/languages/es_EC/ http://l10n.gnome.org/languages/es_GT/ http://l10n.gnome.org/languages/es_HN/ http://l10n.gnome.org/languages/es_NI/ http://l10n.gnome.org/languages/es_PA/ http://l10n.gnome.org/languages/es_PE/ http://l10n.gnome.org/languages/es_PR/ http://l10n.gnome.org/languages/es_SV/ http://l10n.gnome.org/languages/es_UY/ http://l10n.gnome.org/languages/es_VE/ cjl -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Translate-pootle mailing list Translate-pootle@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/translate-pootle
Re: [translate-pootle] rename and clone language
Thanks both, that's very helpful. Great to be able to learn from others' experiences. It's interesting though. You do often find that things are very different in some languages. Even british english and US english (organisation/organization) or brazilian portuguese and portuguese portuguese, or flemish and dutch. My project already has zh_CN and zh_TW, which I have no idea, but seem very different. Being dutch (dutch dutch that is) speaking mostly (british) english in my daily live and living in Argentina, these issues pop up quite often in my environment. But I do agree that it seems more efficient to combine forces and try to work out the differences. Eventually in some cases, the words need to be made up anyway. I'll throw it in the group, and see what they think. Michiel On 06/08/2012 03:24 PM, Chris Leonard wrote: On Fri, Jun 8, 2012 at 12:46 PM, Christian PERRIER bubu...@debian.org wrote: Quoting Michiel Dethmers (mich...@phplist.com): Hello Everyone My translators would like to split off Argentinian from Spanish, in order to account for the language differences. I currently have an es language. http://translate.phplist.com/ So, I'd like to rename es to es_ES and clone es to es_AR, so that the Argentinians continue on the translation from where es is now. I would not recommend doing this. At least, leave es as is so that users with locales for countries that are neither Argentina nor Spain still have a Spanish translation. Also, I think this is the best way to waste resources by splitting work just because people can't agree on a few words and translations (for Spanish, differences are really minimal and most l10n teams I know have been able to find compromises to avoid fights about computador vs. ordenador). At least, for French, I always fight very hard when I find PO files names fr_CA, fr_CH and *also* fr_FR. So, as a short conclusion: - try to avoid the split - if you can't, don't change es to es_ES I must agree with Christian. At Sugar Labs, even with major OLPC deployments in Uruguay, Paraguay, Peru. Mexico and elsewhere in the Spanish speaking world, our Spanish L10n community has found common ground in the advantages of maintaining a single lang-es project. Having taken a fairly thorough look around at various translation hosting servers, my sense is that many people begin these country-specific branches with the best of intentions, but they are almost never maintained in the long run. I am entirely in favor of linguistic self-determination, but I do think people over-estimate the variability of the Spanish vocabulary from which software UI's actually draw. If the text concerned cuisine or culture, there would be much more justification for the burden of maintaining the various branches. On the other hand, I am quite interested in developing more country-specific voices for e-speak because spoken Spanish is far more variable, with the yeismo and seismo (for instance). Gnome upstream as an example: Many 100% complete http://l10n.gnome.org/teams/es/ Mostly 0% http://l10n.gnome.org/languages/es_AR/ http://l10n.gnome.org/languages/es_CL/ http://l10n.gnome.org/languages/es_CO/ http://l10n.gnome.org/languages/es_CR/ http://l10n.gnome.org/languages/es_DO/ http://l10n.gnome.org/languages/es_EC/ http://l10n.gnome.org/languages/es_GT/ http://l10n.gnome.org/languages/es_HN/ http://l10n.gnome.org/languages/es_NI/ http://l10n.gnome.org/languages/es_PA/ http://l10n.gnome.org/languages/es_PE/ http://l10n.gnome.org/languages/es_PR/ http://l10n.gnome.org/languages/es_SV/ http://l10n.gnome.org/languages/es_UY/ http://l10n.gnome.org/languages/es_VE/ cjl -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Translate-pootle mailing list Translate-pootle@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/translate-pootle -- Michiel Dethmers mich...@phplist.com http://www.phplist.com Open Source newsletter manager -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Translate-pootle mailing list Translate-pootle@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/translate-pootle
Re: [translate-pootle] rename and clone language
On Fri, Jun 8, 2012 at 3:35 PM, Michiel Dethmers mich...@phplist.com wrote: Thanks both, that's very helpful. Great to be able to learn from others' experiences. It's interesting though. You do often find that things are very different in some languages. Even british english and US english (organisation/organization) or brazilian portuguese and portuguese portuguese, or flemish and dutch. My project already has zh_CN and zh_TW, which I have no idea, but seem very different. 1) en, en_US and en_GB are commonly found because a) there are a number of distinct spelling differences (orthography) that you do not see with Spanish. b) the burden of maintaining the translations is actually quite minimal. In Gnome, there is even a PERL script that takes a first pass at this for you, with pretty good results. http://git.gnome.org/browse/gnome-i18n/tree/en_GB/en_GB.pl 2) zh_CN and zh-TW are in fact quite different, you never see just zh (in my experience). When present, both zh_CN and zh-TW typically get well maintained. Occasionally one will also see zh_HK, but it is less common. 3) pt_PT and pt_BR are very frequently found. I don't know the distinctions well, but when present, they both get well maintained. 4) Occasionally one will see de_DE and de_CH, although much more rarely, and generally with much less completeness on de_CH 5) fa and fa_AF seems to be an important distinction as well (Iranian Persian vs Afghani Dari). Just my experience, FWIW. cjl -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Translate-pootle mailing list Translate-pootle@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/translate-pootle