subject:"\[translate\-pootle\] Language Alphabets and ISO 3066 codes"

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

2010-03-09 Thread Ihar Hrachyshka

On Wed, 2010-03-10 at 00:39 +1300, Amos Jeffries wrote:
 Christian PERRIER wrote:
  Quoting Amos Jeffries (squ...@treenet.co.nz):
  
  Problem 1) Alphabets versus Languages
   I've hit it with Serbian. They use two different alphabets Latin and
  Cyrillic. But only one language.
   Distinguished by two codes sr-Latn and sr-Cyrl. The same issue occurs in
  Chinese Hans/Hant/Ming/* and has been hacked around previously by appending
  the specific ISO-3166 country code where its most frequently needed.
 
   What I'm hoping for is to use the ISO-3066 alphabet codes as part of the
  language tag somewhere.
  
  
  This is indeed the first time I hear about ISO-3066.
  
  As one of the iso-codes maintainers, I know about ISO-15924, which is
  meant to be a standard for script names. We include it in the package
  since October 2007. Reference is http://unicode.org/iso15924/
 
 Ah thanks. Good to know.
 
  
  Example entry in the XML file we provide:
  
  iso_15924_entry
  alpha_4_code=Cyrl
  numeric_code=220
  name=Cyrillic /
  iso_15924_entry
  alpha_4_code=Cyrs
  numeric_code=221
  name=Cyrillic (Old Church Slavonic variant) /
  .../...
  iso_15924_entry
  alpha_4_code=Latn
  numeric_code=215
  name=Latin /
  
  
  These examples use your own example. Note that the alpha4 code is
  indeed the same.
  
  I'd say that ISO-15924 seems to be an evolution of 3066 or something
  like this.
 
 I guess so. I only found the ISO-3066 code this week in some fairly old 
 university language papers about Serbian/Croatian alphabet splits.
 
  
  WRT your general message, I agree that using ISO 15924 codes in locale
  names would be a great progress over the current hacks implemented in
  various ways (zh_CN vs. zh_TW as a hack between Simplified and
  Traditional Chineseor Hans vs. Hant, or variants for Serbian,
  or probably others I don't know about).
  
 
 So far I know of Chinese and Serbian for certain, with hints indicating 
 Azerbaijan and Croatian will need it in future as well.

...and Belarusian Latin is assigned to b...@latin in glibc (IIRC Serbian
uses '@Latn' tag for the same thing). Actually, these locale 'variants'
don't have good support in different l10n software (f.e. Rosetta doesn't
know about their existance at all).

 
 Amos
 Squid Project
 
 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 ___
 Translate-pootle mailing list
 Translate-pootle@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/translate-pootle



--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

2010-03-09 Thread Amos Jeffries

Alaa Abd El Fattah wrote:
 On Wed, 10 Mar 2010 02:42:10 +1300
 Amos Jeffries squ...@treenet.co.nz wrote:
 
 I can continue that part manually for now, it has not been difficult
 so far.

 But please consider the problem of symlinks versus language 
 creation/updates from template folder as a feature request. I'd love
 to be able to automate that part. Generating symlink in the language
 folder from the base path of a symlink in templates folder seems to
 be the easy way and would come close to a usable solution for me.
 
 I don't understand what you need. can you elaborate?
 

/var/lib/pootle/templates/:
   errpages.pot (symlink to /src/errors/errpages.pot)
   manuals.pot (symlink to /src/manuals/manuals.pot)

ACTION: Adding language af to project squid and initializing from 
templates needs to create:

/var/lib/pootle/po/squid/af/:
??.po (symink to /src/errors/af.po)
??.po (symlink to /src/manuals/af.po)


(assuming that either the src/*/*.po fies already exist or that pootle 
has write access to the /src/* directories to create a new real .po there.)

Amos


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

2010-03-09 Thread Amos Jeffries

Dwayne Bailey wrote:
 On Tue, 2010-03-09 at 06:59 +0100, Christian PERRIER wrote:
 Quoting Amos Jeffries (squ...@treenet.co.nz):

 Problem 1) Alphabets versus Languages
  I've hit it with Serbian. They use two different alphabets Latin and
 Cyrillic. But only one language.
  Distinguished by two codes sr-Latn and sr-Cyrl. The same issue occurs in
 Chinese Hans/Hant/Ming/* and has been hacked around previously by appending
 the specific ISO-3166 country code where its most frequently needed.

  What I'm hoping for is to use the ISO-3066 alphabet codes as part of the
 language tag somewhere.

 This is indeed the first time I hear about ISO-3066.

 As one of the iso-codes maintainers, I know about ISO-15924, which is
 meant to be a standard for script names. We include it in the package
 since October 2007. Reference is http://unicode.org/iso15924/

 Example entry in the XML file we provide:

 iso_15924_entry
 alpha_4_code=Cyrl
 numeric_code=220
 name=Cyrillic /
 iso_15924_entry
 alpha_4_code=Cyrs
 numeric_code=221
 name=Cyrillic (Old Church Slavonic variant) /
 .../...
 iso_15924_entry
 alpha_4_code=Latn
 numeric_code=215
 name=Latin /


 These examples use your own example. Note that the alpha4 code is
 indeed the same.

 I'd say that ISO-15924 seems to be an evolution of 3066 or something
 like this.

 WRT your general message, I agree that using ISO 15924 codes in locale
 names would be a great progress over the current hacks implemented in
 various ways (zh_CN vs. zh_TW as a hack between Simplified and
 Traditional Chineseor Hans vs. Hant, or variants for Serbian,
 or probably others I don't know about).
 
 We're following the Gettext/POSIX convention here which is different
 from the RFC.
 
 I think this is dealt with with something like s...@latn and s...@cyrl -
 these should work in Pootle as we're currently running with c...@valentia
 and we're able to manage that correctly.
 
 Still doesn't solve your problem about having to link the name on Pootle
 to the name you need for your files.
 

I can continue that part manually for now, it has not been difficult so far.

But please consider the problem of symlinks versus language 
creation/updates from template folder as a feature request. I'd love to 
be able to automate that part. Generating symlink in the language folder 
from the base path of a symlink in templates folder seems to be the easy 
way and would come close to a usable solution for me.


Amos
Squid Project

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

2010-03-09 Thread Alaa Abd El Fattah

On Wed, 10 Mar 2010 02:42:10 +1300
Amos Jeffries squ...@treenet.co.nz wrote:

 I can continue that part manually for now, it has not been difficult
 so far.
 
 But please consider the problem of symlinks versus language 
 creation/updates from template folder as a feature request. I'd love
 to be able to automate that part. Generating symlink in the language
 folder from the base path of a symlink in templates folder seems to
 be the easy way and would come close to a usable solution for me.

I don't understand what you need. can you elaborate?

cheers,
Alaa

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

2010-03-09 Thread Alaa Abd El Fattah

On Tue, 09 Mar 2010 13:56:14 +0200
Ihar Hrachyshka ihar.hrachys...@gmail.com wrote:

 On Wed, 2010-03-10 at 00:39 +1300, Amos Jeffries wrote:
  Christian PERRIER wrote:
   Quoting Amos Jeffries (squ...@treenet.co.nz):
   
   Problem 1) Alphabets versus Languages
I've hit it with Serbian. They use two different alphabets
   Latin and Cyrillic. But only one language.
Distinguished by two codes sr-Latn and sr-Cyrl. The same issue
   occurs in Chinese Hans/Hant/Ming/* and has been hacked around
   previously by appending the specific ISO-3166 country code where
   its most frequently needed.
  
What I'm hoping for is to use the ISO-3066 alphabet codes as
   part of the language tag somewhere.
   
   
   This is indeed the first time I hear about ISO-3066.
   
   As one of the iso-codes maintainers, I know about ISO-15924,
   which is meant to be a standard for script names. We include it
   in the package since October 2007. Reference is
   http://unicode.org/iso15924/
  
  Ah thanks. Good to know.
  
   
   Example entry in the XML file we provide:
   
   iso_15924_entry
   alpha_4_code=Cyrl
   numeric_code=220
   name=Cyrillic /
   iso_15924_entry
   alpha_4_code=Cyrs
   numeric_code=221
   name=Cyrillic (Old Church Slavonic variant) /
   .../...
   iso_15924_entry
   alpha_4_code=Latn
   numeric_code=215
   name=Latin /
   
   
   These examples use your own example. Note that the alpha4 code is
   indeed the same.
   
   I'd say that ISO-15924 seems to be an evolution of 3066 or
   something like this.
  
  I guess so. I only found the ISO-3066 code this week in some fairly
  old university language papers about Serbian/Croatian alphabet
  splits.
  
   
   WRT your general message, I agree that using ISO 15924 codes in
   locale names would be a great progress over the current hacks
   implemented in various ways (zh_CN vs. zh_TW as a hack between
   Simplified and Traditional Chineseor Hans vs. Hant, or
   variants for Serbian, or probably others I don't know about).
   
  
  So far I know of Chinese and Serbian for certain, with hints
  indicating Azerbaijan and Croatian will need it in future as well.
 
 ...and Belarusian Latin is assigned to b...@latin in glibc (IIRC
 Serbian uses '@Latn' tag for the same thing). Actually, these locale
 'variants' don't have good support in different l10n software (f.e.
 Rosetta doesn't know about their existance at all).

Poolte uses glibc locale's and supports codes like b...@latin, they're
inconsistently used for other types of variations like c...@valencia but
the good news is they work fine with our tools
(check http://pootle.locamotion.org/c...@valencia/ for example).

I'm not sure I understood the issues Amos is facing, how much of it is
solved by using s...@latin?

cheers,
Alaa

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

2010-03-09 Thread Alaa Abd El Fattah

On Tue, 09 Mar 2010 15:51:42 +1300
Amos Jeffries squ...@treenet.co.nz wrote:

 
 Problem 2)
 I've experimented on 1.3 a while back and that just resulted in:
  * erasure of the symlinks, replaced with physical files (empty like
 the .pot).
  * loss of .po files not explicitly named identical to the available
 language ie (X.pot - $LANG/$LANG.po)
 
  Do the 2.0 improvements help with these at all?

1.3 was the prerelease name of 2.0, anything you tested under the 1.3
name was far from stable.

Pootle 2.0.1 had fixes related to symlinked translation files. so you
should be fine.

cheers,
Alaa

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

2010-03-09 Thread Ihar Hrachyshka

IIRC s...@latn is obsolete due to s...@latin is used for new translations
(we tried to move b...@latin to b...@latn since the latter is IANA approved
keyword but we got the responce from glibc maintainers that s...@latn was
going to move to s...@latin itself because of some glibc internal rules.
Though you'd better ask the Serbian guys to get 100% right answer :)

On Wed, 2010-03-10 at 02:59 +1300, Amos Jeffries wrote:
 Alaa Abd El Fattah wrote:
  On Tue, 09 Mar 2010 13:56:14 +0200
  Ihar Hrachyshka ihar.hrachys...@gmail.com wrote:
  
  On Wed, 2010-03-10 at 00:39 +1300, Amos Jeffries wrote:
  Christian PERRIER wrote:
  Quoting Amos Jeffries (squ...@treenet.co.nz):
 
  Problem 1) Alphabets versus Languages
   I've hit it with Serbian. They use two different alphabets
  Latin and Cyrillic. But only one language.
   Distinguished by two codes sr-Latn and sr-Cyrl. The same issue
  occurs in Chinese Hans/Hant/Ming/* and has been hacked around
  previously by appending the specific ISO-3166 country code where
  its most frequently needed.
 
   What I'm hoping for is to use the ISO-3066 alphabet codes as
  part of the language tag somewhere.
 
  This is indeed the first time I hear about ISO-3066.
 
  As one of the iso-codes maintainers, I know about ISO-15924,
  which is meant to be a standard for script names. We include it
  in the package since October 2007. Reference is
  http://unicode.org/iso15924/
  Ah thanks. Good to know.
 
  Example entry in the XML file we provide:
 
  iso_15924_entry
  alpha_4_code=Cyrl
  numeric_code=220
  name=Cyrillic /
  iso_15924_entry
  alpha_4_code=Cyrs
  numeric_code=221
  name=Cyrillic (Old Church Slavonic variant) /
  .../...
  iso_15924_entry
  alpha_4_code=Latn
  numeric_code=215
  name=Latin /
 
 
  These examples use your own example. Note that the alpha4 code is
  indeed the same.
 
  I'd say that ISO-15924 seems to be an evolution of 3066 or
  something like this.
  I guess so. I only found the ISO-3066 code this week in some fairly
  old university language papers about Serbian/Croatian alphabet
  splits.
 
  WRT your general message, I agree that using ISO 15924 codes in
  locale names would be a great progress over the current hacks
  implemented in various ways (zh_CN vs. zh_TW as a hack between
  Simplified and Traditional Chineseor Hans vs. Hant, or
  variants for Serbian, or probably others I don't know about).
 
  So far I know of Chinese and Serbian for certain, with hints
  indicating Azerbaijan and Croatian will need it in future as well.
  ...and Belarusian Latin is assigned to b...@latin in glibc (IIRC
  Serbian uses '@Latn' tag for the same thing). Actually, these locale
  'variants' don't have good support in different l10n software (f.e.
  Rosetta doesn't know about their existance at all).
  
  Poolte uses glibc locale's and supports codes like b...@latin, they're
  inconsistently used for other types of variations like c...@valencia but
  the good news is they work fine with our tools
  (check http://pootle.locamotion.org/c...@valencia/ for example).
  
  I'm not sure I understood the issues Amos is facing, how much of it is
  solved by using s...@latin?
  
 
 My problem #1 can be resolved completely by s...@latin. Thanks for 
 pointing it out. I had seen c...@valencia without really understanding 
 what that was about, it slipped my mind.
 
 But ... where do I find a reliable index of these @... codes? searching 
 online for stuff with '@' in it seems to be one of the difficult tasks, 
 and even @valencia did not lead anywhere useful.
 
 FYI: The web standard my raw .po files have to use in VCS uses '-' 
 instead of '@' and the ISO-15924 codes instead of valencia or latin 
 glibc codes. Otherwise identical in meaning.
 
 
 My problem #2 is partly about needing to store man page translations 
 (with system Locales) and these web-format translations side by side for 
 each language.
 ie
   s...@latin/sr-Latn.po, s...@latin/sr_SP.po s...@latin/sr_SB.po
   s...@cyrillic/sr-Cyrl.po, s...@cyrillic/sr_SP.po s...@cyrillic/sr_SB.po
 
 Or do I need the .pot name in the .po filename like Rosetta appear to use?
 
 Amos
 Squid Project
 
 --
 Download Intel#174; Parallel Studio Eval
 Try the new software tools for yourself. Speed compiling, find bugs
 proactively, and fine-tune applications for parallel performance.
 See why Intel Parallel Studio got high marks during beta.
 http://p.sf.net/sfu/intel-sw-dev
 ___
 Translate-pootle mailing list
 Translate-pootle@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/translate-pootle



--
Download Intel#174; Parallel Studio Eval
Try the

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

2010-03-09 Thread Dwayne Bailey

On Tue, 2010-03-09 at 06:59 +0100, Christian PERRIER wrote:
Quoting Amos Jeffries (squ...@treenet.co.nz):

Problem 1) Alphabets versus Languages
I've hit it with Serbian. They use two different alphabets Latin and
Cyrillic. But only one language.
Distinguished by two codes sr-Latn and sr-Cyrl. The same issue occurs in
Chinese Hans/Hant/Ming/* and has been hacked around previously by appending
the specific ISO-3166 country code where its most frequently needed.

What I'm hoping for is to use the ISO-3066 alphabet codes as part of the
language tag somewhere.

This is indeed the first time I hear about ISO-3066.

As one of the iso-codes maintainers, I know about ISO-15924, which is
meant to be a standard for script names. We include it in the package
since October 2007. Reference is http://unicode.org/iso15924/

Example entry in the XML file we provide:

iso_15924_entry
alpha_4_code=Cyrl
numeric_code=220
name=Cyrillic /
iso_15924_entry
alpha_4_code=Cyrs
numeric_code=221
name=Cyrillic (Old Church Slavonic variant) /
.../...
iso_15924_entry
alpha_4_code=Latn
numeric_code=215
name=Latin /

These examples use your own example. Note that the alpha4 code is
indeed the same.

I'd say that ISO-15924 seems to be an evolution of 3066 or something
like this.

WRT your general message, I agree that using ISO 15924 codes in locale
names would be a great progress over the current hacks implemented in
various ways (zh_CN vs. zh_TW as a hack between Simplified and
Traditional Chineseor Hans vs. Hant, or variants for Serbian,
or probably others I don't know about).

We're following the Gettext/POSIX convention here which is different
from the RFC.

I think this is dealt with with something like s...@latn and s...@cyrl -
these should work in Pootle as we're currently running with c...@valentia
and we're able to manage that correctly.

Still doesn't solve your problem about having to link the name on Pootle
to the name you need for your files.

--
Dwayne Bailey
Associate Research Director+27 12 460 1095 (w)
Translate.org.za ANLoc+27 83 443 7114 (c)

Recent blog posts:
* Translate Toolkit - a powerful localisation toolkit
http://www.translate.org.za/blogs/dwayne/en/content/translate-toolkit-powerful-localisation-toolkit
* The sky's the limit for new Zulu spell checker
* Everyone has the power to champion their language

Firefox web browser in Afrikaans - http://af.www.mozilla.com/af/
African Network for Localisation (ANLoc) - http://africanlocalisation.net/

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

2010-03-09 Thread Amos Jeffries

On Tue, 9 Mar 2010 16:40:13 +0200, Alaa Abd El Fattah
a...@translate.org.za wrote:
 On Wed, 10 Mar 2010 02:59:41 +1300
 Amos Jeffries squ...@treenet.co.nz wrote:
 
 My problem #2 is partly about needing to store man page translations 
 (with system Locales) and these web-format translations side by side
 for each language.
 ie
   s...@latin/sr-Latn.po, s...@latin/sr_SP.po s...@latin/sr_SB.po
   s...@cyrillic/sr-Cyrl.po, s...@cyrillic/sr_SP.po s...@cyrillic/sr_SB.po
 
 Pootle supports two directory schemes (we call them tree styles). GNU
 and the horribly named Non-Gnu
 
 GNU style means files are named after language codes. it is usually a
 single directory.
 
 po/foo/
   foo.pot
   ar.po
   af.po
   s...@latin.po
   ...
 
 if it involves multiple templates then each extra template file gets
 it's own directory and it looks like this (note template and
 subdirectory can be named anything, they don't have to match)
 
 po/foo/
  manual/
  manual.pot
  ar.po
  af.po
  s...@latin.po
  ...
  foo.pot
  ar.po
  af.po
  s...@latin.po
 
 for Non-Gnu each language gets a subdirectory, files could be called
 anything but they tend to have a name that reflects where the
 translation strings came from, templates should reside in the
 templates directory. like
 
 po/foo/
  templates/
 main.pot
 manual.pot
 ...
  ar/
 main.po
 manual.po
 ...
  af/
 main.po
 manual.po
 ...
  s...@latin/
 main.po
 manual.po
 
 we realize these do not conform to the way every single project works,
 but they cover the vast majority of them.
 
 you can use symlinks to adapt your current structure, but Pootle won't
 be aware of the symlinks and so can't imitate them when adding new
 languages. it will have to be a manual process.
 
 if I understand correctly you are relying on the difference between
 POSIX locales and web locales to keep two different translation files
 side by side while still being named after the language code. I don't
 think this is a good idea in general (adds confusion where things are
 unnecessarily confusing already), and we are unlikely to ever support a
 scheme like this.
 
 cheers,
 Alaa
 

Thank you Alaa. This is exactly the type of advise I was looking for.

Looks like a I'm going to be migrating from non-GNU to GNU structure to
get this to work.


Amos

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle

[translate-pootle] Language Alphabets and ISO 3066 codes

2010-03-08 Thread Amos Jeffries


I've encountered a few problems with our usage of Pootle and am seeking
advice from the experts on how best to proceed.

Problem 1) Alphabets versus Languages
 I've hit it with Serbian. They use two different alphabets Latin and
Cyrillic. But only one language.
 Distinguished by two codes sr-Latn and sr-Cyrl. The same issue occurs in
Chinese Hans/Hant/Ming/* and has been hacked around previously by appending
the specific ISO-3166 country code where its most frequently needed.

 What I'm hoping for is to use the ISO-3066 alphabet codes as part of the
language tag somewhere.
The options that appear to be present now are:
 (a) create a fake country code and do the xx_YY code hack to create
entire new languages as per Chinese (eww)
 (b) add two .po files with sr-Latn.po and sr-Cyrl.po names.

This latter seems cleaner and will be of some help when problem (3) starts
to happen. But when last tested broke the use of templates as described in
problem (2) below.  It may also bring up issues with pootle reporting
language X having twice as many words as other languages, thus falsely
incomplete reported for one or other alphabet.

 Also, as in the case with Chinese when the alphabets have different
special characters and maybe even grammer rules things break badly.

 I can see at least two other languages on the horizon with similar
alphabet issues. How has other peoples experience been with multiple .po
files per language for one .pot?

FYI:  Web language codings are tagged by the BNF ::= ISO-639-* ['-'
ISO-3066] ['-' ISO-3166].
 Pootle denies (a) the use of '-' in language codes, and AFAICT (a) the
use of more than 2 chars. So these all have to be hacked down to xx_YY
(making ISO-639-2 and ISO-639-* '-' ISO-3066 base codings unusable). With
the Chinese hacks things get nasty very fast.

Feature Request:
  If Pootle accepted ISO-3066 alphabet codes in language codes the Chinese
hack could easily be dropped out of existence for us in favor of Hans/Hant
namings.



Problem 2)
 Pootle folder structure. I've been using a flat folder layout with no
templates

 How to use the update-from-templates feature when the .po are all
symlinks to files with slightly different names (ISO639-ISO3066-ISO3166
named, not ISO639_ISO3166 named). And/or the multiple files needed for
problem (1).
I've experimented on 1.3 a while back and that just resulted in:
 * erasure of the symlinks, replaced with physical files (empty like the
.pot).
 * loss of .po files not explicitly named identical to the available
language ie (X.pot - $LANG/$LANG.po)

 Do the 2.0 improvements help with these at all?

 Also, if I go for a solution to problem (1) where a specific language has
two ISO-3066 sub-coded .po (ie  $POTNAME '_' $LANG '-' ISO-3066 '.po') 
will that feature update both to the .pot? one? neither?

Problem 3) potential clash.
 I am about to begin dealing with system locale encoding xx_YY encodings
AND web encodings simultaneously within each language. Different .pot for
each style. Any clues as to where start looking for guidance on that nest
of issues?

Amos
Squid Project


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

2010-03-08 Thread Christian PERRIER

Quoting Amos Jeffries (squ...@treenet.co.nz):

 Problem 1) Alphabets versus Languages
  I've hit it with Serbian. They use two different alphabets Latin and
 Cyrillic. But only one language.
  Distinguished by two codes sr-Latn and sr-Cyrl. The same issue occurs in
 Chinese Hans/Hant/Ming/* and has been hacked around previously by appending
 the specific ISO-3166 country code where its most frequently needed.
 
  What I'm hoping for is to use the ISO-3066 alphabet codes as part of the
 language tag somewhere.


This is indeed the first time I hear about ISO-3066.

As one of the iso-codes maintainers, I know about ISO-15924, which is
meant to be a standard for script names. We include it in the package
since October 2007. Reference is http://unicode.org/iso15924/

Example entry in the XML file we provide:

iso_15924_entry
alpha_4_code=Cyrl
numeric_code=220
name=Cyrillic /
iso_15924_entry
alpha_4_code=Cyrs
numeric_code=221
name=Cyrillic (Old Church Slavonic variant) /
.../...
iso_15924_entry
alpha_4_code=Latn
numeric_code=215
name=Latin /


These examples use your own example. Note that the alpha4 code is
indeed the same.

I'd say that ISO-15924 seems to be an evolution of 3066 or something
like this.

WRT your general message, I agree that using ISO 15924 codes in locale
names would be a great progress over the current hacks implemented in
various ways (zh_CN vs. zh_TW as a hack between Simplified and
Traditional Chineseor Hans vs. Hant, or variants for Serbian,
or probably others I don't know about).


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Translate-pootle mailing list
Translate-pootle@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/translate-pootle

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

[translate-pootle] Language Alphabets and ISO 3066 codes

Re: [translate-pootle] Language Alphabets and ISO 3066 codes

11 matches

Site Navigation

Mail list logo

Footer information