Re: Please do not use en_US.UTF-8 outside the US

Ienup Sung Thu, 17 Oct 2002 12:34:15 -0700

I just would like to point out that we started with en_US.UTF-8 and ko.UTF-8
at Solaris 2.6 back then 1996 or so. Since then, we've been gradually and also
consistently increasing the number of Unicode/UTF-8 locales and that's our
goal, i.e., try to supply as many as Unicode/UTF-8 locales as our (limited)
resource allows.


Also, as the locale name specifies, the en_US.UTF-8 is a locale for American
English at the States. We have never even tried to pursuade anyone to
use the locale as the only solution; we are also quite surprised that people
have seen it that way.

As an additional evidence, in Solaris 9, we have:

system% uname -a
SunOS aal 5.9 Generic sun4u sparc SUNW,Ultra-5_10
system% locale -a | grep UTF-8 | sort -u
ar_EG.UTF-8
de.UTF-8
de_DE.UTF-8
de_DE.UTF-8@euro
en_US.UTF-8
es.UTF-8
es_ES.UTF-8
es_ES.UTF-8@euro
fi_FI.UTF-8
fr.UTF-8
fr_BE.UTF-8
fr_BE.UTF-8@euro
fr_FR.UTF-8
fr_FR.UTF-8@euro
he_IL.UTF-8
hi_IN.UTF-8
it.UTF-8
it_IT.UTF-8
it_IT.UTF-8@euro
ja_JP.UTF-8
ko.UTF-8
ko_KR.UTF-8
ko_KR.UTF-8@dict
pl.UTF-8
pl_PL.UTF-8
pt_BR.UTF-8
ru.UTF-8
ru_RU.UTF-8
sv.UTF-8
sv_SE.UTF-8
sv_SE.UTF-8@euro
th_TH.UTF-8
tr_TR.UTF-8
zh.UTF-8
zh_CN.UTF-8
zh_CN.UTF-8@pinyin
zh_CN.UTF-8@radical
zh_CN.UTF-8@stroke
zh_HK.UTF-8
zh_HK.UTF-8@radical
zh_HK.UTF-8@stroke
zh_TW.UTF-8
zh_TW.UTF-8@pinyin
zh_TW.UTF-8@radical
zh_TW.UTF-8@stroke
zh_TW.UTF-8@zhuyin

With regards,

Ienup

] Date: Thu, 17 Oct 2002 15:24:48 +0100
] From: Markus Kuhn <[EMAIL PROTECTED]>
] Subject: Re: Please do not use en_US.UTF-8 outside the US
] To: [EMAIL PROTECTED]
] MIME-version: 1.0
] 
] [EMAIL PROTECTED] wrote on 2002-10-16 14:48 UTC:
] > I came across this older mail by Markus:
] > 
] > > General warning: Please do not use the locale name en_US.UTF-8 anywhere
] > > outside North America. Some older Solaris documentation suggested that
] > > this is the only UTF-8 locale you'll ever need, as locales don't change
] > > much sensible beyond the encoding anyway. This is not the case any more
] > > today!
] > 
] > The problem is that on many Sun installations, en_US.UTF-8 is the 
] > only UTF-8 locale available at all!
] 
] I can't reproduce this problem report on our current Suns:
] 
] $ uname -a ; locale -a | grep UTF-8
] SunOS piper 5.8 Generic_108528-12 sun4u sparc SUNW,Ultra-4
] en_US.UTF-8
] fr.UTF-8
] fr_FR.UTF-8
] fr_FR.UTF-8@euro
] de.UTF-8
] es.UTF-8
] it.UTF-8
] ja_JP.UTF-8
] ko.UTF-8
] sv.UTF-8
] zh.UTF-8
] zh_TW.UTF-8
] 
] It is slightly unpleasant that there is no Commonwealth en.UTF-8 or
] British en_GB.UTF-8, but as long as you use en_US only in LC_CTYPE and
] not in LANG, your are usually fairly safe from the terror of US cultural
] conventions.
] 
] > A decent solution to this problem would be to handle basic locale 
] > information ("en_US") and encoding suffix ("UTF-8") separately and 
] > specifiy that ANY available locale can be suffixed with ANY known 
] > encoding, so installed de, gb, whatever locales could always be 
] > run with UTF-8.
] > Is anything specified anywhere about this?
] 
] http://www.opengroup.org/onlinepubs/007904975/functions/setlocale.html
] 
] In principle, you could set
] 
]   LANG=de LC_CTYPE=en_US.UTF-8
] 
] However in practictice, if "de" is for ISO 8859-1, then it will contain
] only collating data for ISO 8859-1 and therefore work not as well as if
] you had taken the collating data from a full UTF-8 locale that comes
] with all the necessary data. Therefore, in practice, the locales that
] you mix with LC_* should preferably come with identical encodings.
] 
] Markus
] 
] -- 
] Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
] Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>
] 
] --
] Linux-UTF8:   i18n of Linux on all levels
] Archive:      http://mail.nl.linux.org/linux-utf8/
] 

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: Please do not use en_US.UTF-8 outside the US

Reply via email to