[EMAIL PROTECTED] wrote on 2002-10-16 14:48 UTC:
> I came across this older mail by Markus:
>
> > General warning: Please do not use the locale name en_US.UTF-8 anywhere
> > outside North America. Some older Solaris documentation suggested that
> > this is the only UTF-8 locale you'll ever need, as locales don't change
> > much sensible beyond the encoding anyway. This is not the case any more
> > today!
>
> The problem is that on many Sun installations, en_US.UTF-8 is the
> only UTF-8 locale available at all!
I can't reproduce this problem report on our current Suns:
$ uname -a ; locale -a | grep UTF-8
SunOS piper 5.8 Generic_108528-12 sun4u sparc SUNW,Ultra-4
en_US.UTF-8
fr.UTF-8
fr_FR.UTF-8
fr_FR.UTF-8@euro
de.UTF-8
es.UTF-8
it.UTF-8
ja_JP.UTF-8
ko.UTF-8
sv.UTF-8
zh.UTF-8
zh_TW.UTF-8
It is slightly unpleasant that there is no Commonwealth en.UTF-8 or
British en_GB.UTF-8, but as long as you use en_US only in LC_CTYPE and
not in LANG, your are usually fairly safe from the terror of US cultural
conventions.
> A decent solution to this problem would be to handle basic locale
> information ("en_US") and encoding suffix ("UTF-8") separately and
> specifiy that ANY available locale can be suffixed with ANY known
> encoding, so installed de, gb, whatever locales could always be
> run with UTF-8.
> Is anything specified anywhere about this?
http://www.opengroup.org/onlinepubs/007904975/functions/setlocale.html
In principle, you could set
LANG=de LC_CTYPE=en_US.UTF-8
However in practictice, if "de" is for ISO 8859-1, then it will contain
only collating data for ISO 8859-1 and therefore work not as well as if
you had taken the collating data from a full UTF-8 locale that comes
with all the necessary data. Therefore, in practice, the locales that
you mix with LC_* should preferably come with identical encodings.
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/