On Wed, Oct 16, 2002 at 04:48:15PM +0200, [EMAIL PROTECTED] wrote:
> The problem is that on many Sun installations, en_US.UTF-8 is the
> only UTF-8 locale available at all!
> A decent solution to this problem would be to handle basic locale
> information ("en_US") and encoding suffix ("UTF-8") separately and
> specifiy that ANY available locale can be suffixed with ANY known
> encoding, so installed de, gb, whatever locales could always be
> run with UTF-8.
> Is anything specified anywhere about this? Perhaps someone might
> nag Sun to fix this broken thing.
A good reason for doing something like this: Admins, with no personal
interest in UTF-8 and few users using it, are likely to only generate
legacy locales, and not enable UTF-8 ones. This probably isn't any
particular desire *not* to have it; they just don't know the difference
(and shouldn't need to).
So, even though my system and terminal is UTF-8, and all of the systems
I connect to are *capable* of it, only a few actually have the locale
available. This is a senseless hurdle to using UTF-8; I have to nag
admins to generate UTF-8 locales, even though all of the software I'm
using has already been updated to handle it! Long before UTF-8 can ever
be the default encoding everywhere, it needs to be *available* everywhere
(without root intervention).
This is a problem on Debian, at least. It shows a list of locale names;
you only get UTF-8 if you ask for it. It should probably show a list of
country/language codes; eg. choosing en_US should generate both
en_US (ISO-8859-1) and en_US.UTF-8, unless the user specifically asks
for UTF-8 to not be generated.
I suppose one reason this isn't done is because locale generation does
take quite a while (maybe 20 seconds per locale on my system). There
are probably other, less obvious reasons this isn't done, but I don't
know them. One such might be http://bugs.debian.org/99623 ; but that
doesn't seem to prevent generating UTF-8 most of the time.
This would be less of an issue if locales could convert on the fly; eg.
if only en_US.ISO-8859-1 is selected, and I'm in en_US.UTF-8, iconv it
on the fly--slower, but better than not working. Complicated, though,
and there are probably plenty of not-so-subtle reasons this isn't done.
--
Glenn Maynard
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/