On Wed, Oct 16, 2002 at 04:48:15PM +0200, [EMAIL PROTECTED] wrote: > The problem is that on many Sun installations, en_US.UTF-8 is the > only UTF-8 locale available at all! > A decent solution to this problem would be to handle basic locale > information ("en_US") and encoding suffix ("UTF-8") separately and > specifiy that ANY available locale can be suffixed with ANY known > encoding, so installed de, gb, whatever locales could always be > run with UTF-8. > Is anything specified anywhere about this? Perhaps someone might > nag Sun to fix this broken thing.
A good reason for doing something like this: Admins, with no personal interest in UTF-8 and few users using it, are likely to only generate legacy locales, and not enable UTF-8 ones. This probably isn't any particular desire *not* to have it; they just don't know the difference (and shouldn't need to). So, even though my system and terminal is UTF-8, and all of the systems I connect to are *capable* of it, only a few actually have the locale available. This is a senseless hurdle to using UTF-8; I have to nag admins to generate UTF-8 locales, even though all of the software I'm using has already been updated to handle it! Long before UTF-8 can ever be the default encoding everywhere, it needs to be *available* everywhere (without root intervention). This is a problem on Debian, at least. It shows a list of locale names; you only get UTF-8 if you ask for it. It should probably show a list of country/language codes; eg. choosing en_US should generate both en_US (ISO-8859-1) and en_US.UTF-8, unless the user specifically asks for UTF-8 to not be generated. I suppose one reason this isn't done is because locale generation does take quite a while (maybe 20 seconds per locale on my system). There are probably other, less obvious reasons this isn't done, but I don't know them. One such might be http://bugs.debian.org/99623 ; but that doesn't seem to prevent generating UTF-8 most of the time. This would be less of an issue if locales could convert on the fly; eg. if only en_US.ISO-8859-1 is selected, and I'm in en_US.UTF-8, iconv it on the fly--slower, but better than not working. Complicated, though, and there are probably plenty of not-so-subtle reasons this isn't done. -- Glenn Maynard -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/