Re: [gentoo-user] Glibc, userlocales, and ENV Variables

Holly Bostick Wed, 02 Nov 2005 12:21:16 -0800

Hans-Werner Hilse schreef:
> Hi,
> 
> On Wed, 02 Nov 2005 15:53:11 +0100 Holly Bostick <[EMAIL PROTECTED]> 
> wrote:
> 
> 
>> [...] /etc/locales.build
>> 
>> which says
>> 
>> # This file names the list of locales to be built when glibc is 
>> installed. # The format is <locale>/<charmap>, where <locale> is a 
>> locale from the # /usr/share/i18n/locales directory, and <charmap> 
>> is name of one of the files # in /usr/share/i18n/charmaps/. All 
>> blank lines and lines starting with # are # ignored. Here is an 
>> example: # en_US/ISO-8859-1 [...] Glibc built fine (afaict), but my
>>  problem is that I now don't know what to export with a LANG 
>> variable.
>> 
>> For example, if I want [EMAIL PROTECTED]/UTF-8, how do I export that as 
>> opposed to [EMAIL PROTECTED]/ISO-8859-15 (or worse, ISO-8859-1)?
> 
> 
> Note the comment you've cited: The format is "locale/charmap". This 
> generates the locale data for a certain "language" (it's a little bit
>  more than just language, though) for the specified charmap.
> 
> In LANG/LC_* you only set the locale. The charmap is (semi-) 
> automatically chosen, which makes sense, since it's terminal 
> dependant which charset is used.


OK, I kinda get that.... and dmesg says during boot that the terminal
(agetty) is being configured to use UTF-8 (which is what I told it to do
when I built the kernel, so that's OK).

So does that mean that when I log in to my DE/WM, and start X, the
charmap will be automatically UTF-8, because that's what the getty was?

I want the full ISO-8859-15 charset and the Euro symbol. UTF-8 gets me
the charset, but afaik I need some attachment to "@euro" to get the Euro
symbol (for those fonts that even have the character(s), which is
another horror show that I won't get into, since once you've found a
reasonably attractive font with all the characters, half the time it
doesn't have bold or italic or bold italic, so it's not very useful on
the desktop.... a horror show).

It's not clear to me whether the Euro symbol is included in UTF-8
encodings, or only as a special variant of ISO-8859-15 (the "@euro"
variant), which is one of the reasons I try to encode both.

> 
> 
>> Was I supposed to give the locales individual names as the 
>> Localization Guide implies? locales.build doesn't indicate that you
>>  can do that (and in fact, I thought perhaps the reason why 
>> language exports were mildly borked might be because I had done 
>> so).
> 
> 
> [EMAIL PROTECTED]/ISO-8859-1 didn't make much sense to me (and maybe causes
>  some failures when building?), but other from that it seemed OK.

Well, of course I know less about this than you do, but my native Dutch
boyfriend runs a English Windows machine, I run Windows programs with
Wine, and about the only thing I think I know about the whole issue is
that Windows pretty much only knows ISO-8859-1 (unless you had a
multi-lingual version, which neither of us did). So I wanted support for
ISO-8859-1 to be available (with support for the Euro symbol for those
MS fonts that support it, which I think that the core MS fonts now do by
default, though I'm not sure about that either).

In any case, if such an application called for ISO-8859-1 , I wanted it
to be there, though as you can tell, I don't get how this is all
supposed to work well enough to be sure that was the way to accomplish
the goal.

> 
> 
>> Should I just get rid of the 'extra' locales (ISO-8859-15 and 
>> ISO-8859-1)? Since I guess I'm going to try to stick to UTF-8, 
>> maybe I don't really need them (I was mostly covering my butt, 
>> concerned that my current and future network connections might not 
>> support UTF-8, since they're mostly to Windows machines).
> 
> 
> All the terminals you're using support UTF-8?

Well, I thought so, but maybe I was wrong. I use mostly
multi-gnome-terminal (which does appear to have unicode support by
default), but when I switched window managers to fvwm-crystal, I started
using mrxvt and aterm a bit more (because fvwm-crystal "likes" them, and
xterm-- which crystal also "likes"-- takes forever to open for some
reason, likely unrelated but very annoying). This may well be when I
started noticing this as a problem rather than an annoyance, because I
was suddenly seeing it so much. Previously, the issue had only raised
its ugly head in some X programs, but not X programs I use that often,
so it was easy to ignore.

None of the terms I use have a unicode USE flag, but I have been by the
homepages. Now I see that support for CJK does not mean that UTF is
automatically supported; it seems that mrvxt does not support unicode,
nor do aterm/multi-aterm/rvxt.

OK, that answers that, I guess, but what did "you" Europeans do when
these terminals were all you had, for Pete's sake? Your output would
have been half-gibberish, and I don't see how people would have stood
for that.

> 
> 
>> I guess I've made a mistake, but I'm not quite sure what to do 
>> about it. Since fixing it will most almost certainly require a 
>> recompile of glibc, and since compiling glibc takes nine-tenths of 
>> forever, I'd like to get it on with it as soon as possible (sigh). 
>> So any hints would be appreciated.
> 
> 
> How does the "borkism" of your locales manifest?

Most of the time, when Dutch characters are meant to be used, they are,
as in the following example:

[EMAIL PROTECTED] -> killall -9 conky
conky: geen proces beëindigd

but sometimes I get this:

 killall -9 MPlayer
MPlayer: geen proces beëindigd

..... now *that's* interesting... I copied and pasted the second from a
terminal (mrvxt, whereas the first was from multi-gnome-terminal), where
what appeared was

 killall -9 MPlayer
MPlayer: geen proces beA(with the ~ over it) << (but tiny ones)indigd

in place of the ë . But when I pasted it into this compose window, it
came out right! But it isn't in the term.

And sometimes the same thing happens in X programs (depending on....
something I now perhaps have a hint about, but am not sure), the "Copy"
command (in Dutch, "Kopiëren") appears with the same mangling of the ë
character.

But I've just noticed that when I tried to copy and paste the 'borked'
output to text and then copy it to the compose window (which still
pasted correctly, which was for once not what I wanted), that I used
gnotepad+ (for speed) rather than gedit-- and gnotepad+ displays the
lack of Unicode support as well (Kopiëren borked).

Is that because it's a GTK+ 1 program? (That's really all it could be,
seems to me.) GTK+-1.2.10-r11 is compiled with nls support, but that
doesn't mean unicode support?

What a mess... does this mean I have to set GTK 1 somehow to use
ISO-8859-15 and all the 'modern' programs to use UTF-8 as they do?

Or give up/prune any GTK 1 programs I might use, and solve the problem
that way?

I mean, is it really so much to ask that accented and special characters
appear correctly no matter what program I'm using? It's not like there's
so many of them!!! But I have to tie my system in knots to get it?

How do you do it? I presume that the bulk of your system is displayed in
German.

Sorry, I'm getting a bit frazzled by this, and I'm annoyed because I
don't think this should be a frazzle-worthy issue, but I've been
struggling with it off and on for the past three-and-a-half years, and
it's about time to get a handle on it.

Thanks for any further instruction,
Holly
-- 
gentoo-user@gentoo.org mailing list

Re: [gentoo-user] Glibc, userlocales, and ENV Variables

Reply via email to