[gentoo-user] UTF-8 troubles

Matthias Bethke Thu, 30 Nov 2006 15:10:32 -0800

I switched a few systems to all-UTF-8 a while ago, and while it's
generally a big improvement, a few apps are playing up. Pretty common
apps that is, most notably tin and centericq, so I think it's probably
my problem.
Thing is, tin seems to decode messages correctly and tries to show
umlauts. However, I only see the lowercase ä, ö and ü; the uppercase
versions and the German "sharp s" (ß) are garbled. The latter for
example is displayed as a diamond with a question mark inside
(supposedly indicating "invalid UTF sequence") followed by "~_" (0x7e
0x5f---the correct UTF-8 sequence is 0xc3 0x9f). Centericq is similar; I
see all umlauts I type in the input area as two question marks, but the
lowercase ones get transmitted correctly and I can read others'
lowercase umlauts. No capitals, no ß either.
The only distinction I could make out between the sets of characters that
are displayed correctly and those that aren't is that the latter contain
UTF-8 bytes that would not be printable when interpreted as ISO-8859-x,
so my hypothesis is that something in-between the app's text output and
the terminal eats bytes unless they're deemed "printable". 
The affected programs all seem to use ncurses. I couldn't find anything
in terminfo that could be causing this, but then I don't have much of a
clue about terminfo in the first place. Google doesn't seem to hvae
heard of the problem. Any ideas where I could look?


cheers!
        Matthias
-- 
I prefer encrypted and signed messages. KeyID: FAC37665
Fingerprint: 8C16 3F0A A6FC DF0D 19B0  8DEF 48D9 1700 FAC3 7665

pgpjIiUL6vMu5.pgp
Description: PGP signature

[gentoo-user] UTF-8 troubles

Reply via email to