On 7/28/07, Bernhard Kuemel <[EMAIL PROTECTED]> wrote: > Hi debian-user! > > I converted to utf8 in the hope that my non ASCII character problems > would disappear. They are now ... different. > > I used utf8migrationtool and locale now says: > > [EMAIL PROTECTED]:~$ locale > LANG=en_US.UTF-8
<snip> > I wanted to print a German text containing umlauts from a web page. > I marked it in iceweasel and pasted it into a 'konsole' running bash > running 'cat >x'. 'lpr x' printed only a page with the character 'K'. > > 'hexdump -C x' says: > > 00000010 20 20 20 20 20 20 4b fc 6e 64 69 67 75 6e 67 73 | > K.ndigungs| > 00000020 62 65 73 63 68 72 e4 6e 6b 75 6e 67 65 6e 0a 0a > |beschr.nkungen..| > > so ü is 0xfc, ä is 0xf4, and the characters are printed as > periods '.'. > > mc's viewer says: > > 00000010 20 20 20 20 20 20 4B FC 6E 64 69 67 75 6E 67 73 > Kündigungs > 00000020 62 65 73 63 68 72 E4 6E 6B 75 6E 67 65 6E 0A 0A > beschränkungen.. > > Here ü is still only the single byte 0xFC, but it gets printed > as 'A' with a tilde and a '1/4' character. ä is again 0xE4 but > printed as 'A' with a tilde and a circle with 4 short lines > extending from the circle diagonally. > > Opening x in openoffice writer shows rhombuses with question marks > for each umlaut. > > Opening x.html in openoffice writer I was unable to remove all the > table etc. stuff and so was unable to reformat the text so it would > fit on one page. Hmm, it might work, if I copied the text from there > into a new document. But here I want to solve the locale problems, > or what should I call the problem? I think this has to do with the use of HTML entities (ä) instead of actual UTF-8 characters. An additional possible issue is that the web page may not be UTF-8. When I want to fix up an html page before printing it, I use a WYSIWYG html editor (I use vim when writing my own html). SeaMonkey Composer / Nvu / KompoZer (which are basically all the same program in different forms) have worked well for me. > mc (midnight commander, a norton commander clone) of course goes > crazy again, but I was not surprised and accepted that it prints 'a' > with '^' instead of line art, etc. More serious was that when I > 'ssh'ed to a different computer (not sure which) it got confused > about which line it was on and I messed up editing /etc/fstab. > > man gets quote characters wrong, printing 'a' with '^' instead and > so does gcc. > > I also have problems with kvirc. IIRC I can get it to display > iso8859-1 correctly, but not utf8, and the smart utf8/iso8859-1 mode > does not work. I chat with users who use iso8859-1 and utf8. This sounds more like a real locale problem. Have you tried running "dpkg-reconfigure locales"? That can fix some locale problems. Cheers, Kelly

