These days we have at least xterm, urxvt, mlterm, gnome-terminal, and konsole which support utf-8 fairly well, but on the flip side there's still a huge number of terminal emulators which do not respect the user's encoding at all and always behave in a legacy-8bit-codepage way.
Trying to help users in #irssi, etc. with charset issues, I've come to believe that it's a fairly significant problem: users get frustrated with utf-8 because the terminal emulator they want to use (which might be chosen based on anti-bloat sentiment or, quite the opposite, on a desire for specialized eye candy only available in one or two programs) forces their system into a mixed-encoding scenario where they have both utf-8 and non-utf-8 data in the filesystem and text files. How hard would it be to go through the available terminal emulators, evaluate which ones lack utf-8 support, and provide at least minimal fixes? In particular, are there any volunteers? What I'm thinking of as a minimal fix is just putting utf-8 conversion into the input and output layers. It would still be fine for most users of these apps if the terminal were limited to a 256-character subset of UCS, didn't support combining characters or CJK, etc. as long as the data sent and received over the PTY device is valid UTF-8, so that the (valid and correct) assumption of applications running on the terminal that characters are encoded in the locale's encoding is satisfied. Perhaps this could be done via a "reverse luit" -- that is, a program like luit or an extension to luit that assumes the physical terminal is using an 8bit legacy codepage rather than UTF-8. Then these terminals could simply be patched to run luit if the locale's encoding is not single-byte. Rich -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
