just two cents: i did this some years back for the links and elinks web browsers (it's the "utf-8 i/o" option available in some versions of each) and the results are fairly mixed -- copy-n-paste fails horribly in an app converted in this way, and i assume the same would be true of a terminal emulator in a window system like X11. on the other hand, it meant i and others could use these browsers on e.g. mac os x years before someoine undertook the much more in-depth utf-8 and unicode support now in progress for elinks.
using luit for this sounds appealing, but in my experience luit (a) crashes frequently and (b) is easily confused by escape sequences and has no user interface for resetting all its iso-2022 state, so in practice it works for only a few apps. that said, it would probably be better thanthe current state of affairs. On 2/23/07, Rich Felker <[EMAIL PROTECTED]> wrote:
These days we have at least xterm, urxvt, mlterm, gnome-terminal, and konsole which support utf-8 fairly well, but on the flip side there's still a huge number of terminal emulators which do not respect the user's encoding at all and always behave in a legacy-8bit-codepage way. Trying to help users in #irssi, etc. with charset issues, I've come to believe that it's a fairly significant problem: users get frustrated with utf-8 because the terminal emulator they want to use (which might be chosen based on anti-bloat sentiment or, quite the opposite, on a desire for specialized eye candy only available in one or two programs) forces their system into a mixed-encoding scenario where they have both utf-8 and non-utf-8 data in the filesystem and text files. How hard would it be to go through the available terminal emulators, evaluate which ones lack utf-8 support, and provide at least minimal fixes? In particular, are there any volunteers? What I'm thinking of as a minimal fix is just putting utf-8 conversion into the input and output layers. It would still be fine for most users of these apps if the terminal were limited to a 256-character subset of UCS, didn't support combining characters or CJK, etc. as long as the data sent and received over the PTY device is valid UTF-8, so that the (valid and correct) assumption of applications running on the terminal that characters are encoded in the locale's encoding is satisfied. Perhaps this could be done via a "reverse luit" -- that is, a program like luit or an extension to luit that assumes the physical terminal is using an 8bit legacy codepage rather than UTF-8. Then these terminals could simply be patched to run luit if the locale's encoding is not single-byte. Rich -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
-- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
