Frank da Cruz wrote on 2001-05-30 16:39 UTC:
> Yes, of course if you modify the terminal driver to decode UTF-8 before
> handling control characters, that solves the problem.
>
> However, the original goal of UTF-8 was to be able to use it "behind the
> back" of UNIX, just as we do now with ISO 8859. But because UTF-8 is not
> C1-safe, this idea breaks down in the terminal-to-host direction, and now
> terminal drivers must be modified.
I am not aware of any common practice to assign special semantics to C1
control characters in Unix tty. Could you be specific, where UTF-8
exactly breaks something, because C1 bytes are interpreted before there
is a realistic chance of inserting a UTF-8 decoder? I can't think of
anything in the Unix world!
The only problem with UTF-8 that we have with ttys is that their
"cooked" mode is a full-fledged editor that is not aware of *any*
multi-byte character encodings, including UTF-8. My hope is that one of
the next POSIX revisions will add a UTF-8 flag to struct termios, buyt I
have no idea, whether that is already in the queue.
I assume, the other "host communication" problems that you have referred
to in a rather abstract way have to do with some DEC dinosaurs (VMS?),
most of use really don't care about (because a cruel visionary had it
reimplemented, crossed with Windows 3.1 and it's now called WinNT. :-)
If it is VMS you are talking about, is there anyone interested in using
UTF-8 on VMS?
> It would have been possible to design a C1-safe UTF-8 (and in fact is has
> been done as a proof-of-concept), but it's too late now.
It was called UTF-1, was for some time in ISO 10646-1:1993 until an
early amendment removed it. It was a quite dreadful encoding, nasty
properties and modular 96 arithmetic.
> So the real
> question is: can UTF-8 and ISO 4873 (which has specified the very structure
> of coded character sets for 30 years) coexist without special assistance
> from the terminal driver? No.
Which terminal drivers interpret C1 codes? I have never seen one!
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/