Frank da Cruz wrote on 2001-05-30 16:39 UTC:
> Yes, of course if you modify the terminal driver to decode UTF-8 before
> handling control characters, that solves the problem.
> 
> However, the original goal of UTF-8 was to be able to use it "behind the
> back" of UNIX, just as we do now with ISO 8859.  But because UTF-8 is not
> C1-safe, this idea breaks down in the terminal-to-host direction, and now
> terminal drivers must be modified.

I am not aware of any common practice to assign special semantics to C1
control characters in Unix tty. Could you be specific, where UTF-8
exactly breaks something, because C1 bytes are interpreted before there
is a realistic chance of inserting a UTF-8 decoder? I can't think of
anything in the Unix world!

The only problem with UTF-8 that we have with ttys is that their
"cooked" mode is a full-fledged editor that is not aware of *any*
multi-byte character encodings, including UTF-8. My hope is that one of
the next POSIX revisions will add a UTF-8 flag to struct termios, buyt I
have no idea, whether that is already in the queue.

I assume, the other "host communication" problems that you have referred
to in a rather abstract way have to do with some DEC dinosaurs (VMS?),
most of use really don't care about (because a cruel visionary had it
reimplemented, crossed with Windows 3.1 and it's now called WinNT. :-)

If it is VMS you are talking about, is there anyone interested in using
UTF-8 on VMS?

> It would have been possible to design a C1-safe UTF-8 (and in fact is has
> been done as a proof-of-concept), but it's too late now.

It was called UTF-1, was for some time in ISO 10646-1:1993 until an
early amendment removed it. It was a quite dreadful encoding, nasty
properties and modular 96 arithmetic.

> So the real
> question is: can UTF-8 and ISO 4873 (which has specified the very structure
> of coded character sets for 30 years) coexist without special assistance
> from the terminal driver?  No.

Which terminal drivers interpret C1 codes? I have never seen one!

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to