Followup to:  <[EMAIL PROTECTED]>
By author:    Markus Kuhn <[EMAIL PROTECTED]>
In newsgroup: linux.utf8
> 
> > So the real
> > question is: can UTF-8 and ISO 4873 (which has specified the very structure
> > of coded character sets for 30 years) coexist without special assistance
> > from the terminal driver?  No.
> 
> Which terminal drivers interpret C1 codes? I have never seen one!
> 

He's talking about terminal emulators, I guess, and *most* of them do
(CSI especially.)  This isn't a problem in practice, as I've pointed
out in the past, because C1 control codes were broken by so many other
things (nonstandard 8-bit character sets, 7-bit-only communications
channels, etc. etc.) that by and large ALL uses of C1 are in the form
<ESC><GL> where <GL> is the GL character 64 positions below the
corresponding C1 character... <ESC>[ for CSI being the classic
example.  This is in fact so close to universal that it isn't at all
hard to find terminal emulators which will not recognize C1 characters
in their "plain" form; sometimes this is done intentionally (e.g. in
the PC world), sometimes this is accidental.

Yes, For UTF-8 to work properly the terminal emulator needs to be able
to recognize C1 characters using either the <ESC><GL> form (which, I
believe, you will find to be nearly universal) or the UTF-8 encoded
form (prefixed with a 0xC2 byte) -- I suspect that you will, in fact,
never see the latter form at all.

I had someone ask me once why UTF-8 wasn't Latin-1 invariant because
of ease of migration.  I pointed out that if UTF-8 had used only the
C1 characters it would have been a lot clumsier encoding.  The more
you restrict your bytes, the more awkward the resulting encoding gets.

It has been an unofficial fact that you can't trust C1 to be
nongraphic for a long, long time.  UTF-8 uses this fact to make the
encoding cleaner.  I personally think they did the proper tradeoff.

        -hpa

-- 
<[EMAIL PROTECTED]> at work, <[EMAIL PROTECTED]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to