Markus Kuhn wrote:
> Bram Moolenaar wrote on 2001-02-15 21:08 UTC:
> > And since UTF-8 includes 0x9b bytes only as a trailing byte, it would
> > still be possible to use a single CSI byte.
>
> That is not what ISO 10646 says or what any of the UTF-8 terminal
> emulators implement. If you use C1 control sequences such as CSI under
> UTF-8, then you have to encode them using UTF-8 (that means prefixing
> with the byte 0xC2, i.e. 11000010 10xxxxx), and not by inserting them as
> malformed UTF-8 sequences.
OK, that should work.
Now the question is how to handle this in termcap/terminfo entries. Using the
raw byte sequences there should work. The library could take care of some
conversion, but it would require storing the information about what encoding
the terminal uses. That makes it more complicated. It's probably better to
store raw codes to keep it straightforward.
But what if the termcap file is in UTF-8 itself, do the raw bytes then get
converted to UTF-8 another time? Or is the interpretation of the bytes
stored, and then converted to UTF-8 only once?
This is not a big problem, but a choice. It should be made very clear what
the choice is, and it should be documented with examples. Otherwise people
will make wrong termcap entries, which would be very annoying.
I suppose the termcap entries as obtained by tgetstr() should contain the raw
byte sequence, thus CSI would be 0xc2 0x9b. When the termcap file is in UTF-8
format, this then contains the four bytes 0xc3 0x8c 0xc2 0x9b.
--
[SIR LAUNCELOT runs back up the stairs, grabs a rope
of the wall and swings out over the heads of the CROWD in a
swashbuckling manner towards a large window. He stops just short
of the window and is left swing pathetically back and forth.]
LAUNCELOT: Excuse me ... could somebody give me a push ...
"Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD
/// Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.moolenaar.net \\\
((( Creator of Vim - http://www.vim.org -- ftp://ftp.vim.org/pub/vim )))
\\\ Help me helping AIDS orphans in Uganda - http://iccf-holland.org ///
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/