Re: Termcap and UTF-8 CSI

Markus Kuhn Fri, 16 Feb 2001 03:49:24 -0800
Bram Moolenaar wrote on 2001-02-16 11:30 UTC:
> But what if the termcap file is in UTF-8 itself, do the raw bytes then get
> converted to UTF-8 another time?  Or is the interpretation of the bytes
> stored, and then converted to UTF-8 only once?

You can use "\302\233" to represent in /etc/termcap UTF-8 encoded CSI
(U+009B) while keeping the file itself completely in ASCII.

Perhaps some good soul even wants to extend termcap to have two new
special codes for added convenience:

  \c = \233       (CSI in some ISO 4873 conforming character set, e.g. ISO 8859)
  \C = \302\233   (CSI in UTF-8)

I don't think you want to use any non-G0 characters apart from LF in a
termcap file. It's just too much hassle to edit/print/mail these across
systems.

Don't even think about horror scenarios such as character encoding
conversions on termcap entries ... :)

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/
Re: Termcap and UTF-8 CSI

Reply via email to