Kai Henningsen wrote on 2001-03-14 22:43 UTC:
> * There should be a certain octet sequence that will *always* get the  
> terminal into a state where one can use the query-and-modify-state  
> functions, no matter what the terminal is currently doing. (Think of the  
> situation where a program is kill -9ed and you want to restore the diplay,  
> for example. Even if you just were in the middle of a multi-megabyte font  
> download, you still want to be able to reset right now.) Implementing this  
> right probably means you need to keep a state machine running directly at  
> the receive-a-character point, and you also need to keep this in mind when  
> designing escape mechanisms for binary data such as for font downloads.

ISO 6429 has already a mechanism for that: CANcel (0x18) resets both the
ESC sequence and the UTF-8 parsers into their initial state. And ESC
itself as well as all other characters in the range 0x00-0x2F do that,
too, so you do not have to precede an ESC sequence with CAN and no ESC
sequence will be parsed beyond the end of the line. It is all very
robust and well designed, if only authors of terminal emulators would
read it.

ISO 6429 defines the proper state machines that you wish. ISO 6429
forbids "escape mechanisms for binary data". An ISO 6429 ESC sequence
(including all permitted private extensions) always starts ESC [,
continues with an arbitrary sequence of characters from the 0x3X range
('0' .. '?') and ends with a letter. You can't transfer more than 4 bit
per character payload data inside an ESC sequence.

If people invented binary font upload ESC sequence, then that is

  a) simply bad engineering (for the reasons you indicated)
  b) based on ignorance about ISO 6429/ECMA-48/etc.

Such practice should be stopped in VT100 emulators and the relevant code
should be removed quickly.

One of the projects on my todo list is a cleanup of the Linux console
driver. That will include the removal of all ESC sequences that violate
ISO 6429. That will also include the removal of ISO 2022 G0 switching
capabilities. Having in G0 anything but ASCII is far more dangerous than
useful. Such capability should only be enabled by configuration options
that are per default off. If people want to have DEC line drawing
characters, they can map them to G1 or (far better!) they can use UTF-8
surounded by ESC % G and ESC @. Same for xterm.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/

Reply via email to