Re: wcwidth and bidi

Markus Kuhn Wed, 27 Sep 2000 03:58:34 -0700
Roozbeh Pournader wrote on 2000-09-26 21:15 UTC:
> On Mon, 25 Sep 2000, Markus Kuhn wrote:
> > The zero-width spaces/joiner are only required for ligature
> > output. This is for the forseeable future probably outside the scope of
> > VT100-like terminal emulators, and therefore also outside the scope of
> > wcwidth().
> 
> You're ignoring Arabic and Hebrew?

Arabic and Hebrew output are at the moment only supported by xterm in
the form that the application that sends data to xterm has to take care
of both bidi processing and ligature substitution. Please read

  http://www.cl.cam.ac.uk/~mgk25/unicode.html#xterm

wcwidth() is only there for predicting/determining, how many character
cells the cursor will have moved after the provided character has been
sent to the terminal. This is +1 for all spacing and Hebrew and Arabic
characters at the moment, because all these characters move the cursor
by one to the right.

It is completely unclear to me at the moment, whether it makes any sense
to add any bidi functionality to xterm. There are two very different
processing models potentially available: ISO 6429 = ECMA-48 and the
Unicode bidi algorithm. See also ECMA Technical Report TR/53. None of
this is widely used in practice at the moment.

Unless we have a very clear and easy to understand algorithm that
specified exactly how the cursor moves for all ISO 6429 ESC sequences in
the presence of bidi processing in xterm, there will only be huge chaos
and confusion in the synchronization between applications (especially
editors and screen management libraries such as curses) and the terminal
emulator. So instead of an immensely complicate bidi protocol between
the terminal and the application, I'd rather prefer the terminal to be
left-to-right, and the application doing all the bidi trickery by
reversing RTL substrings and taking full control of the cursor position.

ISO 6429 does provide a full specification for in-terminal bidi
processing, but it is my understanding that the ISO 6429 algorithm is
difficult to merge with the Unicode algorithm and that people have not
found ISO 6429 bidi on its own that useful so far (does anyone know of
existing implementations). Bidi processing in the application and simple
left-to-right cursor motion in the terminal emulator on the other hand
seem to be widely used in Hebrew VT100 compatible systems, and xterm
does already fully support this easy-to-understand mode of usage. The
Unicode bidi algorithm operates in terms of reformatting a whole
paragraphs, while what we need for xterm is an algorithm that operates
in terms of what does the cursor do after the next character has been
displayed.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/
Re: wcwidth and bidi

Reply via email to