Tomohiro KUBOTA writes:

> All existing softwares which can deal with doublewidth characters
> (including terminals and applications) assume that two backspace
> characters are needed to erase one doublewidth character.  This is a
> de-facto standard in CJK world, though it is not documented.

It is actually very well documented, in the X/Open Curses spec
http://www.opengroup.org/onlinepubs/007908799/xcurses/intov.html#tag_001_004_003

     "Unless the cursor was already in column 0, <backspace> moves the
     cursor one column toward the start of the current line and any
     characters after the <backspace> are added or inserted starting
     there."

> Though I am not familiar with kernel programming, can it use the same
> design as the recent XTerm?  I.e, use Unicode as internal encoding and
> use iconv() and wcwidth() to support all other encodings such as
> ISO-8859-*, EUC-*, ISO-2022-*, and so on.

This is probably overkill. It's easier to have a wcwidth function for
every encoding. For ISO-8859-* it is trivial, for EUC-* it's easy, for
UTF-8 it is a 2 KB table, and for GB18030 it is also acceptably small
(20 KB).

> Note that Linux is likely to run only with GNU libc and we can
> expect these functions are always available.

libc functions are not available in kernel space. We have a problem
here. We could make a kernel module for every possible encoding, and
have any of these modules dynamically loaded when a tty actually is
put into this encoding.

> Also note that we will need a utility to set tty's locale since LANG
> variable cannot be used for kernel.

The program which creates the tty (xterm or fbgetty) can put the tty
into the proper encoding.

Bruno
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/

Reply via email to