Tomohiro KUBOTA writes:
> All existing softwares which can deal with doublewidth characters
> (including terminals and applications) assume that two backspace
> characters are needed to erase one doublewidth character. This is a
> de-facto standard in CJK world, though it is not documented.
It is actually very well documented, in the X/Open Curses spec
http://www.opengroup.org/onlinepubs/007908799/xcurses/intov.html#tag_001_004_003
"Unless the cursor was already in column 0, <backspace> moves the
cursor one column toward the start of the current line and any
characters after the <backspace> are added or inserted starting
there."
> Though I am not familiar with kernel programming, can it use the same
> design as the recent XTerm? I.e, use Unicode as internal encoding and
> use iconv() and wcwidth() to support all other encodings such as
> ISO-8859-*, EUC-*, ISO-2022-*, and so on.
This is probably overkill. It's easier to have a wcwidth function for
every encoding. For ISO-8859-* it is trivial, for EUC-* it's easy, for
UTF-8 it is a 2 KB table, and for GB18030 it is also acceptably small
(20 KB).
> Note that Linux is likely to run only with GNU libc and we can
> expect these functions are always available.
libc functions are not available in kernel space. We have a problem
here. We could make a kernel module for every possible encoding, and
have any of these modules dynamically loaded when a tty actually is
put into this encoding.
> Also note that we will need a utility to set tty's locale since LANG
> variable cannot be used for kernel.
The program which creates the tty (xterm or fbgetty) can put the tty
into the proper encoding.
Bruno
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/