Hi,
At Mon, 29 Jan 2001 20:25:00 +0000 (GMT),
Robert Brady <[EMAIL PROTECTED]> wrote:
> If it has to display just the the first or second half of a wide-character,
> U+303F is the right thing to display? Does this apply to Hangul, Kana,
> and Yi as well as Ideographs? What about the wide punctuation?
There are three reasons for 'half of a wide-character' problems.
(1) printf(79 columns of characters and a doublewidth character);
(2) move(1,1); printf(a doublewidth character);
move(1,2); printf(any character);
(3) move(1,1); printf(a doublewidth character);
move(1,1); printf(a singlewidth character);
I have been using doublewidth-enabled terminals for long years (mainly
MS-DOS, kterm, and rxvt). In general, it sucks for these cases. At
worst, all following characters were broken. Though I didn't read the
source codes for them, I think these terminals don't pay attention to
such cases. (MS-DOS breaks a line before a doublewidth character in
case of (1) to avoid 'half-character problem'. I know XTerm already
has the same algorithm. Excellent. XTerm is better than even Kterm
or Rxvt in this point.)
The main cause for (2) and (3) is text-based window-oriented softwares.
The edge of a window can destroy doublewidth character. I think window
libraries should be responsible to take care of this case. (The erasure
of the window and resume of the 'under' window cannot be managed well by
the terminal.) Thus it is application softwares' responsibility to
avoid (2) and (3).
I think you are interested in how XTerm should be implemented.
However, I feel the recent XTerm (150-23-k5) is now usable for CJK
people even without 'proper' half-character handling. It would be a
big news when the XTerm will be released, since CJK people could not
use XTerm at all so far. (In other words, this XTerm will be regarded
as the initial release for CJK people.) I am sure a certain amount of
people will want ISO-2022 support (as Kterm and Rxvt do) [1] but I think
it is important XTerm (150-23-k5) is usable anyway. Thus, I think you
may leave half-character problems for futher discussion while we should
release XTerm with locale-sensibility ASAP. (Well, we need to integrate
my patch and selection patch.)
[1] I posted a message to Debian JP Developers Mailing List that
XTerm will support locale encodings and we are now developping it.
Then I received a mail to ask whether it will support ISO-2022,
from a developer who is involved into an internationalization
project of a text-based web browser.
Theoretically, ISO-2022 can co-exist with almost other encodings
such as ISO-8859-*, EUC-*, UTF-8, Shift_JIS, and so on because
these encodings don't include ISO-2022's escape sequences.
Thus, theoretically it is possible that XTerm support ISO-2022
without damaging 8bit/UTF-8/locale encodings. Though I would
be glad if XTerm would support ISO-2022, I know it need a large
amount of labor and none of us (including me) are brave enough
to try this work...
---
Tomohiro KUBOTA <[EMAIL PROTECTED]>
http://surfchem0.riken.go.jp/~kubota/
"Introduction to I18N"
http://www.debian.org/doc/manuals/intro-i18n/
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/