Hi,
At Sat, 8 Sep 2001 20:54:37 +0100 (BST),
Markus Kuhn <[EMAIL PROTECTED]> wrote:
> The following 15 characters went from neutral to ambiguous,
> probabaly someone discovered them in some CJK character set
> that is displayed there double-width:
I imagine so, though these characters are not related to
my report http://www.debian.or.jp/~kubota/unicode-symbols.html .
However, there is an another problem that Unicode Consortium
has abolished all EastAsian cross mapping tables. I once
pointed that there are many cross mapping tables for Japanese
Shift_JIS and JIS X 0208 <-> Unicode. I said that this causes
a problem that an identical document in JIS X 0208 can become
different when converted into Unicode in various environment.
Now we have lost these mapping tables. Thus, the situation
I pointed has got even worse because now we can implement
arbitrary mapping tables because there are no standards.
I will request Unicode Consortium to supply one authorized
reliable reference mapping table between Unicode and JIS X 0208.
This problem also affects the EastAsianWidth. Now we lost
a way to discuss which Unicode character is doublewidth in
EastAsian, except for characters only used in CJK (such as
Han Ideogram, Hiragana, Katanakan, Hangul, and CJK-only
punctuations).
> The normal wcwidth() did not change as a result of Unicode 3.1.1,
> because both neutral and ambiguous characters result there in
> the same width: 1
>
> I just updated the still somewhat experimental wcwidth_cjk(),
> in case people found that so far actually useful. It contains
> a new table of EastAsianWidth Ambiguous characters.
>
> http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
Thanks.
---
Tomohiro KUBOTA <[EMAIL PROTECTED]>
http://www.debian.or.jp/~kubota/
"Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/