Hi,

At Sat, 8 Sep 2001 20:54:37 +0100 (BST),
Markus Kuhn <[EMAIL PROTECTED]> wrote:

> The following 15 characters went from neutral to ambiguous,
> probabaly someone discovered them in some CJK character set
> that is displayed there double-width:

I imagine so, though these characters are not related to
my report http://www.debian.or.jp/~kubota/unicode-symbols.html .

However, there is an another problem that Unicode Consortium
has abolished all EastAsian cross mapping tables.  I once
pointed that there are many cross mapping tables for Japanese
Shift_JIS and JIS X 0208 <-> Unicode.  I said that this causes
a problem that an identical document in JIS X 0208 can become
different when converted into Unicode in various environment.

Now we have lost these mapping tables.  Thus, the situation
I pointed has got even worse because now we can implement
arbitrary mapping tables because there are no standards.
I will request Unicode Consortium to supply one authorized
reliable reference mapping table between Unicode and JIS X 0208.

This problem also affects the EastAsianWidth.  Now we lost
a way to discuss which Unicode character is doublewidth in
EastAsian, except for characters only used in CJK (such as
Han Ideogram, Hiragana, Katanakan, Hangul, and CJK-only
punctuations).


> The normal wcwidth() did not change as a result of Unicode 3.1.1,
> because  both neutral and ambiguous characters result there in
> the same width: 1
> 
> I just updated the still somewhat experimental wcwidth_cjk(),
> in case people found that so far actually useful. It contains
> a new table of EastAsianWidth Ambiguous characters.
> 
> http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c

Thanks.

---
Tomohiro KUBOTA <[EMAIL PROTECTED]>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to