Tomohiro KUBOTA wrote on 2002-04-01 13:34 UTC:
> Michael B. Allen <[EMAIL PROTECTED]> wrote:
> > Does wcwidth require __STDC_ISO_10646__?
>
> No, wcwidth() does not require __STDC_ISO_10646__ .
In more detail:
wcwidth() was first proposed in the draft for the 1994 amendment of
ISO C, but removed in the last second. The X/Open Unix spec then
included it, and it is now part of the 2001 version of POSIX.1. It is
therefore older than __STDC_ISO_10646__ (which was introduced with
ISO C 99) and in no way dependant to it.
wcwidth() is trivial to implement for EUC, because
- existing display practice for EUC is that single-byte
encoded characters are displayed in a single charcell,
double-byte encoded characters in two charcells.
- EUC does not introduce new control characters beyond
what ASCII already has
- EUC does not introduce combining characters
wcwidth() for Unicode is a *far* more tricky story, and as there existed
absolutely no implementation practice three years ago and to get people
started with understanding the issues involved, I wrote
http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
as an example for what a possible wcwidth() semantic could look like
in a Unicode-based locale. This implementation assumes wchart_t =
ISO 10646, something that is guaranteed for all locales on a system
that defines __STDC_ISO_10646__. That is the only relationship between
the two.
My above example implementation was used to define the biwidth semantics
of XFree86 xterm, and it also served as a model for the (configuration
file dependent) way in which glibc 2.2 implements wcwidth() in most
locales (the non-CJK ones).
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/