Harald Alvestrand wrote on 2000-11-15 17:59 UTC:
> Is wcwidth() actually owned by some standard organization?
I am not sure, what you mean by "owned".
wcwidth() is defined at various places, for example by the X/Open SUS in
http://www.opengroup.org/onlinepubs/007908799/xsh/wcwidth.html
It seems to be commonly considered to be a locale dependent function.
That makes sense, because EUC, etc. have a width associated with every
coding position (namely: wcswidth(s) == strlen(s)).
Beyond that EUC tradition, I see no reason why wcwidth() should depend
in any way on the culture, language, or region. Since it is not a
cultural convention, I don't think ISO DTR 14652 is the right place to
put this. I see lots of excellent reasons for fixing wcwidth for all
UTF-8 locales, as this will tremendously simplify fixed-width terminal
emulator interoperability.
Suitable standards bodies for such a specification of a recommended wcwidth
for all UTF-8 communication with terminal emulators are
ECMA/ISO because they own ECMA-48/ISO-6429, the mother of all terminal
emulation standards
Unicode because they have already done part of the work in their
EastAsianWidth tables
IETF because they own telnet and ssh, the lower layer transmission
protocols used today for almost all terminal access
There are a few open wcwidth() things/ideas still, which probably
deserve a bit more practical experience:
- handling Arabic ligatures as double-width (probably yes)
- handling non-spacing characters like combining characters (probably yes)
- handling of Hangul Jamo (probably best don't use Jamo at all in terminals)
- handling of Indic scripts (no clue so far on my side, talking to
local computer-literate Sanskrit gurus is still on my todo list)
- handling of the math characters (white brackets, etc.) that were
accidentally placed into the CJK section (probably should be single width)
- should the em dash be double width? (still not sure, probably not,
would just open a big can of worms)
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/