Harald Alvestrand wrote on 2000-11-15 17:59 UTC:
> Is wcwidth() actually owned by some standard organization?

I am not sure, what you mean by "owned".

wcwidth() is defined at various places, for example by the X/Open SUS in

  http://www.opengroup.org/onlinepubs/007908799/xsh/wcwidth.html

It seems to be commonly considered to be a locale dependent function.
That makes sense, because EUC, etc. have a width associated with every
coding position (namely: wcswidth(s) == strlen(s)).

Beyond that EUC tradition, I see no reason why wcwidth() should depend
in any way on the culture, language, or region. Since it is not a
cultural convention, I don't think ISO DTR 14652 is the right place to
put this. I see lots of excellent reasons for fixing wcwidth for all
UTF-8 locales, as this will tremendously simplify fixed-width terminal
emulator interoperability.

Suitable standards bodies for such a specification of a recommended wcwidth
for all UTF-8 communication with terminal emulators are

  ECMA/ISO   because they own ECMA-48/ISO-6429, the mother of all terminal
             emulation standards

  Unicode    because they have already done part of the work in their
             EastAsianWidth tables

  IETF       because they own telnet and ssh, the lower layer transmission
             protocols used today for almost all terminal access

There are a few open wcwidth() things/ideas still, which probably
deserve a bit more practical experience:

  - handling Arabic ligatures as double-width (probably yes)
  - handling non-spacing characters like combining characters (probably yes)
  - handling of Hangul Jamo (probably best don't use Jamo at all in terminals)
  - handling of Indic scripts (no clue so far on my side, talking to
    local computer-literate Sanskrit gurus is still on my todo list)
  - handling of the math characters (white brackets, etc.) that were
    accidentally placed into the CJK section (probably should be single width)
  - should the em dash be double width? (still not sure, probably not,
    would just open a big can of worms)

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/

Reply via email to