2011/1/5 Alexander Polakov <polac...@gmail.com>: > 1) wcwidth(0x200B) > This if from http://unicode.org/Public/UNIDATA/ : > > 200B;ZERO WIDTH SPACE;Cf;0;BN;;;;;N;;;;; > 200C;ZERO WIDTH NON-JOINER;Cf;0;BN;;;;;N;;;;; > 200D;ZERO WIDTH JOINER;Cf;0;BN;;;;;N;;;;; > > --- share/locale/ctype/en_US.UTF-8.src.orig B B Tue Jan B 4 22:49:22 2011 > +++ share/locale/ctype/en_US.UTF-8.src B Tue Jan B 4 22:50:55 2011 > @@ -1672,7 +1672,8 @@ > B BLANK B B 0x2000 - 0x200b B 0x202f B 0x205f > B PRINT B B 0x2000 - 0x200b B 0x2010 - 0x2029 B 0x202f - 0x2052 B 0x2057 > B PRINT B B 0x205f > -SWIDTH1 B 0x2000 - 0x200b B 0x2010 - 0x2029 B 0x202f - 0x2052 B 0x2057 > +SWIDTH1 B 0x2000 - 0x200c B 0x2010 - 0x2029 B 0x202f - 0x2052 B 0x2057 > +SWIDTH0 B 0x200b - 0x200d > B SWIDTH1 B 0x205f
That only solves the test case. All combining characters(diacritic marks), including 0x300, should be 0 width as well. Accepted interpretation of Unicode rules appears to be that Cf, Me and Mf categories +- a few characters are to be 0-spaced, see the comments in: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c That file also happens to be in xenocara/app/xterm/wcwidth.c so that was the behavior in xterm until(I assume) it started using the system version. The database file in OpenBSD is just too old, the same problem file was fixed in FreeBSD in 2006, see: http://code.bsd64.org/cvsweb/freebsd/src/share/mklocale/UTF-8.src