Re: wcwidth and locale

Rich Felker Tue, 10 Apr 2007 06:51:06 -0700

On Mon, Apr 09, 2007 at 12:26:51PM -0400, ＳｒｉｎＴｕａｒ wrote:
> Just a question:
> 
> Does anyone know of locales where ambiguous char-cell width
> characters, such as ※☠☢☣☤ ♀♂★☆ are treated as double 
> width rather than
> single width?


Ambiguous width from a Unicode perspective means just that the
characters did not exist in legacy CJK encodings, or that they were
wide in legacy CJK encodings but narrow in others (and should be
narrow), such as Greek.

> It seems they are double width in most fonts, but on my systems even
> in east asian locales they still return widths of 1. (so I get funny
> overlaps in my terminals )

I think this is a problem with the fonts. There’s no reason a
character like ♀ should be double-width. A few of the examples you
gave are hard to make look nice at 8x16 and could benefit from a
double-width cell, but all of them are legible and distinguishable at
8x16. If you’re using a smaller font size you shouldn’t expect
non-Latin characters to be particularly legible.

At times I’ve thought it would be beneficial to update and standardize
the wcwidth table to make certain characters wide, such as the em
dash and various letters in certain Indic and other scripts which
cannot adequately be represented in a single cell due to their
proportions and level of detail. But I’m not entirely sure how this
should be done, and even if it were done, I don’t think dingbats are
appropriate candidates.

~Rich

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: wcwidth and locale

Reply via email to