On 2021/04/06 13:09, Martijn van Duren wrote: > I´m also not convinced that the other wcwidth implementations might be > on to something and that the unicode consortium is having inertia > problems.
The difficulty is that it isn't *possible* to give a single correct answer for the width of SHY, it varies and can only be identified when other information about the terminal is taken into account (how the terminal behaves and whether the word currently printed is being wrapped), which is out of scope for wcwidth(3). So no surprise different people come up with a different way to handle it. > If you want to show a hyphen in your text, use a hyphen. If you want to > indicate where a word might be broken up in a hyphenated way across two > lines if the software knows the localized grammar rules use a SHY. > Also thanks to sthen@ for pointing out where the confusion comes from: > we´re using UTF-8 here, not ISO-8859-1, so we must make sure that we > use the UTF-8 definitions. but, guess what happens when text is converted from ISO-8859-1 to UTF-8... $ printf '\xad' | iconv -f iso-8859-1 -t utf-8 | hexdump -C 00000000 c2 ad |..|