On Thu, Aug 23, 2012 at 10:08 AM, Nicholas Cole <[email protected]> wrote: > I've recently been working hard to improve support for non-western > languages. A problem I've hit is that there seems to be no reliable > way to determine how many columns a given unicode character will > occupy on the terminal, even if you assume the output will be in > utf-8. With python's ever-increasing emphasis on unicode, I had hoped > python 3.3 would help, but it doesn't seem to.[1] I assume you've > also run in to this problem with urwid - have you found any solutions? > Anything I've tried only works some of the time, and for parts of the > unicode character set; I've always preferred not to offer a feature > than to offer one that is unreliable. > > Best wishes, > > Nicholas > > [1] unicodedata.east_asian_width, as the name suggests, works only on > 'East Asian' characters, and even then (as far as I have been able to > tell) not 100% reliably for the purposes of curses -- the way a > character is rendered can depend on the font. Beyond that, there are > characters in unicode that are rendered by utf-8 on my terminal as 2 > or even 3 columns wide, but without any way to reliably determine > that.
Urwid only has a partial solution that does seem to work with east asian characters. There is also some handling for combining (0-width) characters as well, but it's not very well tested. My plan is to modify my raw_display module to query the terminal's cursor position after sending any character that it's guessing the width of, collect the cursor positions received and record any that don't match the expectation (and redraw the screen). So far I haven't been polling the cursor position at all and just hoping the width values I expect are correct. The only protection I have is a manual position reset at the beginning of each line drawn so that if there is corruption it should be limited to a single line. Ian _______________________________________________ Urwid mailing list [email protected] http://lists.excess.org/mailman/listinfo/urwid
