Adeodato Simó suggests to follow through with the suggestion of this bug report and link ncurses-ruby against ncursesw instead of ncurses because he has observed that the sup email program will display non-ascii characters better on a utf-8 terminal when linked like that.
I am the upstream author of ncurses-ruby. I admit that until today I had no clear idea what the difference was between ncurses and ncursesw, apart from ncursesw "somehow" enabling "wide characters". I have investigated the matter today and I recommend not to link ncurses-ruby against ncursesw. Reasoning: I agree that it would be a good thing to have a ruby ncurses binding that links against ncursesw. Conventional ncurses (without the trailing w) only works well with 8 bit charsets. A few years ago, this has not been a problem for most users, as it was common then for linux distributions to configure local 8 bit charsets like ISO-8859-1. Now however, virtually every linux installation defaults to UTF-8 character encoding. With the consequence that non-ascii characters require more than one byte for encoding them. NCurses programs that worked fine in the old environment will no longer display non-ascii characters reliably. Is ncursesw the rescue? Yes, but its not that simple. You cannot simply link an ncurses program against ncursesw and expect it to magically work with UTF-8 Strings. In the email program mentioned above, you will still notice display errors when you use the cursor keys to highlight a line in the message body that contains non-ascii characters: Not the whole line is highlighted, a few character cells will remain black. If an email runs over several pages, then flipping the pages may cause some garbage from the previous page remain on the screen in lines containing non-ascii characters. What is happening? The email program still calls mvaddstr with an utf-8 encoded string. As far as ncurses(w) is concerned, the multiple bytes that make up a single non-ascii character are distributed to different character cells on the screen. The only reason why the user can recognise the original non-ascii character on the screen is that ncurses probably also happens to "print" the sub-character bytes in the correct sequence to the terminal, which then interprets the resulting UTF-8 encoding. However, after the printing, there is a disagreement on the horizontal position of the cursor between the terminal and the ncurses(w) library. The correct way to use ncursesw to print non-ascii, utf-8 encoded characters on a utf-8 terminal is for the application to split the string to print into (possibly multibyte) characters, compute the unicode codepoint for each character, and call the wide character functions of ncursesw (e.g. mvadd_wch, mvaddwstr). This requires a ncursesw-ruby wrapper as well as changes to the application. Looking at the source code of the mailer I'd say that it is not really suited for UTF-8 encoded strings yet, as it still assumes that the length of a string in bytes is equal to the number of characters in the string. Conclusions: - The switch from 8 bit character sets to UTF-8 requires serious modifications to applications using ncurses. Ncursesw cannot be used as a drop-in replacement. - A separate ncursesw-ruby wrapper is desirable. It has to export the additional wide character functions. Tobias -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]

