Some background first:

Version 3.2 of the Unicode standard introduced a block of so-called
"variation selector" characters, which are not themselves printable but
rather provide a hint about what glyph to use for the preceding
grapheme cluster. Since version 9.0, variation selector 16 (Unicode code
point U+FE0F) is defined to request an emoji-like appearance for the
character, which are expected to appear as "full width" i.e. 2 columns
in a terminal. Some references:
https://en.wikipedia.org/wiki/Variation_Selectors_(Unicode_block)
https://www.unicode.org/reports/tr51/#Display
https://gitlab.freedesktop.org/terminal-wg/specifications/-/issues/9

As seen from the third link, terminal authors have struggled with the
details of implementing this - since now a character can supposedly
*modify* the width of a *previous* character, rather than simply having
a width of its own - invalidating the entire concept of `wcwidth`, for
example.

However, currently ncurses is compounding the problem by handling the
variation selector character inconsistently *when processing it* and
supplying it to the terminal, which is the bug I am reporting here.

I initially reported this on the CPython bug tracker, since I initially
discovered the issue in the Python `curses` wrapper:

https://github.com/python/cpython/issues/135521

However, with some further investigation I have established that the C
library is to blame.

Here is a test program:

------------------------------------------------------------------------
#define _XOPEN_SOURCE_EXTENDED 1

#include <locale.h>
#include <ncurses.h>

/* Based on code from: https://stackoverflow.com/questions/69804246 */

int main() {
    WINDOW* win;
    setlocale(LC_ALL, "");
    initscr();
    raw();
    noecho();
    curs_set(0);
    refresh();
    
    win = newwin(6, 12, 0, 0);
    box(win, 0, 0);
    mvwaddwstr(win, 1, 6, L"vwxyz");
    mvwaddwstr(win, 2, 6, L"vwxyz");
    mvwaddwstr(win, 3, 6, L"vwxyz");
    mvwaddwstr(win, 4, 6, L"vwxyz");
    mvwaddwstr(win, 1, 1, L"a\u2764\ufe0fa");
    mvwins_wstr(win, 2, 1, L"a\u2764\ufe0fa");
    mvwaddwstr(win, 3, 1, L"a\U0001f49aa");
    mvwins_wstr(win, 4, 1, L"a\U0001f49aa");
    wrefresh(win);
    delwin(win);

    getch();
    endwin();
}
------------------------------------------------------------------------

Very simple - I draw a box and put some letters and then some heart
emojis in it (red and green), in varying ways.

On every terminal I can find, the heart on line 2 is just an ordinary
U+2764 HEAVY BLACK HEART, and a blank space appears next to it. Whereas
on terminals that support the red-heart emoji, it will appear on line 1.
Even in terminals with no emoji support, such as a vanilla xterm, the
results differ between adding vs. inserting the text - there's still a
blank space on line 2, but not on line 1 (where the variation selector
has presumably just been discarded by the terminal).

It is as if `mvwins_wstr` were silently replacing the variation selector
character with a space.

The green hearts work without issue and are included for reference.

  • mvwins_wstr appe... Bug reports for ncurses, the GNU implementation of curses

Reply via email to