New submission from STINNER Victor <victor.stin...@haypocalc.com>: curses functions accepting strings encode implicitly character strings to UTF-8. This is wrong. We should add a function to set the encoding (see issue #6745) or use the wide character C functions. I don't think that UTF-8 is the right default encoding, I suppose that the locale encoding is a better choice.
Accepting characters (and character strings) but calling byte functions is wrong. For example, addch('é') doesn't work with UTF-8 locale encoding. It calls waddch(0xE9) (é is U+00E9), whereas waddch(0xC3)+waddch(0xA9) should be called. Workaround in Python: for byte in 'é'.encode('utf-8'): win.addch(byte) I see two possible solutions: A) Add a new functions only accepting characters, and not accept characters in the existing functions B) The function should be fixed to call the right C function depending on the input type. For example, Python addch(10) and addch(b'\n') would call waddch(10), whereas addch('é') would call wadd_wch(233). I prefer solution (B) because addch('é') would just work as expected. ---------- components: Library (Lib) messages: 140375 nosy: Nicholas.Cole, akuchling, cben, gpolo, haypo, inigoserna, python-dev, r.david.murray, schodet, zeha priority: normal severity: normal status: open title: curses implementation of Unicode is wrong in Python 3 versions: Python 3.3 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12567> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com