New submission from Amaury Forgeot d'Arc <amaur...@gmail.com>: On narrow unicode builds: unicodedata.category(chr(0x10000)) == 'Lo' # correct Py_UNICODE_ISPRINTABLE(0x10000) == 1 # correct str.isprintable(chr(0x10000)) == False # inconsistent
On narrow unicode builds, large code points are stored with a surrogate pair. But str.isprintable() simply loops over the Py_UNICODE array, and test the surrogates separately. There should be a way to walk a unicode string in C, character by character, and the str methods (str.is*, str.to*) should use it. ---------- components: Unicode messages: 109542 nosy: amaury.forgeotdarc, ezio.melotti, lemburg priority: normal severity: normal status: open title: str.isprintable() is always False for large code points versions: Python 3.2 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue9200> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com