New submission from Amaury Forgeot d'Arc <amaur...@gmail.com>:

On narrow unicode builds:
unicodedata.category(chr(0x10000)) == 'Lo'  # correct
Py_UNICODE_ISPRINTABLE(0x10000)    == 1     # correct 
str.isprintable(chr(0x10000))      == False # inconsistent

On narrow unicode builds, large code points are stored with a surrogate pair.  
But str.isprintable() simply loops over the Py_UNICODE array, and test the 
surrogates separately.

There should be a way to walk a unicode string in C, character by character, 
and the str methods (str.is*, str.to*) should use it.

----------
components: Unicode
messages: 109542
nosy: amaury.forgeotdarc, ezio.melotti, lemburg
priority: normal
severity: normal
status: open
title: str.isprintable() is always False for large code points
versions: Python 3.2

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9200>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to