Martin v. Löwis wrote:
> Shane Hathaway wrote:
> 
>>I agree that UCS4 is needed.  There is a balancing act here; UTF-16 is
>>widely used and takes less space, while UCS4 is easier to treat as an
>>array of characters.  Maybe we can have both: unicode objects start with
>>an internal representation in UTF-16, but get promoted automatically to
>>UCS4 when you index or slice them.  The difference will not be visible
>>to Python code.  A compile-time switch will not be necessary.  What do
>>you think?
> 
> 
> This breaks backwards compatibility with existing extension modules.
> Applications that do PyUnicode_AsUnicode get a Py_UNICODE*, and
> can use that to directly access the characters.

Py_UNICODE would always be 32 bits wide.  PyUnicode_AsUnicode would
cause the unicode object to be promoted automatically.  Extensions that
break as a result are technically broken already, aren't they?  They're
not supposed to depend on the size of Py_UNICODE.

Shane
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to