New submission from Gergely Erdélyi:
create_unicode_buffer() fails on Windows if the initializer string contains
unicode code points outside of the Basic Multilingual Plane and an explicit
length is not specified.
The problem appears to be rooted in the fact that, since PEP 393, len() returns
the number of code points, which does not always correspond to the number of
16-bit wchar words needed for the encoding on Windows. Because of that, the
preallocated c_wchar buffer will be too short for the UTF-16 string.
The following small snippet demonstrates the problem:
from ctypes import create_unicode_buffer
b = create_unicode_buffer("\U00028318\U00028319")
print(b)
File "c:\Python33\lib\ctypes\__init__.py", line 294, in create_unicode_buffer
buf.value = init
ValueError: string too long
----------
components: ctypes
messages: 205045
nosy: gergely.erdelyi
priority: normal
severity: normal
status: open
title: create_unicode_buffer() fails on non-BMP strings on Windows
type: crash
versions: Python 3.3, Python 3.4
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue19865>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com