[issue19865] create_unicode_buffer() fails on non-BMP strings on Windows

Gergely Erdélyi Mon, 02 Dec 2013 11:37:24 -0800

New submission from Gergely Erdélyi:

create_unicode_buffer() fails on Windows if the initializer string contains 
unicode code points outside of the Basic Multilingual Plane and an explicit 
length is not specified.


The problem appears to be rooted in the fact that, since PEP 393, len() returns 
the number of code points, which does not always correspond to the number of 
16-bit wchar words needed for the encoding on Windows. Because of that, the 
preallocated c_wchar buffer will be too short for the UTF-16 string.

The following small snippet demonstrates the problem:

from ctypes import create_unicode_buffer
b = create_unicode_buffer("\U00028318\U00028319")
print(b)

  File "c:\Python33\lib\ctypes\__init__.py", line 294, in create_unicode_buffer
    buf.value = init
ValueError: string too long

----------
components: ctypes
messages: 205045
nosy: gergely.erdelyi
priority: normal
severity: normal
status: open
title: create_unicode_buffer() fails on non-BMP strings on Windows
type: crash
versions: Python 3.3, Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue19865>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue19865] create_unicode_buffer() fails on non-BMP strings on Windows

Reply via email to