[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-02 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r85172 changes PyUnicode_AsWideCharString() (don't count the trailing nul character in the output size) and add unit tests. r85173 patches unicode_aswidechar() to supports non-BMP characters for all known wchar_t/Py_UNICODE size

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-02 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r85174+r85177: ctypes.c_wchar supports non-BMP characters with 32 bits wchar_t = fix this issue (I commited also an unwanted change on _testcapi to fix r85172 in r85174: r85175 reverts this change, and r85176 fixes the _testcapi

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-02 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r85173 patches unicode_aswidechar() to supports non-BMP characters for all known wchar_t/Py_UNICODE size combinaisons (2/2, 2/4 and 4/2). Oh, and 4/4 ;-) -- ___ Python tracker

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-02 Thread Daniel Stutzbach
Daniel Stutzbach dan...@stutzbachenterprises.com added the comment: Thanks for working on this! Since this was a bugfix, it should be merged back into 2.7, yes? -- stage: unit test needed - committed/rejected ___ Python tracker

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-02 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Since this was a bugfix, it should be merged back into 2.7, yes? Mmmh, the fix requires to change PyUnicode_AsWideChar() function (support non-BMP characters and surrogate pairs) (and maybe also to create

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-02 Thread Daniel Stutzbach
Daniel Stutzbach dan...@stutzbachenterprises.com added the comment: Since I noticed the bug through source code inspection and no one has reported it occurring in practice, that sounds reasonable to me. -- versions: -Python 2.7 ___ Python tracker

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-01 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Update the patch for the new PyUnicode_AsWideCharString() function: - use Py_UNICODE_SIZE and SIZEOF_WCHAR_T in the preprocessor tests - faster loop: don't use a counter + pointer, but only use pointers (for the stop condition)

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-01 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file17322/pyunicode_aswidechar_surrogates-py3k.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8670

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-01 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Patch version 3: - fix unicode_aswidechar if Py_UNICODE_SIZE == SIZEOF_WCHAR_T and w == NULL (return the number of characters, don't write into w!) - improve unicode_aswidechar() comment -- Added file:

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-01 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: I don't know how to test if Py_UNICODE_SIZE == 4 SIZEOF_WCHAR_T == 2. On Windows, sizeof(wchar_t) is 2, but it looks like Python is not prepared to have Py_UNICODE != wchar_t for is Windows implementation. wchar_t is 32 bits long

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-01 Thread Daniel Stutzbach
Daniel Stutzbach dan...@stutzbachenterprises.com added the comment: I, too, can't think of any platforms where Py_UNICODE_SIZE == 4 SIZEOF_WCHAR_T == 2 and I'm not sure what the previous policy has been. Have you noticed any other code that would set a precedent? If no one else chimes in,

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-01 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: STINNER Victor wrote: STINNER Victor victor.stin...@haypocalc.com added the comment: I don't know how to test if Py_UNICODE_SIZE == 4 SIZEOF_WCHAR_T == 2. On Windows, sizeof(wchar_t) is 2, but it looks like Python is not prepared to

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-01 Thread Daniel Stutzbach
Daniel Stutzbach dan...@stutzbachenterprises.com added the comment: You can tweak the Windows pyconfig.h to use UCS4, AFAIK, if you want to test drive this case. I seem to recall seeing some other code that assumed Windows implied UCS2. Proceed with caution. ;-) But it's probably easier

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-01 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Daniel Stutzbach wrote: Daniel Stutzbach dan...@stutzbachenterprises.com added the comment: You can tweak the Windows pyconfig.h to use UCS4, AFAIK, if you want to test drive this case. I seem to recall seeing some other code that

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-01 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Patch version 4: - implement unicode_aswidechar() for 16 bits wchar_t and 32 bits Py_UNICODE - PyUnicode_AsWideWcharString() returns the number of wide characters excluding the nul character as does PyUnicode_AsWideChar() For 16

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-01 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file19082/aswidechar_nonbmp-2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8670 ___

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-01 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file19083/aswidechar_nonbmp-3.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8670 ___

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-10-01 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Ooops, I lost my patch to fix the initial (ctypes) issue. Here is an updated patch: ctypes_nonbmp.patch (which needs aswidechar_nonbmp-4.patch). -- Added file: http://bugs.python.org/file19101/ctypes_nonbmp.patch

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-09-28 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: #9979 proposes to create a new PyUnicode_AsWideCharString() function. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8670 ___

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-09-28 Thread Daniel Stutzbach
Daniel Stutzbach dan...@stutzbachenterprises.com added the comment: I know enough about Unicode to have reported this bug, but I don't feel knowledgeable enough about Python's Unicode implementation to comment on your suggested solution. I'm adding the other people listed in

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-05-13 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Support of characters outside the Unicode BMP (code 0x) is not complete in narrow build (sizeof(Py_UNICODE) == 2) for Python2: $ ./python Python 2.7b2+ (trunk:81139M, May 13 2010, 18:45:37) x=u'\U0001' x[0], x[1]

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-05-13 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Patch for Python3: - Fix PyUnicode_AsWideChar() to support surrogates (Py_UNICODE: 2 bytes, wchar_t: 4 bytes) - u_set() of _ctypes uses PyUnicode_AsWideChar() - add a test (skipped if sizeof(wchar_t) is smaller than 4 bytes)

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-05-12 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: -- components: +Unicode ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8670 ___ ___

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-05-12 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8670 ___ ___

[issue8670] c_types.c_wchar should not assume that sizeof(wchar_t) == sizeof(Py_UNICODE)

2010-05-09 Thread Daniel Stutzbach
New submission from Daniel Stutzbach dan...@stutzbachenterprises.com: Using a UCS2 Python on a platform with a 32-bit wchar_t, the following code throws an exception (but should not): ctypes.c_wchar('\u1') Traceback (most recent call last): File stdin, line 1, in module TypeError: one