Marc-Andre Lemburg <m...@egenix.com> added the comment: Antoine Pitrou wrote: > > Antoine Pitrou <pit...@free.fr> added the comment: > > Well, the patch was technically useless since, as mentioned, unicode strings > are terminated by a NUL character by design.
There are two things to keep in mind: * Unicode objects are NUL-terminated, but only very external APIs rely on this (e.g. code using the Windows Unicode API). Please don't make the code in unicodeobject.c itself rely on this subtle detail. * The codecs work on Py_UNICODE* buffers which are *never* guaranteed to be NUL-terminated, so the problem in question is real. > Anyway, I now get the following error on the 2.7 branch. Perhaps it's related: > > ====================================================================== > FAIL: test_ucs4 (test.test_unicode.UnicodeTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/home/antoine/cpython/27/Lib/test/test_unicode.py", line 941, in > test_ucs4 > self.assertEqual(x, y) > AssertionError: '\\udbc0\\udc00' != '\\U00100000' > > ---------- > nosy: +pitrou > status: closed -> open > > _______________________________________ > Python tracker <rep...@bugs.python.org> > <http://bugs.python.org/issue8821> > _______________________________________ > _______________________________________________ > Python-bugs-list mailing list > Unsubscribe: > http://mail.python.org/mailman/options/python-bugs-list/mal%40egenix.com ---------- nosy: +lemburg _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue8821> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com