Marc-Andre Lemburg <m...@egenix.com> added the comment: Alexander Belopolsky wrote: > > Alexander Belopolsky <belopol...@users.sourceforge.net> added the comment: > > On Sat, Nov 27, 2010 at 6:38 PM, Raymond Hettinger > <rep...@bugs.python.org> wrote: > .. >> I suggest Py_UNICODE_ADVANCE() to avoid false suggestion that the iterator >> protocol is being used. >> > > As a data point, ICU defines U16_NEXT() for similar purpose. I also > like ICU terminology for surrogates ("lead" and "trail") better than > the backward "high" and "low".
"High" and "low" are Unicode standard terms, so we should use those. Regarding Py_UCS4_READ_CODE_POINT: you're right that surrogates are code points, so how about Py_UCS4_READ_NEXT() ?! Regarding Py_UCS4_READ_NEXT() vs. Py_UNICODE_READ_NEXT(): the return value of the macro is a Py_UCS4 value, not a Py_UNICODE value. The first argument of the macro can be any array, not just Py_UNICODE*, but also Py_UCS4* or even int*. Py_UCS2_READ_NEXT() would be plain wrong :-) Also note that Python does have a Py_UCS4 type; it doesn't have a Py_UCS2 type. That's why we should use *Py_UCS4*_READ_NEXT(). ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue10542> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com