Ezio Melotti <ezio.melo...@gmail.com> added the comment:

As I said in msg142175 I think the Py_UNICODE_IS{HIGH|LOW|}SURROGATE and 
Py_UNICODE_JOIN_SURROGATES can be committed without trailing _ in 3.3 and with 
trailing _ in 2.7/3.2.  They should go in unicodeobject.h and be public in 3.3+.

Regarding the name, it would be fine with me to use 
PyUNICODE_IS_HIGH_SURROGATE.  Other IS* macros don't use spaces, but 
JOIN_SURROGATES and other proposed macros (e.g. PUT_NEXT/WRITE_NEXT) do.  Also 
these macros are not related to any existing API like e.g. isalpha.  I think 
HIGH/LOW are fine, we can mention lead/trail in the doc.

Regarding the implementation, we could use Victor's one if it's faster and it 
has no other side effects.

Regarding the other macros:
 * _Py_UNICODE_NEXT and _Py_UNICODE_PUT_NEXT are useful, so once we have agreed 
about the name they can go in.  They can be private in all the 3 branches and 
made public in 3.4 if they work well;
 * IS_NONBMP doesn't simplify much the code but makes it more readable.  ICU 
has U_IS_BMP, but in most of the cases we want to check for non-BMP, so if we 
add this macro it might be ok to check for non-BMP;
 * I'm not sure HIGH_SURROGATE/LOW_SURROGATE are useful with _Py_UNICODE_NEXT.  
If they are they should get a better name because the current one is not clear 
about what they do.


Unless someone disagrees I'll prepare a patch with 
PyUNICODE_IS_{HIGH_|LOW_|}SURROGATE and Py_UNICODE_JOIN_SURROGATES for 
unicodeobject.h, using them where necessary, using with Victor implementation 
and commit it (after a review).

We can think about the rest later.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10542>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to