Daniel Stutzbach <[email protected]> added the comment:
In bltinmodule.c, it looks like some of the indentation doesn't line up?
Bikeshedding aside, it looks good to me.
I agree with Eric Smith that the first part macro name usually refers to the
type of the first argument (or the type the first argument points to).
Examples:
- Py_UNICODE_ISSPACE : Py_UNICODE -> int
- Py_UNICODE_TOLOWER : Py_UNICODE -> Py_UNICODE
- Py_UNICODE_strlen: Py_UNICODE * -> size_t
This is true elsewhere in the code as well:
- PyList_GET_SIZE : PyListObject * -> Py_ssize_t
Yes, I know there are some unfortunate exceptions. ;-)
I agree that it would be nice if something in the name hinted that the return
type was Py_UCS4 though.
Marc-Andre Lemburg wrote:
> The first argument of the macro can be any array, not just
> Py_UNICODE*, but also Py_UCS4* or even int*.
It's true that macros in C do not have any type safety. While technically
passing a Py_UCS4 * will work, on a UCS2 build it would needlessly check the
Py_UCS4 data for surrogates. I think we should discourage that.
You can also technically pass a PyListObject * to PyTuple_GET_SIZE, but that's
also not a good idea. ;-)
Alexander Belopolsky wrote:
> The issue is that once in in the process of reading the codepoint, it
> is determined whether the code point is BMP or non-BMP. Testing the
> result again in order to write it is somewhat wasteful. I don't
> think this would matter in practice, but would like to hear
> alternative opinions before moving further.
If the common pattern is:
ch = Py_UNICODE_NEXT(rp, end);
uc = Py_UNICODE_SOME_TRANSFORMATION(ch);
Py_UNICODE_PUT_NEXT(wp, uc);
The second check for surrogates in Py_UNICODE_PUT_NEXT is necessary, unless you
can prove that Py_UNICODE_SOME_TRANSFORMATION will never transform characters <
0x10000 into characters > 0x10000 or vice versa.
Can we prove will always be the case, for current and future versions of
Unicode, for all or almost-all of the transformations we care about?
Answering that question and figuring out what to do about it are probably more
trouble than it's worth. If a particularly point proves to be a bottleneck, we
can always specialize the code there later.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue10542>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com