Hi Neil,

Le mar. 23 juin 2020 à 03:47, Neil Schemenauer
<nas-pyt...@arctrix.com> a écrit :
> Thanks for putting work into this.

You're welcome, I took some ideas from your tagged pointer proof of
concept ;-) I recall that we met the same C API issues in our
experiments ;-)


> Changes must be made for
> well founded reasons and not just because we think it makes a
> "cleaner" API.  I believe you are following those principles.

I mostly used the tagged pointer as a concrete goal to decide which
changes are required or not. PyPy and HPy developers also gave me API
that they would like to see disappearing :-)


> One aspect of the API that could be improved is memory management
> for PyObjects.  The current API is quite a mess and for no good
> reason except legacy, IMHO.  The original API design allowed
> extension types to use their own memory allocator.  E.g. they could
> call their own malloc()/free() implemention and the rest of the
> CPython runtime would handle that.  One consequence is that
> Py_DECREF() cannot call PyObject_Free() but instead has to call
> tp_dealloc().  There was supposed to be multiple layers of
> allocators, PyMem vs PyObject, but since the layering was not
> enforced, we ended up with a bunch of aliases to the same underlying
> function.

I vaguely recall someone explaining that Python memory allocator
created high memory fragmentation, and using a dedicated memory
allocator was way more efficient. But I concur that the majority of
people never override default tp_new and tp_free functions.

By the way, in Python 3.8, heap types started to increase their
reference counter when an instance is created, but decrementing the
type reference counter is the responsibility of the tp_dealloc
function and we failed to find a way to automate it.

More info on this issue:

* https://bugs.python.org/issue35810
* https://bugs.python.org/issue40217
* https://docs.python.org/dev/whatsnew/3.9.html#changes-in-the-c-api

C extensions maintainers now have to update their tp_dealloc method,
or their application will never be able to destroy their heap types.


> Perhaps there are a few cases when the flexibility to use a custom
> object allocator is useful.  I think in practice it is very rare
> than an extension needs to manage memory itself.  To achieve
> something similar, allow a PyObject to have a reference to some
> externally managed resource and then the tp_del method would take
> care of freeing it.  IMHO, the Python runtime should be in charge of
> allocating and freeing PyObject memory.

Do you think that it should be in PEP 620 or can it be done
independently? I don't know how to implement it, I have no idea how
many C extensions would be broken, etc.

I don't see an obvious relationship with interoperability with other
Python implementations or the stable ABI and hiding tp_del/tp_free.

While making object allocation and deallocation simpler would be nice,
it doesn't seem "required" in PEP 620 for now. What do you think?


> Another place for improvement is that the C API is unnecessarily
> large.  E.g. we don't really need PyList_GetItem(),
> PyTuple_GetItem(), and PyObject_GetItem().  Every extra API is a
> potential leak of implementation details and a burden for
> alternative VMs.  Maybe we should introduce something like
> WIN32_LEAN_AND_MEAN that hides all the extra stuff.  The
> Py_LIMITED_API define doesn't really mean the same thing since it
> tries to give ABI compatibility.  It would make sense to cooperate
> with the HPy project on deciding what parts are unnecessary.  Things
> like Cython might still want to use the larger API, to extract every
> bit of performance.  The vast majority of C extensions don't require
> that.

At the beginning, I had a plan to remove all functions and only keep
"abstract" functions like PyObject_GetItem().

Then someone asked what is the performance overhead of only using
abstract functions. I couldn't reply. Also, I didn't see a need to
only use abstract functions for now, so I abandoned this idea.

PyTuple_GetItem() returns a borrowed reference which is bad, whereas
PyObject_GetItem() returns a strong reference. Since PyPy cpyext
already solved this problem, I chose the leave the borrowed references
problem aside for now. Trying to fix all issues at once doesn't work
:-)

One issue of calling PyTuple_GetItem() or PyDict_GetItem() is that it
doesn't take in account the ability to override __getitem__() in a
subclass. Few developers write the correct code like:

            if (PyDict_CheckExact(ns))
                err = PyDict_SetItem(ns, name, v);
            else
                err = PyObject_SetItem(ns, name, v);

The PEP 620 is already quite long and introduces many incompatible
changes. I tried to make the PEP as short as possible and minimize the
number of incompatible C API changes.

Using Py_LIMITED_API provides a stable ABI, but it doesn't reduce the
Python maintenance burden, and other Python implementations must
continue to implement the full C API since C extensions actually use
it.

Unless we make Py_LIMITED_API (or another new macro to reduce the C
API size), there is no benefit for CPython nor other Python
implementations. Also, only a very few extensions use Py_LIMITED_API,
even if it exists since Python 3.2 (released in 2011).

As I wrote in the introduction, the PEP 620 is my third attempt.
Previous attempts tried to keep backward compatibility and were based
on an "opt-in" option (I want to use the new limited C API because the
carrot looks delicious!). But IMO there is a high risk that developers
don't opt-in (the carrot isn't as good as I expected :-( ) if there is
little benefit in the short term, and it doesn't reduce the
maintenance burden. Also, having two C API may explode the test
matrix, and some people didn't like that.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZKXM2CIW7AEC26YYCVQUUX2LNQPWDIQY/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to