[Python-Dev] Re: PEP: Modify the C API to hide implementation details

Steve Dower Fri, 10 Apr 2020 15:39:45 -0700

On 10Apr2020 2055, Antoine Pitrou wrote:

On Fri, 10 Apr 2020 19:20:00 +0200
Victor Stinner <vstin...@python.org> wrote:


Note: Cython and cffi should be preferred to write new C extensions.
This PEP is about existing C extensions which cannot be rewritten with
Cython.


Using Cython does not make the C API irrelevant.  In some
applications, the C API has to be low-level enough for performance.
Whether the application is written in Cython or not.


It does to the code author.

The point here is that we want authors who insist on coding against theC API to be aware that they have fewer compatibility guarantees - maybeeven to the point of needing to rebuild for each minor version if youwant to insist on using macros (i.e. anything starting with "_Py").

Examples of issues to make structures opaque:

* ``PyGC_Head``: https://bugs.python.org/issue40241
* ``PyObject``: https://bugs.python.org/issue39573
* ``PyTypeObject``: https://bugs.python.org/issue40170


How do you keep fast type checking such as PyTuple_Check() if extension
code doesn't have access e.g. to tp_flags?

Measured in isolation, sure. But what task are you doing that is beingheld up by builtin type checks?

If the type check is the bottleneck, you need to work on moreinteresting algorithms ;)

I notice you did:
"""
Add fast inlined version _PyType_HasFeature() and _PyType_IS_GC()
for object.c and typeobject.c.
"""

So you understand there is a need.


These are private APIs.

**Backward compatibility:** backward incompatible on purpose. Break the
limited C API and the stable ABI, with the assumption that `Most C
extensions don't rely directly on CPython internals`_ and so will remain
compatible.


The problem here is not only compatibility but potential performance
regressions in C extensions.

I don't think we've ever guaranteed performance between releases.Correctness, sure, but not performance.

New optimized CPython runtime
==============================

Backward incompatible changes is such a pain for the whole Python
community. To ease the migration (accelerate adoption of the new C
API), one option is to provide not only one but two CPython runtimes:

* Regular CPython: fully backward compatible, support direct access to
   structures like ``PyObject``, etc.
* New optimized CPython: incompatible, cannot import C extensions which
   don't use the limited C API, has new optimizations, limited to the C
   API.


Well, this sounds like a distribution nightmare.  Some packages will
only be available for one runtime and not the other.  It will confuse
non-expert users.

Agreed (except that it will also confuse expert users). Doing "Python4"-by-stealth like this is a terrible idea.

If it's incompatible, give it a new version number. If you don't want anew version number, maintain compatibility. There are no alternatives.

O(1) bytearray to bytes conversion
..................................

Convert bytearray to bytes without memory copy.

Currently, bytearray is used to build a bytes string, but it's usually
converted into a bytes object to respect an API. This conversion
requires to allocate a new memory block and copy data (O(n) complexity).

It is possible to implement O(1) conversion if it would be possible to
pass the ownership of the bytearray object to bytes.

That requires modifying the ``PyBytesObject`` structure to support
multiple storages (support storing content into a separate memory
block).


If that's desirable (I'm not sure it is), there is a simpler solution:
instead of allocating a raw memory area, bytearray could allocate... a
private bytes object that you can detach without copying it.

Yeah, I don't see the point in this one, unless you mean a purelyinternal change. Is this a major bottleneck?

Having a broader concept of "freezable" objects may be a valuable thingto enable in a new runtime, but retrofitting it to CPython doesn't seemlikely to have a big impact.

Fork and "Copy-on-Read" problem
...............................

Solve the "Copy on read" problem with fork: store reference counter
outside ``PyObject``.


Nowadays it is strongly recommended to use multiprocessing with the
"forkserver" start method:
https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

With "forkserver", the forked process is extremely lightweight and
there are little savings to be made in the child.

Unfortunately, a recommendation that only applies to a minority ofPython users. Oh well.

Separating refcounts theoretically improves cache locality, specificallythe case where cache invalidation impacts multiple CPUs (and even thecase where a single thread moves between CPUs). But I don't thinkthere's been a convincing real benchmark of this yet.

Debug runtime and remove debug checks in release mode
.....................................................

If the C extensions are no longer tied to CPython internals, it becomes
possible to switch to a Python runtime built in debug mode to enable
runtime debug checks to ease debugging C extensions.


That's the one convincing feature in this PEP, as far as I'm concerned.

Eh, this assumes that someone is fully capable of rebuilding CPython andtheir own extension, but not one of their dependencies, and this codethat they're using doesn't have any system dependencies that differ indebug builds (spoiler: they do). That seems like an oddly specificscenario that we don't really have to support.


I'd love to hear what Victor's working on that makes him so keen on this :)

All up, moving the C API away from macros and direct structure memberaccess to *real* (not "static inline") functions is a very good thingfor compatibility. Personally I'd like to go to a function table modelto support future extensibility and back-compat shims, but first we haveto deal with the macros.


Cheers,
Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IEFT3PTWCAASZJ6XW7VGSIJUFTLVEWEY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP: Modify the C API to hide implementation details

Reply via email to