On Fri, 10 Apr 2020 19:20:00 +0200 Victor Stinner <vstin...@python.org> wrote: > > Note: Cython and cffi should be preferred to write new C extensions. > This PEP is about existing C extensions which cannot be rewritten with > Cython.
Using Cython does not make the C API irrelevant. In some applications, the C API has to be low-level enough for performance. Whether the application is written in Cython or not. > **Status:** not started. The performance overhead must be measured with > benchmarks and this PEP should be accepted. Surely you mean "before this PEP should be accepted"? > Examples of issues to make structures opaque: > > * ``PyGC_Head``: https://bugs.python.org/issue40241 > * ``PyObject``: https://bugs.python.org/issue39573 > * ``PyTypeObject``: https://bugs.python.org/issue40170 How do you keep fast type checking such as PyTuple_Check() if extension code doesn't have access e.g. to tp_flags? I notice you did: """ Add fast inlined version _PyType_HasFeature() and _PyType_IS_GC() for object.c and typeobject.c. """ So you understand there is a need. > **Backward compatibility:** backward incompatible on purpose. Break the > limited C API and the stable ABI, with the assumption that `Most C > extensions don't rely directly on CPython internals`_ and so will remain > compatible. The problem here is not only compatibility but potential performance regressions in C extensions. > New optimized CPython runtime > ============================== > > Backward incompatible changes is such a pain for the whole Python > community. To ease the migration (accelerate adoption of the new C > API), one option is to provide not only one but two CPython runtimes: > > * Regular CPython: fully backward compatible, support direct access to > structures like ``PyObject``, etc. > * New optimized CPython: incompatible, cannot import C extensions which > don't use the limited C API, has new optimizations, limited to the C > API. Well, this sounds like a distribution nightmare. Some packages will only be available for one runtime and not the other. It will confuse non-expert users. > O(1) bytearray to bytes conversion > .................................. > > Convert bytearray to bytes without memory copy. > > Currently, bytearray is used to build a bytes string, but it's usually > converted into a bytes object to respect an API. This conversion > requires to allocate a new memory block and copy data (O(n) complexity). > > It is possible to implement O(1) conversion if it would be possible to > pass the ownership of the bytearray object to bytes. > > That requires modifying the ``PyBytesObject`` structure to support > multiple storages (support storing content into a separate memory > block). If that's desirable (I'm not sure it is), there is a simpler solution: instead of allocating a raw memory area, bytearray could allocate... a private bytes object that you can detach without copying it. But really, this is why we have BytesIO. Which already uses that exact strategy: allocate a private bytes object. > Fork and "Copy-on-Read" problem > ............................... > > Solve the "Copy on read" problem with fork: store reference counter > outside ``PyObject``. Nowadays it is strongly recommended to use multiprocessing with the "forkserver" start method: https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods With "forkserver", the forked process is extremely lightweight and there are little savings to be made in the child. > `Dismissing Python Garbage Collection at Instagram > <https://engineering.instagram.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172>`_ > (Jan 2017) by Instagram Engineering. > > Instagram contributed `gc.freeze() > <https://docs.python.org/dev/library/gc.html#gc.freeze>`_ to Python 3.7 > which works around the issue. > > One solution for that would be to store reference counters outside > ``PyObject``. For example, in a separated hash table (pointer to > reference counter). Changing ``PyObject`` structures requires that C > extensions don't access them directly. You're planning to introduce a large overhead for each reference count lookup just to satisfy a rather niche use case? CPython probably does millions of reference counts per second. > Debug runtime and remove debug checks in release mode > ..................................................... > > If the C extensions are no longer tied to CPython internals, it becomes > possible to switch to a Python runtime built in debug mode to enable > runtime debug checks to ease debugging C extensions. That's the one convincing feature in this PEP, as far as I'm concerned. Regards Antoine. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HKRZQVVXOFLS36VYV6NRQJOLEYSKJ2NQ/ Code of Conduct: http://python.org/psf/codeofconduct/