On 22/8/22 18:59, Eric Snow wrote:
Hi all,

CPython has supported multiple interpreters (in the same process) for
a long time, but only through the C-API.  I'm working on exposing that
functionality to Python code (see PEP 554), aiming for 3.12.  I expect
that users will find the feature useful (particularly with a
per-interpreter GIL--see PEP 684) and that it will be used a lot more
over the coming years.  This has the potential to impact extension
module projects, especially large ones like numpy, which is why I'm
reaching out to you.

Use of multiple interpreters depends on isolation between them.  When
an extension module is imported in multiple interpreters, it is loaded
separately into a new module object in each.  Extensions often store
module data/state in C globals, which means the the multiple instances
end up sharing data.  This causes problems, more so once we have one
GIL per interpreter.

Over the years we have added machinery to help extensions get the
necessary isolation, moving away from global variables.  This includes
PEPs 384, 3121, and 489.  This has culminated in the guide you can
find in PEP 630.

Note that nothing should change when only a single interpreter is in
use (basically the status quo).  With PEP 684, importing an
incompatible extension outside the main (initial) interpreter will now
be an ImportError.  (Currently the behavior is undefined and too often
results in hard-to-debug failures and crashes.)

Thus extension module maintainers do have the option to *not* support
multiple interpreters.  Unfortunately, that doesn't mean their users
won't pester them about adding support.  We all recognize how that
dynamic can be draining on a project.  The potential burden on
maintainers is a serious factor for these upcoming changes.  numpy is
likely to be affected more than any other project.  That's why I'm
starting this thread.

PEP 684 discusses all of the above.  What I'm after with this thread is:

* to make sure the numpy maintainers are clear on what interpreter
isolation requires of the project
* a clear picture of what changes numpy would need (and how much work
that would be)
* feedback on what the CPython team can do to minimize that work
(incl. adding new C APIs)

I'm fine with having the discussion here, but I will probably create a
new category on discuss.python.org for a variety of similar threads
related to multiple interpreters and supporting them.  Having our
discussion there may lead to more participation from more CPython core
devs than just me.  Do you have any preference for or against any
particular venue?

Thanks!

-eric
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: matti.pi...@gmail.com

Thanks for starting the conversation. I would personally prefer the discussion about NumPy be here, general discussions could be elsewhere.


Please correct me if I am wrong: I understand that multiple interpreters would require us to (at least):

- refactor all the static module global state in NumPy and make it re-entrant or immortal including converting stack-allocated PyTypeObjects to heap types.

- find a mechanism to access the per-interpreter module state

- carefully consider places in the code that we steal references either intentionally or because that is the CPython C-API we are using

- measure the performance implications of the necessary changes

- plan forward/backward compatibility


This seems like a significant undertaking, and is why we have rejected casual calls for supporting multiple interpreters in the past [2], [3], [4]. Supporting multiple interpreters is currently not on the NumPy roadmap [0]. Priorities can be changed, through dialog with the NumPy community, and others can propose changes to NumPy via NEPs, PRs, and issues, but we are unlikely to engage directly in the work if it is not an agreed upon goal. There are other initiatives around NumPy that may dovetail with multiple interpreters. For instance the HPy group hit many of the issues above when creating a  port of NumPy [5]. It would be good to get like-minded people talking about this and to pool resources, maybe someone on this list has a strong opinion and would be willing to put in some work on the subject.


One thing CPython could do is to provide clear documentation how to port a small c-extension module [1]


Matti


[0] https://numpy.org/neps/roadmap.html

[1] https://github.com/python/cpython/issues/79601

[2] https://github.com/numpy/numpy/issues/665

[3] https://github.com/numpy/numpy/issues/14384

[4] https://github.com/numpy/numpy/issues/16963

[5] https://github.com/hpyproject/numpy-hpy/tree/graal-team/hpy#readme

_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to