[Python-Dev] Re: My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)

Mark Shannon Wed, 10 Jun 2020 05:44:04 -0700

Hi Petr,

On 09/06/2020 2:24 pm, Petr Viktorin wrote:

On 2020-06-05 16:32, Mark Shannon wrote:
Hi,
There have been a lot of changes both to the C API and to internalimplementations to allow multiple interpreters in a single O/S process.
These changes cause backwards compatibility changes, have a negativeperformance impact, and cause a lot of churn.
While I'm in favour of PEP 554, or some similar model for parallelismin Python, I am opposed to the changes we are currently making tosupport it.
What are sub-interpreters?
--------------------------
A sub-interpreter is a logically independent Python process whichsupports inter-interpreter communication built on shared memory andchannels. Passing of Python objects is supported, but only by copying,not by reference. Data can be shared via buffers.
Here's my biased take on the subject:
Interpreters are contexts in which Python runs. They containconfiguration (e.g. the import path) and runtime state (e.g. the set ofimported modules). An interpreter is created at Python startup(Py_InitializeEx), and you can create/destroy additional ones withPy_NewInterpreter/Py_EndInterpreter.
This is long-standing API that is used, most notably by mod_wsgi.
Many extension modules and some stdlib modules don't play well with theexistence of multiple interpreters in a process, mainly because they useprocess-global state (C static variables) rather than some more granularscope.This tends to result in nasty bugs (C-level crashes) when multipleinterpreters are started in parallel (Py_NewInterpreter) or in sequence(several Py_InitializeEx/Py_FinalizeEx cycles). The bugs are similar inboth cases.
Whether Python interpreters run sequentially or in parallel, having themwork will enable a use case I would like to see: allowing me to callPython code from wherever I want, without thinking about global state.Think calling Python from an utility library that doesn't care about therest of the application it's used in. I personally call this "the Luause case", because light-weight, worry-free embedding is an area wherePython loses to Lua. (And JS as well—that's a relatively recentdevelopment, but much more worrying.)

This seems like a worthwhile goal. However I don't see why thisrequires having multiple Python interpreters in a single O/S process.

The part I have been involved in is moving away from process-globalstate. Process-global state can be made to work, but it is much safer toalways default to module-local state (roughly what Python-language's`global` means), and treat process-global state as exceptions one has tothink through. The API introduced in PEPs 384, 489, 573 (and futureplanned ones) aims to make module-local state possible to use, thenlater easy to use, and the natural default.


I don't agree. Process level state is *much* safer than module-local state.

Suppose two interpreters, have both imported the same module.

By using O/S processes to keep the interpreters separate, the hardwareprevents the two copies of the module from interfering with each other.By sharing an address space the separation is maintained by trust andhoping that third party modules don't have too many bugs.


I don't see how you can claim the later case if safer.

Relatively recently, there is an effort to expose interpreter creation &finalization from Python code, and also to allow communication betweenthem (starting with something rudimentary, sharing buffers). There isalso a push to explore making the GIL per-interpreter, which ties in tomoving away from process-global state. Both are interesting ideas, but(like banishing global state) not the whole motivation forchanges/additions. It's probably possible to do similar things withthreads or subprocesses, sure, but if these efforts went away, the otherissues would remain.


What other issues? Please be specific.

I am not too fond of the term "sub-interpreters", because it impliessome kind of hierarchy. Of course, if interpreter creation is exposed toPython, you need some kind of "parent" to start the "child" and get itsresult when done. Also, due to some practical issues you might (sadly,currently) need some notion of "the main interpreter". But ideally, wecan make interpreters entirely independent to allow the "Lua use case".In the end-game of these efforts, I see Py_NewInterpreter transparentlycalling Py_InitializeEx if global state isn't set up yet, and similarly,Py_EndInterpreter turning the lights off if it's the last one out.


I'll drop the "sub" from now on :)

If each interpreter runs in its own process, then initializing aninterpreter and initializing the "global" state are the same thing andwouldn't need a separate step.


Cheers,
Mark.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q4ZJJ26YUXUGDNAEMKDAZR56STGFIL5C/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)

Reply via email to