Hello,
First, I would like to say that I have no fondamental problem with this PEP. While I agree with Nathaniel that the rationale given about the CSP concurrency model seems a bit weak, the author is obviously expressing his opinion there and I won't object to that. However, I think the PEP is desirable for other reasons. Mostly, I hope that by making the subinterpreters functionality available to pure Python programmers (while it was formally an advanced and arcane part of the C API), we will spur of bunch of interesting third-party experimentations, including possibilities that we on python-dev have not thought about. The appeal of the PEP for experimentations is multiple: 1) ability to concurrently run independent execution environments without spawning child processes (which on some platforms and in some situations may not be very desirable: for example on Windows where the cost of spawning is rather high; also, child processes may crash, and sometimes it is not easy for the parent to recover, especially if a synchronization primitive is left in an unexpected state) 2) the potential for parallelizing CPU-bound pure Python code in a single process, if a per-interpreter GIL is finally implemented 3) easier support for sharing large data between separate execution environments, without the hassle of setting up shared memory or the fragility of relying on fork() semantics (and as I said, I hope people find other applications) As for the argument that we already have asyncio and several other packages, I actually think that combining these different concurrency mechanisms would be interesting complex applications (such as distributed systems). For that, however, I think the PEP as currently written is a bit lacking, see below. Now for the detailed comments. * I think the module should indeed be provisional. Experimentation may discover warts that call for a change in the API or semantics. Let's not prevent ourselves from fixing those issues. * The "association" timing seems quirky and potentially annoying: an interpreter only becomes associated with a channel the first time it calls recv() or send(). How about, instead, associating an interpreter with a channel as soon as that channel is given to it through `Interpreter.run(..., channels=...)` (or received through `recv()`)? * How hard would it be, in the current implementation, to add buffering to channels? It doesn't have to be infinite: you can choose a fixed buffer size (or make it configurable in the create() function, which allows passing 0 for unbuffered). Like Nathaniel, I think unbuffered channels will quickly be annoying to work with (yes, you can create a helper thread... now you have one additional thread per channel, which isn't pretty -- especially with the GIL). * In the same vein, I think channels should allow adding readiness callbacks (that are called whenever a channel becomes ready for sending or receiving, respectively). This would make it easy to plug them into an event loop or other concurrency systems (such as Future-based concurrency). Note that each interpreter "associated" with a channel should be able to set its own readiness callback: so one callback per Python object representing the channel, but potentially multiple callbacks for the underlying channel primitive. (how would the callback be scheduled for execution in the right interpreter? perhaps using `_PyEval_AddPendingCall()` or a similar mechanism?) * I think either `interpreters.get_main()` or `interpreters.is_main()` is desirable. Inevitable, the slight differences between main and non-main interpreters will surface in non-trivial applications (finalization issues in distributed systems can really be hairy). It seems this should be mostly costless to provide, so let's do it. * I do think a minimal synchronization primitive would be nice. Either a Lock (in the Python sense) or a Semaphore: both should be relatively easy to provide, by wrapping an OS-level synchronization primitive. Then you can recreate all high-level synchronization primitives, like the threading and multiprocessing modules do (using a Lock or a Semaphore, respectively). (note you should be able to emulate a semaphore using blocking send() and recv() calls, but that's probably not very efficient, and efficiency is important) Of course, I hope these are all actionable before beta1 :-) If not, here is my preferential priority list: * High priority: fix association timing * High priority: either buffering /or/ readiness callbacks * Middle priority: get_main() /or/ is_main() * Middle / low priority: a simple synchronization primitive But I would stress the more of these we provide, the more we encourage people to experiment without pulling too much of their hair. (also, of course, I hope other people read the PEP and emit feedback) Best regards Antoine. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3KS3KACCJBUCHUGRBZ3R6WUGZXOKKWZ5/ Code of Conduct: http://python.org/psf/codeofconduct/