On Wed., 17 Jun. 2020, 4:28 am Mark Shannon, <m...@hotpy.org> wrote:

>
> On 16/06/2020 1:24 pm, Nick Coghlan wrote:
> > Multiprocessing serialisation overheads are abysmal. With enough OS
> > support you can attempt to mitigate that via shared memory mechanisms
> > (which Davin added to the standard library), but it's impossible to get
> > the overhead of doing that as low as actually using the address space of
> > one OS process.
>
> What does "multiprocessing serialisation" even mean? I assume you mean
> the overhead of serializing objects for communication between processes.
>
> The cost of serializing an object has absolutely nothing to do with
> which process the interpreter is running in.
>
> Separate interpreters within a single process will still need to
> serialize objects for communication.
>
> The overhead of passing data through shared memory is the same for
> threads and processes. It's just memory.
>

No, it's not. With multiple processes, you have to instruct the OS to poke
holes in the isolated-by-default behavior in order to give multiple Python
interpreters access to a common memory store.

When the interpreters are in the same process, that isn't true - to give
multiple Python interpreters access, you just give them all a pointer to
the common data.

This will work most easily when the state being shared is not itself a
Python object. PEP 3118 buffers will be one example of that (including when
using pickle protocol 5 for data passing between interpreters), but the
application embedding use case (where there's no real "main" interpreter,
just multiple subinterpreters manipulating the application state) is the
other one I expect to be reasonably common.

This is the Ceph/mod_wsgi/hexchat plugin use case, which is beneficial
enough for people to have pursued it *despite* the significant usability
problems with the current state of the subinterpreter support.

Doing full blown zero-copy ownership transfer of actual Python objects
would be more difficult, since the current plan is to have separate memory
allocation pools per interpreter to avoid excessive locking overhead, so I
don't currently expect to see that any time soon, even if PEP 554 is
accepted. Assuming that remains the case, I'd expect multiprocessing to
remain the default choice for CPU bound use cases where all the interesting
state is held in Python objects (if you're going to have to mess about with
a separate heap of shared objects anyway, you may as well also enjoy the
benefits of greater process isolation).

Cheers,
Nick.

>

>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GFBTU7MLX5V4KQYDSO6WYGLUH6XO2SIA/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to