[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

Victor Stinner Wed, 06 May 2020 05:44:43 -0700

Hi Nathaniel,

Le mer. 6 mai 2020 à 04:00, Nathaniel Smith <n...@pobox.com> a écrit :
> As far as I understand it, the subinterpreter folks have given up on
> optimized passing of objects, and are only hoping to do optimized
> (zero-copy) passing of raw memory buffers.

I think that you misunderstood the PEP 554. It's a bare minimum API,
and the idea is to *extend* it later to have an efficient
implementation of "shared objects".

IMO it should easy to share *data* (object "content") between
subinterpreters, but each interpreter should have its own PyObject
which exposes the data at the Python level. See the PyObject has a
proxy to data.

It would badly hurt performance if a PyObject is shared by two
interpreters: it would require locking or atomic variables for
PyObject members and PyGC_Head members.

It seems like right now, the PEP 554 doesn't support sharing data, so
it should still be designed and implemented later.

Who owns the data? When can we release memory? Which interpreter
releases the memory? I read somewhere that data is owned by the
interpreter which allocates the memory, and its memory would be
released in the same interpreter.

How do we track data lifetime? I imagine a reference counter. When it
reaches zero, the interpreter which allocates the data can release it
"later" (it doesn't have to be done "immediately").

How to lock the whole data or a portion of data to prevent data races?
If data doesn't contain any PyObject, it may be safe to allow
concurrent writes, but readers should be prepared for inconsistencies
depending on the access pattern. If two interpreters access separated
parts of the data, we may allow lock-free access.

I don't think that we have to reinvent the wheel. threading,
multiprocessing and asyncio already designed such APIs. We should to
design similar APIs and even simply reuse code.

My hope is that "synchronization" (in general, locks in specific) will
be more efficient in the same process, than synchronization between
multiple processes.

I would be interested to have a generic implementation of "remote
object": a empty proxy object which forward all operations to a
different interpreter. It will likely be inefficient, but it may be
convenient for a start. If a method returns an object, a new proxy
should be created. Simple scalar types like int and short strings may
be serialized (copied).

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/5B5M62C7YNMXJW2ULXQ3XAGEM4F3C67S/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PoC: Subinterpreters 4x faster than sequential execution or threads on CPU-bound workaround

Reply via email to