Hi Mark,

Le jeu. 7 mai 2020 à 15:53, Mark Shannon <m...@hotpy.org> a écrit :
> I say no. Why the sudden urgency?

My urgency was to be able to quickly see if per-interpreter GIL would
be doable for Python 3.9 or not. The answer is no: there is too much
work to be done to write a "correct" implementation. I'm talking about
fixing subinterpreters issues in the proper way, not the idea of a
special build.


> I think that you are making too many assumptions about how
> inter-interpreter communication is going to work, and about sharing of
> objects.

I'm not sure what you mean here. My changes rely on the assumption
that objects are not shared between two interpreters.

I didn't work at all on the inter-interpreter communication.


> On 06/05/2020 6:49 pm, Victor Stinner wrote:
> > It's a practical solution to be able to experiment quickly
> > per-interpreter GIL without having to fix all issues at once. For
> > example, it disables Unicode interned strings which is unsafe with
> > multiple interpreters running in parallel.
>
> If the interning is done on a per-interpreter basis, then it is safe.

Sure. My idea was to quickly lists area of the code that should be
reworked to be "compatible" with subinterpreters. As I wrote, the long
term plan is to fix these issues. For example, I made small integer
singletons per interpreter. The idea would be the same for interned
strings. It's not hard to do it, but I didn't want to make this change
right now, we are now close to the 3.9 release.

I think that I will remove my changes from the future 3.9 branch and
only keep them in the master branch, to clarify that 3.9 is now out of
the scope.


> > I added this #ifdef to encourage other core developers work on this
> > project, and let early adopters test this experimental feature to give
> > us their feedback.
>
> You say this is to let other core developers work on this, but has
> anyone actually asked for these changes?

I do want these changes as Eric Snow. And another core dev also asked
me how to contribute to this project.

Most of the past changes cleaned up Python internals. For example,
properly release resources at exit. It fix old issues about
Py_Initialize()/Py_Finalize() called multiple times when Python is
embedded. It's not only about subinterpreters. See for example:
https://bugs.python.org/issue1635741


> > Currently, the special build changes:
> >
> > * Per-interpreter GIL
> > * Store the current Python thread state in a TLS
> > * Disable dict, frame, tuple and list free list
> > * Disable type method cache
> > * Disable pymalloc: force usage of libc malloc
> > * Disable the GC in subinterpreters
> > * _xxsubinterpreters.run_string() releases the GIL
>
> These changes are going to have such a large impact on robustness and
> performance as to make any comparisons meaningless.

Oh sure, my benchmark on per-interpreter GIL was run on the same
Python binary where all these caches were disabled.

I'm aware that "regular" Python has all these caches. But I didn't
care of the absolute timing. I only wanted to check if subinterpreters
actually "scales" with the number of CPUs. My PoC benchmark says that
yes, it does. CPU-bound workaround is faster in subinterpreters than
using threads. It also shows that subinterpeters have basically the
same speed than multiprocessing, which is an interesting data point.

As I wrote in my email, they are only temporary changes using #ifdef,
but I plan to fix all these issues (make the code compatible with
subinterpreters).


> > Most changes are easy to write, but some other changes are non
> > trivial. For example, I modified _PyThreadState_GET() and
> > _PyThreadState_Swap() to use a Thread Local Storage (TLS) to get and
> > set the current Python thread state.
>
> Does that mean you are going to remove all the `PyThreadState *tstate`
> parameters that have been added lately?

I don't plan to make tstate implicit again soon. I like to see where
Python states are coming from in functions. But it's an open question.

> Rather than sprinking #ifdefs everywhere, could you continue
> consolidating all "global" objects into a single data structure?

Most of my work in 3.9 was to move things from _PyRuntimeState to
PyInterpreterState.

Moving global variables into these structures is the simple solution,
but I am trying to find a way to avoid declaring all structures in
PyInterpreterState.

#include "pycore_interp.h" includes conditional variables which
includes <windows.h> on Windows. It also includes tons of things,
since PyInterpreterState became quite large.

Maybe there is a way to have "per-interpreter" variables, something
similar to thread local storage (TLS). But I'm not sure how the API
would look alike.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
_______________________________________________
python-committers mailing list -- python-committers@python.org
To unsubscribe send an email to python-committers-le...@python.org
https://mail.python.org/mailman3/lists/python-committers.python.org/
Message archived at 
https://mail.python.org/archives/list/python-committers@python.org/message/3EE64ZZXARKEMR5X6VQ4FWJNTRGYTK44/
Code of Conduct: https://www.python.org/psf/codeofconduct/

Reply via email to