Re: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code

Eric Snow Thu, 07 Sep 2017 20:11:56 -0700

First of all, thanks for the feedback and encouragement!  Responses
in-line below.

-eric

On Thu, Sep 7, 2017 at 3:48 PM, Nathaniel Smith <n...@pobox.com> wrote:
> My concern about this is the same as it was last time -- the work
> looks neat, but right now, almost no-one uses subinterpreters
> (basically it's Jep and mod_wsgi and that's it?), and therefore many
> packages get away with ignoring subinterpreters.

My concern is that this is a chicken-and-egg problem.  The situation
won't improve until subinterpreters are more readily available.

> Numpy is the one I'm
> most familiar with: when we get subinterpreter bugs we close them
> wontfix, because supporting subinterpreters properly would require
> non-trivial auditing, add overhead for non-subinterpreter use cases,
> and benefit a tiny tiny fraction of our users.

The main problem of which I'm aware is C globals in libraries and
extension modules.  PEPs 489 and 3121 are meant to help but I know
that there is at least one major situation which is still a blocker
for multi-interpreter-safe module state.  Other than C globals, is
there some other issue?

> If we add a friendly python-level API like this, then we're committing
> to this being a part of Python for the long term and encouraging
> people to use it, which puts pressure on downstream packages to do
> that work... but it's still not clear whether any benefits will
> actually materialize.

I'm fine with Nick's idea about making this a "provisional" module.
Would that be enough to ease your concern here?

> I've actually argued with the PyPy devs to try to convince them to add
> subinterpreter support as part of their experiments with GIL-removal,
> because I think the semantics would genuinely be nicer to work with
> than raw threads, but they're convinced that it's impossible to make
> this work. Or more precisely, they think you could make it work in
> theory, but that it would be impossible to make it meaningfully more
> efficient than using multiple processes. I want them to be wrong, but
> I have to admit I can't see a way to make it work either...

Yikes!  Given the people involved I don't find that to be a good sign.
Nevertheless, I still consider my ultimate goals to be tractable and
will press forward.  At each step thus far, the effort has led to
improvements that extend beyond subinterpreters and multi-core.  I see
that trend continuing for the entirety of the project.  Even if my
final goal is not realized, the result will still be significantly net
positive...and I still think it will still work out. :)

> If this is being justified by the multicore use case, and specifically
> by the theory that having two interpreters in the same process will
> allow for more efficient communication than two interpreters in two
> different processes, then... why should we believe that that's
> actually possible? I want your project to succeed, but if it's going
> to fail then it seems better if it fails before we commit to exposing
> new APIs.

The project is partly about performance.  However, it's also
particularly about offering a alternative concurrency model with an
implementation that can run in multiple threads simultaneously in the
same process.

On Thu, Sep 7, 2017 at 5:15 PM, Nathaniel Smith <n...@pobox.com> wrote:
> The slow case is passing
> complicated objects between processes, and it's slow because pickle
> has to walk the object graph to serialize it, and walking the object
> graph is slow. Copying object graphs between subinterpreters has the
> same problem.

The initial goal is to support passing only strings between
interpreters.  Later efforts will involve investigating approaches to
efficiently and safely passing other objects.

> So the only case I can see where I'd expect subinterpreters to make
> communication dramatically more efficient is if you have a "deeply
> immutable" type
> [snip]
> However, it seems impossible to support user-defined deeply-immutable
> types in Python:
> [snip]

I agree that it is currently not an option.  That is part of the
exercise.  There are a number of possible solutions to explore once we
get to that point.  However, this PEP isn't about that.  I'm confident
enough about the possibilities that I'm comfortable with moving
forward here.

> I guess the other case where subprocesses lose to "real" threads is
> startup time on Windows. But starting a subinterpreter is also much
> more expensive than starting a thread, once you take into account the
> cost of loading the application's modules into the new interpreter. In
> both cases you end up needing some kind of process/subinterpreter pool
> or cache to amortize that cost.

Interpreter startup costs (and optimization strategies) are another
aspect of the project which deserve attention.  However, we'll worry
about that after the core functionality has been achieved.

> Obviously I'm committing the cardinal sin of trying to guess about
> performance based on theory instead of measurement, so maybe I'm
> wrong. Or maybe there's some deviously clever trick I'm missing.

:)  I'd certainly be interested in more data regarding the relative
performance of fork/multiprocess+IPC vs. subinterpreters.  However,
it's going to be hard to draw any conclusions until the work is
complete. :)

> I hope so -- a really useful subinterpreter multi-core store would be
> awesome.

Agreed!  Thanks for the encouragement. :)
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] PEP 554: Stdlib Module to Support Multiple Interpreters in Python Code

Reply via email to