[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

Eric Snow Mon, 20 Apr 2020 18:30:40 -0700

Nathaniel,

Your tone and approach to this conversation concern me.  I appreciate
that you have strong feelings here and readily recognize I have my own
biases, but it's becoming increasingly hard to draw any constructive
insight from what tend to be very longs posts from you.  It ends up
being a large commitment of time for small gains.  And honestly, it's
also becoming hard to not counter some of your more elaborate
statements with my own unhelpful prose.  In the interest of making
things better, please take it all down a notch or two.

I apologize if I sound frustrated.  I am frustrated, which is only
more frustrating because I respect you a lot and feel like your
feedback should be more helpful.  I'm trying to moderate my responses,
but I expect some of my emotion may slip through. :/

On Mon, Apr 20, 2020 at 4:30 PM Nathaniel Smith <n...@pobox.com> wrote:
> On Fri, Apr 17, 2020 at 3:57 PM Eric Snow <ericsnowcurren...@gmail.com> wrote:
> That makes it worse, right? If I wrote a PEP saying "here's some
> features that could possibly someday be used to make a new concurrency
> model", that wouldn't make it past the first review.

Clearly, tying this to "concurrency models" is confusing here.  So
let's just say, as Paul Moore put it, the PEP allows us to "organize"
our code in a new way (effectively along the lines of isolated threads
with message passing).

> I guess your perspective is, subinterpreters are already a CPython
> feature, so we're not adding anything, and we don't really need to
> talk about whether CPython should support subinterpreters.
>
> But this simply isn't true. Yes, there's some APIs for subinterpreters
> added back in the 1.x days, but they were never really thought
> through, and have never actually worked.

The C-API was thought through more than sufficiently.  Subinterpreters
are conceptually and practically a very light wrapper around the
fundamental architecture of CPython's runtime.  The API exposes
exactly that, no more, no less.  What is missing or broken?

They also work fine in most cases.  Mostly they have problems with
extension modules that have unsafe process-global state and break in
some less common cases due to bugs in CPython (which have not been
fixed because no one cared enough).

> There are exactly 3 users,
> and all have serious issues, and a strategy for avoiding
> subinterpreters because of the brokenness. In practice, the existing
> ecosystem of C extensions has never supported subinterpreters.

Catch-22: why would they ever bother if no one is using them.

> This is clearly not a great state of affairs – we should either
> support them or not support them. Shipping a broken feature doesn't
> help anyone. But the current status isn't terribly harmful, because
> the general consensus across the ecosystem is that they don't work and
> aren't used.
>
> If we start exposing them in the stdlib and encouraging people to use
> them, though, that's a *huge* change.

You are arguing that this is effectively a new feature.  As you noted
earlier, I am saying it isn't.

> Our users trust us. If we tell
> them that subinterpreters are a real thing now, then they'll spend
> lots of effort on trying to support them.

What is "lots"?  We've yet to see clear evidence of possible severe
impact.  On the contrary, I've gotten feedback from folks highly
involved in the ecosystem that it will not be a big problem.  It won't
take care of itself, but it won't require a massive effort.

> Since subinterpreters are confusing, and break the C API/ABI

How are they confusing and how do they break either the C-API or
C-ABI?  This sort of misinformation (or perhaps just miscommunication)
is not helpful at all to your argument.

>, this
> means that every C extension author will have to spend a substantial
> amount of time figuring out what subinterpreters are, how they work,
> squinting at PEP 489, asking questions, auditing their code, etc.

You make it sounds like tons of work, but I'm unconvinced, as noted
earlier.  Consider that we regularly have new features for which
extensions must provide support.  How is this different?

> This
> will take years, and in the mean time, users will expect
> subinterpreters to work, be confused at why they break, yell at random
> third-party maintainers, spend days trying to track down mysterious
> problems that turn out to be caused by subinterpreters, etc. There
> will be many many blog posts trying to explain subinterpreters and
> understand when they're useful (if ever), arguments about whether to
> support them. Twitter threads. Production experiments. If you consider
> that we have thousands of existing C extensions and millions of users,
> accepting PEP 554 means forcing people you don't know to collectively
> spend many person-years on subinterpreters.

Again you're painting a hopeless picture, but so far it's no more than
a picture that contrasts with other less negative feedback I've
gotten.  So yours comes off as unhelpful here.

> Random story time: NumPy deprecated some C APIs some years ago, a
> little bit before I got involved. Unfortunately, it wasn't fully
> thought through; the new APIs were a bit nicer-looking, but didn't
> enable any new features, didn't provide any path to getting rid of the
> old APIs, and in fact it turned out that there were some critical use
> cases that still required the old API. So in practice, the deprecation
> was never going anywhere; the old APIs work just as well and are never
> going to get removed, so spending time migrating to the new APIs was,
> unfortunately, a completely pointless waste of time that provided zero
> value to anyone.
>
> Nonetheless, our users trusted us, so lots and lots of projects spend
> substantial effort on migrating to the new API: figuring out how it
> worked, making PRs, reviewing them, writing shims to work across the
> old and new API, having big discussions about how to make the new API
> work with Cython, debating what to do about the cases where the new
> APIs were inadequate, etc. None of this served any purpose: they just
> did it because they trusted us, and we misled them. It's pretty
> shameful, honestly. Everyone meant well, but in retrospect it was a
> terrible betrayal of our users' trust.
>
> Now, that only affected projects that were using the NumPy C API, and
> even then, only developers who were diligent and trying to follow the
> latest updates; there were no runtime warnings, nothing visible to
> end-users, etc. Your proposal has something like 100x-1000x more
> impact, because you want to make all C extensions in Python get
> updated or at least audited, and projects that aren't updated will
> produce mysterious crashes, incorrect output, or loud error messages
> that cause users to come after the developers and demand fixes.

So if the comparison is fair, you are saying:

* extension authors will feel tremendous pressure to support subinterpreters
* it will be years of work
* users won't get much benefit out of subinterpreters

I counter that the opposite of all 3 is true.

> Now maybe that's worth it. I think on net the Py3 transition was worth
> it, and that was even more difficult. But Py3 had an incredible amount
> of scrutiny and rationale. Here you're talking about breaking the C
> API,

Again no C-API is getting broken.

> and your rationales so far are, I'm sorry, completely half-assed.
> You've never even tried to address the most difficult objections, the
> rationales you have written down are completely hand-wave-y, and
> AFAICT none of them stand up to any serious scrutiny.

That's fine.  I don't feel like my proposal needs more than what I've
written in the PEP.  Based on feedback so far (other than yours), my
feeling is correct.

> (For one random example: have you even measured how much
> subinterpreters might improve startup time on Windows versus
> subprocesses? I did, and AFAICT in any realistic scenario it's
> completely irrelevant – the majority of startup cost is importing
> modules, not spawning a subprocess, and anyway in any case where
> subinterpreters make sense to use, startup costs are only a tiny
> fraction of total runtime. Maybe I'm testing the wrong scenario, and
> you can come up with a better one. But how are you at the point of
> asking for PEP acceptance without any test results at all?!)

Performance is not the objective of the PEP nor does it ever suggest
that it is.  The point is to expose an existing functionality to a
broader audience.  If performance were a factor then I would agree
that demonstration of improvement would be important.

> Yes, subinterpreters are a neat idea, and a beautiful dream. But on
> its own, that's not enough to justify burning up many person-years of
> our users' lives. You can do better than this, and you need to.

Again, I can just as easily say "it won't cost anyone anything, so it
would be irresponsible not to do it".

> > * provides a minimal way to pass information between subinterpreters
> > (which you don't need in C but do in Python code)
> > * adds a few minor conveniences like propagating exceptions and making
> > it easier to share buffers safely
>
> These are a new API, and the current draft does seem like, well, a
> draft. Probably there's not much point in talking about it until the
> points above are resolved. But even if CPython should support
> subinterpreters, it would still be better to evolve the API outside
> the stdlib until it's more mature. Or at least have some users! Every
> API sucks in its first draft, that's just how API design works.

The proposed API has gone through at least 5 rounds on python-dev, as
well as a lot of careful thought, research, and practical use (by me).
So a "first draft" it is not.

> > Are you concerned about users reporting bugs that surface when an
> > incompatible extension is used in a subinterpreter?  That shouldn't be
> > a problem if we raise ImportError if an extension that does not
> > support PEP 489 is imported in a subinterpreter.
>
> Making subinterpreter support opt-in would definitely be better than
> making it opt-out. When C extensions break with subinterpreters, it's
> often in super-obscure ways where it's not at all clear that
> subinterpreters are involved.
>
> But notice that this means that no-one can use subinterpreters at all,
> until all of their C extensions have had significant reworks to use
> the new API, which will take years and tons of work -- it's similar to
> the Python 3 transition. Many libraries will never make the jump.

Again, that is a grand statement that makes things sound much worse
than they really are.  I expect very very few extensions will need
"significant reworks".  Adding PEP 489 support will not take much
effort, on the order of minutes.  Dealing with process-global state
will depend on how much, if any.

Honest question: how many C extensions have process-global state that
will cause problems under subinterpreters?  In other words, how many
already break in mod_wsgi?

> And why should anyone bother to wait?
>
> > Saying it's "obviously" the "only" reason is a bit much. :)  PEP 554
> > exposes existing functionality that hasn't been all that popular
> > (until recently for some reason <wink>) mostly because it is old, was
> > never publicized (until recently), and involved using the C-API.  As
> > soon as folks learn about it they want it, for various reasons
> > including (relative) isolation and reduced resource usage in
> > large-scale deployment scenarios.  It becomes even more attractive if
> > you say subinterpreters allow you to work around the GIL in a single
> > process, but that isn't the only reason.
>
> I'm worried that you might be too close to this, and convincing
> yourself that there's some pent-up demand that doesn't actually exist.
> Subinterpreters have always been documented in the C API docs, and
> they've had decades for folks to try them out and/or improve support
> if it was useful. CPython has seen *huge* changes in that time, with
> massive investments on many fronts. But no serious work happened on
> subinterpreters until you started advocating for the GIL splitting
> idea.

False.

Much of the work that is going on is motivated by desire to improve
runtime startup/finalization, to improve embed-ability, and to address
the demands of certain large-scale deployments.  Most of that work
began either before my project started or independently of it.

> But anyway, you say here that it's useful for "(relative) isolation
> and reduced resource usage". That's great, I'm asking for rationale
> and there's some rationale! Can you expand that into something that's
> detailed enough to actually evaluate?

What are you after here, exactly?  It sounds like you are saying every
new feature in Python should go through an exhaustive trial to
quantitatively prove its worth before inclusion.  That isn't how the
PEP process works, for practical reasons.  Usually the analysis is
much more subjective, under the careful scrutiny of the BDFL-delegate.

Are you saying that PEP 554 is exceptional in that regard?  Thus far I
haven't been able to discern from your feedback any concrete
justification for such a qualification.

> We already have robust support for threads for low-isolation and
> subprocesses for high-isolation. Can you name some use cases where
> neither of these are appropriate and you instead want an in-between
> isolation – like subprocesses, but more fragile and with odd edge
> cases where state leaks between them?
>
> Why do you believe that subinterpreters will have reduced resource
> usage? I assume you're comparing them to subprocesses here.
> Subinterpreters are "shared-nothing"; all code, data, etc. has to be
> duplicated, except for static C code ... which is exactly the same as
> how subprocesses work. So I don't see any theoretical reason why they
> should have reduced resource usage.

The PEP does not talk about resource usage at all.  I do not think
that such side effects should influence the decision of accepting or
rejecting the PEP.

As to actual improvements to resource usage, there are only a few that
folks might see currently when using subinterpreters.  This includes
not running into host limits on resources like #pids, which matters
for large-scale uses of Python.  Otherwise any other benefits are
mostly hypothetical (hence not mentioned in the PEP).  Some of the
improvements should not require much effort, but mostly require that
we already have the base functionality provided by PEP 554.

> And theory aside, have you measured the resource usage? Can you share
> your measurements?
>
> > > Or if PEP 554 is really a good idea on its own merits,
> > > purely as a new concurrency API, then why not build that concurrency
> > > API on top of multiprocessing and put it on PyPI and let real users
> > > try it out?
> >
> > As I said, the aim of PEP 554 isn't to provide a full concurrency
> > model, though it could facilitate something like CSP.  FWIW, there are
> > CSP libraries on PyPI already, but they are limited due to their
> > reliance on threads or multiprocessing.
>
> What are these limitations? Can you name some?

Sorry, I don't have any more time to spend looking this up.  One lib
to look at is https://python-csp.readthedocs.io/en/latest/.

> > > etc. is stupendously complex,
> >
> > The project involves lots of little pieces, each supremely tractable.
> > So if by "stupendously complex" you mean "stupendously tedious/boring"
> > then I agree. :)  It isn't something that requires a big brain so much
> > as a willingness to stick with it.
>
> I think you're being over-optimistic here :-/.
>
> The two of us have had a number of conversations about this project
> over the last few years. And as I remember it, I've repeatedly pointed
> out that there were several fundamental unanswered questions, any one
> of which could easily sink the whole project, and also a giant pile of
> boring straightforward work, and I encouraged you to start with the
> high-risk parts to prove out the idea before investing all that time
> in the tedious parts. And you've explicitly told me that no, you
> wanted to work on the easy parts first, and defer the big questions
> until later.

I'm sure I called them the "trickiest" parts.  There are no
"fundamental unanswered questions" at this point, just a lot of little
things to be done.  Mostly the tricky parts entail making the
allocators and GIL per-interpreter.  Neither will require a lot of
direct work and we have a high-level of confidence that it can be
done.  They are also blocked by just about everything else, so doing
them first isn't really an option.

Another important thing that needs effort is in helping extension
authors support subinterpreters.  Feedback on that would be helpful,
especially given your experience in certain parts of that ecosystem.

> So, well... you're asking for a PEP to be accepted. I think that means
> it's "later".

None of that applies to the PEP.  I continue to argue that the PEP
should stand on its own, aside from the work related to the GIL.

> And I feel like a bit of a jerk raising these difficult
> questions, after all the work you and others have poured into this,
> but... that's kind of what you explicitly decided to set yourself up
> for? I'm not sure what you were expecting.

Honestly, I appreciate the interest and that you are willing to take
an unpopular contrary position.  I'm just disappointed about
inaccuracies and frustrated by the difficulty in getting more helpful
feedback from you.  From my perspective you have a lot of helpful
insight to offer.

> tl;dr: accepting PEP 554 is effectively a C API break, and will force
> many thousands of people worldwide to spend many hours wrangling with
> subinterpreter support. And I've spent a ton of time thinking about
> it, talking to folks about it, etc., over the last few years, and I
> still just can't see any rationale that stands up to scrutiny. So I
> think accepting PEP 554 now would be a betrayal of our users' trust,
> harm our reputation, and lead to a situation where a few years down
> the road we all look back and think "why did we waste so much energy
> on that?"

As noted throughout this thread, some of that is not correct and some
of it is overstating the challenges.  I would love more feedback on
how we could mitigate the problems you foresee, rather than a
consistent push to abandon all hope.

-eric
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZVV5QYJ3C2Z5CVIPXIKNGBZWTS3KYFXQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

Reply via email to