[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

Irit Katriel via Python-Dev Mon, 05 Apr 2021 09:53:42 -0700

Hi Nathaniel,

Thank you for your feedback. See a few comment below.

On Mon, Apr 5, 2021 at 11:01 AM Nathaniel Smith <[email protected]> wrote:

> OK, better late than never... here's a much-delayed review of the PEP.
> Thank you Irit and Guido for carrying this forward while I've been AWOL!
> It's fantastic to see my old design sketches turned into something like,
> actually real.
>
> == Overall feelings ==
>
> Honestly, I have somewhat mixed feelings ExceptionGroups. I don't see any
> way around adding ExceptionGroups in some form, because it's just a fact of
> life that in a concurrent program, multiple things can go wrong at once,
> and we want Python to be usable for writing concurrent programs. Right now
> the state of the art is "exceptions in background threads/tasks get dropped
> on the floor", and almost anything is better than that. The current PEP is
> definitely better than that. But at the same time, there are a lot of
> compromises needed to retrofit this onto Python's existing system, and the
> current proposal feels like a bunch of awkward hacks with hacks on top.
> That's largely my fault for not thinking of something better, and maybe
> there is nothing better. But I still wish we could come up with something
> more elegant, and I do see why this proposal has made people uncomfortable.
>

People were uncomfortable about three main points:

1. In the first draft "except Exception" would not catch exception groups,
this was changed in the second draft.
2. In the first draft exception groups were not subclassable, this also
changed in the second draft.
3. Do we really need except*? Why is ExceptionGroup not enough?  I think we
replied, but this question may still be on some people's minds and if so we
should continue discussing it.

Was there anything else?

> For example:
>
> - I'm uncomfortable with how in some contexts we treat EG's as
> placeholders for the contained exceptions, and other places we treat them
> like a single first-class exceptions. (Witness all the feedback about "why
> not just catch the ExceptionGroup and handle it by hand?", and imagine
> writing the docs explaining all the situations where that is or isn't a
> good idea and the pitfalls involved...) If we could somehow pick one and
> stick to it then I think that would make it easier for users to grasp. (My
> gut feeling is that making them pure containers is better, which to me is
> why it makes sense for them to be @final and why I keep hoping we can
> figure out some better way for plain 'except' and EGs to interact.)
>

I'm not sure what you mean by "a placeholder". An exception group is an
exception, and except* is a new syntax that helps you manipulate exception
groups.  I don't think the confusion you mention was due to the fact that
except can catch EGs, but rather due to the fact that the simplest examples
don't show you why except is not good enough, and why we need except*.

Suppose we did as you suggest and made exception group a container which is
not exception. Now suppose that an exception is raised in an except* block.
What is the context of this exception? Do we change the requirement that an
exception's context is always an exception?

How do you raise an exception group if it's not an Exception? Do we go back
to allowing random objects being raised?

>
> - If a function wants to start using concurrency internally, then now
> *all* its exceptions have to get wrapped in EGs and callers have to change
> *all* their exception handling code to use except* or similar. You would
> think this was an internal implementation detail that the caller shouldn't
> have to care about, but instead it forces a major change on the function's
> public API. And this is because regular 'except' can't do anything useful
> with EGs.
>

I don't understand this point. If you are using concurrency internally and
don't want to raise EGs externally, then surely you will catch EGs, select
one of the exceptions to raise and throw away all the others like in the
old days when there weren't EGs (which is presumably what your caller is
expecting).  In other words, if you make drastic changes in a function's
implementation, it is your responsibility to ensure that the old API is
preserved.

>
> - We have a special-case hack to keep 'except Exception' working, but it
> has tricky edge cases (Exceptions can still sneak past if they're paired up
> with a BaseException), and it really is specific to 'except Exception'; it
> doesn't work for any other 'except SomeError' code. This smells funny.
>

I agree with this point, my personal preference would be to have
ExceptionGroup(BaseExceptionGroup) as in the first draft, with a runaway
exception group being something you should fix. The current choice is a
compromise for backwards compatibility, not so much with the language but
with the practice of using "except Exception" to "catch almost everything,
log it and move on".  I don't think we can go against that at this point.

>
> Anyway, that's just abstract context to give an idea where I'm coming
> from. Maybe we just have to accept these trade-offs, but if anyone has any
> ideas, speak up...
>
> == Most important comment ==
>
> Flat ExceptionGroups: there were two basic design approaches we discussed
> last year, which I'll call "flat" vs "nested". The current PEP uses the
> nested design, where ExceptionGroups form a tree, and traceback information
> is distributed in pieces over this tree. This is the source of a lot of the
> complexity in the current PEP: for example, it's why EG's don't have one
> obvious iteration semantics, and it's why once an exception is wrapped in
> an EG, it can never be unwrapped again (because it would lose traceback
> information).
>
> The idea of the "flat" design is to instead store all the traceback info
> directly on the leaf exceptions, so the EG itself can be just a pure
> container holding a list of exceptions, that's it, with no nesting. The
> downside is that it requires changes to the interpreter's code for updating
> __traceback__ attributes, which is currently hard-coded to only update one
> __traceback__ at a time.
>
> For a third-party library like Trio, changing the interpreter is obviously
> impossible, so we never considered it seriously. But in a PEP, changing the
> interpreter is possible. And now I'm worried that we ruled out a better
> solution early on for reasons that no longer apply. The more I think about
> it, the more I suspect that flat EGs would end up being substantially
> simpler all around? So I think we should at least think through what that
> would look like (and Irit, I'd love your thoughts here now that you're the
> expert on the CPython details!), and document an explicit decision one way
> or another. (Maybe we should do a call or something to go over the details?
> I'm trying to keep this email from ballooning out of control...)
>

A Flat EG is in essence a "denormalized EG" - the tracebacks don't share
parts, each leaf exception has its own linked list of frames as its
traceback. It is easy enough to write a denormalize() function in
traceback.py that constructs this from the current EG structure, if you
need it (use the leaf_generator from the PEP). I'm not sure I see why we
should trouble the interpreter with this.

For display purposes, it is probably nicer to look at a normalized
traceback where common parts are not repeated.

>
> == Smaller points ==
>
> - In my original proposal, EGs didn't just hold a list of exceptions, but
> also a list of "origins" for each exception. The idea being that if, say,
> you attempt to connect to a host with an IPv4 address and an IPv6 address,
> and they raised two different OSErrors that got bundled together into one
> EG, then it would be nice to know which OSError came from which attempt. Or
> in asyncio/trio it would be nice if tracebacks could show which task each
> exception came from. It seems like this got dropped at some point?
>

I don't remember seeing this, so it wasn't deliberately dropped.   It's not
that we copied MultiError, or your sketches for MultiError2, and then
modified them into what is now in the PEP. We learned from your experience,
particularly the MutliError2 document which spells out the aspects of
MultiError that didn't work well. But we designed ExceptionGroup pretty
much from first principles (and then redesigned it several times as we
learned our own lessons) . So you shouldn't assume that anything we didn't
take from MultiError or the MultiError2 sketch was deliberately dropped. We
may have just not paid attention to all of the details.

>
>   On further consideration, I think this might be better handled as a
> special kind of traceback entry that we can attach to each exception, that
> just holds some arbitrary text that's inserted into the traceback at the
> appropriate place? But either way, I think it would be good to be able to
> attach this kind of information to tracebacks somehow.
>

It sounds like you want some way to enrich exceptions. This should be
optional (we wouldn't want EG to require additional metadata for
exceptions) so yeah, I agree it should sit on the leaf exception and not on
the group. In that sense it's orthogonal to this PEP.

>
> - Recording pre-empted exceptions: This is another type of metadata that
> would be useful to print along with the traceback. It's non-obvious and a
> bit hard to explain, but multiple trio users have complained about this, so
> I assume it will bite asyncio users too as soon as TaskGroups are added.
> The situation is, you have a parent task P and two child tasks C1 and C2:
>
>         P
>        / \
>       C1  C2
>
>   C1 terminates with an unhandled exception E1, so in order to continue
> unwinding, the nursery/taskgroup in P cancels C2. But, C2 was itself in the
> middle of unwinding another, different exception E2 (so e.g. the
> cancellation arrived during a `finally` block). E2 gets replaced with a
> `Cancelled` exception whose __context__=E2, and that exception unwinds out
> of C2 and the nursery/taskgroup in P catches the `Cancelled` and discards
> it, then re-raises E1 so it can continue unwinding.
>
>   The problem here is that E2 gets "lost" -- there's no record of it in
> the final output. Basically E1 replaced it. And that can be bad: for
> example, if the two children are interacting with each other, then E2 might
> be the actual error that broke the program, and E1 is some exception
> complaining that the connection to C2 was lost. If you have two exceptions
> that are triggered from the same underlying event, it's a race which one
> survives.
>
>   This is conceptually similar to the way an exception in an 'except'
> block used to cause exceptions to be lost, so we added __context__ to avoid
> that. And just like for __context__, it would be nice if we could attach
> some info to E1 recording that E2 had happened and then got preempted. But
> I don't see how we can reuse __context__ itself for this, because it's a
> somewhat different relationship: __context__ means that an exception
> happened in the handler for another exception, while in this case you might
> have multiple preempted exceptions, and they're associated with particular
> points in the stack trace where the preemption occurred.
>
>   This is a complex issue and maybe we should call it out-of-scope for the
> first version of ExceptionGroups. But I mention it because it's a second
> place where adding some extra annotations to the traceback info would be
> useful, and maybe we can keep it simple by adding some minimal hooks in the
> core traceback machinery and let libraries like trio/asyncio handle the
> complicated parts?
>

Isn't this something that Trio/asyncio should handle, as in: if you catch
Cancelled that has a __context__ then you need to put that __context__ into
the exception group you are raising so that it's not lost?

>
> - There are a number of places where the Python VM itself catches
> exceptions and has hard-coded handling for certain exception types. For
> example:
>
>   - Unhandled exceptions that reach the top of the main thread generally
> cause a traceback to be printed, but if the exception is SystemExit then
> the interpreter instead exits silently with status exc.args[0].
>
>   - 'for' loops call iter.__next__, and catch StopIteration while allowing
> other exceptions to escape.
>
>   - Generators catch StopIteration from their bodies and replace it with
> RuntimeError (PEP 479)
>
>   With this PEP, it's now possible for the main thread to terminate with
> ExceptionGroup(SystemExit), __next__ to raise
> ExceptionGroup(StopIteration), a generator to raise
> ExceptionGroup(StopIteration), either alone or mixed with other exceptions.
> How should the VM handle these new cases? Should they be using except* or
> except?
>
>   I don't think there's an obvious answer here, and possibly the answer is
> just "don't do that". But I feel like the PEP should say what the language
> semantics are in these cases, one way or another.
>

I agree that there are some edge cases in the interpreter that we will need
to make a decision about.  I would imagine an ExceptionGroup(StopIteration)
is a bug of some sort, so we would probably want to not handle that as a
StopIteration - just let the program fail and make the error obvious.
ExceptionGroup(SystemExit) -- I don't have an opinion, I think it's the
application's responsibility to unpack it, but we can bikeshed that.

>
> - traceback module API changes: The PEP notes that traceback.print_tb and
> traceback.print_exception will be updated to handle ExceptionGroups. The
> traceback module also has some special data structures for representing
> "pre-processed" stack traces, via the traceback.StackSummary type. This is
> used to capture tracebacks in a structured way but without holding onto the
> full frame objects. Maybe this API should also be extended somehow so it
> can also represent traceback trees?
>

print_tb and print_exception are thin wrappers around these data
structures, so this is implied but we can make it explicit in the PEP.

Thanks again
Irit

_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/A7ORJ27IY2CAAVPLC3JLEWZADP5QUASR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 654: Exception Groups and except* [REPOST]

Reply via email to