[issue38559] async generators aclose() behavior in 3.8

2019-10-23 Thread Yury Selivanov


Yury Selivanov  added the comment:

The more I think about this the more I like new 3.8 behavior. I'm going to 
close this issue; if anything comes up I'll reopen it later. Sorry for the 
noise.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38559] async generators aclose() behavior in 3.8

2019-10-22 Thread Yury Selivanov


Yury Selivanov  added the comment:

> So I think asyncio.run is actually totally fine with the 3.8.0 behavior.

Yes.  Looks like it.

> It's only explicit calls to shutdown_asyncgens that might run into this, and 
> I think that's probably OK?

Yes.  Calling shutdown_asyncgens after all tasks are cancelled is still useful 
to cleanup asynchronous generators that were created but not yet GCed or 
iterated.

> And if we did allow aclose() to run at any time, then I worry that that could 
> cause serious problems. Users can call aclose() at will, and it's generally 
> good practice to always call aclose() on your async generators explicitly. So 
> it's entirely possible that users will accidentally call aclose() themselves 
> while they have another task is blocked in __anext__. And in that case 
> what do we do?

I agree.  Avoiding that kind of mess was the main motivation to fix ag_running.

> So... this is super subtle and confusing, but I *think* the conclusion is 
> that yeah, 3.8.0 is fine and there's no urgent action needed.

Let's see if anyone else thinks this might be a problem, but yeah, it seems 
that the only programs that can suffer from this change are those that don't 
use 'asyncio.run'.  And if so they should be fixed to explicitly cancel all 
tasks before calling shutdown_asyncgens.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38559] async generators aclose() behavior in 3.8

2019-10-22 Thread Nathaniel Smith

Nathaniel Smith  added the comment:

I think conceptually, cancelling all tasks and waiting for them to exit is the 
right thing to do. That way you run as much shutdown code as possible in 
regular context, and also avoid this problem – if there are no tasks, then 
there can't be any tasks blocked in __anext__/asend/athrow/aclose, so calling 
aclose is safe.

This is also the right way to handle it from Trio's perspective – we always 
wait for all tasks to exit before exiting our run loop. For asyncio it might be 
a bit more complicated though so we should think through the trade-offs.

> The only way (at least known to me) that that could be the case if some tasks 
> ignores cancellation.  

Looking at runners.py, it seems like asyncio.run doesn't just call 
task.cancel(), but actually waits for all tasks to exit. (See the call to 
gather() inside _cancel_all_tasks.) So I think that does guarantee that by the 
time we hit shutdown_asyncgens(), we can be certain that all tasks have exited. 
(If a task ignores cancellation, then that will cause _cancel_all_tasks to 
hang, so we never reach shutdown_asyncgens at all. Not ideal, but not really 
our problem either – if someone complains we just tell them they need to stop 
ignoring cancellation if they want  their program to shutdown cleanly.)

So I think asyncio.run is actually totally fine with the 3.8.0 behavior. It's 
only explicit calls to shutdown_asyncgens that might run into this, and I think 
that's probably OK? If you're calling shutdown_asyncgens by hand then you're 
kind of taking responsibility for dealing with any subtle issues around 
shutdown.

And if we did allow aclose() to run at any time, then I worry that that could 
cause serious problems. Users can call aclose() at will, and it's generally 
good practice to always call aclose() on your async generators explicitly. So 
it's entirely possible that users will accidentally call aclose() themselves 
while they have another task is blocked in __anext__. And in that case what 
do we do?

The aclose call has to consume the generator's frame, by definition. But in the 
mean time, we still have this __anext__ coroutine object hanging around 
pointing to the consumed frame, that the task scheduler thinks needs to be 
resumed at some point in response to some event, and that can cause all kinds 
of weird situations. The __anext__ might try to resume while the aclose() is 
running. The __anext__ might get stuck and never resume, because of the code 
run by aclose() accidentally unregistering the event that was going to wake it 
up, so you just get a deadlock instead of a useful error. It might cause some 
other code entirely to crash – like if the __anext__ was blocked inside a call 
to 'await queue.get()', then aclose() interrupting that might corrupt the 
queue's internal state, so the next time some other task calls queue.put() they 
get an exception or other incorrect behavior. It all seems really tricky and 
messy to me.

So... this is super subtle and confusing, but I *think* the conclusion is that 
yeah, 3.8.0 is fine and there's no urgent action needed.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38559] async generators aclose() behavior in 3.8

2019-10-22 Thread Yury Selivanov


Yury Selivanov  added the comment:

Just discussed this issue off-list with Nathaniel.  Maybe this isn't as severe 
as I thought it is.

If we cancel all asyncio tasks before calling `loop.shutdown_asyncgens()` this 
*probably* becomes a non-issue.  And "asyncio.run()" does just that [1].

Any asynchronous generator ultimately needs a task to iterate it, so if we 
cancel tasks, we should cancel all asynchronous generators too.  Calling 
`loop.shutdown_asyncgens()` then ensures that there are no orphan async 
generators.  The only way (at least known to me) that that could be the case if 
some tasks ignores cancellation.  

Nathaniel, what do you think about this in the context of Trio?

If anything, we can update asyncio docs explaining the new behavior of async 
generators and how to use "loop.shutdown_asyncgens()".

[1] 
https://github.com/python/cpython/blob/91528f40c30717563a478920861c81d18da5bf63/Lib/asyncio/runners.py#L46

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38559] async generators aclose() behavior in 3.8

2019-10-22 Thread Yury Selivanov


New submission from Yury Selivanov :

I believe I might have discovered a problem with asynchronous generators in 3.8.


# Prelude

In Python prior to 3.8 it was possible to overlap running of "asend()" and 
"athrow()" methods for the same asynchronous generator.

In plain English, it was possible to await on "agen.asend()" while some other 
coroutine is already awaiting on "agen.asend()".  That created all kinds of 
problems with asynchronous generators when people used them to implement 
channels by trying to push values into the same asynchronous generator from 
different coroutines.  Regular code that used asynchronous generators in plain 
and non-sophisticated way did not care.

For classic synchronous generators we have a check for preventing overlapping 
use of "send()" and "throw()" -- the "gi_running" flag.  If the flag is set, 
both methods raise a RuntimeError saying that "generator already executing".


# Python 3.8

As was discussed in issues #30773 and #32526 we decided to replicate the same 
behavior for asynchronous generators, mainly:

* add an "ag_running" flag;

* set "ag_running" when "anext()" or "athrow()" begin to rung, and set it off 
when they are finished executing;

* if the flag is set when we are about to run "anext()" or "athrow()" it means 
that another coroutine is reusing the same generator object in parallel and so 
we raise a RuntimeError.


# Problem

Closing a generator involves throwing a GeneratorExit exception into it.  
Throwing the exception is done via calling "throw()" for sync generators, and 
"athrow()" for async generators.

As shown in https://gist.github.com/1st1/d9860cbf6fe2e5d243e695809aea674c, it's 
an error to close a synchronous generator while it is being iterated.  This is 
how async generators *started to behave* in 3.8.

The problem is that asyncio's "loop.shutdown_asyncgens()" method tries to 
shutdown orphan asynchronous generators by calling "aclose()" on them.  The 
method is public API and is called by "asyncio.run()" automatically.

Prior to 3.8, calling "aclose()" worked (maybe not in the most clean way). A 
GeneratorExit was thrown into an asynchronous generator regardless of whether 
it was running or not, aborting the execution.

In 3.8, calling "aclose()" can crash with a RuntimeError.  It's no longer 
possible to *reliably cancel* a running asynchrounous generator.


# Dilemma

Replicating the behavior of synchronous generators in asynchronous generators 
seems like the right thing.  But it seems that the requirements for 
asynchronous generators are different, and 3.8 breaks backwards compat.


# Proposed Solution

We keep "asend()" and "athrow()" as is in 3.8.  They will continue to raise 
RuntimeError if used in parallel on the same async generator.

We modify "aclose()" to allow it being called in parallel with "asend()" or 
"athrow()".  This will restore the <3.8 behavior and fix the 
"loop.shutdown_asyncgens()" method.


Thoughts?

--
components: Interpreter Core
messages: 355158
nosy: asvetlov, gvanrossum, lukasz.langa, ncoghlan, njs, yselivanov
priority: normal
severity: normal
status: open
title: async generators aclose() behavior in 3.8
type: behavior
versions: Python 3.8, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com