Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Yury Selivanov

> On Jul 6, 2016, at 9:44 PM, Nathaniel Smith  wrote:
> 
> On Wed, Jul 6, 2016 at 6:17 PM, Yury Selivanov  wrote:
>> 
>>> ...does it actually work to re-enter a main loop from inside a __del__
>>> callback? It seems like you could get into really nasty states with
>>> multiple nested __del__ calls, or if a single sweep detects multiple
>>> pieces of garbage with __del__ methods, then some of those __del__
>>> calls could be delayed indefinitely while the first __del__ runs. Is
>>> the cycle collector even re-entrant?
>> 
>> We can have a flag on the async gen object to make sure that we run the 
>> finalizer only once. The finalizer will likely schedule async_gen.aclose() 
>> coroutine which will ensure a strong ref to the gen until it is closed. This 
>> can actually work.. ;)
> 
> Hmm, if the strategy is to schedule the work to happen outside of the
> actual __del__ call, then I think this is back to assuming that all
> coroutine runners are immortal and always running. Is that an
> assumption you're comfortable with?

No need to require coroutine runners to be immortal.

If a finalizer finds out that its event loop is closed, it does nothing.
The interpreter will then issue a ResourceWarning if the generator wasn’t
properly closed.  This is how a finalizer for asyncio event loop might
look like:

def _finalize_gen(self, gen):
if not self.is_closed():
self.create_task(gen.aclose())

And this is how asyncio loop might set it up:

class AsyncioEventLoop:
def run_forever():
...
old_finalizer = sys.get_async_generator_finalizer()
sys.set_async_generator_finalizer(self._finalize_gen)
try:
while True:
self._run_once()
…
finally:
…
sys.set_async_generator_finalizer(old_finalizer)


Why do I think that it’s OK that some async generators might end up 
not being properly closed?  Because this can already happen to 
coroutines:

async def coro1():
try:
print('try')
await asyncio.sleep(1)
finally:
await asyncio.sleep(0)
print('finally')
async def coro2():
await asyncio.sleep(0)
loop = asyncio.get_event_loop()
loop.create_task(coro1())
loop.run_until_complete(coro2())

In other words, an event loop might stop or be interrupted, and there
is no way to guarantee that all coroutines will be finalized properly
in that case.

To address Glyph’s point about many event loops in one process:
a finalizer set with set_async_generator_finalizer() (which should
be thread specific, same as set_coroutine_wrapper) can actually be 
assigned to the async generator when it’s instantiated.

Yury
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Nathaniel Smith
On Wed, Jul 6, 2016 at 6:17 PM, Yury Selivanov  wrote:
>
>> ...does it actually work to re-enter a main loop from inside a __del__
>> callback? It seems like you could get into really nasty states with
>> multiple nested __del__ calls, or if a single sweep detects multiple
>> pieces of garbage with __del__ methods, then some of those __del__
>> calls could be delayed indefinitely while the first __del__ runs. Is
>> the cycle collector even re-entrant?
>
> We can have a flag on the async gen object to make sure that we run the 
> finalizer only once. The finalizer will likely schedule async_gen.aclose() 
> coroutine which will ensure a strong ref to the gen until it is closed. This 
> can actually work.. ;)

Hmm, if the strategy is to schedule the work to happen outside of the
actual __del__ call, then I think this is back to assuming that all
coroutine runners are immortal and always running. Is that an
assumption you're comfortable with?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Yury Selivanov

> ...does it actually work to re-enter a main loop from inside a __del__
> callback? It seems like you could get into really nasty states with
> multiple nested __del__ calls, or if a single sweep detects multiple
> pieces of garbage with __del__ methods, then some of those __del__
> calls could be delayed indefinitely while the first __del__ runs. Is
> the cycle collector even re-entrant?

We can have a flag on the async gen object to make sure that we run the 
finalizer only once. The finalizer will likely schedule async_gen.aclose() 
coroutine which will ensure a strong ref to the gen until it is closed. This 
can actually work.. ;)

Yury
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Nathaniel Smith
On Wed, Jul 6, 2016 at 5:47 PM, Glyph Lefkowitz  wrote:
>
> On Jul 6, 2016, at 5:25 PM, Yury Selivanov  wrote:
>
> The problem is that the GC can’t execute async code, and we don’t have any
> control over GC.  What if we add a mechanism to control how async generators
> (AG) are destructed.  Let’s say we add new function to the sys module -
> `sys.set_async_generator_finalizer(finalizer)`.  We already have
> sys.set_coroutine_wrapper(), so this isn’t something unprecedented.
>
>
> There isn't just one event loop though, and what trampoline to attach a
> dying coroutine to depends heavily on what event loop it came from.  It
> seems like a single global value for this in 'sys' would just be ... wrong.

I guess one could standardize a "get finalizer" protocol for coroutine
runners, where async generators (or other objects with the same
problem, I guess) would do the equivalent of

class AsyncGenerator:
async def __anext__(self, *args, **kwargs):
if not hasattr(self, "_finalizer"):
self._finalizer = (yield GET_FINALIZER_SENTINEL)
...

def __del__(self):
self._finalizer.run_until_complete(self.aclose())

i.e., we provide some standard mechanism for coroutine code to take a
reference on the coroutine runner when starting up (when they do have
access to it), and then re-enter it for cleanup.

This feels a bit elaborate, though, and produces some pretty
convoluted control flow.

...does it actually work to re-enter a main loop from inside a __del__
callback? It seems like you could get into really nasty states with
multiple nested __del__ calls, or if a single sweep detects multiple
pieces of garbage with __del__ methods, then some of those __del__
calls could be delayed indefinitely while the first __del__ runs. Is
the cycle collector even re-entrant?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Yury Selivanov


> On Jul 6, 2016, at 8:47 PM, Glyph Lefkowitz  wrote:
> 
> 
>> On Jul 6, 2016, at 5:25 PM, Yury Selivanov  wrote:
>> 
>> The problem is that the GC can’t execute async code, and we don’t have any 
>> control over GC.  What if we add a mechanism to control how async generators 
>> (AG) are destructed.  Let’s say we add new function to the sys module - 
>> `sys.set_async_generator_finalizer(finalizer)`.  We already have 
>> sys.set_coroutine_wrapper(), so this isn’t something unprecedented.
> 
> There isn't just one event loop though, and what trampoline to attach a dying 
> coroutine to depends heavily on what event loop it came from.  It seems like 
> a single global value for this in 'sys' would just be ... wrong.

But there can only be one currently running event loop per thread...

Another way is to add sys.set_async_generator_wrapper() (and a getter, so that 
loops can maintain a stack of them).  With it a running event loop can create a 
weak ref to a generator that is created in a coroutine that the loop is 
currently running. And with a weakref it can later finalize the generator.

Yury___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Glyph Lefkowitz

> On Jul 6, 2016, at 5:25 PM, Yury Selivanov  wrote:
> 
> The problem is that the GC can’t execute async code, and we don’t have any 
> control over GC.  What if we add a mechanism to control how async generators 
> (AG) are destructed.  Let’s say we add new function to the sys module - 
> `sys.set_async_generator_finalizer(finalizer)`.  We already have 
> sys.set_coroutine_wrapper(), so this isn’t something unprecedented.

There isn't just one event loop though, and what trampoline to attach a dying 
coroutine to depends heavily on what event loop it came from.  It seems like a 
single global value for this in 'sys' would just be ... wrong.

-glyph

___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Yury Selivanov

> On Jul 6, 2016, at 7:06 PM, Nathaniel Smith  wrote:
> 
> On Wed, Jul 6, 2016 at 1:12 PM, Yury Selivanov  wrote:
>> This is an interesting idea, but I wonder if instead of using ‘async with’ 
>> we can actually augment ‘async for’ to do the async finalization.
>> 
>> We can add an __aiter_close__ special method which will return an awaitable. 
>>  In this case, ‘async for’ can always look for that method and call it at 
>> the end of the iteration.  Async generators will implement the method to 
>> make sure that ‘finally’ is always executed (any number of awaits in 
>> ‘finally’ is OK; ‘yield’ expressions cannot be used).
> 
> I was wondering about that too. This is a fairly substantial change to
> how iterators work, though -- currently, it's totally legal and
> sometimes very useful to do things like
> 
> it = open("...")
> # Discard header line (first non-commented line)
> for line in it:
>if not line.startswith("#"):
>break
> for body_line in it:
>…

Right.  __aiter_close__ won’t work.

The problem is that the GC can’t execute async code, and we don’t have any 
control over GC.  What if we add a mechanism to control how async generators 
(AG) are destructed.  Let’s say we add new function to the sys module - 
`sys.set_async_generator_finalizer(finalizer)`.  We already have 
sys.set_coroutine_wrapper(), so this isn’t something unprecedented.

With this new function, when an AG is about to be finalized, the interpreter 
will resurrect it and call the `finalizer`.  The finalizer function will be 
installed by an event loop, and will execute ‘await AG.aclose()’ in the context 
of that loop.

We can issue a ResourceWarning if an AG (not fully exhausted) is GCed and there 
is no async finalizer.

Yury
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Nathaniel Smith
On Wed, Jul 6, 2016 at 1:12 PM, Yury Selivanov  wrote:
> This is an interesting idea, but I wonder if instead of using ‘async with’ we 
> can actually augment ‘async for’ to do the async finalization.
>
> We can add an __aiter_close__ special method which will return an awaitable.  
> In this case, ‘async for’ can always look for that method and call it at the 
> end of the iteration.  Async generators will implement the method to make 
> sure that ‘finally’ is always executed (any number of awaits in ‘finally’ is 
> OK; ‘yield’ expressions cannot be used).

I was wondering about that too. This is a fairly substantial change to
how iterators work, though -- currently, it's totally legal and
sometimes very useful to do things like

it = open("...")
# Discard header line (first non-commented line)
for line in it:
if not line.startswith("#"):
break
for body_line in it:
...

or nested for loops like:

for section_header_line in it:
section_info = parse_section_header_line(section_header_line)
if section_info.type == "multi-line":
for body_line in it:
if body_line == "-- end of section --":
break
else:
...

I guess there are a few different options in this design space -- one
could have a dedicated for+with syntax ("async forthwith"?), or an
opt-out utility like

for section_header_line in it:
for body_line in protect_from_close(it):
...

where protect_from_close returns a wrapper object that intercepts
__iter_close__ and ignores it, while passing through other method
calls (__next__/send/throw).

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Герасимов Михаил
I tried to implement asyncronious generators based on asyncio recently, 
you can see result here:


https://github.com/germn/aiogen

I also faced problem with cleanup.

First thing is where to call `.close` (or async `.aclose` for AG since 
it raises exception inside async function). While regular generator 
calls it inside `__del__`  as I understand there's no guarantee AG's 
event loop wouldn't be closed at this moment. I think AG can be closed 
at the moment it's parent task done. Since this is non-async callback we 
can start task to call `.aclose`:


https://github.com/germn/aiogen/blob/master/aiogen/agenerator.py#L35

Nothing awaits for cleanup task so we should sure that is would be 
finished before event loop is closed. I found nothings better than to 
decorate event loop's `.close` method:


https://github.com/germn/aiogen/blob/master/aiogen/agenerator.py#L52

That all doesn't look like ideal solution, but it works.
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Yury Selivanov

> On Jul 6, 2016, at 3:54 PM, David Beazley  wrote:
> 
> 
>> However, as far as I know curio doesn’t have the ability to schedule an 
>> operation in a synchronous manner by means of something like a Future. Is 
>> that correct? If there is no way in curio to spawn a task to occur later 
>> without having to await on it, then clearly there is no option but to allow 
>> coroutines in __aexit__ and finally: how else could curio operate?
>> 
> 
> Yes, Curio does not utilize Futures or callback functions for that matter.   
> However, I think the real issue at hand might be much more subtle and weird.
> 
> If I understand things correctly, the goal is to implement an asynchronous 
> generator for feeding asynchronous iteration.  This is a new kind of 
> generator that runs inside of a coroutine (so, imagine a coroutine inside of 
> another coroutine).   Without seeing much more, I'd guess it would look 
> something roughly akin to this:
> 
> async def agen():
>   ... some sort of async generator ...
>   async yield value #  Syntax 

I my current WIP branch I just use ‘yield’ inside ‘async def’.  That’s what I 
was going to propose to do in the PEP.

> 
> async def main():
>   async for x in agen():
>  ...
> 
> There's some kind of underlying protocol driving the async iteration, but 
> it's *different* than what is being used by the enclosing coroutines.   Yes, 
> there is a scheduler kernel (or event loop) that makes the outer coroutines 
> run, but that scheduler is not driving the underlying iteration protocol of 
> the async generator part.   So, things get weird when stuff like this happens:
> 
> async def main():
>   async for x in agen():
>   if x == STOP:
> break

Good catch.

[..]

> Since forgetting that last close() step would be easy, naturally an async 
> generator should support the asynchronous context-management protocol.
> 
> async def main():
>   async with agen() as items:
>  async for x in items:
>  if x == STOP:
>break

This is an interesting idea, but I wonder if instead of using ‘async with’ we 
can actually augment ‘async for’ to do the async finalization.

We can add an __aiter_close__ special method which will return an awaitable.  
In this case, ‘async for’ can always look for that method and call it at the 
end of the iteration.  Async generators will implement the method to make sure 
that ‘finally’ is always executed (any number of awaits in ‘finally’ is OK; 
‘yield’ expressions cannot be used).

Yury

___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread David Beazley

> However, as far as I know curio doesn’t have the ability to schedule an 
> operation in a synchronous manner by means of something like a Future. Is 
> that correct? If there is no way in curio to spawn a task to occur later 
> without having to await on it, then clearly there is no option but to allow 
> coroutines in __aexit__ and finally: how else could curio operate?
> 

Yes, Curio does not utilize Futures or callback functions for that matter.   
However, I think the real issue at hand might be much more subtle and weird.

If I understand things correctly, the goal is to implement an asynchronous 
generator for feeding asynchronous iteration.  This is a new kind of generator 
that runs inside of a coroutine (so, imagine a coroutine inside of another 
coroutine).   Without seeing much more, I'd guess it would look something 
roughly akin to this:

async def agen():
   ... some sort of async generator ...
   async yield value #  Syntax 

async def main():
   async for x in agen():
  ...

There's some kind of underlying protocol driving the async iteration, but it's 
*different* than what is being used by the enclosing coroutines.   Yes, there 
is a scheduler kernel (or event loop) that makes the outer coroutines run, but 
that scheduler is not driving the underlying iteration protocol of the async 
generator part.   So, things get weird when stuff like this happens:

async def main():
   async for x in agen():
   if x == STOP:
 break

Here, the asynchronous generator makes it to a point, but never to completion 
because of the break statement.  Instead, it gets garbage collected and all 
hell breaks loose back in the agen() function because what happens now?  
Especially if agen() uses finally or an async context manager:

async def agen():
   async with whatever:
...

Assuming that this is getting to the heart of the issue, I spent some time 
pondering it this morning and almost wonder if it could be solved by 
"underthinking" the solution so to speak.  For example, perhaps the __del__() 
method of an async-generator could just raise a RuntimeError if it's ever 
garbage collected before being properly terminated. Maybe you give asynchronous 
generators an async-close method to explicitly shut it down.  So, you'd have to 
do this.

async def main():
   items = agen()
   async for x in items:
 if x == STOP:
break
  await items.close()

Maybe the async-close() method would merely raise AsyncGeneratorExit in the 
generator and not enforce any other kind of semantics other than having it 
continue to run as a coroutine as it did before (with the understanding that 
the purpose of the exception is to terminate eventually). 

Since forgetting that last close() step would be easy, naturally an async 
generator should support the asynchronous context-management protocol.

async def main():
   async with agen() as items:
  async for x in items:
  if x == STOP:
break

Perhaps the only thing missing at this point is a metaclass---or possibly a 
codec.  I'm not sure. 

Yikes

Cheers,
Dave

___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Nathaniel Smith
On Wed, Jul 6, 2016 at 6:42 AM, Cory Benfield  wrote:
>
>> On 6 Jul 2016, at 13:09, David Beazley  wrote:
>>
>> Curio uses asynchronous context managers for much more than closing sockets 
>> (which frankly is the least interesting thing).   For example, they're used 
>> extensively with synchronization primitives such as Locks, Semaphores, 
>> Events, Queues, and other such things.   The ability to use coroutines in 
>> the __aexit__() method is an essential part of these primitives because it 
>> allows task scheduling decisions to be made in conjunction with 
>> synchronization events such as lock releases.   For example, you can 
>> implement fair-locking or various forms of priority scheduling.   Curio also 
>> uses asynchronous context managers for timeouts and other related 
>> functionality where coroutines have to be used in __aexit__.  I would expect 
>> coroutines in __aexit__ to also be useful in more advanced contexts such as 
>> working with databases, dealing with transactions, and other kinds of 
>> processing where asynchronous I/O might be involved.
>
> For my own edification, Dave, do you mind if I dive into this a little bit? 
> My suspicion is that this problem is rather unique to curio (or, at least, 
> does not so strongly effect event loop implementations), and I’d just like to 
> get a handle on it. It’s important to me that we don’t blow away curio, so 
> where it differs from event loops I’d like to understand what it’s doing.

The relevant difference between curio and asyncio here is that in
asyncio, there are two different mechanisms for accessing the event
loop: for some operations, you access it through its coroutine runner
interface using 'await', and for other operations, you get a direct
reference to it through some side-channel (loop= arguments, global
lookups) and then make direct method calls. To make the latter work,
asyncio code generally has to be written in a way that makes sure
loop= objects are always passed throughout the whole call stack, and
always stay in sync [no pun intended] with the coroutine runner
object. This has a number of downsides, but one upside is that it
means that the loop object is available from __del__, where the
coroutine runner isn't.

Curio takes the other approach, of standardizing on 'await' as the
single blessed mechanism for accessing the event loop (or kernel or
whatever you want to call it). So this eliminates all the tiresome
loop= tracking and the potential for out-of-sync bugs, but it means
you can't do *anything* event-loop-related from __del__.

However, I'm not convinced that asyncio really has an advantage here.
Imagine some code that uses asyncio internally, but exposes a
synchronous wrapper to the outside (e.g. as proposed here ;-) [1]):

def synchronous_wrapper(...):
# Use a new loop since we might not be in the same thread as the global loop
loop = asyncio.new_event_loop()
return loop.run_until_complete(work_asynchronously(..., loop=loop))

async def work_asynchronously(..., loop=loop):
stream = await get_asyncio_stream_writer(..., loop=loop)
stream.write(b"hello")

stream.write(...) queues some data to be sent, but doesn't send it.
Then stream falls out of scope, which triggers a call to
stream.__del__, which calls stream.close(), which does some calls on
loop requesting that it flush the buffer and then close the underlying
socket. So far so good.

...but then immediately after this, the loop itself falls out of
scope, and you lose. AFAICT from a quick skim of the asyncio code, the
data will not be sent and the socket will not be closed (depending on
kernel buffering etc.).

(And even if you wrap everything with proper 'with' blocks, this is
still true... asyncio event loops don't seem to have any API for
"complete all work and then shut down"? Maybe I'm just missing it --
if not then this is possibly a rather serious bug in actual
currently-existing asyncio. But for the present purposes, the point is
that you really do need something like 'async with' around everything
here to force the I/O to complete before handing things over to the gc
-- you can't rely on the gc to do your I/O.)

-n

[1] https://github.com/kennethreitz/requests/issues/1390#issuecomment-225361421

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Glyph Lefkowitz

> On Jul 5, 2016, at 5:56 PM, Nathaniel Smith  wrote:
> 
> Starting axiom: async functions / async generators need to be prepared
> for the case where they get garbage collected before being run to
> completion, because, well... this is a thing that can happen.

This thread is really confusing, so I'm going to try to just attack this axiom 
(which isn't really axiomatic, it's a conclusion drawn from a few other 
properties of the interpreter and the event loop) and see if it holds :-).

Yury already pointed out that coroutines run under a scheduler that keeps 
strong references to them.  Backing that up a little bit - by definition, if 
your coroutine is not strongly referenced by a scheduler, it can't be doing 
anything interesting; nobody's going to call .send on it.  This is why 'foo()' 
is a no-op whereas 'await foo()' actually causes something to happen.  
Similarly, this is why you need Task() to do something in parallel, and you 
can't just toss a coroutine out into the void.

Furthermore, Task.cancel()'s documentation holds another clue about 
asynchronous cleanup:

"Unlike Future.cancel(), this does not guarantee that the task will be 
cancelled: the exception might be caught and acted upon, delaying cancellation 
of the task or preventing cancellation completely".

Deferred.cancel() behaves in much the same way, for the same reason.  It is 
explicitly allowed that an asynchronous task do asynchronous clean-up.

Now, if you have a task that was scheduled but has come forcibly detached from 
its scheduler, you are in a bit of a mess.  But this is an inherently 
unresolvable mess, for the same reason that Thread.kill() is an inherently 
unresolvable mess: you cannot forcibly terminate a thread of execution and end 
up with your program in a known state.  It's OK to kill a generator (as 
distinct from a 'coroutine' in the sense that it does not expect .send to be 
called on it) from the GC because a generator is just yielding values and since 
it doesn't expect .send it can't expect to keep running, and it can do purely 
synchronous try/finally.  But as soon as you are allowed to invoke asynchronous 
work, you have to be allowed to let that asynchronous work complete.

So: async functions should not and cannot be prepared for the case where they 
get garbage collected; this is not a thing that can happen unless the coroutine 
scheduler is damaged beyond repair and it's time to crash your process.

-glyph___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Yury Selivanov

> On Jul 5, 2016, at 8:56 PM, Nathaniel Smith  wrote:
[..]
> Starting axiom: async functions / async generators need to be prepared
> for the case where they get garbage collected before being run to
> completion, because, well... this is a thing that can happen.
[..]
> Dismaying conclusion: inside an async function / async generator,
> finally: blocks must never yield (and neither can __aexit__
> callbacks), because they might be invoked during garbage collection.

I agree with David here: coroutines are always running under a scheduler
which, in one way or another, keeps strong references to them.  In curio
it’s a global table of all tasks, in asyncio it’s a chain of references
to callbacks of Tasks/Futures.  The only time a coroutine can be GC’ed 
in asyncio is when the caller did not use ‘await’ on it.

A running coroutine being GC’ed is an exceptional situation, that means 
that there is a bug in your scheduler.  And people actually rely on this
thing.  It’s not so much about __aexit__ returning an awaitable, it’s
about people writing code that awaits in ‘finally’ statements.

That’s why I’m big -1 on changing __aexit__.  If we change __aexit__, 
we should also prohibit awaiting in finally blocks, which is not an 
option.


> For async functions this is... arguably a problem but not super
> urgent, because async functions rarely get garbage collected without
> being run to completion. For async generators it's a much bigger
> problem.

I’m not sure I understand why it’d a problem for async generators.
Since coroutines shouldn’t ever be GC’ed while running, async generators
generally won’t be GC’ed while running too, because they will have
a strong ref from the running coroutine.

I think to makes things simple, we shouldn’t have a ‘close()’
method on async generators at all.  When a running async generator is
GC’ed we’ll make the interpreter to issue a warning.

We might want to add an ‘aclose()’ coroutine method, which will throw 
a GeneratorExit exception (or GeneratorAsyncExit) with the following
semantics:

1. If the running async generator ignores GeneratorAsyncExit and keeps
‘async yielding’, we throw a RuntimeError.

2. If the running async generator receives a GeneratorAsyncExit exception
outside of a try..finally block, the async generator is closed silently.

3. If the running async generator receives a GeneratorAsyncExit exception
inside a ‘finally’ block, it will be able to await on any number of
coroutines inside that block.

Thanks,
Yury

___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Cory Benfield

> On 6 Jul 2016, at 13:09, David Beazley  wrote:
> 
> Curio uses asynchronous context managers for much more than closing sockets 
> (which frankly is the least interesting thing).   For example, they're used 
> extensively with synchronization primitives such as Locks, Semaphores, 
> Events, Queues, and other such things.   The ability to use coroutines in the 
> __aexit__() method is an essential part of these primitives because it allows 
> task scheduling decisions to be made in conjunction with synchronization 
> events such as lock releases.   For example, you can implement fair-locking 
> or various forms of priority scheduling.   Curio also uses asynchronous 
> context managers for timeouts and other related functionality where 
> coroutines have to be used in __aexit__.  I would expect coroutines in 
> __aexit__ to also be useful in more advanced contexts such as working with 
> databases, dealing with transactions, and other kinds of processing where 
> asynchronous I/O might be involved.

For my own edification, Dave, do you mind if I dive into this a little bit? My 
suspicion is that this problem is rather unique to curio (or, at least, does 
not so strongly effect event loop implementations), and I’d just like to get a 
handle on it. It’s important to me that we don’t blow away curio, so where it 
differs from event loops I’d like to understand what it’s doing.

In the case of an event loop implementation, all of the above can be 
implemented simply by scheduling a callback which can then do whatever it needs 
to do. For example, scheduling a fair lock can be implemented synchronously, 
either literally or by means of something like asyncio.call_soon(). The only 
advantage I can see for event loops in being able to use a coroutine here is 
that they can suspend execution of the outer block until such time as they 
*know* that the next task has been scheduled into the critical section, which 
while useful does not strike me as *necessary*.

However, as far as I know curio doesn’t have the ability to schedule an 
operation in a synchronous manner by means of something like a Future. Is that 
correct? If there is no way in curio to spawn a task to occur later without 
having to await on it, then clearly there is no option but to allow coroutines 
in __aexit__ and finally: how else could curio operate?

Cory


signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread David Beazley
Cory wrote:

"What’s not entirely clear to me is why we need __aexit__ to *actually* be an 
async function. The example in curio is socket closure, which seems to be like 
it absolutely does not need to be awaitable."

Curio uses asynchronous context managers for much more than closing sockets 
(which frankly is the least interesting thing).   For example, they're used 
extensively with synchronization primitives such as Locks, Semaphores, Events, 
Queues, and other such things.   The ability to use coroutines in the 
__aexit__() method is an essential part of these primitives because it allows 
task scheduling decisions to be made in conjunction with synchronization events 
such as lock releases.   For example, you can implement fair-locking or various 
forms of priority scheduling.   Curio also uses asynchronous context managers 
for timeouts and other related functionality where coroutines have to be used 
in __aexit__.  I would expect coroutines in __aexit__ to also be useful in more 
advanced contexts such as working with databases, dealing with transactions, 
and other kinds of processing where asynchronous I/O might be involved.

As such, I think it's pretty useful to allow coroutines in __aexit__() and 
would be strongly opposed to restricting it.

-Dave

___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Cory Benfield

> On 6 Jul 2016, at 01:56, Nathaniel Smith  wrote:
> 
> There's some more discussion, and a first sketch at conventions we
> might want to use for handling this, here:
> 
>https://github.com/dabeaz/curio/issues/70

This feels like a problem with very few good solutions.

Finalizers (e.g. __del__ and weakref callbacks) obviously cannot await for 
anything (they have no coroutine runner to yield to, and there is currently no 
Python API for handing off to a coroutine runner to execute your finalizer). 
That strongly suggests that all cleanup inside coroutines and async generators 
must be synchronous. This is especially problematic given the existence of 
“async with”, which nominally promises to do asynchronous cleanup.

What’s not entirely clear to me is why we need __aexit__ to *actually* be an 
async function. The example in curio is socket closure, which seems to be like 
it absolutely does not need to be awaitable. Why can’t close() just tell the 
event loop (or curio kernel) that I’m done with the socket, and to clean up 
that socket on its own time (again, this is what Twisted does).

In the worst case you could rule that __aexit__ cannot use coroutines, but of 
course it may continue to use Futures and other fun things. I know that 
callbacks aren’t The Asyncio Way, but they have their uses, and this is 
probably one of them. Allowing Futures/Deferreds allows your __aexit__ to do 
some actual I/O if that’s required (e.g. send a HTTP/2 GOAWAY frame).

The only reason I can think of that __aexit__ needs to be a coroutine is to 
guarantee that the resource in question is genuinely cleaned up by the time the 
with block is exited. It is not entirely clear to me what the value of this 
guarantee is: does anyone have a good use-case for it that doesn’t seem like it 
violates the spirit of the context manager?

That leaves us with finally, and here I have no good solution except that 
finally inside generators has *always* been a problem. However, in this case, I 
think we’ve got a good analogy. In synchronous generators, if you yield inside 
a finally *and* leak your coroutine, the interpreter moans at you. I don’t see 
any reason not to treat await/yield from inside finally in exactly the same 
way: if you do it, you eventually explode.

Basically, there doesn’t seem to be an obvious way to make garbage collection 
of coroutines work while allowing them to be coroutines without having some way 
to register a coroutine runner with the garbage collector. That strikes me as a 
*terrible* idea (e.g. you may have multiple event loops, which requires that 
you register a unique coroutine runner per coroutine). Therefore, the only 
logical thing to do is to have only synchronous functions invoked in cleanup 
methods (e.g. __aexit__ and finally), and if those need to do some form of 
asynchronous I/O they need to use a Future-like construct to actually achieve 
it.

Cory


signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

[Async-sig] Asynchronous cleanup is a problem

2016-07-05 Thread Nathaniel Smith
So here's an interesting issue I just discovered while experimenting
with async generators. It caught me by surprise, anyway. Maybe this
was already obvious to everyone else. But I wanted to get some more
perspectives.

Starting axiom: async functions / async generators need to be prepared
for the case where they get garbage collected before being run to
completion, because, well... this is a thing that can happen.

Currently, the way garbage collection is handled is that their __del__
method calls their .close() method, which does something like:

class GeneratorType:
...
def close(self):
try:
self.throw(GeneratorExit)
except GeneratorExit, StopIteration:
return  # it worked, all is good
except:
raise  # double-fault, propagate
else:
raise RuntimeError("generator ignored GeneratorExit")

(see PEP 342).

So far, so obvious -- an async function that gets a GeneratorExit has
to propagate that exception immediately. This is a regular method, not
an async method, because it

Dismaying conclusion: inside an async function / async generator,
finally: blocks must never yield (and neither can __aexit__
callbacks), because they might be invoked during garbage collection.

For async functions this is... arguably a problem but not super
urgent, because async functions rarely get garbage collected without
being run to completion. For async generators it's a much bigger
problem.

There's some more discussion, and a first sketch at conventions we
might want to use for handling this, here:

https://github.com/dabeaz/curio/issues/70

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/