Re: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio

2019-03-27 Thread Nathaniel Smith
On Wed, Mar 27, 2019 at 1:49 PM Guido van Rossum  wrote:
>
> On Wed, Mar 27, 2019 at 1:23 PM Nathaniel Smith  wrote:
>>
>> On Wed, Mar 27, 2019 at 10:44 AM Daniel Nugent  wrote:
>> >
>> > FWIW, the ayncio_run_encapsulated approach does not work with the 
>> > transport/protocol apis because the loop needs to stay alive concurrent 
>> > with the connection in order for the awaitables to all be on the same loop.
>>
>> Yeah, there are two basic approaches being discussed here: using two
>> different loops, versus re-entering an existing loop.
>> asyncio_run_encapsulated is specifically for the two-loops approach.
>>
>> In this version, the outer loop, and everything running on it, stop
>> entirely while the inner loop is running – which is exactly what
>> happens with any other synchronous, blocking API. Using
>> asyncio_run_encapsulated(aiohttp.get(...)) in Jupyter is exactly like
>> using requests.get(...), no better or worse.
>
>
> And Yury's followup suggests that it's hard to achieve total isolation 
> between loops, due to subprocess management and signal handling (which are 
> global states in the OS, or at least per-thread -- the OS doesn't know about 
> event loops).

The tough thing about signals is that they're all process global
state, *not* per-thread.

In Trio I think this wouldn't be a big deal – whenever we touch signal
handlers, we save the old value and then restore it afterwards, so the
inner loop would just temporarily override the outer loop, which I
guess is what you'd expect. (And Trio's subprocess support avoids
touching signals or any global state.) Asyncio could potentially do
something similar, but its subprocess support does rely on signals,
which could get messy since the outer loop can't be allowed to miss
any SIGCHLDs. Asyncio does have a mechanism to share SIGCHLD handlers
between loops (intended to support the case where you have loops
running in multiple threads simultaneously), and it might handle this
case too, but I don't know the details well enough to say for sure.

> I just had another silly idea. What if the magical decorator that can be used 
> to create a sync version of an async def (somewhat like tworoutines) made the 
> async version hand off control to a thread pool? Could be a tad slower, but 
> the tenor of the discussion seems to be that performance is not that much of 
> an issue.

Unfortunately I don't think this helps much... If your async def
doesn't use signals, then it won't interfere with the outer loop's
signal state and a thread is unnecessary. And if it *does* use
signals, then you can't put it in a thread, because Python threads are
forbidden to call any of the signal-related APIs.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio

2019-03-27 Thread Daniel Nugent
FWIW, the ayncio_run_encapsulated approach does not work with the 
transport/protocol apis because the loop needs to stay alive concurrent with 
the connection in order for the awaitables to all be on the same loop.

I think the notion that allowing an option for nested loops will inevitably 
lead to a situation where nested loops are always required is maybe a bit 
pessimistic?

-Dan Nugent
On Mar 26, 2019, 23:33 -0400, Nathaniel Smith , wrote:
> On Mon, Mar 25, 2019 at 4:37 PM Guido van Rossum  wrote:
> >
> > I also hope Nathaniel has something to say -- I wonder if trio supports 
> > nested event loops?
>
> Trio does have a similar check to prevent starting a new Trio loop
> inside a running Trio loop, and there's currently no way to disable
> it: 
> https://github.com/python-trio/trio/blob/444234392c064c0ec5e66b986a693e2e9f76bc58/trio/_core/_run.py#L1398-L1402
>
> Like the comment says, I could imagine changing this if there's a good reason.
>
> On Tue, Mar 26, 2019 at 11:56 AM Yury Selivanov  wrote:
> > I think that if we implement this feature behind a flag then some libraries 
> > will start requiring that flag to be set. Which will inevitably lead us to 
> > a situation where it's impossible to use asyncio without the flag. 
> > Therefore I suppose we should either just implement this behaviour by 
> > default or defer this to 3.9 or later.
>
> It is weird that if you have a synchronous public interface, then it
> acts differently depending on whether you happened to implement that
> interface using the socket module directly vs using asyncio.
>
> If you want to "hide" that your synchronous API uses asyncio
> internally, then you can actually do that now using
> public/quasi-public APIs:
>
> def asyncio_run_encapsulated(*args, **kwargs):
> old_loop = asyncio.get_running_loop()
> try:
> asyncio._set_running_loop(None)
> return asyncio.run(*args, **kwargs)
> finally:
> asyncio._set_running_loop(old_loop)
>
> def my_sync_api(...):
> return asyncio_run_encapsulated(my_async_api(...))
>
> But this is also a bit weird, because the check is useful. It's weird
> that a blocking socket-module-based implementation and a blocking
> asyncio-based implementation act differently, but arguably the way to
> make them consistent is to fix the socket module so that it does give
> an error if you try to issue blocking calls from inside asyncio,
> rather than remove the error from asyncio. In fact newcomers often
> make mistakes like using time.sleep or requests from inside async
> code, and a common question is how to catch this in real code bases.
>
> I wonder if we should have an interpreter-managed thread-local flag
> "we're in async mode", and make blocking operations in the stdlib
> check it. E.g. as a straw man, sys.set_allow_blocking(True/False),
> sys.get_allow_blocking(), sys.check_allow_blocking() -> raises an
> exception if sys.get_allow_blocking() is False, and then add calls to
> sys.check_allow_blocking() in time.sleep, socket operations with
> blocking mode enabled, etc. (And encourage third-party libraries that
> do their own blocking I/O without going through the stdlib to add
> similar calls.) Async I/O libraries (asyncio/trio/twisted/...) would
> set the flag appropriately; and if someone like IPython *really wants*
> to perform blocking operations inside async context, they can fiddle
> with the flag themselves.
>
> > I myself am -1 on making 'run_until_complete()' reentrant. The separation 
> > of async/await code and blocking code is painful enough to some people, 
> > introducing another "hybrid" mode will ultimately do more damage than good. 
> > E.g. it's hard to reason about this even for me: I simply don't know if I 
> > can make uvloop (or asyncio) fully reentrant.
>
> Yeah, pumping the I/O loop from inside a task that's running on the
> I/O loop is just a mess. It breaks the async/await readability
> guarantees, it risks stack overflow, and by the time this stuff bites
> you you're going to have to backtrack a lonnng way to get to something
> sensible. Trio definitely does not support this, and I will fight to
> keep it that way :-).
>
> Most traditional GUI I/O loops *do* allow this, and in the traditional
> Twisted approach of trying to support all the I/O loop APIs on top of
> each other, this can be a problem – if you want an adapter to run Qt
> or Gtk apps on top of your favorite asyncio loop implementation, then
> your loop implementation needs to support reentrancy. But I guess so
> far people are OK with doing things the other way (implementing the
> asyncio APIs on top of the standard GUI event loops). In Trio I have a
> Cunning Scheme to avoid doing either approach, but we'll see how that
> goes...
>
> > In case of Jupyter I don't think it's a good idea for them to advertise 
> > nest_asyncio. IMHO the right approach would be to encourage library 
> > developers to expose async/await APIs and teach Jupyter users to "await" on 
> > async code directly.
> >
> > The linked Jupyter