Re: [Async-sig] Asynchronous cleanup is a problem

2016-07-06 Thread Nathaniel Smith
On Wed, Jul 6, 2016 at 6:42 AM, Cory Benfield  wrote:
>
>> On 6 Jul 2016, at 13:09, David Beazley  wrote:
>>
>> Curio uses asynchronous context managers for much more than closing sockets 
>> (which frankly is the least interesting thing).   For example, they're used 
>> extensively with synchronization primitives such as Locks, Semaphores, 
>> Events, Queues, and other such things.   The ability to use coroutines in 
>> the __aexit__() method is an essential part of these primitives because it 
>> allows task scheduling decisions to be made in conjunction with 
>> synchronization events such as lock releases.   For example, you can 
>> implement fair-locking or various forms of priority scheduling.   Curio also 
>> uses asynchronous context managers for timeouts and other related 
>> functionality where coroutines have to be used in __aexit__.  I would expect 
>> coroutines in __aexit__ to also be useful in more advanced contexts such as 
>> working with databases, dealing with transactions, and other kinds of 
>> processing where asynchronous I/O might be involved.
>
> For my own edification, Dave, do you mind if I dive into this a little bit? 
> My suspicion is that this problem is rather unique to curio (or, at least, 
> does not so strongly effect event loop implementations), and I’d just like to 
> get a handle on it. It’s important to me that we don’t blow away curio, so 
> where it differs from event loops I’d like to understand what it’s doing.

The relevant difference between curio and asyncio here is that in
asyncio, there are two different mechanisms for accessing the event
loop: for some operations, you access it through its coroutine runner
interface using 'await', and for other operations, you get a direct
reference to it through some side-channel (loop= arguments, global
lookups) and then make direct method calls. To make the latter work,
asyncio code generally has to be written in a way that makes sure
loop= objects are always passed throughout the whole call stack, and
always stay in sync [no pun intended] with the coroutine runner
object. This has a number of downsides, but one upside is that it
means that the loop object is available from __del__, where the
coroutine runner isn't.

Curio takes the other approach, of standardizing on 'await' as the
single blessed mechanism for accessing the event loop (or kernel or
whatever you want to call it). So this eliminates all the tiresome
loop= tracking and the potential for out-of-sync bugs, but it means
you can't do *anything* event-loop-related from __del__.

However, I'm not convinced that asyncio really has an advantage here.
Imagine some code that uses asyncio internally, but exposes a
synchronous wrapper to the outside (e.g. as proposed here ;-) [1]):

def synchronous_wrapper(...):
# Use a new loop since we might not be in the same thread as the global loop
loop = asyncio.new_event_loop()
return loop.run_until_complete(work_asynchronously(..., loop=loop))

async def work_asynchronously(..., loop=loop):
stream = await get_asyncio_stream_writer(..., loop=loop)
stream.write(b"hello")

stream.write(...) queues some data to be sent, but doesn't send it.
Then stream falls out of scope, which triggers a call to
stream.__del__, which calls stream.close(), which does some calls on
loop requesting that it flush the buffer and then close the underlying
socket. So far so good.

...but then immediately after this, the loop itself falls out of
scope, and you lose. AFAICT from a quick skim of the asyncio code, the
data will not be sent and the socket will not be closed (depending on
kernel buffering etc.).

(And even if you wrap everything with proper 'with' blocks, this is
still true... asyncio event loops don't seem to have any API for
"complete all work and then shut down"? Maybe I'm just missing it --
if not then this is possibly a rather serious bug in actual
currently-existing asyncio. But for the present purposes, the point is
that you really do need something like 'async with' around everything
here to force the I/O to complete before handing things over to the gc
-- you can't rely on the gc to do your I/O.)

-n

[1] https://github.com/kennethreitz/requests/issues/1390#issuecomment-225361421

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] [ANN] async_generator v1.2 released

2016-11-25 Thread Nathaniel Smith
On Fri, Nov 25, 2016 at 10:46 AM, Alex Grönholm
<alex.gronh...@nextday.fi> wrote:
> 25.11.2016, 12:09, Nathaniel Smith kirjoitti:
>>
>> On Thu, Nov 24, 2016 at 11:59 PM, Alex Grönholm
>> <alex.gronh...@nextday.fi> wrote:
>>>
>>> 25.11.2016, 09:25, Nathaniel Smith kirjoitti:
>>>>
>>>> On Thu, Nov 24, 2016 at 1:23 PM, Nathaniel Smith <n...@pobox.com> wrote:
>>>> [...]
>>>>>>
>>>>>> One thing I noticed is that there seems to be no way to detect async
>>>>>> generator functions in your implementation. That is something I would
>>>>>> want
>>>>>> to have before switching.
>>>>>
>>>>> Good point. That should be pretty trivial to add.
>>>>
>>>> Just pushed async_generator v1.3 to PyPI, with new isasyncgen,
>>>> isasyncgenfunction, and (on 3.6) registration with
>>>> collections.abc.AsyncGenerator.
>>>
>>> And you just *had* to make it incompatible with the async generators from
>>> asyncio_extras? "_is_async_gen_function" vs "_is_async_generator"?
>>> I thought we agreed on cooperating?
>>
>> I started doing that, but... it's not an async generator, isasyncgen
>> is a different inspect function...
>>
>> -n
>>
> It's an arbitrary string that will likely never be seen by anyone except for
> people working on the libraries. But it makes a world of difference in
> compatibility. I named it this way to be consistent with asyncio which marks
> yield-based coroutines with "_is_coroutine". So please reconsider.

Sure, I guess it doesn't matter too much either way.

Can we pause a moment to ask whether we really want the two async
generator types to return true for each other's introspection
function, though? Right now they aren't really interchangeable, and
the only user for this kind of introspection I've seen is your
contextmanager.py. It seems to use the inspect versions of isasyncgen
and isasyncgenfunction, because it wants to know whether
asend/athrow/aclose are supported. Right now, if it switched to using
async_generator's isasyncgen/isasyncgenfunction, it would continue to
work; but if I then switched async_generator's
isasyncgen/isasyncgenfunction to detect asyncio_extras's generators,
it would stop working again.

Speaking of which, what do you want to do about isasyncgen?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] [ANN] async_generator v1.2 released

2016-11-25 Thread Nathaniel Smith
On Fri, Nov 25, 2016 at 12:03 AM, Alex Grönholm
<alex.gronh...@nextday.fi> wrote:
> 24.11.2016, 23:23, Nathaniel Smith kirjoitti:
>
> On Nov 23, 2016 11:29 PM, "Alex Grönholm" <alex.gronh...@nextday.fi> wrote:
>>
>> 23.11.2016, 01:34, Nathaniel Smith kirjoitti:
>>>
>>> On Tue, Nov 22, 2016 at 2:22 PM, Alex Grönholm <alex.gronh...@nextday.fi>
>>> wrote:
>>> > I'm not sure where asyncio_extras's async generator implementation
>>> > assumes
>>> > you're using asyncio. Could you elaborate on that?
>>>
>>> If I'm reading it right, it assumes that the only two things that might
>>> be yielded to the coroutine runner are either (a) the special yield wrapper,
>>> or (b) an awaitable object like an asyncio.Future. This works on asyncio,
>>> because that's all the asyncio runner supports, but it doesn't work with
>>> (for example) curio. async_generator (like native async generators) allows
>>> arbitrary objects to be yielded to the coroutine runner.
>>
>> You are misreading the code. It is in no way limited to what asyncio
>> accepts. It doesn't even import asyncio in the asyncyield or generator
>> modules. The only parts of the library that depend on PEP 3156 event loops
>> are the ones that involve executors and threads.
>
> I didn't say that it imported asyncio. I said that it assumes the only
> things that will be yielded are the things that asyncio yields. This is the
> line that I'm worried about:
>
>
> https://github.com/agronholm/asyncio_extras/blob/aec412e1b7034ca3cad386c381e655ce3547fee3/asyncio_extras/asyncyield.py#L40
>
> The code awaits the value yielded by the coroutine, but there's no guarantee
> that this value is awaitable. It's an arbitrary Python object representing a
> message sent to the coroutine runner. It turns out that asyncio only uses
> awaitable objects for its messages, so this code can get away with this on
> asyncio, but if you try using this code with curio then I'm pretty sure
> you're going to end up doing something like "await (3,)" and then blowing
> up.
>
> PEP 492 clearly states the following:
>
> It is a TypeError to pass anything other than an awaitable object to an
> await expression.
>
> That (3,) is not an awaitable, so the example is invalid. That said, I will
> re-examine this part of the implementation and correct it if necessary.

I feel like I'm running out of ideas for how to explain this, and
starting to just repeat myself :-(. There is nothing that says the
return value from coro.__next__() must be awaitable. Coroutines can
yield arbitrary objects. (3,) is not awaitable, but it's a perfectly
valid thing to be yielded from a coroutine.

> So far I just haven't encountered anything that would produce an error.

That's because you've only tested on asyncio, not curio. I promise!

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Adding asyncio.run() function in Python 3.6

2016-11-16 Thread Nathaniel Smith
What's the use case for the async generator version? Could the yield be
replaced by 'await loop.shutting_down()'?

On Nov 16, 2016 10:12 AM, "Yury Selivanov"  wrote:

> One of the remaining problems with the event loop in asyncio is
> bootstrapping/finalizing asyncio programs.
>
> Currently, there are two different scenarios:
>
> [1] Running a coroutine:
>
> async def main():
># your program
> loop = asyncio.get_event_loop()
> try:
> loop.run_until_complete(main())
> finally:
> loop.close()
>
> [2] Running a server:
>
> loop = asyncio.get_event_loop()
> srv = loop.run_until_complete(
>   loop.create_server(…))
> try:
>   loop.run_forever()
> finally:
>   try:
> srv.close()
> loop.run_until_complete(srv.wait_closed())
>   finally:
> loop.close()
>
> Both cases are *incomplete*: they don’t do correct finalization of
> asynchronous generators. To do that we’ll need to add another 1-3 lines of
> code (extra try-finally).
>
> This manual approach is really painful:
>
> * It makes bootstrapping asyncio code unnecessarily hard.
>
> * It makes the documentation hard to follow.  And we can’t restructure the
> docs to cover the loop only in the advanced section.
>
> * Most of the people will never know about `loop.shutdown_asyncgen`
> coroutine.
>
> * We don’t have a place to add debug code to let people know that their
> asyncio program didn’t clean all resources properly (a lot of unordered
> warnings will be spit out instead).
>
> In https://github.com/python/asyncio/pull/465 I propose to add a new
> function to asyncio in Python 3.6: asyncio.run().
>
> The function can either accept a coroutine, which solves [1]:
>
> async def main():
># your program
> asyncio.run(main())
>
> Or it can accept an asynchronous generator, which solves [2]:
>
> async def main():
>   srv = await loop.create_server(…))
>   try:
> yield  # let the loop run forever
>   finally:
> srv.close()
> await srv.wait_closed()
>
> asyncio.run(main())
>
> asyncio.run() solves the following:
>
> * An easy way to start an asyncio program that properly takes care of loop
> instantiation and finalization.
>
> * It looks much better in the docs.  With asyncio.run people don’t need to
> care about the loop at all, most probably will never use it.
>
> * Easier to experiment with asyncio in REPL.
>
> * The loop and asynchronous generators will be cleaned up properly.
>
> * We can add robust debug output to the function, listing the unclosed
> tasks, servers, connections, asynchronous generators etc, helping people
> with the cleanup logic.
>
> * Later, if we need to add more cleanup code to asyncio, we will have a
> function to add the logic to.
>
> I feel that we should add this to asyncio.  One of the arguments against
> that, is that overloading asyncio.run to accept both coroutines and
> asynchronous generators makes the API more complex.  If that’s really the
> case, we can add two functions: asyncio.run(coro) and
> asyncio.run_forever(async_generator).
>
> Also take a look at https://github.com/python/asyncio/pull/465.
>
> Thanks,
> Yury
> ___
> Async-sig mailing list
> Async-sig@python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [Async-sig] Feedback, loop.load() function

2017-08-11 Thread Nathaniel Smith
It looks like your "load average" is computing something very different
than the traditional Unix "load average". If I'm reading right, yours is a
measure of what percentage of the time the loop spent sleeping waiting for
I/O, taken over the last 60 ticks of a 1 second timer (so generally
slightly longer than 60 seconds). The traditional Unix load average is an
exponentially weighted moving average of the length of the run queue.

Is one of those definitions better for your goal of detecting when to shed
load? I don't know. But calling them the same thing is pretty confusing
:-). The Unix version also has the nice property that it can actually go
above 1; yours doesn't distinguish between a service whose load is at
exactly 100% of capacity and barely keeping up, versus one that's at 200%
of capacity and melting down. But for load shedding maybe you always want
your tripwire to be below that anyway.

More broadly we might ask what's the best possible metric for this purpose
– how do we judge? A nice thing about the JavaScript library you mention is
that scheduling delay is a real thing that directly impacts quality of
service – it's more of an "end to end" measure in a sense. Of course, if
you really want an end to end measure you can do things like instrument
your actual logic, see how fast you're replying to http requests or
whatever, which is even more valid but creates complications because some
requests are supposed to take longer than others, etc. I don't know which
design goals are important for real operations.

On Aug 6, 2017 3:57 PM, "Pau Freixes"  wrote:

> Hi guys,
>
> I would appreciate any feedback about the idea of implementing a new
> load function to ask about how saturated is your reactor.
>
> I have a proof of concept [1] of how the load function might be
> implemented in the Asyncio python loop.
>
> The idea is to provide a method that can be used to ask about the load
> of the reactor in a specific time, this implementation returns the
> load taking into account the last 60 seconds but it can easily return
> the 5m and 15minutes ones u others.
>
> This method can help services built on to of Asyncio to implement back
> pressure mechanisms that take into account a metric coming from the
> loop, instead of inferring the load using other metrics provided by
> external agents such as the CPU, load average u others.
>
> Nowadays exists some alternatives for other languages that address
> this situation using the lag of a scheduler callback, produced by
> saturated reactors. The most known implementation is toobusy [2] a
> nodejs implementation.
>
> IMHO the solution provided by tobusy has a strong dependency with the
> hardware needing to tune the maximum lag allowed in terms of
> milliseconds [3]. in the POF presented the user can use an exact value
> meaning the percentage of the load, perhaps 0.9
>
> Any comment would be appreciated.
>
> [1] https://github.com/pfreixes/cpython/commit/
> 5fef3cae043abd62165ce40b181286e18f5fb19c
> [2] https://www.npmjs.com/package/toobusy
> [3] https://www.npmjs.com/package/toobusy#tunable-parameters
> --
> --pau
> ___
> Async-sig mailing list
> Async-sig@python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] async documentation methods

2017-07-04 Thread Nathaniel Smith
On Mon, Jul 3, 2017 at 11:49 PM, Alex Grönholm  wrote:
> The real question is: why doesn't vanilla Sphinx have any kind of support
> for async functions which have been part of the language for quite a while?

Because no-one's sent them a PR, I assume. They're pretty swamped AFAICT.

One of the maintainers has at least expressed interest in integrating
something like sphinxcontrib-trio if someone does the work:
https://github.com/sphinx-doc/sphinx/issues/3743

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] question re: asyncio.Condition lock acquisition order

2017-06-27 Thread Nathaniel Smith
On Tue, Jun 27, 2017 at 4:15 AM, Chris Jerdonek
<chris.jerdo...@gmail.com> wrote:
> On Tue, Jun 27, 2017 at 3:29 AM, Nathaniel Smith <n...@pobox.com> wrote:
>> In fact asyncio.Lock's implementation is careful to maintain strict
>> FIFO fairness, i.e. whoever calls acquire() first is guaranteed to get
>> the lock first. Whether this is something you feel you can depend on
>> I'll leave to your conscience :-). Though the docs do say "only one
>> coroutine proceeds when a release() call resets the state to unlocked;
>> first coroutine which is blocked in acquire() is being processed",
>> which I think might be intended to say that they're FIFO-fair?
>> ...
>
> Thanks. All that is really interesting, especially the issue you
> linked to in the Trio docs re: fairness:
> https://github.com/python-trio/trio/issues/54
>
> Thinking through the requirements I want for my RW synchronization use
> case in more detail, I think I want the completion of any "write" to
> be followed by exhausting all "reads." I'm not sure if that qualifies
> as barging. Hopefully this will be implementable easily enough with
> the available primitives, given what you say.

I've only seen the term "barging" used in discussions of regular
locks, though I'm not an expert, just someone with eclectic reading
habits. But RWLocks have some extra subtleties that "barging" vs
"non-barging" don't really capture. Say you have the following
sequence:

task w0 acquires for write
task r1 attempts to acquire for read (and blocks)
task r2 attempts to acquire for read (and blocks)
task w1 attempts to acquire for write (and blocks)
task r3 attempts to acquire for read (and blocks)
task w0 releases the write lock
task r4 attempts to acquire for read

What happens? If r1+r2+r3+r4 are able to take the lock, then you're
"read-biased" (which is essentially the same as barging for readers,
but it can be extra dangerous for RWLocks, because if you have a heavy
read load it's very easy for readers to starve writers). If tasks
r1+r2 wake up, but r3+r4 have to wait, then you're "task-fair" (the
equivalent of FIFO fairness for RWLocks). If r1+r2+r3 wake up, but r4
has to wait, then you're "phase fair".

There are some notes here that are poorly organized but perhaps retain
some small interest:
https://github.com/python-trio/trio/blob/master/trio/_core/_parking_lot.py

If I ever implement one of these it'll probably be phase-fair, because
(a) it has some nice theoretical properties, and (b) it happens to be
particularly easy to implement using my existing wait-queue primitive,
and task-fair isn't :-).

> Can anything similar be said not about synchronization primitives, but
> about awakening coroutines in general? Do event loops maintain strict
> FIFO queues when it comes to deciding which awaiting coroutine to
> awaken? (I hope that question makes sense!)

Something like that. There's some complication because there are two
ways that a task can become runnable: directly by another piece of
code in the system (e.g., releasing a lock), or via some I/O (e.g.,
bytes arriving on a socket). If you really wanted to ensure that tasks
ran exactly in the order that they became runnable, then you need to
check for I/O constantly, but this is inefficient. So usually what
cooperative scheduling systems guarantee is a kind of "batched FIFO":
they do a poll for I/O (a which point they may discover some new
runnable tasks), and then take a snapshot of all the runnable tasks,
and then run all of the tasks in their snapshot once before
considering any new tasks. So this isn't quite strict FIFO, but it's
fair-like-FIFO (the discrepancy between when each task should run
under strict FIFO, and when it actually runs, is bounded by the number
of active tasks; there's no possibility of a runnable task being left
unscheduled for an arbitrary amount of time).

Curio used to allow woken-by-code tasks to starve out woken-by-I/O
tasks, and you might be interested in the discussion in the PR that
changed that: https://github.com/dabeaz/curio/pull/127

In trio I actually randomize the order within each batch because I
don't want people to accidentally encode assumptions about the
scheduler (e.g. in their test suites). This is because I have hopes of
eventually doing something fancier :-):
https://github.com/python-trio/trio/issues/32 ("If you liked issue
#54, you'll love #32!"). Many systems are not this paranoid though,
and actually are strict-FIFO for tasks that are woken-by-code - but
this is definitely one of those features where depending on it is
dubious. In asyncio for example the event loop is pluggable and the
scheduling policy is a feature of the event loop, so even if the
implementation in the stdlib is strict FIFO you don't know about
third-party ones.

-n

-- 
Nathaniel J.

Re: [Async-sig] "read-write" synchronization

2017-06-27 Thread Nathaniel Smith
On Mon, Jun 26, 2017 at 6:41 PM, Chris Jerdonek
 wrote:
> On Mon, Jun 26, 2017 at 12:37 PM, Dima Tisnek  wrote:
>> Chris, here's a simple RWLock implementation and analysis:
>> ...
>> Obv., this code could be nicer:
>> * separate context managers for read and write cases
>> * .unlock can be automatic (if self.writer: unlock_for_write()) at the
>> cost of opening doors wide open to bugs
>> * policy can be introduced if `.lock` identified itself (by an
>> object(), since there's no thread id) in shared state
>> * notifyAll() makes real life use O(N^2) for N being number of
>> simultaneous write lock requests
>>
>> Feel free to use it :)
>
> Thanks, Dima. However, as I said in my earlier posts, I'm actually
> more interested in exploring approaches to synchronizing readers and
> writers in async code that don't require locking on reads. (This is
> also why I've always been saying RW "synchronization" instead of RW
> "locking.")
>
> I'm interested in this because I think the single-threadedness of the
> event loop might be what makes this simplification possible over the
> traditional multi-threaded approach (along the lines Guido was
> mentioning). It also makes the "fast path" faster. Lastly, the API for
> the callers is just to call read() or write(), so there is no need for
> a general RWLock construct or to work through RWLock semantics of the
> sort Nathaniel mentioned.
>
> I coded up a working version of the pseudo-code I included in an
> earlier email so people can see how it works. I included it at the
> bottom of this email and also in this gist:
> https://gist.github.com/cjerdonek/858e1467f768ee045849ea81ddb47901

FWIW, to me this just looks like an implementation of an async RWLock?
It's common for async synchronization primitives to be simpler
internally than threading primitives because the async ones don't need
to worry about being pre-empted at arbitrary points, but from the
caller's point of view you still have basically a blocking acquire()
method, and then you do your stuff (potentially blocking while you're
at it), and then you call a non-blocking release(), just like every
other async lock.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] [ann] sphinxcontrib-trio: make sphinx better at documenting functions/methods, esp. for async/await code

2017-05-12 Thread Nathaniel Smith
[dropped python-announce from CC list]

On Fri, May 12, 2017 at 9:17 AM, Brett Cannon  wrote:
> So are you going to try to upstream this? ;)

Realistically, for me this is a side-project to a side-project, so it
may require someone else do the integration work, but:
 https://github.com/sphinx-doc/sphinx/issues/3743

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] async/sync library reusage

2017-06-08 Thread Nathaniel Smith
On Thu, Jun 8, 2017 at 3:32 PM, manuel miranda  wrote:
> Hello everyone,
>
> After using asyncio for a while, I'm struggling to find information about
> how to support both synchronous and asynchronous use cases for the same
> library.
>
> I.e. imagine you have a package for http requests and you want to give the
> user the choice to use a synchronous or an asynchronous interface. Right now
> the approach the community is following is creating separate libraries one
> for each version. This is far from ideal for several reasons, some I can
> think of:
>
> - Code duplication, most of the functionality is the same in both libraries,
> only difference is the sync/async behaviors
> - Some new async libraries lack functionality compared to their sync
> siblings. Others will introduce bugs that the sync version already solved
> long ago, etc.
> - Different interfaces for the user for the same exact functionality.
>
> In summary, in some cases it looks like reinventing the wheel. So now comes
> the question, is there any documentation, guide on what would be best
> practice supporting this kind of duality?

I would say that this is something that we as a community are still
figuring out. I really like the Sans-IO approach, and it's a really
valuable piece of the solution, but it doesn't solve the whole problem
by itself - you still need to actually do I/O, and this means things
like error handling and timeouts that aren't obviously a natural fit
to the Sans-IO approach, and this means you may still have some tricky
code that can end up duplicated. (Or maybe the Sans-IO approach can be
extended to handle these things too?) There are active discussions
happening in projects like urllib3 [1] and packaging [2] about what
the best strategy to take is. And the options vary a lot depending on
whether you need to support python 2 etc.

If you figure out a good approach I think everyone would be interested
to hear it :-)

-n

[1] https://github.com/shazow/urllib3/pull/1068#issuecomment-294422348

[2] Here's the same API implemented three different ways:
Using deferreds: https://github.com/pypa/packaging/pull/87
"traditional" sans-IO: https://github.com/pypa/packaging/pull/88
Using the "effect" library: https://github.com/dstufft/packaging/pull/1

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] Cancelling SSL connection

2017-06-21 Thread Nathaniel Smith
SSLObject.unwrap has the contract that if it finishes successfully, then
the SSL connection has been cleanly shut down and both sides remain in
sync, and can continue to use the socket in unencrypted mode. When asyncio
calls unwrap before the handshake has completed, then this contract is
impossible to fulfill, and raising an error is the right thing to do. So
imo the ssl module is correct here, and this is a (minor) bug in asyncio.

On Jun 21, 2017 12:49 PM, "Dima Tisnek"  wrote:

> Looks like a bug in the `ssl` module, not `asyncio`.
>
> Refer to https://github.com/openssl/openssl/issues/710
> IMO `ssl` module should be prepared for this.
>
> I'd say post a bug to cpython and see what core devs have to say about it
> :)
> Please note exact versions of python and openssl ofc.
>
> my 2c: openssl has been a moving target every so often, it's quite
> possible that this change in the API escaped the devs.
>
> On 21 June 2017 at 19:50, Mark E. Haase  wrote:
> > (I'm not sure if this is a newbie question or a bug report or something
> in
> > between. I apologize in advance if its off-topic. Let me know if I should
> > post this somewhere else.)
> >
> > If a task is cancelled while SSL is being negotiated, then an SSLError is
> > raised, but there's no way (as far as I can tell) for the caller to catch
> > it. (The example below is pretty contrived, but in an application I'm
> > working on, the user can cancel downloads at any time.) Here's an
> example:
> >
> > import asyncio, random, ssl
> >
> > async def download(host):
> > ssl_context = ssl.create_default_context()
> > reader, writer = await asyncio.open_connection(host, 443,
> > ssl=ssl_context)
> > request = f'HEAD / HTTP/1.1\r\nHost: {host}\r\n\r\n'
> > writer.write(request.encode('ascii'))
> > lines = list()
> > while True:
> > newdata = await reader.readline()
> > if newdata == b'\r\n':
> > break
> > else:
> > lines.append(newdata.decode('utf8').rstrip('\r\n'))
> > return lines[0]
> >
> > async def main():
> > while True:
> > task = asyncio.Task(download('www.python.org'))
> > await asyncio.sleep(random.uniform(0.0, 0.5))
> > task.cancel()
> > try:
> > response = await task
> > print(response)
> > except asyncio.CancelledError:
> > print('request cancelled!')
> > except ssl.SSLError:
> > print('caught SSL error')
> > await asyncio.sleep(1)
> >
> > loop = asyncio.get_event_loop()
> > loop.run_until_complete(main())
> > loop.close()
> >
> > Running this script yields the following output:
> >
> > HTTP/1.1 200 OK
> > request cancelled!
> > HTTP/1.1 200 OK
> > HTTP/1.1 200 OK
> > : SSL
> handshake
> > failed
> > Traceback (most recent call last):
> >   File "/usr/lib/python3.6/asyncio/base_events.py", line 803, in
> > _create_connection_transport
> > yield from waiter
> >   File "/usr/lib/python3.6/asyncio/tasks.py", line 304, in _wakeup
> > future.result()
> > concurrent.futures._base.CancelledError
> >
> > During handling of the above exception, another exception occurred:
> >
> > Traceback (most recent call last):
> >   File "/usr/lib/python3.6/asyncio/sslproto.py", line 577, in
> > _on_handshake_complete
> > raise handshake_exc
> >   File "/usr/lib/python3.6/asyncio/sslproto.py", line 638, in
> > _process_write_backlog
> > ssldata = self._sslpipe.shutdown(self._finalize)
> >   File "/usr/lib/python3.6/asyncio/sslproto.py", line 155, in
> shutdown
> > ssldata, appdata = self.feed_ssldata(b'')
> >   File "/usr/lib/python3.6/asyncio/sslproto.py", line 219, in
> > feed_ssldata
> > self._sslobj.unwrap()
> >   File "/usr/lib/python3.6/ssl.py", line 692, in unwrap
> > return self._sslobj.shutdown()
> > ssl.SSLError: [SSL] shutdown while in init (_ssl.c:2299)
> >
> > Is this a bug that I should file, or is there some reason that it's
> intended
> > to work this way? I can work around it with asyncio.shield(), but I
> think I
> > would prefer for the asyncio/sslproto.py to catch the SSLError and ignore
> > it. Maybe I'm being short sighted.
> >
> > Thanks,
> > Mark
> >
> > ___
> > Async-sig mailing list
> > Async-sig@python.org
> > https://mail.python.org/mailman/listinfo/async-sig
> > Code of Conduct: https://www.python.org/psf/codeofconduct/
> >
> ___
> Async-sig mailing list
> Async-sig@python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
___
Async-sig mailing list
Async-sig@python.org

Re: [Async-sig] "read-write" synchronization

2017-06-26 Thread Nathaniel Smith
On Mon, Jun 26, 2017 at 12:37 PM, Dima Tisnek  wrote:
> Note that `.unlock` cannot validate that it's called by same coroutine
> as `.lock` was.
> That's because there's no concept for "current_thread" for coroutines
> -- there can be many waiting on each other in the stack.

This is also a surprisingly complex design question. Your async RWLock
actually matches how Python's threading.Lock works: you're explicitly
allowed to acquire it in one thread and then release it from another.
People sometimes find this surprising, and it prevents some kinds of
error-checking. For example, this code *probably* deadlocks:

lock = threading.Lock()
lock.acquire()
# probably deadlocks
lock.acquire()

but the interpreter can't detect this and raise an error, because in
theory some other thread might come along and call lock.release(). On
the other hand, it is sometimes useful to be able to acquire a lock in
one thread and then "hand it off" to e.g. a child thread. (Reentrant
locks, OTOH, do have an implicit concept of ownership -- they kind of
have to, if you think about it -- so even if you don't need reentrancy
they can be useful because they'll raise a noisy error if you
accidentally try to release a lock from the wrong thread.)

In trio we do have a current_task() concept, and the basic trio.Lock
[1] does track ownership, and I even have a Semaphore-equivalent that
tracks ownership as well [2]. The motivation here is that I want to
provide nice debugging tools to detect things like deadlocks, which is
only possible when your primitives have some kind of ownership
tracking. So far this just means that we detect and error on these
kinds of simple cases:

lock = trio.Lock()
await lock.acquire()
# raises an error
   await lock.acquire()

But I have ambitions to do more [3] :-).

However, this raises some tricky design questions around how and
whether to support the "passing ownership" cases. Of course you can
always fall back on something like a raw Semaphore, but it turns out
that trio.run_in_worker_thread (our equivalent of asyncio's
run_in_executor) actually wants to do something like pass ownership
from the calling task into the spawned thread. So far I've handled
this by adding acquire_on_behalf_of/release_on_behalf_of methods to
the primitive that run_in_worker_thread uses, but this isn't really
fully baked yet.

-n

[1] https://trio.readthedocs.io/en/latest/reference-core.html#trio.Lock
[2] 
https://trio.readthedocs.io/en/latest/reference-core.html#trio.CapacityLimiter
[3] https://github.com/python-trio/trio/issues/182

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] async generator confusion or bug?

2017-06-26 Thread Nathaniel Smith
I actually thought that async generators already guarded against this using
their ag_running attribute. If I try running Dima's example with
async_generator, I get:

sending user-1
received user-1
sending user-2
sending user-0
Traceback (most recent call last):
[...]
ValueError: async generator already executing

The relevant code is here:
https://github.com/njsmith/async_generator/blob/e303e077c9dcb5880c0ce9930d560b282f8288ec/async_generator/impl.py#L273-L279

But I added this in the first place because I thought it was needed for
compatibility with native async generators :-)

-n

On Jun 26, 2017 6:54 PM, "Yury Selivanov"  wrote:

> (Posting here, rather than to the issue, because I think this actually
> needs more exposure).
>
> I looked at the code (genobject.c) and I think I know what's going on
> here.  Normally, when you work with an asynchronous generator (AG) you
> interact with it through "asend" or "athrow" *coroutines*.
>
> Each AG has its own private state, and when you await on "asend" coroutine
> you are changing that state.  The state changes on each "asend.send" or
> "asend.throw" call.  The normal relation between AGs and asends is 1 to 1.
>
>   AG - asend
>
> However, in your example you change that to 1 to many:
>
>  asend
> /
>   AG - asend
> \
>  asend
>
> Both 'ensure_future' and 'gather' will wrap each asend coroutine into an
> 'asyncio.Task'. And each Task will call "asend.send(None)" right in its
> '__init__', which changes the underlying *shared* AG instance completely
> out of order.
>
> I don't see how this can be fixed (or that it even needs to be fixed), so
> I propose to simply raise an exception if an AG has more than one asends
> changing it state *at the same time*.
>
> Thoughts?
>
> Yury
>
> > On Jun 26, 2017, at 12:25 PM, Dima Tisnek  wrote:
> >
> > Hi group,
> >
> > I'm trying to cross-use an sync generator across several async functions.
> > Is it allowed or a completely bad idea? (if so, why?)
> >
> > Here's MRE:
> >
> > import asyncio
> >
> >
> > async def generator():
> >while True:
> >x = yield
> >print("received", x)
> >await asyncio.sleep(0.1)
> >
> >
> > async def user(name, g):
> >print("sending", name)
> >await g.asend(name)
> >
> >
> > async def helper():
> >g = generator()
> >await g.asend(None)
> >
> >await asyncio.gather(*[user(f"user-{x}", g) for x in range(3)])
> >
> >
> > if __name__ == "__main__":
> >asyncio.get_event_loop().run_until_complete(helper())
> >
> >
> > And the output it produces when ran (py3.6.1):
> >
> > sending user-1
> > received user-1
> > sending user-2
> > sending user-0
> > received None
> > received None
> >
> >
> > Where are those None's coming from in the end?
> > Where did "user-0" and "user-1" data go?
> >
> > Is this a bug, or am I hopelessly confused?
> > Thanks!
> > ___
> > Async-sig mailing list
> > Async-sig@python.org
> > https://mail.python.org/mailman/listinfo/async-sig
> > Code of Conduct: https://www.python.org/psf/codeofconduct/
>
>
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] "read-write" synchronization

2017-06-25 Thread Nathaniel Smith
On Sun, Jun 25, 2017 at 2:13 PM, Chris Jerdonek
 wrote:
> I'm relatively new to async programming in Python and am thinking
> through possibilities for doing "read-write" synchronization.
>
> I'm using asyncio, and the synchronization primitives that asyncio
> exposes are relatively simple [1]. Have options for async read-write
> synchronization already been discussed in any detail?

As a general comment: I used to think rwlocks were a simple extension
to regular locks, but it turns out there's actually this huge increase
in design complexity. Do you want your lock to be read-biased,
write-biased, task-fair, phase-fair? Can you acquire a write lock if
you already hold one (i.e., are write locks reentrant)? What about
acquiring a read lock if you already hold the write lock? Can you
atomically upgrade/downgrade a lock? This makes it much harder to come
up with a one-size-fits-all design suitable for adding to something
like the python stdlib.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] awaiting task is not chaining exception

2017-11-12 Thread Nathaniel Smith
On Sun, Nov 12, 2017 at 7:27 AM, Guido van Rossum  wrote:
> On Sun, Nov 12, 2017 at 2:53 AM, Chris Jerdonek 
> wrote:
>>
>> By the way, since we're already on the subject of asyncio tasks and
>> (truncated) stack traces, this looks like a good opportunity to ask a
>> question that's been on my mind for a while:
>>
>> There's a mysterious note at the end of the documentation of
>> asyncio.Task's get_stack() method, where it says--
>>
>> > For reasons beyond our control, only one stack frame is returned for a
>> > suspended coroutine.
>>
>> (https://docs.python.org/3/library/asyncio-task.html#asyncio.Task.get_stack
>> )
>>
>> What does the "For reasons beyond our control" mean? What is it that
>> can possibly be beyond the control of Python?
>
>
> It's an odd phrasing, but it refers to the fact that a suspended generator
> frame (which is what a coroutine really is) does not have a "back link" to
> the frame that called it. Whenever a generator yields (or a coroutine
> awaits) its frame is disconnected from the current call stack, control is
> passed to the top frame left on that stack, and the single generator frame
> is just held on to by the generator object. When you call next() on that, it
> will be pushed on top of whatever is the current stack (i.e. whatever calls
> next()), which *may* be a completely different stack configuration than when
> it was suspended.

It is possible to get the await/yield from stack though, even when a
generator/coroutine is suspended, using the gi_yieldfrom / cr_await
attributes. Teaching Task.get_stack to do this would be a nice little
enhancement.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] Simplifying stack traces for tasks?

2017-11-14 Thread Nathaniel Smith
On Tue, Nov 14, 2017 at 6:54 AM, Mark E. Haase  wrote:
> If an exception is thrown while the `asyncio` event loop is running, the
> stack trace is pretty complicated. Here's an example:
>
[...]
>
> I'm posting here to get constructive criticism on the concept and would also
> like to hear if Curio or Trio have done anything like this. (Based on a
> quick skim of the docs, I don't think they are.) I might publish a PyPI
> package if anybody is interested.

Trio does jump through hoops to try to make its tracebacks more
readable, but so far this is mostly aimed at cases where there are
multiple exceptions happening concurrently, and there is still an
unpleasant amount of internal gibberish that can overwhelm the parts
the user actually cares about. Doing something about this is on my
todo list [1], and it shouldn't be *too* hard given that Trio already
has to take over sys.excepthook, but I'm not 100% sure how to best
track which frames are pure bookkeeping noise and which are
interesting -- it's a bit more subtle than just "in the trio namespace
or not". (I'd also like to annotate the traceback with more
information, e.g. showing where an exception jumped between tasks.)

Messing with tracebacks like this does feel a bit weird, but async
libraries are in a weird position. The interpreter hides traceback
lines all the time, because every time you go into C then the
traceback disappears; Python's tracebacks are specifically designed to
show user code, not interpreter internals. Async libraries are doing
the same work the interpreter does, like implementing exception
propagation machinery, but they're doing it in Python code, so it
messes up the normal heuristic that C = low-level interpreter
machinery, Python = user code.

-n

[1] https://github.com/python-trio/trio/issues/56

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] awaiting task is not chaining exception

2017-11-11 Thread Nathaniel Smith
On Fri, Nov 10, 2017 at 9:52 PM, Chris Jerdonek
 wrote:
> Hi, I recently encountered a situation with asyncio where the stack
> trace is getting truncated: an exception isn't getting chained as
> expected.
>
> I was able to reduce it down to the code below.
>
> The reduced case seems like a pattern that can come up a lot, and I
> wasn't able to find an issue on the CPython tracker, so I'm wondering
> if I'm doing something wrong or if the behavior is deliberate.

I think what you're seeing is collateral damage from some known
bugginess in the generator/coroutine .throw() method:
https://bugs.python.org/issue29587

In trio I work around this by never using throw(); instead I send() in
the exception and re-raise it inside the coroutine:
https://github.com/python-trio/trio/blob/389f1e1e01b410756e2833cffb992fd1ff856ae5/trio/_core/_run.py#L1491-L1498

But asyncio doesn't do this -- when an asyncio.Task awaits an
asyncio.Future and the Future raises, the exception is throw()n into
the Task, triggering the bug:
https://github.com/python/cpython/blob/e184cfd7bf8bcfd160e3b611d4351ca3ce52d9e2/Lib/asyncio/tasks.py#L178

(If you try profiling your code you may also see weird/impossible
results in cases like this, because throw() also messes up stack
introspection.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


[Async-sig] ANN: Trio v0.2.0 released

2017-12-07 Thread Nathaniel Smith
Hi all,

I'm proud to announce the release of Trio v0.2.0. Trio is a new async
concurrency library for Python that's obsessed with usability and
correctness -- we want to make it easy to get things right. This is
the second public release, and it contains major new features and
bugfixes from 14 contributors.

You can read the full release notes here:

https://trio.readthedocs.io/en/latest/history.html#trio-0-2-0-2017-12-06

Some things I'm particularly excited about are:

- Comprehensive support for async file I/O

- The new 'nursery.start' method for clean startup of complex task trees

- The new high-level networking API -- this is roughly the same level
of abstraction as twisted/asyncio's protocols/transports. Includes
luxuries like happy eyeballs for most robust client connections, and
server helpers that integrate with nursery.start.

- Complete support for using SSL/TLS encryption over arbitrary
transports. You can even do SSL-over-SSL, which is useful for HTTPS
proxies and AFAIK not supported by any other Python library.

- Task-local storage.

- Our new contributing guide:
https://trio.readthedocs.io/en/latest/contributing.html

To get started with Trio, the best place to start is our tutorial:

https://trio.readthedocs.io/en/latest/tutorial.html

It doesn't assume any prior familiarity with concurrency or async/await.

Share and enjoy,
-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful

2018-04-26 Thread Nathaniel Smith
On Thu, Apr 26, 2018 at 7:55 PM, Dima Tisnek  wrote:
> My 2c after careful reading:
>
> restarting tasks automatically (custom nursery example) is quite questionable:
> * it's unexpected
> * it's not generally safe (argument reuse, side effects)
> * user's coroutine can be decorated to achieve same effect

It's an example of something that a user could implement. I guess if
you go to the trouble of implementing this behavior, then it is no
longer unexpected and you can also cope with handling the edge cases
:-).There may be some reason why it turns out to be a bad idea
specifically in the context of Python, but it's one of the features
that's famously helpful for making Erlang work so well, so it seemed
worth mentioning.

> It's very nice to have the escape hatch of posting tasks to "someone
> else's" nursery.
> I feel there are more caveats to posting a task to parent's or global
> nursery though.
> Consider that local tasks typically await on other local tasks.
> What happens when N1-task1 waits on N2-task2 and N2-task9 encounters an error?
> My guess is N2-task2 is cancelled, which by default cancels N1-task1 too, 
> right?
> That kinda break the abstraction, doesn't it?

"Await on a task" is not a verb that Trio has. (We don't even have
task objects, except in some low-level plumbing/introspection APIs.)
You can do 'await queue.get()' to wait for another task to send you
something, but if the other task gets cancelled then the data will
just... never arrive.

There is some discussion here of moving from a queue.Queue-like model
to a model with separate send- and receive-channels:

  https://github.com/python-trio/trio/issues/497

If we do this (which I suspect we will), then probably the task that
gets cancelled was holding the only reference to the send-channel (or
even better, did 'with send_channel: ...'), so the channel will get
closed, and then the call to get() will raise an error which it can
handle or not...

But yes, you do need to spend some time thinking about what kind of
task tree topology makes sense for your problem. Trio can give you
tools but it's not a replacement for thoughtful design :-).

> If the escape hatch is available, how about allowing tasks to be moved
> between nurseries?

That would be possible (and in fact there's one special case
internally where we do it!), but I haven't seen a good reason yet to
implement it as a standard feature. If someone shows up with use cases
then we could talk about it :-).

> Is dependency inversion allowed?
> (as in given parent N1 and child N1.N2, can N1.N2.t2 await on N1.t1 ?)
> If that's the case, I guess it's not a "tree of tasks", as in the
> graph is arbitrary, not DAG.

See above re: not having "wait on a task" as a verb.

> I've seen [proprietary] strict DAG task frameworks.
> while they are useful to e.g. perform sub-requests in parallel,
> they are not general enough to be useful at large.
> Thus I'm assuming trio does not enforce DAG...

The task tree itself is in fact a tree, not a DAG. But that tree
doesn't control which tasks can talk to each other. It's just used for
exception propagation, and for enforcing that all children have to
finish before the parent can continue. (Just like how in a regular
function call, the caller stops while the callee is running.) Does
that help?

> Finally, slob programmers like me occasionally want fire-and-forget
> tasks, aka daemonic threads.
> Some are long-lived, e.g. "battery status poller", others short-lived,
> e.g. "tail part of low-latency logging".
> Obv., a careful programmer would keep track of those, but we want
> things simple :)
> Perhaps in line with batteries included principle, trio could include
> a standard way to accomplish that?

Well, what semantics do you want? If the battery status poller
crashes, what should happen? If the "tail part of low-latency logging"
command is still running when you go to shut down, do you want to wait
a bit for it to finish, or cancel it, or ...?

You can certainly implement some helper like:

async with open_throwaway_nursery() as throwaway_nursery:
# If this crashes, we ignore the problem, maybe log it or something
throwaway_nursery.start_soon(some_fn)
...
# When we exit the with block, it gets cancelled
...

if that's what you want. Before adding anything like this to trio
itself though I'd like to see some evidence of how it's being used in
real-ish projects.

> Thanks again for the great post!
> I think you could publish an article on this, it would be good to have
> wider discussion, academic, ES6, etc.

Thanks for the vote of confidence :-). And, we'll see...

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful

2018-04-26 Thread Nathaniel Smith
On Wed, Apr 25, 2018 at 9:43 PM, Guido van Rossum  wrote:
> Now there's a PEP I'd like to see.

Which part?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful

2018-04-26 Thread Nathaniel Smith
On Wed, Apr 25, 2018 at 3:17 AM, Antoine Pitrou <solip...@pitrou.net> wrote:
> On Wed, 25 Apr 2018 02:24:15 -0700
> Nathaniel Smith <n...@pobox.com> wrote:
>> Hi all,
>>
>> I just posted another essay on concurrent API design:
>>
>> https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/
>>
>> This is the one that finally gets at the core reasons why Trio exists;
>> I've been trying to figure out how to write it for at least a year
>> now. I hope you like it.
>
> My experience is indeed that something like the nursery construct would
> make concurrent programming much more robust in complex cases.
> This is a great explanation why.

Thanks!

> API note: I would expect to be able to use it this way:
>
> class MyEndpoint:
>
> def __init__(self):
> self._nursery = open_nursery()
>
> # Lots of behaviour methods that can put new tasks in the nursery
>
> def close(self):
> self._nursery.close()

You might expect to be able to use it that way, but you can't! The
'async with' part of 'async with open_nursery()' is mandatory. This is
what I mean about it forcing you to rethink things, and why I think
there is room for genuine controversy :-). (Just like there was about
goto -- it's weird to think that it could have turned out differently
in hindsight, but people really did have valid concerns...)

I think the pattern we're settling on for this particular case is:

class MyEndpoint:
def __init__(self, nursery, ...):
self._nursery = nursery
# methods here that use nursery

@asynccontextmanager
async def open_my_endpoint(...):
async with trio.open_nursery() as nursery:
yield MyEndpoint(nursery, ...)

Then most end-users do 'async with open_my_endpoint() as endpoint:'
and then use the 'endpoint' object inside the block; or if you have
some special reason why you need to have multiple endpoints in the
same nursery (e.g. you have an unbounded number of endpoints and don't
want to have to somehow write an unbounded number of 'async with'
blocks in your source code), then you can call MyEndpoint() directly
and pass an explicit nursery. A little bit of extra fuss, but not too
much.

So that's how you handle it. Why do we make you jump through these hoops?

The problem is, we want to enforce that each nursery object's lifetime
is bound to the lifetime of a calling frame. The point of the 'async
with' in 'async with open_nursery()' is to perform this binding. To
reduce errors, open_nursery() doesn't even return a nursery object –
only open_nursery().__aenter__() does that. Otherwise, if a task in
the nursery has an unhandled error, we have nowhere to report it
(among other issues).

Of course this is Python, so you can always do gross hacks like
calling __aenter__ yourself, but then you're responsible for making
sure the context manager semantics are respected. In most systems
you'd expect this kind of thing to syntactically enforced as part of
the language; it's actually pretty amazing that Trio is able to makes
things work as well as it can as a "mere library". It's really a
testament to how much thought has been put into Python -- other
languages don't really have any equivalent to with or Python's
generator-based async/await.

> Also perhaps more finegrained shutdown routines such as:
>
> * Nursery.join(cancel_after=None):
>
>   wait for all tasks to join, cancel the remaining ones
>   after the given timeout

Hmm, I've never needed that particular pattern, but it's actually
pretty easy to express. I didn't go into it in this writeup, but:
because nurseries need to be able to cancel their contents in order to
unwind the stack during exception propagation, they need to enclose
their contents in a cancel scope. And since they have this cancel
scope anyway, we expose it on the nursery object. And cancel scopes
allow you to adjust their deadline. So if you write:

async with trio.open_nursery() as nursery:
   ... blah blah ...
   # Last line before exiting the block and triggering the implicit join():
   nursery.cancel_scope.deadline = trio.current_time() + TIMEOUT

then it'll give you the semantics you're asking about. There could be
more sugar for this if it turns out to be useful. Maybe a .timeout
attribute on cancel scopes that's a magic property always equal to
(self.deadline - trio.current_time()), so you could do
'nursery.cancel_scope.timeout = TIMEOUT'?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] task.result() and exception traceback display

2017-12-25 Thread Nathaniel Smith
I haven't thought about this enough to have an opinion about whether
this is correct or how it could be improved, but I can explain why
you're seeing what you're seeing :-).

The traceback is really a trace of where the exception went after it
was raised, with new lines added to the top as it bubbles out. So the
bottom line is the 'raise' statement, because that's where it was
created, and then it bubbled onto the 'call 1' line and was caught.
Then it was raised again and bubbled onto the 'call 2' line. Etc. So
you should think of it not as a snapshot of your stack when it was
created, but as a travelogue.

-n

On Sun, Dec 24, 2017 at 9:55 PM, Chris Jerdonek
 wrote:
> Hi,
>
> I noticed that if a task in asyncio raises an exception, then the
> displayed traceback can be "polluted" by intermediate calls to
> task.result().  Also, the calls to task.result() can appear out of
> order relative to each other and to other lines.
>
> Here is an example:
>
> import asyncio
>
> async def raise_error():
> raise ValueError()
>
> async def main():
> task = asyncio.ensure_future(raise_error())
>
> try:
> await task  # call 1
> except Exception:
> pass
>
> try:
> task.result()  # call 2
> except Exception:
> pass
>
> task.result()  # call 3
>
> asyncio.get_event_loop().run_until_complete(main())
>
> The above outputs--
>
> Traceback (most recent call last):
>   File "test.py", line 24, in 
> asyncio.get_event_loop().run_until_complete(main())
>   File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py",
>   line 467, in run_until_complete
> return future.result()
>   File "test.py", line 21, in main
> task.result()  # call 3
>   File "test.py", line 17, in main
> task.result()  # call 2
>   File "test.py", line 12, in main
> await task  # call 1
>   File "test.py", line 5, in raise_error
> raise ValueError()
> ValueError
>
> Notice that the "call 2" line appears in the traceback, even though it
> doesn't come into play in the exception.  Also, the lines don't obey
> the "most recent call last" rule.  If this rule were followed, it
> should be something more like--
>
> Traceback (most recent call last):
>   File "test.py", line 24, in 
> asyncio.get_event_loop().run_until_complete(main())
>   File "/Users/.../3.6.4rc1/lib/python3.6/asyncio/base_events.py",
>   line 467, in run_until_complete
> return future.result()
>   File "test.py", line 12, in main
> await task  # call 1
>   File "test.py", line 5, in raise_error
> raise ValueError()
>   File "test.py", line 17, in main
> task.result()  # call 2
>   File "test.py", line 21, in main
> task.result()  # call 3
> ValueError
>
> If people agree there's an issue along these lines, I can file an
> issue in the tracker. I didn't seem to find one when searching for
> open issues with search terms like "asyncio traceback".
>
> Thanks,
> --Chris
> ___
> Async-sig mailing list
> Async-sig@python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/



-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


[Async-sig] Blog post: Timeouts and cancellation for humans

2018-01-11 Thread Nathaniel Smith
Hi all,

Folks here might be interested in this new blog post:

https://vorpus.org/blog/timeouts-and-cancellation-for-humans/

It's a detailed discussion of pitfalls and design-tradeoffs in APIs
for timeout and cancellation, and has a proposal for handling them in
a more Pythonic way. Any feedback welcome!

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] Blog post: Timeouts and cancellation for humans

2018-01-16 Thread Nathaniel Smith
On Sun, Jan 14, 2018 at 6:33 PM, Dima Tisnek  wrote:
> Perhaps the latter is what `shield` should do? That is detach computation as
> opposed to blocking the caller past caller's deadline?

Well, it can't do that in trio :-). One of trio's core design
principles is: no detached processes.

And even if you don't think detached processes are inherently a bad
idea, I don't think they're what you'd want in this case anyway. If
your socket shutdown code has frozen, you want to kill it and close
the socket, not move it into the background where it can hang around
indefinitely wasting resources.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] Blog post: Timeouts and cancellation for humans

2018-01-14 Thread Nathaniel Smith
On Sun, Jan 14, 2018 at 5:11 AM, Chris Jerdonek
<chris.jerdo...@gmail.com> wrote:
> On Sun, Jan 14, 2018 at 3:33 AM, Nathaniel Smith <n...@pobox.com> wrote:
>> On Fri, Jan 12, 2018 at 4:17 AM, Chris Jerdonek
>> <chris.jerdo...@gmail.com> wrote:
>>> Say you have a complex operation that you want to be able to timeout
>>> or cancel, but the process of cleanup / cancelling might also require
>>> a certain amount of time that you'd want to allow time for (likely a
>>> smaller time in normal circumstances). Then it seems like you'd want
>>> to be able to allocate a separate timeout for the clean-up portion
>>> (independent of the timeout allotted for the original operation).
>>> ...
>>
>> You can get these semantics using the "shielding" feature, which the
>> post discusses a bit later:
>> ...
>> However, I think this is probably a code smell.
>
> I agree with this assessment. My sense was that shielding could
> probably do it, but it seems like it could be brittle or more of a
> kludge. It would be nice if the same primitive could be used to
> accommodate this and other variations in addition to the normal case.
> For example, a related variation might be if you wanted to let
> yourself extend the timeout in response to certain actions or results.
>
> The main idea that occurs to me is letting the cancel scope be
> dynamic: the timeout could be allowed to change in response to certain
> things. Something like that seems like it has the potential to be both
> simple as well as general enough to accommodate lots of different
> scenarios, including adjusting the timeout in response to entering a
> clean-up phase. One good test would be whether shielding could be
> implemented using such a primitive.

Ah, if you want to change the timeout on a specific cancel scope, that's easy:

async def do_something():
with move_on_after(10) as cscope:
...
# Actually, let's give ourselves a bit more time
cscope.deadline += 10
...

If you have a reference to a Trio cancel scope, you can change its
timeout at any time. However, this is different from shielding. The
code above only changes the deadline for that particular cancel scope.
If the caller sets their own timeout:

with move_on_after(15):
await do_something()

then the code will still get cancelled after 15 seconds when the outer
cancel scope's deadline expires, even though the inner scope ended up
with a 20 second timeout.

Shielding is about disabling outer cancel scopes -- the ones you don't
know about! -- in a particular bit of code. (If you compare to C#'s
cancellation sources or Golang's context-based cancellation, it's like
writing a function that intentionally choose not to pass through the
cancel token it was given into some function it calls.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] Blog post: Timeouts and cancellation for humans

2018-01-14 Thread Nathaniel Smith
On Sun, Jan 14, 2018 at 2:45 PM, Nick Badger  wrote:
>> However, I think this is probably a code smell. Like all code smells,
>> there are probably cases where it's the right thing to do, but when
>> you see it you should stop and think carefully.
>
> Huh. That's a really good point. But I'm not sure the source of the smell is
> the code that needs the shield logic -- I think this might instead be
> indicative of upstream code smell. Put a bit more concretely: if you're
> writing a protocol for an unreliable network (and of course, every network
> is unreliable), requiring a closure operation to transmit something over
> that network is inherently problematic, because it inevitably leads to
> multiple-stage timeouts or ungraceful shutdowns.

I wouldn't go that far -- there are actually good reasons to design
protocols like this.

SSL/TLS is a protocol that has a "goodbye" message (they call it
"close-notify"). According to the spec [1], sending this is mandatory
if you want to cleanly shut down an SSL/TLS connection. Why? Well, say
I send you a message, "Should I buy more bitcoin?" and your reply is
"Yes, but only if the price drops below $XX". Unbeknownst to us, we're
being MITMed. Fortunately, we used SSL/TLS, so the MITM can't alter
what we're saying. But they can manipulate the network; for example,
they could cause our connection to drop after the first 3 bytes of
your message, so your answer gets truncated and I think you just said
"Yes" -- which is very different! But, close-notify saves us -- or at
least contains the damage. Since I know that you're supposed to send a
close-notify at the end of your connection, and I didn't get one, I
can tell that this is a truncated message. I can't tell what the rest
was going to be, but at least I know the message I got isn't the
message you intended to send. And an attacker can't forge a
close-notify message, because they're cryptographically authenticated
like all the data we send.

In websockets, the goodbye handshake is used to work around a nasty
case that can happen with common TCP stacks (like, all of them):

1. A sends a message to B.
2. A is done after that, so it closes the connection.
3. Just then, B sends a message to A, like maybe a regular ping on some timer.
4. A's TCP stack receives data on a closed connection, goes "huh
wut?", and sends a RST packet.
5. B goes to read the last message A sent before they closed the
connection... but whoops it's gone! the RST packet caused both TCP
stacks to wipe out all their buffered data associated with this
connection.

So if you have a protocol that's used for streaming indefinite amounts
of data in both directions and supports stuff like pings, you kind of
have to have a goodbye handshake to avoid TCP stacks accidentally
corrupting your data. (The goodbye handshake can also help make sure
that clients end up carrying CLOSE-WAIT states instead of servers, but
that's a finicky and less important issue.)

Of course, it is absolutely true that networks are unreliable, so when
your protocol specifies a goodbye handshake like this then
implementations still need to have some way to cope if their peer
closes the connection unexpectedly, and they may need to unilaterally
close the connection in some circumstances no matter what the spec
says. Correctly handling every possible case here quickly becomes,
like, infinitely complicated. But nonetheless, as a library author one
has to try to provide some reasonable behavior by default (while
knowing that some users will end up needing to tweak things to handle
special circumstances).

My tentative approach so far in Trio is (a) make cancellation stateful
like discussed in the blog post, because accidentally hanging forever
just can't be a good default, (b) in the "trio.abc.AsyncResource"
interface that complex objects like trio.SSLStream implement (and we
recommend libraries implement too), the semantics for the aclose and
__aexit__ methods are that they're allowed to block forever trying to
do a graceful shutdown, but if cancelled then they have to return
promptly *but still freeing any underlying resources*, possibly in a
non-graceful way. So if you write straightforward code like:

with trio.move_on_after(10):
async with open_websocket_connection(...):
...

then it tries to do a proper websocket goodbye handshake by default,
but if the timeout expires then it gives up and immediately closes the
socket. It's not perfect, but it seems like a better default than
anything else I can think of.

-n

[1] There's also this whole mess where many SSL/TLS implementations
ignore the spec and don't bother sending close-notify. This is *kinda*
justifiable because the original and most popular use for SSL/TLS is
for wrapping HTTP connections, and HTTP has its own ways of signaling
the end of the connection that are already transmitted through the
encrypted tunnel, so the SSL/TLS end-of-connection handshake is
redundant. Therefore lots of 

Re: [Async-sig] Blog post: Timeouts and cancellation for humans

2018-01-14 Thread Nathaniel Smith
On Fri, Jan 12, 2018 at 4:17 AM, Chris Jerdonek
 wrote:
> Thanks, Nathaniel. Very instructive, thought-provoking write-up!
>
> One thing occurred to me around the time of reading this passage:
>
>> "Once the cancel token is triggered, then all future operations on that 
>> token are cancelled, so the call to ws.close doesn't get stuck. It's a less 
>> error-prone paradigm. ... If you follow the path we did in this blog post, 
>> and start by thinking about applying a timeout to a complex operation 
>> composed out of multiple blocking calls, then it's obvious that if the first 
>> call uses up the whole timeout budget, then any future calls should fail 
>> immediately."
>
> One case that's not clear how should be addressed is the following.
> It's something I've wrestled with in the context of asyncio, and it
> doesn't seem to be raised as a possibility in your write-up.
>
> Say you have a complex operation that you want to be able to timeout
> or cancel, but the process of cleanup / cancelling might also require
> a certain amount of time that you'd want to allow time for (likely a
> smaller time in normal circumstances). Then it seems like you'd want
> to be able to allocate a separate timeout for the clean-up portion
> (independent of the timeout allotted for the original operation).
>
> It's not clear to me how this case would best be handled with the
> primitives you described. In your text above ("then any future calls
> should fail immediately"), without any changes, it seems there
> wouldn't be "time" for any clean-up to complete.
>
> With asyncio, one way to handle this is to await on a task with a
> smaller timeout after calling task.cancel(). That lets you assign a
> different timeout to waiting for cancellation to complete.

You can get these semantics using the "shielding" feature, which the
post discusses a bit later:

try:
await do_some_stuff()
finally:
# Always give this 30 seconds to clean up, even if we've
# been cancelled
with trio.move_on_after(30) as cscope:
cscope.shield = True
await do_cleanup()

Here the inner scope "hides" the code inside it from any external
cancel scopes, so it can continue executing even of the overall
context has been cancelled.

However, I think this is probably a code smell. Like all code smells,
there are probably cases where it's the right thing to do, but when
you see it you should stop and think carefully. If you're writing code
like this, then it means that there are multiple different layers in
your code that are implementing timeout policies, that might end up
fighting with each other. What if the caller really needs this to
finish in 15 seconds? So if you have some way to move the timeout
handling into the same layer, then I suspect that will make your
program easier to understand and maintain. OTOH, if you decide you
want it, the code above works :-). I'm not 100% sure here; I'd
definitely be interested to hear about more use cases.

One thing I've thought about that might help is adding a kind of "soft
cancelled" state to the cancel scopes, inspired by the "graceful
shutdown" mode that you'll often see in servers where you stop
accepting new connections, then try to finish up old ones (with some
time limit). So in this case you might mark 'do_some_stuff()' as being
cancelled immediately when we entered the 'soft cancel' phase, but let
the 'do_cleanup' code keep running until the grace period expired and
the region was hard-cancelled. This idea isn't fully baked yet though.
(There's some more mumbling about this at
https://github.com/python-trio/trio/issues/147.)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] Blog post: Timeouts and cancellation for humans

2018-01-13 Thread Nathaniel Smith
On Thu, Jan 11, 2018 at 7:49 PM, Dima Tisnek  wrote:
> Very nice read, Nathaniel.
>
> The post left me wondering how cancel tokens interact or should
> logically interact with async composition, for example:
>
> with move_on_after(10):
> await someio.gather(a(), b(), c())
>
> or
>
> with move_on_after(10):
> await someio.first/race(a(), b(), c())
>
> or
>
> dataset = someio.Future(large_download(), move_on_after=)
>
> task a:
> with move_on_after(10):
> use((await dataset)["a"])
>
> task b:
> with move_on_after(10):
> use((await dataset)["b"])

It's funny you say "async composition"... Trio's concurrency primitive
(nurseries) is closely related to the core concurrency primitive in
Communicating Sequential Processes, which they call "parallel
composition". (Basically, if P and Q are processes, then "P || Q" is
the process that runs both P and Q in parallel and then finishes when
they've both finished.) If you were using that as your primitive, then
tasks would form an orderly tree and this wouldn't be a problem :-).

Given asyncio's actual primitives though, then yeah, this is clearly
the big question, and I doubt there are any simple answers; so far my
ambition has just been to articulate the problem well enough to start
that conversation (see also the "asyncio" section in the blog post).

One possibility might be a hybrid cancel token / cancel scope API:
create a first class cancel token API like C# has, enhance make the
low-level asyncio APIs to use them, and then on top of that add
mechanisms to attach a stack of implicitly-applied cancel tokens to
each task? That's just a vague handwave of an idea so far though.

Note that last case is the one where asyncio cancellation semantics
are already... well, surprising, anyway. If you cancel task a then
task b will receive a CancelledError, even though task a was not
cancelled. (I talked about this a bit in my "Some thoughts ..." blog
post; search for "spooky-cancellation-at-a-distance.py".)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] avoiding accidentally calling blocking code

2018-02-14 Thread Nathaniel Smith
On Wed, Feb 14, 2018 at 12:42 AM, Chris Jerdonek
 wrote:
> Thanks, Dima and Andrew, for your suggestions.
>
> Re: loop.slow_callback_duration, that's interesting. I didn't know
> about that. However, it doesn't seem like that alone would work to
> catch many operations that are normally fast, but strictly speaking
> blocking. For example, it seems like simple disk I/O operations
> wouldn't be caught even with slow_callback_duration set to a small
> value. Dima suggests that such calls are okay. Is there a consensus on
> that?

It's not generally possible to avoid occasional arbitrary blocking,
e.g. due to the GC running, the OS scheduler, page faults, etc.
Basically the problem caused by blocking is when it means that other
tasks are stuck waiting when they could be getting useful work done.
If callbacks are finishing quickly then this isn't happening, so
slow_callback_duration is checking for exactly the right thing.

Where it might fall down is for operations that are only occasionally
slow, so they slip past your testing. E.g. if your disk is fast when
testing on your developer machine, but then in production you run on
some high-occupancy cloud host and a noisy neighbor starts pounding
the disk and suddenly your disk latencies shoot up.

> Dima's mock.patch_all_known_blocking_calls() is an interesting idea
> and seems like it would work for the case I mentioned. Has anyone
> started writing such a method (e.g. for certain standard lib modules)?

The implementation of gevent.monkey.patch_all() is probably not too
far from what you want.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


[Async-sig] new library: sniffio – Sniff out which async library your code is running under

2018-08-16 Thread Nathaniel Smith
Hi all,

A number of people are working on packages that support multiple async
backends (e.g., asyncio + trio, or trio + curio, or trio + twisted,
...). So then the question arises... how can I figure out which async
library my user is actually using?

Answer: install sniffio, and then call
sniffio.current_async_library(), and it tells you.

Well, right now it only works for trio and asyncio, but if you
maintain an async library and you want to make it easier for packages
to detect you, then it's easy to add support – see the manual. We
considered various clever things, but ultimately decided that the best
approach was to use a ContextVar and make it the coroutine runner's
responsibility to advertise which async flavor it uses. In particular,
this approach works even for hybrid programs that are using multiple
coroutine runners in the same loop, like a Twisted program with
asyncio-flavored and twisted-flavored coroutines in the same thread,
or a Trio program using trio-asyncio to run both asyncio-flavored and
trio-flavored coroutines in the same thread.

Github: https://github.com/python-trio/sniffio
Manual: https://sniffio.readthedocs.io/
PyPI: https://pypi.org/p/sniffio

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] new library: sniffio – Sniff out which async library your code is running under

2018-08-21 Thread Nathaniel Smith
On Sat, Aug 18, 2018 at 2:44 PM, Chris Jerdonek
 wrote:
> The kind of alternative I had in mind for a neutral location is
> setting an attribute with an agreed upon name on a module in the
> standard lib, perhaps something like
> `contextvars.current_async_library_cvar` to use your naming. This is
> analogous to agreeing on a file name to store information of a certain
> kind in a repository root, like .travis.yml, package.json, or
> pyproject.toml. It's light-weight and doesn't require any
> infrastructure or tying to a particular package on PyPI.

Yeah, it'd be possible. I guess it just didn't seem worth the extra
complication.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] new library: sniffio – Sniff out which async library your code is running under

2018-08-17 Thread Nathaniel Smith
On Fri, Aug 17, 2018, 09:09 Alex Grönholm  wrote:

> This was my approach:
>
> def _detect_running_asynclib() -> str:
> if 'trio' in sys.modules:
> from trio.hazmat import current_trio_token
> try:
> current_trio_token()
> except RuntimeError:
> pass
> else:
> return 'trio'
>
> if 'curio' in sys.modules:
> from curio.meta import curio_running
> if curio_running():
> return 'curio'
>
> if 'asyncio' in sys.modules:
> from .backends.asyncio import get_running_loop
> if get_running_loop() is not None:
> return 'asyncio'
>
> raise LookupError('Cannot find any running async event loop')
>
>
> Is there something wrong with this?
>

If you're using trio-asyncio, then you can have both trio-flavored
coroutines and asyncio-flavored coroutines running in the same thread. And
in particular, the trio and asyncio tests you do above will both return
true at the same time, even though at any given moment only you can only
'await' one kind of async function or the other.

Twisted running on the asyncio reactor has a similar situation.

-n
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] new library: sniffio – Sniff out which async library your code is running under

2018-08-18 Thread Nathaniel Smith
On Fri, Aug 17, 2018 at 11:44 PM, Chris Jerdonek
 wrote:
> On Fri, Aug 17, 2018 at 12:50 PM, Nathaniel Smith  wrote:
>> On Fri, Aug 17, 2018, 12:12 Chris Jerdonek  wrote:
>>>
>>> Did you also think about whether it would be possible for a library to
>>> advertise itself without having to depend on a third-party library
>>> (e.g. using some sort of convention)? That would permit a less
>>> "centralized" approach.
>>
>>
>> What kind of convention do you have in mind?
>
> Good question. I don't claim to know the answer which is why I asked
> if you had thought about it. The *kind* of thing I had in mind was to
> set a variable with an agreed-upon name and value on an agreed-upon
> module in the standard library -- though I agree that seems hacky as
> stated.
>
> It does seem to me like something that should (already?) have a
> general solution. What other ways does Python let things register or
> "announce" themselves?

Well, you could register an entry in an some global dict under an
agreed-on key, like, say, sys.modules["sniffio"]. Of course, whenever
you're mutating a global object like this you should worry about name
collisions, but fortunately that particular dict has a good convention
for reserving names. In fact there's a whole web service called "PyPI"
devoted to managing those registrations! And then you might as well
upload the code for accessing that variable to the web service, so
everyone doesn't have to copy/paste it into their programs... ;-)

Now that packaging works reliably, it's a pretty good solution for
this kind of thing IMHO.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


[Async-sig] New blog post: Notes on structured concurrency, or: Go statement considered harmful

2018-04-25 Thread Nathaniel Smith
Hi all,

I just posted another essay on concurrent API design:

https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/

This is the one that finally gets at the core reasons why Trio exists;
I've been trying to figure out how to write it for at least a year
now. I hope you like it.

(Guido: this is the one you should read :-). Or if it's too much, you
can jump to the conclusion [1], and I'm happy to come find you
somewhere with a whiteboard, if that'd be helpful!)

-n

[1] 
https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/#conclusion

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] asyncio.Lock equivalent for multiple processes

2018-04-17 Thread Nathaniel Smith
Pretty sure you want to add a try/finally around that yield, so you release
the lock on errors.

On Tue, Apr 17, 2018, 14:39 Ludovic Gasc  wrote:

> 2018-04-17 15:16 GMT+02:00 Antoine Pitrou :
>
>>
>>
>> You could simply use something like the first 64 bits of
>> sha1("myapp:")
>>
>
> I have followed your idea, except I used hashtext directly, it's an
> internal postgresql function that generates an integer directly.
>
> For now, it seems to work pretty well but I didn't yet finished all tests.
> The final result is literally 3 lines of Python inside an async
> contextmanager, I like this solution ;-) :
>
> @asynccontextmanager
> async def lock(env, category='global', name='global'):
> # Alternative lock id with 'mytable'::regclass::integer OID
> await env['aiopg']['cursor'].execute("SELECT pg_advisory_lock(
> hashtext(%(lock_name)s) );", {'lock_name': '%s.%s' % (category, name)})
>
> yield None
>
> await env['aiopg']['cursor'].execute("SELECT pg_advisory_unlock(
> hashtext(%(lock_name)s) );", {'lock_name': '%s.%s' % (category, name)})
>
>
>
>>
>> Regards
>>
>> Antoine.
>>
>>
>> On Tue, 17 Apr 2018 15:04:37 +0200
>> Ludovic Gasc  wrote:
>> > Hi Antoine & Chris,
>> >
>> > Thanks a lot for the advisory lock, I didn't know this feature in
>> > PostgreSQL.
>> > Indeed, it seems to fit my problem.
>> >
>> > The small latest problem I have is that we have string names for locks,
>> > but advisory locks accept only integers.
>> > Nevertheless, it isn't a problem, I will do a mapping between names and
>> > integers.
>> >
>> > Yours.
>> >
>> > --
>> > Ludovic Gasc (GMLudo)
>> >
>> > 2018-04-17 13:41 GMT+02:00 Antoine Pitrou :
>> >
>> > > On Tue, 17 Apr 2018 13:34:47 +0200
>> > > Ludovic Gasc  wrote:
>> > > > Hi Nickolai,
>> > > >
>> > > > Thanks for your suggestions, especially for the file system lock:
>> We
>> > > don't
>> > > > have often locks, but we must be sure it's locked.
>> > > >
>> > > > For 1) and 4) suggestions, in fact we have several systems to sync
>> and
>> > > also
>> > > > a PostgreSQL transaction, the request must be treated by the same
>> worker
>> > > > from beginning to end and the other systems aren't idempotent at
>> all,
>> > > it's
>> > > > "old-school" proprietary systems, good luck to change that ;-)
>> > >
>> > > If you already have a PostgreSQL connection, can't you use a
>> PostgreSQL
>> > > lock?  e.g. an "advisory lock" as described in
>> > > https://www.postgresql.org/docs/9.1/static/explicit-locking.html
>> > >
>> > > Regards
>> > >
>> > > Antoine.
>> > >
>> > >
>> > >
>> >
>>
>>
>>
>> ___
>> Async-sig mailing list
>> Async-sig@python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
>
> ___
> Async-sig mailing list
> Async-sig@python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] await question

2018-12-10 Thread Nathaniel Smith
Yeah, 'await' makes it *possible* for the function you're calling to return
control to the event loop, but returning is still an explicit action that
the function has to take.

In asyncio, the only operation that actually returns control to the event
loop is awaiting a Future object.

In Trio, we make a documented guarantee that all the async functions in the
'trio' namespace do in fact return control to the event loop (we call this
a "checkpoint").

In both cases, the effect is the same: if a function never actually awaits
a Future/calls one of Trio's built-in async functions, either directly or
indirectly, then it won't return to the event loop.

(And curio actually has a number of primitives that you call with await,
and that do in fact return to the event loop, but that still don't actually
let other tasks run, which I found pretty confusing. This is one of the
major reasons I stopped using curio.)

-n

On Mon, Dec 10, 2018, 04:32 Dima Tisnek  No, in this case fib(1) is resolved instantly, thus it's caller is
> resolved instantly, thus...
>
> On Mon, 10 Dec 2018 at 9:28 PM, Pradip Caulagi  wrote:
>
>> I was wondering if every use of 'await' should return the control to
>> event loop? So in this example -
>> https://gist.github.com/caulagi/3edea8cf734495f2592528a48f99e1d2 - I
>> was hoping I would see 'A', 'B' to be mixed, but I see only 'A'
>> followed by 'B'. What am I missing? I am using Python 3.7.1.
>>
>> How is my example different from
>> https://docs.python.org/3.7/library/asyncio-task.html#asyncio.run?
>>
>> Thanks.
>> ___
>> Async-sig mailing list
>> Async-sig@python.org
>> https://mail.python.org/mailman/listinfo/async-sig
>> Code of Conduct: https://www.python.org/psf/codeofconduct/
>>
> ___
> Async-sig mailing list
> Async-sig@python.org
> https://mail.python.org/mailman/listinfo/async-sig
> Code of Conduct: https://www.python.org/psf/codeofconduct/
>
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


[Async-sig] Preventing `yield` inside certain context managers

2019-04-02 Thread Nathaniel Smith
This proposal might be of interest to folks here:
https://discuss.python.org/t/preventing-yield-inside-certain-context-managers/1091

I posted the main text there instead of here because it's a change to
the core interpreter, but the motivation is async APIs.

Probably best to post replies there, to keep the discussion consolidated.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] Inadvertent layering of synchronous code as frameworks adopt asyncio

2019-03-27 Thread Nathaniel Smith
On Wed, Mar 27, 2019 at 1:49 PM Guido van Rossum  wrote:
>
> On Wed, Mar 27, 2019 at 1:23 PM Nathaniel Smith  wrote:
>>
>> On Wed, Mar 27, 2019 at 10:44 AM Daniel Nugent  wrote:
>> >
>> > FWIW, the ayncio_run_encapsulated approach does not work with the 
>> > transport/protocol apis because the loop needs to stay alive concurrent 
>> > with the connection in order for the awaitables to all be on the same loop.
>>
>> Yeah, there are two basic approaches being discussed here: using two
>> different loops, versus re-entering an existing loop.
>> asyncio_run_encapsulated is specifically for the two-loops approach.
>>
>> In this version, the outer loop, and everything running on it, stop
>> entirely while the inner loop is running – which is exactly what
>> happens with any other synchronous, blocking API. Using
>> asyncio_run_encapsulated(aiohttp.get(...)) in Jupyter is exactly like
>> using requests.get(...), no better or worse.
>
>
> And Yury's followup suggests that it's hard to achieve total isolation 
> between loops, due to subprocess management and signal handling (which are 
> global states in the OS, or at least per-thread -- the OS doesn't know about 
> event loops).

The tough thing about signals is that they're all process global
state, *not* per-thread.

In Trio I think this wouldn't be a big deal – whenever we touch signal
handlers, we save the old value and then restore it afterwards, so the
inner loop would just temporarily override the outer loop, which I
guess is what you'd expect. (And Trio's subprocess support avoids
touching signals or any global state.) Asyncio could potentially do
something similar, but its subprocess support does rely on signals,
which could get messy since the outer loop can't be allowed to miss
any SIGCHLDs. Asyncio does have a mechanism to share SIGCHLD handlers
between loops (intended to support the case where you have loops
running in multiple threads simultaneously), and it might handle this
case too, but I don't know the details well enough to say for sure.

> I just had another silly idea. What if the magical decorator that can be used 
> to create a sync version of an async def (somewhat like tworoutines) made the 
> async version hand off control to a thread pool? Could be a tad slower, but 
> the tenor of the discussion seems to be that performance is not that much of 
> an issue.

Unfortunately I don't think this helps much... If your async def
doesn't use signals, then it won't interfere with the outer loop's
signal state and a thread is unnecessary. And if it *does* use
signals, then you can't put it in a thread, because Python threads are
forbidden to call any of the signal-related APIs.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [Async-sig] Reliably make unhandled exceptions crash the event loop

2019-02-25 Thread Nathaniel Smith
On Mon, Feb 25, 2019 at 4:15 PM Josh Quigley <0zer...@gmail.com> wrote:
>
> I've realised the error of my ways: because Task separates the scheduling 
> from the response handling, you cannot know if an exception is unhandled 
> until the task is deleted. So in my example the reference means the task is 
> not deleted, so the exception is not yet unhandled.
>
> This is in contrast to APIs like call_soon(callable, success_callback, 
> error_callback) where there the possibility of delayed error handling is not 
> present. In that case the loop can reliably crash if either callback raises 
> an exception.
>
> So, the 'solution' to this use-case is to always attach error handers to 
> Tasks. A catch-all solution cannot catch every error case.

That's right. There are other ways to structure async code to avoid
running into these cases, that are implemented in Trio, and there are
discussions happening (slowly) about adding them into asyncio as well.
See:

https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/

Also, I could swear I saw some library that tried to implement
nurseries on asyncio, but I can't find it now... :-/ maybe someone
else here knows?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Async-sig mailing list
Async-sig@python.org
https://mail.python.org/mailman/listinfo/async-sig
Code of Conduct: https://www.python.org/psf/codeofconduct/