[Python-ideas] Re: await by default

Kyle Stanley Sat, 13 Jun 2020 22:44:30 -0700

> IOW the solution to the problem is to use threads. You can see here
why I said what I did: threads specifically avoid this problem and the
only way for asyncio to avoid it is to use threads.


In the case of the above example, I'd say it's more so "use coroutines by
default and threads as needed" rather than just using threads, but fair
enough. I'll concede that point.

> For instance, maybe during testing (with debug=True), your
DNS lookups are always reasonably fast, but then some time after
deployment, you find that they're stalling you out. How much effort is
it to change this over? How many other things are going to be slow,
and can you find them all?

That's very situationally dependent, but for any IO-bound call with a
variable time where async isn't an option (either because it's not
available, standardized, widespread, etc.), I'd advise using
loop.run_in_executor()/to_thread() preemptively. This is easier said than
done of course and it's very possible for some to be glossed over. If it's
missed though, I don't think it's too much effort to change it over; IMO
the main challenge is more so with locating all of them in production for a
large, existing codebase.

> 3) Steven D'Aprano is terrified of them and will rail on you for using
threads.

Haha, I've somehow completely missed that. I CC'd Steven in the response,
since I'm curious as to what he has to say about that.

> Take your pick. Figure out what your task needs. Both exist for good
reasons.

Completely agreed, threads and coroutines are two completely different
approaches, with neither one being clearly superior for all situations.
Even as someone who's invested a significant amount of time in helping to
improve asyncio recently, I'll admit that I decently often encounter users
that would be better off using threads. Particularly for code that isn't
performance or resource critical, or when it involves a reasonably small
number of concurrent operations that aren't expected to scale in volume
significantly. The fine-grained control over context switching (which can
be a pro or a con), shorter switch delay, and lower resource usage from
coroutines isn't always worth the added code complexity.



On Sun, Jun 14, 2020 at 12:43 AM Chris Angelico <[email protected]> wrote:

> On Sun, Jun 14, 2020 at 2:16 PM Kyle Stanley <[email protected]> wrote:
> >
> > > If
> > you're fine with invisible context switches, you're probably better
> > off with threads, because they're not vulnerable to unexpectedly
> > blocking actions (a common culprit being name lookups before network
> > transactions - you can connect sockets asynchronously, but
> > gethostbyname will block the current thread).
> >
> > These "unexpectedly blocking actions" can be identified in asyncio's
> debug mode. Specifically, any callback or task step that has a duration
> greater than 100ms will be logged. Then, the user can take a closer look at
> the offending long running step. If it's like socket.gethostbyname() and is
> a blocking IO-bound function call, it can be executed in a thread pool
> using loop.run_in_executor(None, socket.gethostbyname, hostname) to avoid
> blocking the event loop. In 3.9, there's also a roughly equivalent
> higher-level function that doesn't require access to the event loop:
> asyncio.to_thread(socket.gethostbyname, hostname).
> >
> > With the default duration of 100ms, it likely wouldn't pick up on
> socket.gethostbyname(), but it can rather easily be adjusted via the
> modifiable loop.slow_callback_duration attribute.
> >
> > Here's a quick, trivial example:
> > ```
> > import asyncio
> > import socket
> >
> > async def main():
> >     loop = asyncio.get_running_loop()
> >     loop.slow_callback_duration = .01 # 10ms
> >     socket.gethostbyname("python.org")
> >
> > asyncio.run(main(), debug=True)
> > # If asyncio.run() is not an option, it can also be enabled via:
> > #     loop.set_debug()
> > #     using -X dev
> > #     PYTHONASYNCIODEBUG env var
> > ```
> > Output (3.8.3):
> > Executing <Task finished name='Task-1' coro=<main() done, defined at
> asyncio_debug_ex.py:5> result=None created at
> /usr/lib/python3.8/asyncio/base_events.py:595> took 0.039 seconds
> >
> > This is a bit more involved than it is for working with threads; I just
> wanted to demonstrate one method of addressing the problem, as it's a
> decently common issue. For more details about asyncio's debug mode, see
> https://docs.python.org/3/library/asyncio-dev.html#debug-mode.
> >
>
> IOW the solution to the problem is to use threads. You can see here
> why I said what I did: threads specifically avoid this problem and the
> only way for asyncio to avoid it is to use threads. (Yes, you can
> asynchronously do a DNS lookup rather than using gethostbyname, but
> the semantics aren't identical, and you may seriously annoy someone
> who uses other forms of name resolution. So that doesn't count.) As an
> additional concern, you don't always know which operations are going
> to be slow. For instance, maybe during testing (with debug=True), your
> DNS lookups are always reasonably fast, but then some time after
> deployment, you find that they're stalling you out. How much effort is
> it to change this over? How many other things are going to be slow,
> and can you find them all?
>
> That's why threads are so convenient for these kinds of jobs.
>
> Disadvantages of threads:
> 1) Overhead. If you make one thread for each task, your maximum
> simultaneous tasks can potentially be capped. Irrelevant if each task
> is doing things with far greater overhead anyway.
> 2) Unexpected context switching. Unless you use locks, a context
> switch can occur at any time. The GIL ensures that this won't corrupt
> Python's internal data structures, but you have to be aware of it with
> any mutable globals or shared state.
> 3) Steven D'Aprano is terrified of them and will rail on you for using
> threads.
>
> Disadvantages of asyncio:
> 1) Code complexity. You have to explicitly show which things are
> waiting on which others.
> 2) Unexpected LACK of context switching. Unless you use await, a
> context switch cannot occur.
>
> Take your pick. Figure out what your task needs. Both exist for good
> reasons.
>
> ChrisA
> _______________________________________________
> Python-ideas mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/python-ideas.python.org/
> Message archived at
> https://mail.python.org/archives/list/[email protected]/message/AJ2EOLSWSOAPSUG7BOM5MF3CHP3BHS3H/
> Code of Conduct: http://python.org/psf/codeofconduct/
>

_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/HWNQAUZPTKPMNHTRJFGLVC3HBBJDM4AF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: await by default

Reply via email to