Re: [python-tulip] Why Tasks are not callable? (+docs issues)

Paul Sokolovsky Fri, 02 May 2014 09:39:12 -0700

Hello,

On Thu, 1 May 2014 19:41:52 -0700
Guido van Rossum <[email protected]> wrote:

> Paul,
> 
> Where were you when PEP 3156 was being discussed?

I wasn't around, sorry ;-).

> 
> There's probably a very good reason that explains why the current API
> is "right", but the point is moot -- we have selected an API, we have
> implemented it, we have released it, and now we should live with it
> and start using it.

Yeah, "start using" is exactly what I'm trying to do. I hope this close
attention to asyncio API doesn't get misinterpreted - I'm sure many
people, myself included, consider asyncio very important module, so
detailed attention to it is yet to come IMHO. Besides, PEP mentions
that there's room for adjustments till 3.5, if they will be found
worthy. But as I mentioned, I don't pledge for any (specific) changes
(besides docs clarifications), I just consider it good to learn what
asyncio *is* by considering how it compares with expectations from
prior art and how it could be different. If this thread will give
insight to a casual googler later, I already consider it serve its
purpose well.

With that intro, I'd like to finish this thread with conclusions on
specific usage case I intended for asyncio. As I mentioned previously,
that was an idea to implement "light" version of asyncio for
MicroPython, so the 1) unchanged code could run unmodified on both
CPython and uPython implementations, while 2) uPython implementation
was really light on resource usage. 2nd requirement means that
framework should be very thin layer on top of coroutines, as they are
implemented on C level and thus inherently more efficient than Python
callbacks and Future instances to wrap them (and I spent a lot of time
figuring out how "yield from" should work and implementing that).

Well, careful reading of PEP3156 sets the accounts right. asyncio was
designed with quite different requirements! Put it in a funny passage, I
took for granted that Python's async framework would support
coroutines, and then looked for an excuse why all those Futures and
Tasks stand in my way to just using the them. But asyncio actually
started with that proverbial "least common denominator" of callbacks,
and then provided an excuse to support native coroutines in the same
framework (because BDFL doesn't like callbacks. Just kidding ;-) ).
Specifically, event loop abstraction is specifically lacks any support
for coroutines, and coroutine support is fully implemented on top of
public event loop API. Here're relevant quotes from PEP:

"For users (like myself) who don't like using callbacks, a scheduler is
provided for writing asynchronous I/O code as coroutines using the PEP
380 yield from expressions. The scheduler is not pluggable;
pluggability occurs at the event loop level, and the standard scheduler
implementation should work with any conforming event loop
implementation. (In fact this is an important litmus test for
conforming implementations.)"

"For interoperability between code written using coroutines and other
async frameworks, the scheduler defines a Task class that behaves like
a Future."

"The scheduler has no public interface. You interact with it by using
yield from future and yield from task. In fact, there is no single
object representing the scheduler -- its behavior is implemented by the
Task and Future classes using only the public interface of the event
loop, so it will work with third-party event loop implementations, too."

Does it makes sense? Pretty much, especially taking into account that
asyncio "business case" is providing a foundation for various
existing async frameworks to interoperate.

As I mentioned previously, I would consider that adding a loop method
to allow schedule a coroutine directly would solve my issues, but based
on the requirements above, it's no-goer, as it will break asyncio
layering.

So, that's it - asyncio has respectable aims and requirements, but
those unfortunately do not cover all possible requirements for an async
framework a Python community may have. That's of course comes as a
little surprise, but I hope people who use other frameworks will at
least give asyncio a try and if dismiss it, then based on requirements
mismatch, and not because they didn't have time for it, don't "like",
or don't understand.

My specific choice for MicroPython then is to develop event loop with
native coroutine support - it will mimic asyncio API, but won't be
compatible, so another adaptation layer will be needed to run
asyncio-upy on native asyncio. And I don't know where that will lead me
- maybe I'll find Futures and wrapping coros in Tasks to be unavoidably
useful, and then it will be just one small step for full asyncio
compatibility. We'll see.

In the meantime, while analyzing all this stuff, I drafted a trivial
asyncio subset implementation (native API) which is capable to run
examples from asyncio docs (even loop & coroutines/tasks sections).
Maybe it will be useful for others studying asyncio design:
https://github.com/micropython/micropython-lib/tree/asyncio/asyncio_slow

> 
> --Guido
> 
> 
> 
> On Thu, May 1, 2014 at 7:14 PM, Paul Sokolovsky <[email protected]>
> wrote:
> 
> > Hello,
> >
> > On Fri, 25 Apr 2014 21:05:48 +0300
> > Paul Sokolovsky <[email protected]> wrote:
> >
> > []
> >
> > > > I suspect your expectactions are tainted by the previous
> > > > knowledge of the threading API, which has a separate
> > > > Thread.start() method.  I
> > >
> > > My expectations are "tainted" by: 1) basic programming rule of
> > > thumb that you first initialize things properly, and then execute
> > > them; 2) intuitive feeling, and even explicit knowledge, of
> > > Python's "explicit is better than implicit" principle; 3)
> > > acquaintance (cursory, I have to admit) with many-year history of
> > > using generators/coroutines for async cooperative multitasking,
> > > and desire to use that using standardized API asyncio promotes.
> > >
> > > > think it makes _some_ sense that Thread objects do not start the
> > > > actual thread automatically, since threads are preemptive and
> > > > prone to race conditions, and you may want to store the Thread
> > > > object in some data structure _before_ the thread actually
> > > > begins executing. With asyncio.Task, even if the task is
> > > > scheduled to be executed, it is guaranteed not to be executed
> > > > until you reach "yield from" statement, so you have plenty of
> > > > opportunity to any setup prior to the task executing.
> > >
> > > Let's sum up what you're saying here: asyncio Task
> > > implementation, by relying on internal asyncio implementation
> > > details (so, naive users who will get fixation on such behavior
> > > will fail miserably in other contexts), violates "Explicit is
> > > better than implicit" principle *just because it can* ?
> >
> > Ok, I did some (re)reading on the topic, and had some time to think
> > about it, based on the arguments provided, and here some additional
> > thoughts and arguments:
> >
> > Point #1
> >
> > First of all I probably should have mentioned that my expectations
> > for coroutine scheduler are set forth by wonderful series on
> > generators and coroutines by David Beazley. This specific slide
> > give the essence of it:
> >
> > http://www.slideshare.net/dabeaz/a-curious-course-on-coroutines-and-concurrency-5286140/137
> > . So, it's possible to write *coroutine* scheduler in such a way
> > that coroutines do not (and cannot if needed) access the main loop
> > directly. They communicate with using yield/yield from, which serve
> > the same purpose as syscall in an OS design. So, knowing that
> > Python offers such level separation, it added to cognitive
> > dissonance to see that asyncio not only does not separate object
> > access, it tightly couple even behavior of Task to a loop.
> >
> > Point #2
> >
> > The latest of David' series was presented just at the recent PyCon
> > 2014: http://www.dabeaz.com/finalgenerator/ . And from slide 43 he
> > presents step-by-step walkthru on building a concurrent execution
> > framework, which (un)surprisingly shapes up as having almost the
> > same API and architecture asyncio. So, it should be fair to say
> > that those slides are good tutorial on asyncio design for dummies.
> > So, his framework is very similar to asyncio: it's starts with
> > callbacks, then switches to coroutines as more adequate
> > representation, they got wrapped in Task's for bookkeeping, results
> > are represented by Future's, then it's shown that Task and Future
> > share many traits, so it makes sense to make to make one subclass
> > of another, etc.
> >
> > They are very similar except for one implementation detail: David's
> > framework doesn't use cooperative multitasking for execution, but
> > rather a thread pool. You can easily imagine what that means: a
> > started Task really does start immediately, so if it suddenly
> > starts behind user's back, there's no time to add callbacks to it
> > later. That's why David's framework doesn't start Tasks behind
> > user's back, which is natural solution (like, you don't need to
> > know that it doesn't start them - it's just default choice). During
> > initial stages of design, Tasks are kickstarted using a .step()
> > method, later explicit scheduling function introduced:
> > start_inline_future(), run_inline_future().
> >
> > So, let's step back at overview the situation.
> > https://docs.python.org/3.4/library/asyncio-task.html#future
> > explicitly says that asyncio.Future is "almost" compatible with
> > concurrent.futures.Future. Why "almost"? Apparently because
> > concurrent.futures.Future has some features depending on concurrent
> > execution model and specifically underlying thread/process
> > implementations, which don't map well to cooperative/event loop
> > execution model. PEP-3156 explicitly mentions that it would be nice
> > to unify both Futures in the future.
> >
> > Certainly, asyncio would learn from such experience and try to
> > provide API model not relying on particular underlying details
> > which would hamper compatibility and reuse, yes? No, because what
> > we talk about is that asyncio (ab)uses the fact that underlying
> > event loop doesn't start execution immediately, so forcefully
> > schedules a Task a makes user add important changes to it after it
> > is in active state, which is backwards from general point of view.
> >
> > Point #3
> >
> > Yet another perspective. Ok, after all there's nothing wrong with
> > being able to schedule a coroutine using a global function - after
> > all, Point #1 above praises complete separation between coroutines
> > and loop using a yield. As yield cannot be used outside a function,
> > it's not so bad idea to provide global function to schedule a
> > coroutine. One problem here is that "Task" or "async" are not too
> > suggestive names for a function which performs scheduling.
> > Actually, I have hypothesis why it's not too plausible to imagine
> > such purpose for them at all. It's grounded in dichotomization of
> > asyncio API:
> >
> > 1. Some operations are expressed as methods of event loop object,
> > e.g.
> >
> > loop.run_forever()
> > loop.call_soon()
> >
> > 2. While other are expressed as global functions taking optional
> > loop parameter:
> >
> > asyncio.wait(..., loop=None, ...)
> > asyncio.sleep(..., loop=None)
> >
> >
> > This API asymmetry is not particularly obvious from first look. The
> > docs start with description of loop methods, which kind of sets
> > expectations that all important functions should be available as
> > such, and the rest are just objects/factory functions, and not
> > normal functions with side effects, to which category both
> >
> > asyncio.Task(..., loop=None)
> > asyncio.async(..., loop=None)
> >
> > should be related (regardless of the actual implementation details,
> > like the fact that "Task" is implemented as a class).
> >
> > How this issue can be solved (besides being clearly described in
> > docs)? Well, it would help if the module offered just a particular
> > variety of API. For example, my problem is that I expected all
> > operations to be available as methods of loop.
> >
> > But dropping that and having stuff like:
> >
> > asyncio.run_forever(loop=None)
> >
> > would work just as well, and probably would just allow for even more
> > efficient implementation (no need for dummy loop object when we have
> > "embedded loop" for example).
> >
> > Finally, having both models, but offering more complete coverage of
> > operations in both (with easy-to-understand names) would be good
> > either.
> >
> >
> >
> > --
> > Best regards,
> >  Paul                          mailto:[email protected]
> >
> 
> 
> 
> -- 
> --Guido van Rossum (python.org/~guido)

-- 
Best regards,
 Paul                          mailto:[email protected]

Re: [python-tulip] Why Tasks are not callable? (+docs issues)

Reply via email to