Re: [python-tulip] Why Tasks are not callable? (+docs issues)

Guido van Rossum Thu, 01 May 2014 19:42:25 -0700

Paul,

Where were you when PEP 3156 was being discussed?


There's probably a very good reason that explains why the current API is
"right", but the point is moot -- we have selected an API, we have
implemented it, we have released it, and now we should live with it and
start using it.

--Guido



On Thu, May 1, 2014 at 7:14 PM, Paul Sokolovsky <[email protected]> wrote:

> Hello,
>
> On Fri, 25 Apr 2014 21:05:48 +0300
> Paul Sokolovsky <[email protected]> wrote:
>
> []
>
> > > I suspect your expectactions are tainted by the previous knowledge
> > > of the threading API, which has a separate Thread.start() method.  I
> >
> > My expectations are "tainted" by: 1) basic programming rule of thumb
> > that you first initialize things properly, and then execute them; 2)
> > intuitive feeling, and even explicit knowledge, of Python's "explicit
> > is better than implicit" principle; 3) acquaintance (cursory, I have
> > to admit) with many-year history of using generators/coroutines for
> > async cooperative multitasking, and desire to use that using
> > standardized API asyncio promotes.
> >
> > > think it makes _some_ sense that Thread objects do not start the
> > > actual thread automatically, since threads are preemptive and prone
> > > to race conditions, and you may want to store the Thread object in
> > > some data structure _before_ the thread actually begins executing.
> > > With asyncio.Task, even if the task is scheduled to be executed, it
> > > is guaranteed not to be executed until you reach "yield from"
> > > statement, so you have plenty of opportunity to any setup prior to
> > > the task executing.
> >
> > Let's sum up what you're saying here: asyncio Task implementation, by
> > relying on internal asyncio implementation details (so, naive users
> > who will get fixation on such behavior will fail miserably in other
> > contexts), violates "Explicit is better than implicit" principle *just
> > because it can* ?
>
> Ok, I did some (re)reading on the topic, and had some time to think
> about it, based on the arguments provided, and here some additional
> thoughts and arguments:
>
> Point #1
>
> First of all I probably should have mentioned that my expectations for
> coroutine scheduler are set forth by wonderful series on generators and
> coroutines by David Beazley. This specific slide give the essence of
> it:
>
> http://www.slideshare.net/dabeaz/a-curious-course-on-coroutines-and-concurrency-5286140/137
> . So, it's possible to write *coroutine* scheduler in such a way that
> coroutines do not (and cannot if needed) access the main loop directly.
> They communicate with using yield/yield from, which serve the same
> purpose as syscall in an OS design. So, knowing that Python offers such
> level separation, it added to cognitive dissonance to see that asyncio
> not only does not separate object access, it tightly couple even
> behavior of Task to a loop.
>
> Point #2
>
> The latest of David' series was presented just at the recent PyCon
> 2014: http://www.dabeaz.com/finalgenerator/ . And from slide 43 he
> presents step-by-step walkthru on building a concurrent execution
> framework, which (un)surprisingly shapes up as having almost the same
> API and architecture asyncio. So, it should be fair to say that those
> slides are good tutorial on asyncio design for dummies. So, his
> framework is very similar to asyncio: it's starts with
> callbacks, then switches to coroutines as more adequate representation,
> they got wrapped in Task's for bookkeeping, results are represented by
> Future's, then it's shown that Task and Future share many traits, so it
> makes sense to make to make one subclass of another, etc.
>
> They are very similar except for one implementation detail: David's
> framework doesn't use cooperative multitasking for execution, but
> rather a thread pool. You can easily imagine what that means: a started
> Task really does start immediately, so if it suddenly starts behind
> user's back, there's no time to add callbacks to it later. That's why
> David's framework doesn't start Tasks behind user's back, which is
> natural solution (like, you don't need to know that it doesn't start
> them - it's just default choice). During initial stages of design,
> Tasks are kickstarted using a .step() method, later explicit scheduling
> function introduced: start_inline_future(), run_inline_future().
>
> So, let's step back at overview the situation.
> https://docs.python.org/3.4/library/asyncio-task.html#future explicitly
> says that asyncio.Future is "almost" compatible with
> concurrent.futures.Future. Why "almost"? Apparently because
> concurrent.futures.Future has some features depending on concurrent
> execution model and specifically underlying thread/process
> implementations, which don't map well to cooperative/event loop
> execution model. PEP-3156 explicitly mentions that it would be nice to
> unify both Futures in the future.
>
> Certainly, asyncio would learn from such experience and try to provide
> API model not relying on particular underlying details which would
> hamper compatibility and reuse, yes? No, because what we talk about is
> that asyncio (ab)uses the fact that underlying event loop doesn't start
> execution immediately, so forcefully schedules a Task a makes user add
> important changes to it after it is in active state, which is
> backwards from general point of view.
>
> Point #3
>
> Yet another perspective. Ok, after all there's nothing wrong with being
> able to schedule a coroutine using a global function - after all,
> Point #1 above praises complete separation between coroutines and loop
> using a yield. As yield cannot be used outside a function, it's not
> so bad idea to provide global function to schedule a coroutine. One
> problem here is that "Task" or "async" are not too suggestive names for
> a function which performs scheduling. Actually, I have hypothesis why
> it's not too plausible to imagine such purpose for them at all. It's
> grounded in dichotomization of asyncio API:
>
> 1. Some operations are expressed as methods of event loop object, e.g.
>
> loop.run_forever()
> loop.call_soon()
>
> 2. While other are expressed as global functions taking optional loop
> parameter:
>
> asyncio.wait(..., loop=None, ...)
> asyncio.sleep(..., loop=None)
>
>
> This API asymmetry is not particularly obvious from first look. The docs
> start with description of loop methods, which kind of sets expectations
> that all important functions should be available as such, and the rest
> are just objects/factory functions, and not normal functions with side
> effects, to which category both
>
> asyncio.Task(..., loop=None)
> asyncio.async(..., loop=None)
>
> should be related (regardless of the actual implementation details, like
> the fact that "Task" is implemented as a class).
>
> How this issue can be solved (besides being clearly described in docs)?
> Well, it would help if the module offered just a particular variety of
> API. For example, my problem is that I expected all operations to be
> available as methods of loop.
>
> But dropping that and having stuff like:
>
> asyncio.run_forever(loop=None)
>
> would work just as well, and probably would just allow for even more
> efficient implementation (no need for dummy loop object when we have
> "embedded loop" for example).
>
> Finally, having both models, but offering more complete coverage of
> operations in both (with easy-to-understand names) would be good either.
>
>
>
> --
> Best regards,
>  Paul                          mailto:[email protected]
>



-- 
--Guido van Rossum (python.org/~guido)

Re: [python-tulip] Why Tasks are not callable? (+docs issues)

Reply via email to