Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-14 Thread Mike Bayer

On Jul 14, 2014, at 9:46 AM, Jay Pipes  wrote:

> 
> The point of eventlet, I thought, was to hide the low-level stuff so that 
> developers could focus on higher-level (and more productive) abstractions. 
> Introducing asyncio contructs into the higher level code like Nova and 
> Neutron seems to be a step in the wrong direction, IMHO. I'd rather see a 
> renewed focus on getting Taskflow incorporated into Nova.

There’s a contingent that disagrees that “hiding low-level stuff” in the case 
of context switching at the point of IO (and in the case of other things too) 
is a good thing.   It’s a more fundamental argument that drives the push 
towards explicit async.In some of my ahem discussions on twitter about 
this, I’ve tried to compare such discomfort with that of Python’s GC firing off 
at implicit moments, why aren’t they uncomfortable with that?   But it’s 
twitter so by that time the discussion is all over the place.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-14 Thread Jay Pipes

On 07/09/2014 11:39 AM, Clint Byrum wrote:

Excerpts from Yuriy Taraday's message of 2014-07-09 03:36:00 -0700:

On Tue, Jul 8, 2014 at 11:31 PM, Joshua Harlow 
wrote:


I think clints response was likely better than what I can write here, but
I'll add-on a few things,



How do you write such code using taskflow?

  @asyncio.coroutine
  def foo(self):
  result = yield from some_async_op(...)
  return do_stuff(result)


The idea (at a very high level) is that users don't write this;

What users do write is a workflow, maybe the following (pseudocode):

# Define the pieces of your workflow.

TaskA():
   def execute():
   # Do whatever some_async_op did here.

   def revert():
   # If execute had any side-effects undo them here.

TaskFoo():
...

# Compose them together

flow = linear_flow.Flow("my-stuff").add(TaskA("my-task-a"),
TaskFoo("my-foo"))



I wouldn't consider this composition very user-friendly.


I find it extremely user friendly when I consider that it gives you
clear lines of delineation between "the way it should work" and "what
to do when it breaks."


Agreed.

snip...


Sorry but the code above is nothing like the code that Josh shared. When
create_network(project) fails, how do we revert its side effects? If we
want to resume this flow after reboot, how does that work?


Exactly.


I understand that there is a desire to write everything in beautiful
python yields, try's, finally's, and excepts. But the reality is that
python's stack is lost the moment the process segfaults, power goes out
on that PDU, or the admin rolls out a new kernel.


Yup.


If we embed taskflow deep in the code, we get those things, and we can
treat tasks as coroutines and let taskflow's event loop be asyncio just
the same. If we embed asyncio deep into the code, we don't get any of
the high level functions and we get just as much code churn.


++


There's no limit to coroutine usage. The only problem is the library that
would bind everything together.
In my example run_task will have to be really smart, keeping track of all
started tasks, results of all finished ones, skipping all tasks that have
already been done (and substituting already generated results).
But all of this is doable. And I find this way of declaring workflows way
more understandable than whatever would it look like with Flow.add's


The way the flow is declared is important, as it leads to more isolated
code. The single place where the flow is declared in Josh's example means
that the flow can be imported, the state deserialized and inspected,
and resumed by any piece of code: an API call, a daemon start up, an
admin command, etc.


Right, this is the main point. We are focusing so much on eventlet vs. 
asyncio, and in doing so we are missing the big picture in how we think 
about the flows of related tasks in our code. Taskflow makes that big 
picture thinking possible, and is what I believe our focus should be on. 
If someone hates seeing eventlet's magic masking of async I/O and wants 
to see Py3K-clean yields, then I think that work belongs in the Taskflow 
engines modules, and not inside Nova directly.


Besides Py3K support and predictive yield points, I haven't seen any 
other valid arguments for spending a bunch of time moving from eventlet 
to asyncio, and certainly no arguments that address the real 
architectural problems inside Nova: that we continue to think at too low 
a level and instead of writing code that naturally groups related sets 
of tasks into workflows, we think instead about how to properly yield 
from one coroutine to another. The point of eventlet, I thought, was to 
hide the low-level stuff so that developers could focus on higher-level 
(and more productive) abstractions. Introducing asyncio contructs into 
the higher level code like Nova and Neutron seems to be a step in the 
wrong direction, IMHO. I'd rather see a renewed focus on getting 
Taskflow incorporated into Nova.


Best,

-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-12 Thread Yuriy Taraday
On Fri, Jul 11, 2014 at 10:34 PM, Joshua Harlow 
wrote:

> S, how about we can continue this in #openstack-state-management (or
> #openstack-oslo).
>
> Since I think we've all made the point and different viewpoints visible
> (which was the main intention).
>
> Overall, I'd like to see asyncio more directly connected into taskflow so
> we can have the best of both worlds.
>
> We just have to be careful in letting people blow their feet off, vs.
> being to safe; but that discussion I think we can have outside this thread.
>

That's what I was about to reply to Clint: "Let the user shoot ones feet,
one can always be creative in doing that anyway".

Sound good?
>

Sure. TBH I didn't think this thread is the right place for this discussion
but "coroutines can't do that" kind of set me off :)

-Josh
>
> On Jul 11, 2014, at 9:04 AM, Clint Byrum  wrote:
>
> > Excerpts from Yuriy Taraday's message of 2014-07-11 03:08:14 -0700:
> >> On Thu, Jul 10, 2014 at 11:51 PM, Josh Harlow 
> wrote:
> >>> 2. Introspection, I hope this one is more obvious. When the coroutine
> >>> call-graph is the workflow there is no easy way to examine it before it
> >>> executes (and change parts of it for example before it executes). This
> is a
> >>> nice feature imho when it's declaratively and explicitly defined, you
> get
> >>> the ability to do this. This part is key to handling upgrades that
> >>> typically happen (for example the a workflow with the 5th task was
> upgraded
> >>> to a newer version, we need to stop the service, shut it off, do the
> code
> >>> upgrade, restart the service and change 5th task from v1 to v1.1).
> >>>
> >>
> >> I don't really understand why would one want to examine or change
> workflow
> >> before running. Shouldn't workflow provide just enough info about which
> >> tasks should be run in what order?
> >> In case with coroutines when you do your upgrade and rerun workflow,
> it'll
> >> just skip all steps that has already been run and run your new version
> of
> >> 5th task.
> >>
> >
> > I'm kind of with you on this one. Changing the workflow feels like self
> > modifying code.
> >
> >> 3. Dataflow: tasks in taskflow can not just declare workflow
> dependencies
> >>> but also dataflow dependencies (this is how tasks transfer things from
> one
> >>> to another). I suppose the dataflow dependency would mirror to
> coroutine
> >>> variables & arguments (except the variables/arguments would need to be
> >>> persisted somewhere so that it can be passed back in on failure of the
> >>> service running that coroutine). How is that possible without an
> >>> abstraction over those variables/arguments (a coroutine can't store
> these
> >>> things in local variables since those will be lost)?It would seem like
> this
> >>> would need to recreate the persistence & storage layer[5] that taskflow
> >>> already uses for this purpose to accomplish this.
> >>>
> >>
> >> You don't need to persist local variables. You just need to persist
> results
> >> of all tasks (and you have to do it if you want to support workflow
> >> interruption and restart). All dataflow dependencies are declared in the
> >> coroutine in plain Python which is what developers are used to.
> >>
> >
> > That is actually the problem that using declarative systems avoids.
> >
> >
> >@asyncio.couroutine
> >def add_ports(ctx, server_def):
> >port, volume = yield from
> asyncio.gather(ctx.run_task(create_port(server_def)),
> >
> ctx.run_task(create_volume(server_def))
> >if server_def.wants_drbd:
> >setup_drbd(volume, server_def)
> >
> >yield from ctx.run_task(boot_server(volume_az, server_def))
> >
> >
> > Now we have a side effect which is not in a task. If booting fails, and
> > we want to revert, we won't revert the drbd. This is easy to miss
> > because we're just using plain old python, and heck it already even has
> > a test case.
> >
> > I see this type of thing a lot.. we're not arguing about capabilities,
> > but about psychological differences. There are pros and cons to both
> > approaches.
> >
> > ___
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 

Kind regards, Yuriy.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-11 Thread Joshua Harlow
S, how about we can continue this in #openstack-state-management (or 
#openstack-oslo).

Since I think we've all made the point and different viewpoints visible (which 
was the main intention).

Overall, I'd like to see asyncio more directly connected into taskflow so we 
can have the best of both worlds.

We just have to be careful in letting people blow their feet off, vs. being to 
safe; but that discussion I think we can have outside this thread.

Sound good?

-Josh

On Jul 11, 2014, at 9:04 AM, Clint Byrum  wrote:

> Excerpts from Yuriy Taraday's message of 2014-07-11 03:08:14 -0700:
>> On Thu, Jul 10, 2014 at 11:51 PM, Josh Harlow  wrote:
>>> 2. Introspection, I hope this one is more obvious. When the coroutine
>>> call-graph is the workflow there is no easy way to examine it before it
>>> executes (and change parts of it for example before it executes). This is a
>>> nice feature imho when it's declaratively and explicitly defined, you get
>>> the ability to do this. This part is key to handling upgrades that
>>> typically happen (for example the a workflow with the 5th task was upgraded
>>> to a newer version, we need to stop the service, shut it off, do the code
>>> upgrade, restart the service and change 5th task from v1 to v1.1).
>>> 
>> 
>> I don't really understand why would one want to examine or change workflow
>> before running. Shouldn't workflow provide just enough info about which
>> tasks should be run in what order?
>> In case with coroutines when you do your upgrade and rerun workflow, it'll
>> just skip all steps that has already been run and run your new version of
>> 5th task.
>> 
> 
> I'm kind of with you on this one. Changing the workflow feels like self
> modifying code.
> 
>> 3. Dataflow: tasks in taskflow can not just declare workflow dependencies
>>> but also dataflow dependencies (this is how tasks transfer things from one
>>> to another). I suppose the dataflow dependency would mirror to coroutine
>>> variables & arguments (except the variables/arguments would need to be
>>> persisted somewhere so that it can be passed back in on failure of the
>>> service running that coroutine). How is that possible without an
>>> abstraction over those variables/arguments (a coroutine can't store these
>>> things in local variables since those will be lost)?It would seem like this
>>> would need to recreate the persistence & storage layer[5] that taskflow
>>> already uses for this purpose to accomplish this.
>>> 
>> 
>> You don't need to persist local variables. You just need to persist results
>> of all tasks (and you have to do it if you want to support workflow
>> interruption and restart). All dataflow dependencies are declared in the
>> coroutine in plain Python which is what developers are used to.
>> 
> 
> That is actually the problem that using declarative systems avoids.
> 
> 
>@asyncio.couroutine
>def add_ports(ctx, server_def):
>port, volume = yield from 
> asyncio.gather(ctx.run_task(create_port(server_def)),
> 
> ctx.run_task(create_volume(server_def))
>if server_def.wants_drbd:
>setup_drbd(volume, server_def)
> 
>yield from ctx.run_task(boot_server(volume_az, server_def))
> 
> 
> Now we have a side effect which is not in a task. If booting fails, and
> we want to revert, we won't revert the drbd. This is easy to miss
> because we're just using plain old python, and heck it already even has
> a test case.
> 
> I see this type of thing a lot.. we're not arguing about capabilities,
> but about psychological differences. There are pros and cons to both
> approaches.
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-11 Thread Clint Byrum
Excerpts from Yuriy Taraday's message of 2014-07-11 03:08:14 -0700:
> On Thu, Jul 10, 2014 at 11:51 PM, Josh Harlow  wrote:
> > 2. Introspection, I hope this one is more obvious. When the coroutine
> > call-graph is the workflow there is no easy way to examine it before it
> > executes (and change parts of it for example before it executes). This is a
> > nice feature imho when it's declaratively and explicitly defined, you get
> > the ability to do this. This part is key to handling upgrades that
> > typically happen (for example the a workflow with the 5th task was upgraded
> > to a newer version, we need to stop the service, shut it off, do the code
> > upgrade, restart the service and change 5th task from v1 to v1.1).
> >
> 
> I don't really understand why would one want to examine or change workflow
> before running. Shouldn't workflow provide just enough info about which
> tasks should be run in what order?
> In case with coroutines when you do your upgrade and rerun workflow, it'll
> just skip all steps that has already been run and run your new version of
> 5th task.
> 

I'm kind of with you on this one. Changing the workflow feels like self
modifying code.

> 3. Dataflow: tasks in taskflow can not just declare workflow dependencies
> > but also dataflow dependencies (this is how tasks transfer things from one
> > to another). I suppose the dataflow dependency would mirror to coroutine
> > variables & arguments (except the variables/arguments would need to be
> > persisted somewhere so that it can be passed back in on failure of the
> > service running that coroutine). How is that possible without an
> > abstraction over those variables/arguments (a coroutine can't store these
> > things in local variables since those will be lost)?It would seem like this
> > would need to recreate the persistence & storage layer[5] that taskflow
> > already uses for this purpose to accomplish this.
> >
> 
> You don't need to persist local variables. You just need to persist results
> of all tasks (and you have to do it if you want to support workflow
> interruption and restart). All dataflow dependencies are declared in the
> coroutine in plain Python which is what developers are used to.
> 

That is actually the problem that using declarative systems avoids.


@asyncio.couroutine
def add_ports(ctx, server_def):
port, volume = yield from 
asyncio.gather(ctx.run_task(create_port(server_def)),
 
ctx.run_task(create_volume(server_def))
if server_def.wants_drbd:
setup_drbd(volume, server_def)

yield from ctx.run_task(boot_server(volume_az, server_def))


Now we have a side effect which is not in a task. If booting fails, and
we want to revert, we won't revert the drbd. This is easy to miss
because we're just using plain old python, and heck it already even has
a test case.

I see this type of thing a lot.. we're not arguing about capabilities,
but about psychological differences. There are pros and cons to both
approaches.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-11 Thread Yuriy Taraday
On Thu, Jul 10, 2014 at 11:51 PM, Outlook  wrote:

> On Jul 10, 2014, at 3:48 AM, Yuriy Taraday  wrote:
>
> On Wed, Jul 9, 2014 at 7:39 PM, Clint Byrum  wrote:
>
>> Excerpts from Yuriy Taraday's message of 2014-07-09 03:36:00 -0700:
>> > On Tue, Jul 8, 2014 at 11:31 PM, Joshua Harlow 
>> > wrote:
>> >
>> > > I think clints response was likely better than what I can write here,
>> but
>> > > I'll add-on a few things,
>> > >
>> > >
>> > > >How do you write such code using taskflow?
>> > > >
>> > > >  @asyncio.coroutine
>> > > >  def foo(self):
>> > > >  result = yield from some_async_op(...)
>> > > >  return do_stuff(result)
>> > >
>> > > The idea (at a very high level) is that users don't write this;
>> > >
>> > > What users do write is a workflow, maybe the following (pseudocode):
>> > >
>> > > # Define the pieces of your workflow.
>> > >
>> > > TaskA():
>> > >   def execute():
>> > >   # Do whatever some_async_op did here.
>> > >
>> > >   def revert():
>> > >   # If execute had any side-effects undo them here.
>> > >
>> > > TaskFoo():
>> > >...
>> > >
>> > > # Compose them together
>> > >
>> > > flow = linear_flow.Flow("my-stuff").add(TaskA("my-task-a"),
>> > > TaskFoo("my-foo"))
>> > >
>> >
>> > I wouldn't consider this composition very user-friendly.
>> >
>>
>>
> So just to make this understandable, the above is a declarative structure
> of the work to be done. I'm pretty sure it's general agreed[1] in the
> programming world that when declarative structures can be used they should
> be (imho openstack should also follow the same pattern more than it
> currently does). The above is a declaration of the work to be done and the
> ordering constraints that must be followed. Its just one of X ways to do
> this (feel free to contribute other variations of these 'patterns' @
> https://github.com/openstack/taskflow/tree/master/taskflow/patterns).
>
> [1] http://latentflip.com/imperative-vs-declarative/ (and many many
> others).
>

I totally agree that declarative approach is better for workflow
declarations. I'm just saying that we can do it in Python with coroutines
instead. Note that declarative approach can lead to reinvention of entirely
new language and these "flow.add" can be the first step on this road.

>  I find it extremely user friendly when I consider that it gives you
>> clear lines of delineation between "the way it should work" and "what
>> to do when it breaks."
>>
>
> So does plain Python. But for plain Python you don't have to explicitly
> use graph terminology to describe the process.
>
>
>
> I'm not sure where in the above you saw graph terminology. All I see there
> is a declaration of a pattern that explicitly says run things 1 after the
> other (linearly).
>

As long as workflow is linear there's no difference on whether it's
declared with .add() or with yield from. I'm talking about more complex
workflows like one I described in example.


>  > > # Submit the workflow to an engine, let the engine do the work to
>> execute
>> > > it (and transfer any state between tasks as needed).
>> > >
>> > > The idea here is that when things like this are declaratively
>> specified
>> > > the only thing that matters is that the engine respects that
>> declaration;
>> > > not whether it uses asyncio, eventlet, pigeons, threads, remote
>> > > workers[1]. It also adds some things that are not (imho) possible with
>> > > co-routines (in part since they are at such a low level) like
>> stopping the
>> > > engine after 'my-task-a' runs and shutting off the software,
>> upgrading it,
>> > > restarting it and then picking back up at 'my-foo'.
>> > >
>> >
>> > It's absolutely possible with coroutines and might provide even clearer
>> > view of what's going on. Like this:
>> >
>> > @asyncio.coroutine
>> > def my_workflow(ctx, ...):
>> > project = yield from ctx.run_task(create_project())
>> > # Hey, we don't want to be linear. How about parallel tasks?
>> > volume, network = yield from asyncio.gather(
>> > ctx.run_task(create_volume(project)),
>> > ctx.run_task(create_network(project)),
>> > )
>> > # We can put anything here - why not branch a bit?
>> > if create_one_vm:
>> > yield from ctx.run_task(create_vm(project, network))
>> > else:
>> > # Or even loops - why not?
>> > for i in range(network.num_ips()):
>> > yield from ctx.run_task(create_vm(project, network))
>> >
>>
>
>> Sorry but the code above is nothing like the code that Josh shared. When
>> create_network(project) fails, how do we revert its side effects? If we
>> want to resume this flow after reboot, how does that work?
>>
>> I understand that there is a desire to write everything in beautiful
>> python yields, try's, finally's, and excepts. But the reality is that
>> python's stack is lost the moment the process segfaults, power goes out
>> on that PDU, or the admin rolls out a new kernel.
>>
>> We're not saying "asyncio vs. taskflow". I've seen 

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-11 Thread Victor Stinner
Hi,

Le lundi 7 juillet 2014, 19:18:38 Mark McLoughlin a écrit :
> I'd expect us to add e.g.
> 
>   @asyncio.coroutine
>   def call_async(self, ctxt, method, **kwargs):
>   ...
> 
> to RPCClient. Perhaps we'd need to add an AsyncRPCClient in a separate
> module and only add the method there - I don't have a good sense of it
> yet.

I don't want to make trollius a mandatory dependency of Oslo Messaging, at 
least not right now.

An option is to only declare the method if trollius is installed. "try: import 
trollius except ImportError: trollius = None" and then "if trollius is not 
None: @trollius.coroutine def cal_async(): ...".

Or maybe a different module (maybe using a subclass) is better.

Victor

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-11 Thread Flavio Percoco
On 07/10/2014 06:46 PM, Mark McLoughlin wrote:
> On Thu, 2014-07-03 at 16:27 +0100, Mark McLoughlin wrote:
>> Hey
>>
>> This is an attempt to summarize a really useful discussion that Victor,
>> Flavio and I have been having today. At the bottom are some background
>> links - basically what I have open in my browser right now thinking
>> through all of this.
>>
>> We're attempting to take baby-steps towards moving completely from
>> eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
>> first victim.
> 
> I got a little behind on this thread, but maybe it'd be helpful to
> summarize some things from this good discussion:

Thanks for summarizing the thread up.


>- "Is moving to asyncio really a priority compared to other things?"
> 
>  I think Victor has made a good case on "what's wrong with 
>  eventlet?"[1] and, personally, I'm excited about the prospect of 
>  the Python community more generally converging on asyncio. 
>  Understanding what OpenStack would need in order move to asyncio 
>  will help the asyncio effort more generally.
> 
>  Figuring through some of this stuff is a priority for Victor and
>  others, but no-one is saying it's an immediate priority for the 
>  whole project.

Agreed. Lets not underestimate the contributions OpenStack as a
community has done to Python and the fact that it can/should keep doing
them. Experimenting with asyncio will bring to light things that can be
contributed back to the community and it'll also help creating new
scenarios and use-cases around asyncio.

> 
>- Moving from an implicitly async to an explicitly async programming
>  has enormous implications and we need to figure out what it means
>  for libraries like SQLalchemy and abstraction layers like ORMs. 
> 
>  I think that's well understood - the topic of this thread is 
>  merely how to make a small addition to oslo.messaging (the ability 
>  to dispatch asyncio co-routines on eventlet) so that we can move on
>  to figuring out the next piece of puzzle.

Lets take 1 step at a time. oslo.messaging is a core piece of OpenStack
but it's also a library that can be used outside OpenStack. Having
support for explicit async in oslo.messaging is a good thing for the
library itself regardless of whether it'll be adopted throughout
OpenStack in the long run.


>- Taskflow vs asyncio - good discussion, plenty to figure out. 
>  They're mostly orthogonal concerns IMHO but *maybe* we decide
>  adopting both makes sense and that both should be adopted together.
>  I'd like to see more concrete examples showing taskflow vs asyncio
>  vs taskflow/asyncio to understand better.
>

+1

> So, tl;dr is that lots of work remains to even begin to understand how
> exactly asyncio could be adopted and whether that makes sense. The
> thread raises some interesting viewpoints, but I don't think it moves
> our understanding along all that much. The initial mail was simply about
> unlocking one very small piece of the puzzle.
> 

Agreed. I'm happy to help moving this effort forward and gather some
real-life results onto which we can base future plans and decisions.

Flavio.

-- 
@flaper87
Flavio Percoco

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-10 Thread Joshua Harlow
This is not supposed to be from 'Outlook', haha.

Using my adjusted mail account so that its don't go to spam on the receiving 
end due to DMARC/DKIM issues...

My fault ;)

-Josh

On Jul 10, 2014, at 12:51 PM, Outlook  wrote:

> On Jul 10, 2014, at 3:48 AM, Yuriy Taraday  wrote:
> 
>> On Wed, Jul 9, 2014 at 7:39 PM, Clint Byrum  wrote:
>> Excerpts from Yuriy Taraday's message of 2014-07-09 03:36:00 -0700:
>> > On Tue, Jul 8, 2014 at 11:31 PM, Joshua Harlow 
>> > wrote:
>> >
>> > > I think clints response was likely better than what I can write here, but
>> > > I'll add-on a few things,
>> > >
>> > >
>> > > >How do you write such code using taskflow?
>> > > >
>> > > >  @asyncio.coroutine
>> > > >  def foo(self):
>> > > >  result = yield from some_async_op(...)
>> > > >  return do_stuff(result)
>> > >
>> > > The idea (at a very high level) is that users don't write this;
>> > >
>> > > What users do write is a workflow, maybe the following (pseudocode):
>> > >
>> > > # Define the pieces of your workflow.
>> > >
>> > > TaskA():
>> > >   def execute():
>> > >   # Do whatever some_async_op did here.
>> > >
>> > >   def revert():
>> > >   # If execute had any side-effects undo them here.
>> > >
>> > > TaskFoo():
>> > >...
>> > >
>> > > # Compose them together
>> > >
>> > > flow = linear_flow.Flow("my-stuff").add(TaskA("my-task-a"),
>> > > TaskFoo("my-foo"))
>> > >
>> >
>> > I wouldn't consider this composition very user-friendly.
>> >
>> 
> 
> So just to make this understandable, the above is a declarative structure of 
> the work to be done. I'm pretty sure it's general agreed[1] in the 
> programming world that when declarative structures can be used they should be 
> (imho openstack should also follow the same pattern more than it currently 
> does). The above is a declaration of the work to be done and the ordering 
> constraints that must be followed. Its just one of X ways to do this (feel 
> free to contribute other variations of these 'patterns' @ 
> https://github.com/openstack/taskflow/tree/master/taskflow/patterns).
> 
> [1] http://latentflip.com/imperative-vs-declarative/ (and many many others).
> 
>> I find it extremely user friendly when I consider that it gives you
>> clear lines of delineation between "the way it should work" and "what
>> to do when it breaks."
>> 
>> So does plain Python. But for plain Python you don't have to explicitly use 
>> graph terminology to describe the process.
>>  
> 
> I'm not sure where in the above you saw graph terminology. All I see there is 
> a declaration of a pattern that explicitly says run things 1 after the other 
> (linearly).
> 
>> > > # Submit the workflow to an engine, let the engine do the work to execute
>> > > it (and transfer any state between tasks as needed).
>> > >
>> > > The idea here is that when things like this are declaratively specified
>> > > the only thing that matters is that the engine respects that declaration;
>> > > not whether it uses asyncio, eventlet, pigeons, threads, remote
>> > > workers[1]. It also adds some things that are not (imho) possible with
>> > > co-routines (in part since they are at such a low level) like stopping 
>> > > the
>> > > engine after 'my-task-a' runs and shutting off the software, upgrading 
>> > > it,
>> > > restarting it and then picking back up at 'my-foo'.
>> > >
>> >
>> > It's absolutely possible with coroutines and might provide even clearer
>> > view of what's going on. Like this:
>> >
>> > @asyncio.coroutine
>> > def my_workflow(ctx, ...):
>> > project = yield from ctx.run_task(create_project())
>> > # Hey, we don't want to be linear. How about parallel tasks?
>> > volume, network = yield from asyncio.gather(
>> > ctx.run_task(create_volume(project)),
>> > ctx.run_task(create_network(project)),
>> > )
>> > # We can put anything here - why not branch a bit?
>> > if create_one_vm:
>> > yield from ctx.run_task(create_vm(project, network))
>> > else:
>> > # Or even loops - why not?
>> > for i in range(network.num_ips()):
>> > yield from ctx.run_task(create_vm(project, network))
>> >
>> 
>> Sorry but the code above is nothing like the code that Josh shared. When
>> create_network(project) fails, how do we revert its side effects? If we
>> want to resume this flow after reboot, how does that work?
>> 
>> I understand that there is a desire to write everything in beautiful
>> python yields, try's, finally's, and excepts. But the reality is that
>> python's stack is lost the moment the process segfaults, power goes out
>> on that PDU, or the admin rolls out a new kernel.
>> 
>> We're not saying "asyncio vs. taskflow". I've seen that mistake twice
>> already in this thread. Josh and I are suggesting that if there is a
>> movement to think about coroutines, there should also be some time spent
>> thinking at a high level: "how do we resume tasks, revert side effects,
>> and control flow?"
>> 
>> If we embed

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-10 Thread Outlook
On Jul 10, 2014, at 3:48 AM, Yuriy Taraday  wrote:

> On Wed, Jul 9, 2014 at 7:39 PM, Clint Byrum  wrote:
> Excerpts from Yuriy Taraday's message of 2014-07-09 03:36:00 -0700:
> > On Tue, Jul 8, 2014 at 11:31 PM, Joshua Harlow 
> > wrote:
> >
> > > I think clints response was likely better than what I can write here, but
> > > I'll add-on a few things,
> > >
> > >
> > > >How do you write such code using taskflow?
> > > >
> > > >  @asyncio.coroutine
> > > >  def foo(self):
> > > >  result = yield from some_async_op(...)
> > > >  return do_stuff(result)
> > >
> > > The idea (at a very high level) is that users don't write this;
> > >
> > > What users do write is a workflow, maybe the following (pseudocode):
> > >
> > > # Define the pieces of your workflow.
> > >
> > > TaskA():
> > >   def execute():
> > >   # Do whatever some_async_op did here.
> > >
> > >   def revert():
> > >   # If execute had any side-effects undo them here.
> > >
> > > TaskFoo():
> > >...
> > >
> > > # Compose them together
> > >
> > > flow = linear_flow.Flow("my-stuff").add(TaskA("my-task-a"),
> > > TaskFoo("my-foo"))
> > >
> >
> > I wouldn't consider this composition very user-friendly.
> >
> 

So just to make this understandable, the above is a declarative structure of 
the work to be done. I'm pretty sure it's general agreed[1] in the programming 
world that when declarative structures can be used they should be (imho 
openstack should also follow the same pattern more than it currently does). The 
above is a declaration of the work to be done and the ordering constraints that 
must be followed. Its just one of X ways to do this (feel free to contribute 
other variations of these 'patterns' @ 
https://github.com/openstack/taskflow/tree/master/taskflow/patterns).

[1] http://latentflip.com/imperative-vs-declarative/ (and many many others).

> I find it extremely user friendly when I consider that it gives you
> clear lines of delineation between "the way it should work" and "what
> to do when it breaks."
> 
> So does plain Python. But for plain Python you don't have to explicitly use 
> graph terminology to describe the process.
>  

I'm not sure where in the above you saw graph terminology. All I see there is a 
declaration of a pattern that explicitly says run things 1 after the other 
(linearly).

> > > # Submit the workflow to an engine, let the engine do the work to execute
> > > it (and transfer any state between tasks as needed).
> > >
> > > The idea here is that when things like this are declaratively specified
> > > the only thing that matters is that the engine respects that declaration;
> > > not whether it uses asyncio, eventlet, pigeons, threads, remote
> > > workers[1]. It also adds some things that are not (imho) possible with
> > > co-routines (in part since they are at such a low level) like stopping the
> > > engine after 'my-task-a' runs and shutting off the software, upgrading it,
> > > restarting it and then picking back up at 'my-foo'.
> > >
> >
> > It's absolutely possible with coroutines and might provide even clearer
> > view of what's going on. Like this:
> >
> > @asyncio.coroutine
> > def my_workflow(ctx, ...):
> > project = yield from ctx.run_task(create_project())
> > # Hey, we don't want to be linear. How about parallel tasks?
> > volume, network = yield from asyncio.gather(
> > ctx.run_task(create_volume(project)),
> > ctx.run_task(create_network(project)),
> > )
> > # We can put anything here - why not branch a bit?
> > if create_one_vm:
> > yield from ctx.run_task(create_vm(project, network))
> > else:
> > # Or even loops - why not?
> > for i in range(network.num_ips()):
> > yield from ctx.run_task(create_vm(project, network))
> >
> 
> Sorry but the code above is nothing like the code that Josh shared. When
> create_network(project) fails, how do we revert its side effects? If we
> want to resume this flow after reboot, how does that work?
> 
> I understand that there is a desire to write everything in beautiful
> python yields, try's, finally's, and excepts. But the reality is that
> python's stack is lost the moment the process segfaults, power goes out
> on that PDU, or the admin rolls out a new kernel.
> 
> We're not saying "asyncio vs. taskflow". I've seen that mistake twice
> already in this thread. Josh and I are suggesting that if there is a
> movement to think about coroutines, there should also be some time spent
> thinking at a high level: "how do we resume tasks, revert side effects,
> and control flow?"
> 
> If we embed taskflow deep in the code, we get those things, and we can
> treat tasks as coroutines and let taskflow's event loop be asyncio just
> the same. If we embed asyncio deep into the code, we don't get any of
> the high level functions and we get just as much code churn.

+1 the declaration of what is to do is not connected to the how to do it. IMHO 
most projects (maybe outside

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-10 Thread Mark McLoughlin
On Mon, 2014-07-07 at 12:48 +0200, Nikola Đipanov wrote:

> When I read all of this stuff and got my head around it (took some time
> :) ), a glaring drawback of such an approach, and as I mentioned on the
> spec proposing it [1] is that we would not really doing asyncio, we
> would just be pretending we are by using a subset of it's APIs, and
> having all of the really important stuff for overall design of the code
> (code that needs to do IO in the callbacks for example) and ultimately -
> performance, completely unavailable to us when porting.
> 
> So in Mark's example above:
> 
>   @asyncio.coroutine
>   def foo(self):
> result = yield from some_async_op(...)
> return do_stuff(result)
> 
> A developer would not need to do anything that asyncio requires like
> make sure that some_async_op() registers a callback with the eventloop
> (using for example event_loop.add_reader/writer methods) you could just
> simply make it use a 'greened' call and things would continue working
> happily.

Yes, Victor and I noticed this problem and wondered whether there was a
way to e.g. turn-off the monkey-patching at runtime in a single
greenthread, or even just make any attempt to context switch raise an
exception.

i.e. a way to run foo() coroutine above in a greenthread such that
context switching is disallowed, or logged, or whatever while the
function is running. The only way context switching would be allowed to
happen would be if the coroutine yielded.

>  I have a feeling this will in turn have a lot of people writing
> code that they don't understand, and as library writers - we are not
> doing an excellent job at that point.
> 
> Now porting an OpenStack project to another IO library with completely
> different design is a huge job and there is unlikely a single 'right'
> way to do it, so treat this as a discussion starter, that will hopefully
> give us a better understanding of the problem we are trying to tackle.
> 
> So I hacked up together a small POC of a different approach. In short -
> we actually use a real asyncio selector eventloop in a separate thread,
> and dispatch stuff to it when we figure out that our callback is in fact
> a coroutine. More will be clear form the code so:
> 
> (Warning - hacky code ahead): [2]
> 
> I will probably be updating it - but if you just clone the repo, all the
> history is there. I wrote it without the oslo.messaging abstractions
> like listener and dispatcher, but it is relatively easy to see which
> bits of code would go in those.
> 
> Several things worth noting as you read the above. First one is that we
> do not monkeypatch until we have fired of the asyncio thread (Victor
> correctly noticed this would be a problem in a comment on [1]). This may
> seem hacky (and it is) but if decide to go further down this road - we
> would probably not be 'greening the world' but rather importing patched
> non-ported modules when we need to dispatch to them. This may sound like
> a big deal, and it is, but it is critical to actually running ported
> code in a real asyncio evenloop. I have not yet tested this further, but
> from briefly reading eventlet code - it seems like ti should work.
> 
> Another interesting problem is (as I have briefly mentioned in [1]) -
> what happens when we need to synchronize between eventlet-run and
> asyncio-run callbacks while we are in the process of porting. I don't
> have a good answer to that yet, but it is worth noting that the proposed
> approach doesn't either, and this is a thing we should have some idea
> about before going in with a knife.
> 
> Now for some marketing :) - I can see several advantages of such an
> approach, the obvious one being as stated, that we are in fact doing
> asyncio, so we are all in. Also as you can see [2] the implementation is
> far from magical - it's (surprisingly?) simple, and requires no other
> additional dependencies apart from trollius itself (granted greenio is
> not too complex either). I am sure that we would hit some other problems
> that were not clear from this basic POC (it was done in ~3 hours on a
> bus), but it seems to me that those problems will likely need to be
> solved anyhow if we are to port Ceilometer (or any other project) to
> asyncio, we will just hit them sooner this way.
> 
> It was a fun approach to ponder anyway - so I am looking forward to
> comments and thoughts.

It's an interesting idea and I'd certainly welcome a more detailed
analysis of what the approach would mean for a service like Ceilometer.

My instinct is that adding an additional native thread where there is
only one native thread now will lead to tricky concurrency issues and a
more significant change of behavior than with the greenio approach. The
reason I like the greenio idea is that it allows us to make the
programming model changes without very significantly changing what
happens at runtime - the behavior, order of execution, concurrency
concerns, etc. shouldn't be all that different.

Mark.


___

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-10 Thread Mark McLoughlin
On Thu, 2014-07-03 at 16:27 +0100, Mark McLoughlin wrote:
> Hey
> 
> This is an attempt to summarize a really useful discussion that Victor,
> Flavio and I have been having today. At the bottom are some background
> links - basically what I have open in my browser right now thinking
> through all of this.
> 
> We're attempting to take baby-steps towards moving completely from
> eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
> first victim.

I got a little behind on this thread, but maybe it'd be helpful to
summarize some things from this good discussion:

   - "Where/when was this decided?!?"

 Victor is working on prototyping how an OpenStack service would
 move to using asyncio. Whether a move to asyncio across the board
 makes sense - and what exactly it would look like - hasn't been
 *decided*. The idea is merely being explored at this point.

   - "Is moving to asyncio really a priority compared to other things?"

 I think Victor has made a good case on "what's wrong with 
 eventlet?"[1] and, personally, I'm excited about the prospect of 
 the Python community more generally converging on asyncio. 
 Understanding what OpenStack would need in order move to asyncio 
 will help the asyncio effort more generally.

 Figuring through some of this stuff is a priority for Victor and
 others, but no-one is saying it's an immediate priority for the 
 whole project.

   - Moving from an implicitly async to an explicitly async programming
 has enormous implications and we need to figure out what it means
 for libraries like SQLalchemy and abstraction layers like ORMs. 

 I think that's well understood - the topic of this thread is 
 merely how to make a small addition to oslo.messaging (the ability 
 to dispatch asyncio co-routines on eventlet) so that we can move on
 to figuring out the next piece of puzzle.

   - Some people are clearly skeptical about whether asyncio is the 
 right thing for Python generally, whether it's the right thing for 
 OpenStack, whatever. Personally, I'm optimistic but I don't find 
 the conversation all that interesting right now - I want to see 
 how the prototype efforts work out before making a call about 
 whether it's feasible and useful.

   - Taskflow vs asyncio - good discussion, plenty to figure out. 
 They're mostly orthogonal concerns IMHO but *maybe* we decide
 adopting both makes sense and that both should be adopted together.
 I'd like to see more concrete examples showing taskflow vs asyncio
 vs taskflow/asyncio to understand better.

So, tl;dr is that lots of work remains to even begin to understand how
exactly asyncio could be adopted and whether that makes sense. The
thread raises some interesting viewpoints, but I don't think it moves
our understanding along all that much. The initial mail was simply about
unlocking one very small piece of the puzzle.

Mark.

[1] - http://techs.enovance.com/6562/asyncio-openstack-python3


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-10 Thread Clint Byrum
Excerpts from Victor Stinner's message of 2014-07-10 05:57:38 -0700:
> Le jeudi 10 juillet 2014, 14:48:04 Yuriy Taraday a écrit :
> > I'm not suggesting that taskflow is useless and asyncio is better (apple vs
> > oranges). I'm saying that using coroutines (asyncio) can improve ways we
> > can use taskflow and provide clearer method of developing these flows.
> > This was mostly response to the "this is impossible with coroutines". I say
> > it is possible and it can even be better.
> 
> It would be nice to modify taskflow to support trollius coroutines. 
> Coroutines 
> supports asynchronous operations and has a better syntax than callbacks.
> 

You mean like this:

https://review.openstack.org/#/c/90881/1/taskflow/engines/action_engine/executor.py

Abandoned, but I think Josh is looking at it. :)

> For Mark's spec, add a new greenio executor to Oslo Messaging: I don't see 
> the 
> direct link to taskflow. taskflow can use Oslo Messaging to call RPC, but I 
> don't see how to use taskflow internally to read a socket (driver), wait for 
> the completion of the callback and then send back the result to the socket 
> (driver).
> 

So oslo and the other low level bits are going to need to be modified
to support coroutines. That is definitely something that will make them
more generally useful anyway. I don't think Josh or I meant to get in
the way of that.

However, having this available is a step toward removing eventlet and
doing the painful work to switch to asyncio. Josh's original email was
in essence a reminder that we should consider a layer on top of asyncio
and eventlet alike, so that the large scale code changes only happen
once.

> I see trollius as a low-level tool to handle simple asynchronous operations, 
> whereas taskflow is more high level to chain correctly more complex 
> operations.
> 

_yes_

> trollius and taskflow must not be exclusive options, they should cooperate, 
> as 
> we plan to support trollius coroutines in Oslo Messaging.
> 

In fact they are emphatically not exclusive. However, considering the
order of adoption should produce a little less chaos for the project.
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-10 Thread Victor Stinner
Le jeudi 10 juillet 2014, 14:48:04 Yuriy Taraday a écrit :
> I'm not suggesting that taskflow is useless and asyncio is better (apple vs
> oranges). I'm saying that using coroutines (asyncio) can improve ways we
> can use taskflow and provide clearer method of developing these flows.
> This was mostly response to the "this is impossible with coroutines". I say
> it is possible and it can even be better.

It would be nice to modify taskflow to support trollius coroutines. Coroutines 
supports asynchronous operations and has a better syntax than callbacks.

For Mark's spec, add a new greenio executor to Oslo Messaging: I don't see the 
direct link to taskflow. taskflow can use Oslo Messaging to call RPC, but I 
don't see how to use taskflow internally to read a socket (driver), wait for 
the completion of the callback and then send back the result to the socket 
(driver).

I see trollius as a low-level tool to handle simple asynchronous operations, 
whereas taskflow is more high level to chain correctly more complex operations.

trollius and taskflow must not be exclusive options, they should cooperate, as 
we plan to support trollius coroutines in Oslo Messaging.

Victor

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-10 Thread Yuriy Taraday
On Wed, Jul 9, 2014 at 7:39 PM, Clint Byrum  wrote:

> Excerpts from Yuriy Taraday's message of 2014-07-09 03:36:00 -0700:
> > On Tue, Jul 8, 2014 at 11:31 PM, Joshua Harlow 
> > wrote:
> >
> > > I think clints response was likely better than what I can write here,
> but
> > > I'll add-on a few things,
> > >
> > >
> > > >How do you write such code using taskflow?
> > > >
> > > >  @asyncio.coroutine
> > > >  def foo(self):
> > > >  result = yield from some_async_op(...)
> > > >  return do_stuff(result)
> > >
> > > The idea (at a very high level) is that users don't write this;
> > >
> > > What users do write is a workflow, maybe the following (pseudocode):
> > >
> > > # Define the pieces of your workflow.
> > >
> > > TaskA():
> > >   def execute():
> > >   # Do whatever some_async_op did here.
> > >
> > >   def revert():
> > >   # If execute had any side-effects undo them here.
> > >
> > > TaskFoo():
> > >...
> > >
> > > # Compose them together
> > >
> > > flow = linear_flow.Flow("my-stuff").add(TaskA("my-task-a"),
> > > TaskFoo("my-foo"))
> > >
> >
> > I wouldn't consider this composition very user-friendly.
> >
>
> I find it extremely user friendly when I consider that it gives you
> clear lines of delineation between "the way it should work" and "what
> to do when it breaks."
>

So does plain Python. But for plain Python you don't have to explicitly use
graph terminology to describe the process.


>  > > # Submit the workflow to an engine, let the engine do the work to
> execute
> > > it (and transfer any state between tasks as needed).
> > >
> > > The idea here is that when things like this are declaratively specified
> > > the only thing that matters is that the engine respects that
> declaration;
> > > not whether it uses asyncio, eventlet, pigeons, threads, remote
> > > workers[1]. It also adds some things that are not (imho) possible with
> > > co-routines (in part since they are at such a low level) like stopping
> the
> > > engine after 'my-task-a' runs and shutting off the software, upgrading
> it,
> > > restarting it and then picking back up at 'my-foo'.
> > >
> >
> > It's absolutely possible with coroutines and might provide even clearer
> > view of what's going on. Like this:
> >
> > @asyncio.coroutine
> > def my_workflow(ctx, ...):
> > project = yield from ctx.run_task(create_project())
> > # Hey, we don't want to be linear. How about parallel tasks?
> > volume, network = yield from asyncio.gather(
> > ctx.run_task(create_volume(project)),
> > ctx.run_task(create_network(project)),
> > )
> > # We can put anything here - why not branch a bit?
> > if create_one_vm:
> > yield from ctx.run_task(create_vm(project, network))
> > else:
> > # Or even loops - why not?
> > for i in range(network.num_ips()):
> > yield from ctx.run_task(create_vm(project, network))
> >
>
> Sorry but the code above is nothing like the code that Josh shared. When
> create_network(project) fails, how do we revert its side effects? If we
> want to resume this flow after reboot, how does that work?
>
> I understand that there is a desire to write everything in beautiful
> python yields, try's, finally's, and excepts. But the reality is that
> python's stack is lost the moment the process segfaults, power goes out
> on that PDU, or the admin rolls out a new kernel.
>
> We're not saying "asyncio vs. taskflow". I've seen that mistake twice
> already in this thread. Josh and I are suggesting that if there is a
> movement to think about coroutines, there should also be some time spent
> thinking at a high level: "how do we resume tasks, revert side effects,
> and control flow?"
>
> If we embed taskflow deep in the code, we get those things, and we can
> treat tasks as coroutines and let taskflow's event loop be asyncio just
> the same. If we embed asyncio deep into the code, we don't get any of
> the high level functions and we get just as much code churn.
>
> > There's no limit to coroutine usage. The only problem is the library that
> > would bind everything together.
> > In my example run_task will have to be really smart, keeping track of all
> > started tasks, results of all finished ones, skipping all tasks that have
> > already been done (and substituting already generated results).
> > But all of this is doable. And I find this way of declaring workflows way
> > more understandable than whatever would it look like with Flow.add's
> >
>
> The way the flow is declared is important, as it leads to more isolated
> code. The single place where the flow is declared in Josh's example means
> that the flow can be imported, the state deserialized and inspected,
> and resumed by any piece of code: an API call, a daemon start up, an
> admin command, etc.
>
> I may be wrong, but it appears to me that the context that you built in
> your code example is hard, maybe impossible, to resume after a process
> restart unless _every_ task is en

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-09 Thread Clint Byrum
Excerpts from Yuriy Taraday's message of 2014-07-09 03:36:00 -0700:
> On Tue, Jul 8, 2014 at 11:31 PM, Joshua Harlow 
> wrote:
> 
> > I think clints response was likely better than what I can write here, but
> > I'll add-on a few things,
> >
> >
> > >How do you write such code using taskflow?
> > >
> > >  @asyncio.coroutine
> > >  def foo(self):
> > >  result = yield from some_async_op(...)
> > >  return do_stuff(result)
> >
> > The idea (at a very high level) is that users don't write this;
> >
> > What users do write is a workflow, maybe the following (pseudocode):
> >
> > # Define the pieces of your workflow.
> >
> > TaskA():
> >   def execute():
> >   # Do whatever some_async_op did here.
> >
> >   def revert():
> >   # If execute had any side-effects undo them here.
> >
> > TaskFoo():
> >...
> >
> > # Compose them together
> >
> > flow = linear_flow.Flow("my-stuff").add(TaskA("my-task-a"),
> > TaskFoo("my-foo"))
> >
> 
> I wouldn't consider this composition very user-friendly.
> 

I find it extremely user friendly when I consider that it gives you
clear lines of delineation between "the way it should work" and "what
to do when it breaks."

> > # Submit the workflow to an engine, let the engine do the work to execute
> > it (and transfer any state between tasks as needed).
> >
> > The idea here is that when things like this are declaratively specified
> > the only thing that matters is that the engine respects that declaration;
> > not whether it uses asyncio, eventlet, pigeons, threads, remote
> > workers[1]. It also adds some things that are not (imho) possible with
> > co-routines (in part since they are at such a low level) like stopping the
> > engine after 'my-task-a' runs and shutting off the software, upgrading it,
> > restarting it and then picking back up at 'my-foo'.
> >
> 
> It's absolutely possible with coroutines and might provide even clearer
> view of what's going on. Like this:
> 
> @asyncio.coroutine
> def my_workflow(ctx, ...):
> project = yield from ctx.run_task(create_project())
> # Hey, we don't want to be linear. How about parallel tasks?
> volume, network = yield from asyncio.gather(
> ctx.run_task(create_volume(project)),
> ctx.run_task(create_network(project)),
> )
> # We can put anything here - why not branch a bit?
> if create_one_vm:
> yield from ctx.run_task(create_vm(project, network))
> else:
> # Or even loops - why not?
> for i in range(network.num_ips()):
> yield from ctx.run_task(create_vm(project, network))
> 

Sorry but the code above is nothing like the code that Josh shared. When
create_network(project) fails, how do we revert its side effects? If we
want to resume this flow after reboot, how does that work?

I understand that there is a desire to write everything in beautiful
python yields, try's, finally's, and excepts. But the reality is that
python's stack is lost the moment the process segfaults, power goes out
on that PDU, or the admin rolls out a new kernel.

We're not saying "asyncio vs. taskflow". I've seen that mistake twice
already in this thread. Josh and I are suggesting that if there is a
movement to think about coroutines, there should also be some time spent
thinking at a high level: "how do we resume tasks, revert side effects,
and control flow?"

If we embed taskflow deep in the code, we get those things, and we can
treat tasks as coroutines and let taskflow's event loop be asyncio just
the same. If we embed asyncio deep into the code, we don't get any of
the high level functions and we get just as much code churn.

> There's no limit to coroutine usage. The only problem is the library that
> would bind everything together.
> In my example run_task will have to be really smart, keeping track of all
> started tasks, results of all finished ones, skipping all tasks that have
> already been done (and substituting already generated results).
> But all of this is doable. And I find this way of declaring workflows way
> more understandable than whatever would it look like with Flow.add's
> 

The way the flow is declared is important, as it leads to more isolated
code. The single place where the flow is declared in Josh's example means
that the flow can be imported, the state deserialized and inspected,
and resumed by any piece of code: an API call, a daemon start up, an
admin command, etc.

I may be wrong, but it appears to me that the context that you built in
your code example is hard, maybe impossible, to resume after a process
restart unless _every_ task is entirely idempotent and thus can just be
repeated over and over.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-09 Thread Yuriy Taraday
On Tue, Jul 8, 2014 at 11:31 PM, Joshua Harlow 
wrote:

> I think clints response was likely better than what I can write here, but
> I'll add-on a few things,
>
>
> >How do you write such code using taskflow?
> >
> >  @asyncio.coroutine
> >  def foo(self):
> >  result = yield from some_async_op(...)
> >  return do_stuff(result)
>
> The idea (at a very high level) is that users don't write this;
>
> What users do write is a workflow, maybe the following (pseudocode):
>
> # Define the pieces of your workflow.
>
> TaskA():
>   def execute():
>   # Do whatever some_async_op did here.
>
>   def revert():
>   # If execute had any side-effects undo them here.
>
> TaskFoo():
>...
>
> # Compose them together
>
> flow = linear_flow.Flow("my-stuff").add(TaskA("my-task-a"),
> TaskFoo("my-foo"))
>

I wouldn't consider this composition very user-friendly.


> # Submit the workflow to an engine, let the engine do the work to execute
> it (and transfer any state between tasks as needed).
>
> The idea here is that when things like this are declaratively specified
> the only thing that matters is that the engine respects that declaration;
> not whether it uses asyncio, eventlet, pigeons, threads, remote
> workers[1]. It also adds some things that are not (imho) possible with
> co-routines (in part since they are at such a low level) like stopping the
> engine after 'my-task-a' runs and shutting off the software, upgrading it,
> restarting it and then picking back up at 'my-foo'.
>

It's absolutely possible with coroutines and might provide even clearer
view of what's going on. Like this:

@asyncio.coroutine
def my_workflow(ctx, ...):
project = yield from ctx.run_task(create_project())
# Hey, we don't want to be linear. How about parallel tasks?
volume, network = yield from asyncio.gather(
ctx.run_task(create_volume(project)),
ctx.run_task(create_network(project)),
)
# We can put anything here - why not branch a bit?
if create_one_vm:
yield from ctx.run_task(create_vm(project, network))
else:
# Or even loops - why not?
for i in range(network.num_ips()):
yield from ctx.run_task(create_vm(project, network))

There's no limit to coroutine usage. The only problem is the library that
would bind everything together.
In my example run_task will have to be really smart, keeping track of all
started tasks, results of all finished ones, skipping all tasks that have
already been done (and substituting already generated results).
But all of this is doable. And I find this way of declaring workflows way
more understandable than whatever would it look like with Flow.add's

Hope that helps make it a little more understandable :)
>
> -Josh
>

PS: I've just found all your emails in this thread in Spam folder. So it's
probable not everybody read them.

-- 

Kind regards, Yuriy.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-08 Thread Joshua Harlow

>>This is the part that I really wonder about. Since asyncio isn't just a
>>drop-in replacement for eventlet (which hid the async part under its
>>*black magic*), I very much wonder how the community will respond to this
>>kind of mindset change (along with its new *black magic*).
>
>I disagree with you, asyncio is not "black magic". It's well defined. The
>execution of a coroutine is complex, but it doesn't use magic. IMO
>eventlet
>task switching is more black magic, it's not possible to guess it just by
>reading the code.
>
>Sorry but asyncio is nothing new :-( It's just a fresh design based on
>previous projects.
>
>Python has Twisted since more than 10 years. More recently, Tornado came.
>Both
>support coroutines using "yield" (see @inlineCallbacks and toro). Thanks
>to
>these two projects, there are already libraries which have an async API,
>using
>coroutines or callbacks.
>
>These are just the two major projects, they are much more projects, but
>they
>are smaller and younger.

I agree that the idea is nothing new, my observation/question/thought was
that the paradigm and larger architectural switch for openstack (and parts
of the larger python community) is new (even if the concept is not new)
and that means for those unaccustomed to it (or for those without
experience with node.js, go, twisted, tornoado or other...) that it will
appear to be similarily black-magic-like (complex things appear as magic
until they exist for long enough to be understood, at which point they are
no longer magically). Eventlet has that one benefit that it has been
around longer (although of course some people will still call it magical,
for the previously stated reasons), for better or worse.

Hope that makes more sense now.

-Josh



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-08 Thread Joshua Harlow
I think clints response was likely better than what I can write here, but
I'll add-on a few things,


>How do you write such code using taskflow?
>
>  @asyncio.coroutine
>  def foo(self):
>  result = yield from some_async_op(...)
>  return do_stuff(result)

The idea (at a very high level) is that users don't write this;

What users do write is a workflow, maybe the following (pseudocode):

# Define the pieces of your workflow.

TaskA():
  def execute():
  # Do whatever some_async_op did here.

  def revert():
  # If execute had any side-effects undo them here.
  
TaskFoo():
   ...

# Compose them together

flow = linear_flow.Flow("my-stuff").add(TaskA("my-task-a"),
TaskFoo("my-foo"))

# Submit the workflow to an engine, let the engine do the work to execute
it (and transfer any state between tasks as needed).

The idea here is that when things like this are declaratively specified
the only thing that matters is that the engine respects that declaration;
not whether it uses asyncio, eventlet, pigeons, threads, remote
workers[1]. It also adds some things that are not (imho) possible with
co-routines (in part since they are at such a low level) like stopping the
engine after 'my-task-a' runs and shutting off the software, upgrading it,
restarting it and then picking back up at 'my-foo'.

Hope that helps make it a little more understandable :)

-Josh

[1] http://docs.openstack.org/developer/taskflow/workers.html


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-08 Thread Joshua Harlow
Thanks clint, that was the gist of what I was getting at with the (likely
to long) email.

-Josh

-Original Message-
From: Clint Byrum 
Reply-To: "OpenStack Development Mailing List (not for usage questions)"

Date: Tuesday, July 8, 2014 at 11:43 AM
To: openstack-dev 
Subject: Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

>Excerpts from Victor Stinner's message of 2014-07-08 05:47:36 -0700:
>> Hi Joshua,
>> 
>> You asked a lot of questions. I will try to answer.
>> 
>> Le lundi 7 juillet 2014, 17:41:34 Joshua Harlow a écrit :
>> > * Why focus on a replacement low level execution model integration
>>instead
>> > of higher level workflow library or service (taskflow, mistral...
>>other)
>> > integration?
>> 
>> I don't know tasklow, I cannot answer to this question.
>> 
>> How do you write such code using taskflow?
>> 
>>   @asyncio.coroutine
>>   def foo(self):
>>   result = yield from some_async_op(...)
>>   return do_stuff(result)
>> 
>
>Victor, this is a low level piece of code, which highlights the problem
>that taskflow's higher level structure is meant to address. In writing
>OpenStack, we want to accomplish tasks based on a number of events. Users,
>errors, etc. We don't explicitly want to run coroutines, we want to
>attach volumes, spawn vms, and store files.
>
>See this:
>
>http://docs.openstack.org/developer/taskflow/examples.html
>
>The result is consumed in the next task in the flow. Meanwhile we get
>a clear definition of work-flow and very clear methods for resumption,
>retry, etc. So the expression is not as tightly bound as the code above,
>but that is the point, because we want to break things up into tasks
>which are clearly defined and then be able to resume each one
>individually.
>
>So what I think Josh is getting at, is that we could add asyncio support
>into taskflow as an abstraction for tasks that want to be non-blocking,
>and then we can focus on refactoring the code around high level work-flow
>expression rather than low level asyncio and coroutines.
>
>> > * Was the heat (asyncio-like) execution model[1] examined and learned
>>from
>> > before considering moving to asyncio?
>> 
>> I looked at Heat coroutines, but it has a design very different from
>>asyncio.
>> 
>> In short, asyncio uses an event loop running somewhere in the
>>background, 
>> whereas Heat explicitly schedules the execution of some tasks (with
>> "TaskRunner"), blocks until it gets the result and then stop completly
>>its 
>> "event loop". It's possible to implement that with asyncio, there is
>>for 
>> example a run_until_complete() method stopping the event loop when a
>>future is 
>> done. But asyncio event loop is designed to run "forever", so various
>>projects 
>> can run tasks "at the same time", not only a very specific section of
>>the code 
>> to run a set of tasks.
>> 
>> asyncio is not only designed to schedule callbacks, it's also designed
>>to 
>> manager file descriptors (especially sockets). It can also spawn and
>>manager 
>> subprocessses. This is not supported by Heat scheduler.
>> 
>> IMO Heat scheduler is too specific, it cannot be used widely in
>>OpenStack.
>> 
>
>This is sort of backwards to what Josh was suggesting. Heat can't continue
>with the current approach, which is coroutine based, because we need the
>the execution stack to not be in RAM on a single engine. We are going
>to achieve even more concurrency than we have now through an even higher
>level of task abstraction as part of the move to a convergence model. We
>will likely use task-flow to express these tasks so that they are more
>resumable and generally resilient to failure.
>
>> > Along a related question,
>> > seeing that openstack needs to support py2.x and py3.x will this mean
>>that
>> > trollius will be required to be used in 3.x (as it is the least common
>> > denominator, not new syntax like 'yield from' that won't exist in
>>2.x).
>> > Does this mean that libraries that will now be required to change
>>will be
>> > required to use trollius (the pulsar[6] framework seemed to mesh
>>these two
>> > nicely); is this understood by those authors?
>> 
>> It *is* possible to write code working on asyncio and trollius:
>> 
>>http://trollius.readthedocs.org/#write-code-working-on-trollius-and-tulip
>> 
>> They are different options for that. They are already projects

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-08 Thread Clint Byrum
Excerpts from Victor Stinner's message of 2014-07-08 05:47:36 -0700:
> Hi Joshua,
> 
> You asked a lot of questions. I will try to answer.
> 
> Le lundi 7 juillet 2014, 17:41:34 Joshua Harlow a écrit :
> > * Why focus on a replacement low level execution model integration instead
> > of higher level workflow library or service (taskflow, mistral... other)
> > integration?
> 
> I don't know tasklow, I cannot answer to this question.
> 
> How do you write such code using taskflow?
> 
>   @asyncio.coroutine
>   def foo(self):
>   result = yield from some_async_op(...)
>   return do_stuff(result)
> 

Victor, this is a low level piece of code, which highlights the problem
that taskflow's higher level structure is meant to address. In writing
OpenStack, we want to accomplish tasks based on a number of events. Users,
errors, etc. We don't explicitly want to run coroutines, we want to
attach volumes, spawn vms, and store files.

See this:

http://docs.openstack.org/developer/taskflow/examples.html

The result is consumed in the next task in the flow. Meanwhile we get
a clear definition of work-flow and very clear methods for resumption,
retry, etc. So the expression is not as tightly bound as the code above,
but that is the point, because we want to break things up into tasks
which are clearly defined and then be able to resume each one
individually.

So what I think Josh is getting at, is that we could add asyncio support
into taskflow as an abstraction for tasks that want to be non-blocking,
and then we can focus on refactoring the code around high level work-flow
expression rather than low level asyncio and coroutines.

> > * Was the heat (asyncio-like) execution model[1] examined and learned from
> > before considering moving to asyncio?
> 
> I looked at Heat coroutines, but it has a design very different from asyncio.
> 
> In short, asyncio uses an event loop running somewhere in the background, 
> whereas Heat explicitly schedules the execution of some tasks (with 
> "TaskRunner"), blocks until it gets the result and then stop completly its 
> "event loop". It's possible to implement that with asyncio, there is for 
> example a run_until_complete() method stopping the event loop when a future 
> is 
> done. But asyncio event loop is designed to run "forever", so various 
> projects 
> can run tasks "at the same time", not only a very specific section of the 
> code 
> to run a set of tasks.
> 
> asyncio is not only designed to schedule callbacks, it's also designed to 
> manager file descriptors (especially sockets). It can also spawn and manager 
> subprocessses. This is not supported by Heat scheduler.
> 
> IMO Heat scheduler is too specific, it cannot be used widely in OpenStack.
> 

This is sort of backwards to what Josh was suggesting. Heat can't continue
with the current approach, which is coroutine based, because we need the
the execution stack to not be in RAM on a single engine. We are going
to achieve even more concurrency than we have now through an even higher
level of task abstraction as part of the move to a convergence model. We
will likely use task-flow to express these tasks so that they are more
resumable and generally resilient to failure.

> > Along a related question,
> > seeing that openstack needs to support py2.x and py3.x will this mean that
> > trollius will be required to be used in 3.x (as it is the least common
> > denominator, not new syntax like 'yield from' that won't exist in 2.x).
> > Does this mean that libraries that will now be required to change will be
> > required to use trollius (the pulsar[6] framework seemed to mesh these two
> > nicely); is this understood by those authors?
> 
> It *is* possible to write code working on asyncio and trollius:
> http://trollius.readthedocs.org/#write-code-working-on-trollius-and-tulip
> 
> They are different options for that. They are already projects supporting 
> asyncio and trollius.
> 
> > Is this the direction we
> > want to go down (if we stay focused on ensuring py3.x compatible, then why
> > not just jump to py3.x in the first place)?
> 
> FYI OpenStack does not support Python 3 right now. I'm working on porting 
> OpenStack to Python 3, we made huge progress, but it's not done yet.
> 
> Anyway, the new RHEL 7 release doesn't provide Python 3.3 in the default 
> system, you have to enable the SCL repository (which provides Python 3.3). 
> And 
> Python 2.7 or even 2.6 is still used in production.
> 
> I would also prefer to use directly "yield from" and "just" drop Python 2 
> support. But dropping Python 2 support is not going to happen before at least 
> 2 years.
> 

Long term porting is important, however, we have immediate needs for
improvements in resilience and scalability. We cannot hang _any_ of that
on Python 3.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-08 Thread Gordon Sim

On 07/07/2014 07:18 PM, Mark McLoughlin wrote:

I'd expect us to add e.g.

   @asyncio.coroutine
   def call_async(self, ctxt, method, **kwargs):
   ...

to RPCClient. Perhaps we'd need to add an AsyncRPCClient in a separate
module and only add the method there - I don't have a good sense of it
yet.

However, the key thing is that I don't anticipate us needing to change
the current API in a backwards incompatible way.


Agreed, and that is a good thing. You would be *adding* to the API to 
support async behaviour, though right? Perhaps it would be worth being 
more explicit about the asynchronicity in that case, e.g. return a 
promise/future or allow an on-completion callback?


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-08 Thread Victor Stinner
Le lundi 7 juillet 2014, 19:14:45 Angus Salkeld a écrit :
> 4) retraining of OpenStack developers/reviews to understand the new
>event loop. (eventlet has warts, but a lot of devs know about them).

Wait, what?

Sorry if it was not clear, but the *whole* point of replacing eventlet with 
asyncio is to solve the most critical eventlet issue: eventlet may switch to 
another greenthread *anywhere* which causes dangerous and very complex bugs.

I asked different OpenStack developers: almost nobody in OpenStack is able to 
understand these issues nor fix them. Most OpenStack are just not aware of the 
issue. A colleague told me that he's alone to fix eventlet bugs, and nobody 
else cares because he's fixing them...

Read also the "What's wrong with eventlet ?" section of my article:
http://techs.enovance.com/6562/asyncio-openstack-python3

eventlet does not support Python 3 yet. They made some progress, but it is not 
working yet (at least in Oslo Incubator). eventlet is now the dependency 
blocking most OpenStack projects to port them to Python 3:
https://wiki.openstack.org/wiki/Python3#Core_OpenStack_projects

> Once we are in "asyncio heaven", would we look back and say "it
> would have been more valuable to focus on X", where X could have
> been say ease-of-upgrades or general-stability?

I would like to work on asyncio, I don't force anyone to work on it. It's not 
like they are only 2 developers working on OpenStack :-) OpenStack already 
evolves fast enough (maybe too fast according to some people!).

It looks like other developers are interested to help me. Probably because 
they want to get rid of eventlet for the reason I just explained.

Victor

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-08 Thread Victor Stinner
Hi Joshua,

You asked a lot of questions. I will try to answer.

Le lundi 7 juillet 2014, 17:41:34 Joshua Harlow a écrit :
> * Why focus on a replacement low level execution model integration instead
> of higher level workflow library or service (taskflow, mistral... other)
> integration?

I don't know tasklow, I cannot answer to this question.

How do you write such code using taskflow?

  @asyncio.coroutine
  def foo(self):
  result = yield from some_async_op(...)
  return do_stuff(result)


> * Was the heat (asyncio-like) execution model[1] examined and learned from
> before considering moving to asyncio?

I looked at Heat coroutines, but it has a design very different from asyncio.

In short, asyncio uses an event loop running somewhere in the background, 
whereas Heat explicitly schedules the execution of some tasks (with 
"TaskRunner"), blocks until it gets the result and then stop completly its 
"event loop". It's possible to implement that with asyncio, there is for 
example a run_until_complete() method stopping the event loop when a future is 
done. But asyncio event loop is designed to run "forever", so various projects 
can run tasks "at the same time", not only a very specific section of the code 
to run a set of tasks.

asyncio is not only designed to schedule callbacks, it's also designed to 
manager file descriptors (especially sockets). It can also spawn and manager 
subprocessses. This is not supported by Heat scheduler.

IMO Heat scheduler is too specific, it cannot be used widely in OpenStack.


>   * A side-question, how do asyncio and/or trollius support debugging, do
> they support tracing individual co-routines? What about introspecting the
> state a coroutine has associated with it? Eventlet at least has
> http://eventlet.net/doc/modules/debug.html (which is better than nothing);
> does an equivalent exist?

asyncio documentation has a section dedicated to debug:
https://docs.python.org/dev/library/asyncio-dev.html

asyncio.Task has get_stack() and print_stack() methods can be used to get the 
current state of a task. I don't know if it is what you are looking for.

I modified recently asyncio Task and Handle to save the traceback where they 
were created, it's now much easier to debug code. asyncio now also logs "slow 
callbacks" which may block the event loop (reduce the reactivity).

Read: We are making progress to ease the development and debug asyncio code.

I don't know exactly what you expect for debugging. However, there is no 
"tracing" feature" in asyncio yet.


> This is the part that I really wonder about. Since asyncio isn't just a
> drop-in replacement for eventlet (which hid the async part under its
> *black magic*), I very much wonder how the community will respond to this
> kind of mindset change (along with its new *black magic*).

I disagree with you, asyncio is not "black magic". It's well defined. The 
execution of a coroutine is complex, but it doesn't use magic. IMO eventlet 
task switching is more black magic, it's not possible to guess it just by 
reading the code.

Sorry but asyncio is nothing new :-( It's just a fresh design based on 
previous projects.

Python has Twisted since more than 10 years. More recently, Tornado came. Both 
support coroutines using "yield" (see @inlineCallbacks and toro). Thanks to 
these two projects, there are already libraries which have an async API, using 
coroutines or callbacks.

These are just the two major projects, they are much more projects, but they 
are smaller and younger.

Other programming languages are also moving to asynchronous programming. Read 
for example this article which lists some of them:
https://glyph.twistedmatrix.com/2014/02/unyielding.html


> * Is the larger python community ready for this?
> 
> Seeing other responses for supporting libraries that aren't asyncio
> compatible it doesn't inspire confidence that this path is ready to be
> headed down. Partially this is due to the fact that its a completely new
> programming model and alot of underlying libraries will be forced to
> change to accommodate this (sqlalchemy, others listed in [5]...).

The design of the asyncio core is really simple, it's mostly based on 
callbacks. Callbacks is an old concept, and many libraries already accept a 
callback to send the result when they are done.

In asyncio, you can use loop.call_soon(callback) to schedule the execution of 
a callback, and future.add_done_callback() to connect a future to a callback.

Oh by way, the core of Twisted and Tornado also uses callbacks. Using 
callbacks makes you compatible with Twisted, Tornado and asyncio.


asyncio already has projects compatible with it, see this list:
https://code.google.com/p/tulip/wiki/ThirdParty

There are for example Redis, PostgreSQL and MongoDB drivers. There is also an 
AMQP implementation.


You are not forced to make your code async. You can run blocking code in 
threads (a pool of threads) using loop.run_in_executor(). For example, DNS 
re

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Clint Byrum
Excerpts from Joshua Harlow's message of 2014-07-07 10:41:34 -0700:
> So I've been thinking how to respond to this email, and here goes (shields
> up!),
> 
> First things first; thanks mark and victor for the detailed plan and
> making it visible to all. It's very nicely put together and the amount of
> thought put into it is great to see. I always welcome an effort to move
> toward a new structured & explicit programming model (which asyncio
> clearly helps make possible and strongly encourages/requires).
> 

I too appreciate the level of detail in the proposal. I think I
understand where it wants to go.

> So now to some questions that I've been thinking about how to
> address/raise/ask (if any of these appear as FUD, they were not meant to
> be):
> 
> * Why focus on a replacement low level execution model integration instead
> of higher level workflow library or service (taskflow, mistral... other)
> integration?
> 
> Since pretty much all of openstack is focused around workflows that get
> triggered by some API activated by some user/entity having a new execution
> model (asyncio) IMHO doesn't seem to be shifting the needle in the
> direction that improves the scalability, robustness and crash-tolerance of
> those workflows (and the associated projects those workflows are currently
> defined & reside in). I *mostly* understand why we want to move to asyncio
> (py3, getting rid of eventlet, better performance? new awesomeness...) but
> it doesn't feel that important to actually accomplish seeing the big holes
> that openstack has right now with scalability, robustness... Let's imagine
> a different view on this; if all openstack projects declaratively define
> the workflows there APIs trigger (nova is working on task APIs, cinder is
> getting there to...), and in the future when the projects are *only*
> responsible for composing those workflows and handling the API inputs &
> responses then the need for asyncio or other technology can move out from
> the individual projects and into something else (possibly something that
> is being built & used as we speak). With this kind of approach the
> execution model can be an internal implementation detail of the workflow
> 'engine/processor' (it will also be responsible for fault-tolerant, robust
> and scalable execution). If this seems reasonable, then why not focus on
> integrating said thing into openstack and move the projects to a model
> that is independent of eventlet, asyncio (or the next greatest thing)
> instead? This seems to push the needle in the right direction and IMHO
> (and hopefully others opinions) has a much bigger potential to improve the
> various projects than just switching to a new underlying execution model.
> 
> * Was the heat (asyncio-like) execution model[1] examined and learned from
> before considering moving to asyncio?
> 
> I will try not to put words into the heat developers mouths (I can't do it
> justice anyway, hopefully they can chime in here) but I believe that heat
> has a system that is very similar to asyncio and coroutines right now and
> they are actively moving to a different model due to problems in part due
> to using that coroutine model in heat. So if they are moving somewhat away
> from that model (to a more declaratively workflow model that can be
> interrupted and converged upon [2]) why would it be beneficial for other
> projects to move toward the model they are moving away from (instead of
> repeating the issues the heat team had with coroutines, ex, visibility
> into stack/coroutine state, scale limitations, interruptibility...)?
> 

I'd like to hear Zane's opinions as he developed the rather light weight
code that we use. It has been quite a learning curve for me but I do
understand how to use the task scheduler we have in Heat now.

Heat's model is similar to asyncio, but is entirely limited in scope. I
think it has stayed relatively manageable because it is really only used
for a few explicit tasks where a high degree of concurrency makes a lot
of sense. We are not using it for I/O concurrency (eventlet still does
that) but rather for request concurrency. So we tell nova to boot 100
servers with 100 coroutines that have 100 other coroutines to block
further execution until those servers are active. We are by no means
using it as a general purpose concurrency programming model.

That said, as somebody working on the specification to move toward a
more taskflow-like (perhaps even entirely taskflow-based) model in Heat,
I think that is the way to go. The fact that we already have an event
loop that doesn't need to be explicit except at the very lowest levels
makes me want to keep that model. And we clearly need help with how to
define workflows, which something like taskflow will do nicely.

>   
>   * A side-question, how do asyncio and/or trollius support debugging, do
> they support tracing individual co-routines? What about introspecting the
> state a coroutine has associated with it? Eventlet at leas

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Angus Salkeld
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/07/14 08:28, Mark McLoughlin wrote:
> On Mon, 2014-07-07 at 18:11 +, Angus Salkeld wrote:
>> On 03/07/14 05:30, Mark McLoughlin wrote:
>>> Hey
>>>
>>> This is an attempt to summarize a really useful discussion that Victor,
>>> Flavio and I have been having today. At the bottom are some background
>>> links - basically what I have open in my browser right now thinking
>>> through all of this.
>>>
>>> We're attempting to take baby-steps towards moving completely from
>>> eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
>>> first victim.
>>
>> Has this been widely agreed on? It seems to me like we are mixing two
>> issues:
>> 1) we need to move to py3
>> 2) some people want to move from eventlet (I am not convinced that the
>>volume of code changes warrants the end goal - and review load)
>>
>> To achieve "1)" in a lower risk change, shouldn't we rather run eventlet
>> on top of asyncio? - i.e. not require widespread code changes.
>>
>> So we can maintain the main loop API but move to py3. I am not sure on
>> the feasibility, but seems to me like a more contained change.
> 
> Right - it's important that we see these orthogonal questions,
> particularly now that it appears eventlet is likely to be available for
> Python 3 soon.

Awesome (I didn't know that), how about we just use that?
Relax and enjoy py3:-)

Can we afford the code churn that the move to asyncio requires?
In terms of:
1) introduced bugs from the large code changes
2) redirected developers (that could be solving more pressing issues)
3) the problem of not been able to easily backport patches to stable
   (the code has diverged)
4) retraining of OpenStack developers/reviews to understand the new
   event loop. (eventlet has warts, but a lot of devs know about them).

> 
> For example, if it was generally agreed that we all want to end up on
> Python 3 with asyncio in the long term, you could imagine deploying

I am questioning whether we should be using asyncio directly (yield).
instead we keep using eventlet (the new one with py3 support) and it
runs the appropriate main loop depending on py2/3.

I don't want to derail this effort, I just want to suggest what I see
as an obvious alternative that requires a fraction of the work (or none).

The question is: "is the effort worth the outcome"?

Once we are in "asyncio heaven", would we look back and say "it
would have been more valuable to focus on X", where X could have
been say ease-of-upgrades or general-stability?


- -Angus

> (picking random examples) Glance with Python 3 and eventlet, but
> Ceilometer with Python 2 and asyncio/trollius.
> 
> However, I don't have a good handle on how your suggestion of switching
> to the asyncio event loop without widespread code changes would work?
> 
> Mark.
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTuvHOAAoJEFrDYBLxZjWobwAH/R6ggRhf7DifYyhdQLQWsDxi
s6moyeqdbjzt977Ula2J1hmP/6MI7icb5WmdI7DFlqcyl2eS+N9a51SFhdYC81Pz
SLJsrV4vUhrFXHGKgzWhFu1PMsE7oEIp+Z1vu/eCx1WiHaT1o615JHckpau9k9w8
7yhdAx1RfBM6UHR7LOOrqFzZvL7TYxNUhE9XTRMcwX2/iSzDFf8thyTyR+ln7iXo
t271Mk+3na/SgGpH42rmvuvWFh8jdaeAogFma+JNPkVgHwu28zXutMpxEfLpXdzn
9Ag7LphZnKh7y2r3+Yzc0KAp7ShmlMmJbhnITzp2w3myRAdF/6yA561ipikGalQ=
=te4t
-END PGP SIGNATURE-

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Joshua Harlow
So just to clear this up, my understanding is that asyncio and replacing
PRC calls with taskflow's job concept are two very different things. The
asyncio change would be retaining the RPC layer while the job concept[1]
would be something entirely different. I'm not a ceilometer expert though
so my understanding might be incorrect.

Overall the taskflow job mechanism is a lot like RQ[2] in concept which is
an abstraction around jobs, and doesn't mandate RPC or redis, or zookeeper
or ... as a job is performed. My biased
not-so-knowledgeable-about-ceilometer opinion is that a job mechanism
suits ceilometer more than a RPC one does (and since a job processing
mechanism is higher level abstraction it hopefully is more flexible with
regards to asyncio or other...).

[1] http://docs.openstack.org/developer/taskflow/jobs.html
[2] http://python-rq.org/

-Original Message-
From: Eoghan Glynn 
Reply-To: "OpenStack Development Mailing List (not for usage questions)"

Date: Sunday, July 6, 2014 at 6:28 AM
To: "OpenStack Development Mailing List (not for usage questions)"

Subject: Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

>
>
>> This is an attempt to summarize a really useful discussion that Victor,
>> Flavio and I have been having today. At the bottom are some background
>> links - basically what I have open in my browser right now thinking
>> through all of this.
>
>Thanks for the detailed summary, it puts a more flesh on the bones
>than a brief conversation on the fringes of the Paris mid-cycle.
>
>Just a few clarifications and suggestions inline to add into the
>mix.
>
>> We're attempting to take baby-steps towards moving completely from
>> eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
>> first victim.
>
>First beneficiary, I hope :)
> 
>> Ceilometer's code is run in response to various I/O events like REST API
>> requests, RPC calls, notifications received, etc. We eventually want the
>> asyncio event loop to be what schedules Ceilometer's code in response to
>> these events. Right now, it is eventlet doing that.
>
>Yes.
>
>And there is one other class of stimulus, also related to eventlet,
>that is very important for triggering the execution of ceilometer
>logic. That would be the timed tasks that drive polling of:
>
> * REST APIs provided by other openstack services
> * the local hypervisor running on each compute node
> * the SNMP daemons running at host-level etc.
>
>and also trigger periodic alarm evaluation.
>
>IIUC these tasks are all mediated via the oslo threadgroup's
>usage of eventlet.greenpool[1]. Would this logic also be replaced
>as part of this effort?
>
>> Now, because we're using eventlet, the code that is run in response to
>> these events looks like synchronous code that makes a bunch of
>> synchronous calls. For example, the code might do some_sync_op() and
>> that will cause a context switch to a different greenthread (within the
>> same native thread) where we might handle another I/O event (like a REST
>> API request)
>
>Just to make the point that most of the agents in the ceilometer
>zoo tend to react to just a single type of stimulus, as opposed
>to a mix of dispatching from both message bus and the REST API.
>
>So to classify, we'd have:
>
> * compute-agent: timer tasks for polling
> * central-agent: timer tasks for polling
> * notification-agent: dispatch of "external" notifications from
>   the message bus
> * collector: dispatch of "internal" metering messages from the
>   message bus
> * api-service: dispatch of REST API calls
> * alarm-evaluator: timer tasks for alarm evaluation
> * alarm-notifier: dispatch of "internal" alarm notifications
>
>IIRC, the only case where there's a significant mix of trigger
>styles is the partitioned alarm evaluator, where assignments of
>alarm subsets for evaluation is driven over RPC, whereas the
>actual thresholding is triggered by a timer.
>
>> Porting from eventlet's implicit async approach to asyncio's explicit
>> async API will be seriously time consuming and we need to be able to do
>> it piece-by-piece.
>
>Yes, I agree, a step-wise approach is the key here.
>
>So I'd love to have some sense of the time horizon for this
>effort. It clearly feels like a multi-cycle effort, so the main
>question in my mind right now is whether we should be targeting
>the first deliverables for juno-3?
>
>That would provide a proof-point in advance of the K* summit,
>where I presume the task would be get wider buy-in for the idea.
>
>If it makes sense to go ahead and aim the first baby s

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Mark McLoughlin
On Mon, 2014-07-07 at 18:11 +, Angus Salkeld wrote:
> On 03/07/14 05:30, Mark McLoughlin wrote:
> > Hey
> > 
> > This is an attempt to summarize a really useful discussion that Victor,
> > Flavio and I have been having today. At the bottom are some background
> > links - basically what I have open in my browser right now thinking
> > through all of this.
> > 
> > We're attempting to take baby-steps towards moving completely from
> > eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
> > first victim.
> 
> Has this been widely agreed on? It seems to me like we are mixing two
> issues:
> 1) we need to move to py3
> 2) some people want to move from eventlet (I am not convinced that the
>volume of code changes warrants the end goal - and review load)
> 
> To achieve "1)" in a lower risk change, shouldn't we rather run eventlet
> on top of asyncio? - i.e. not require widespread code changes.
> 
> So we can maintain the main loop API but move to py3. I am not sure on
> the feasibility, but seems to me like a more contained change.

Right - it's important that we see these orthogonal questions,
particularly now that it appears eventlet is likely to be available for
Python 3 soon.

For example, if it was generally agreed that we all want to end up on
Python 3 with asyncio in the long term, you could imagine deploying
(picking random examples) Glance with Python 3 and eventlet, but
Ceilometer with Python 2 and asyncio/trollius.

However, I don't have a good handle on how your suggestion of switching
to the asyncio event loop without widespread code changes would work?

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Chris Behrens

On Jul 7, 2014, at 11:11 AM, Angus Salkeld  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 03/07/14 05:30, Mark McLoughlin wrote:
>> Hey
>> 
>> This is an attempt to summarize a really useful discussion that Victor,
>> Flavio and I have been having today. At the bottom are some background
>> links - basically what I have open in my browser right now thinking
>> through all of this.
>> 
>> We're attempting to take baby-steps towards moving completely from
>> eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
>> first victim.
> 
> Has this been widely agreed on? It seems to me like we are mixing two
> issues:

Right. Does someone have a pointer to where this was decided?

- Chris



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Mark McLoughlin
On Mon, 2014-07-07 at 15:53 +0100, Gordon Sim wrote:
> On 07/07/2014 03:12 PM, Victor Stinner wrote:
> > The first step is to patch endpoints to add @trollius.coroutine to the 
> > methods,
> > and add yield From(...) on asynchronous tasks.
> 
> What are the 'endpoints' here? Are these internal to the oslo.messaging 
> library, or external to it?

The callback functions we dispatch to are called 'endpoint methods' -
e.g. they are methods on the 'endpoints' objects passed to
get_rpc_server().

> > Later we may modify Oslo Messaging to be able to call an RPC method
> > asynchronously, a method which would return a Trollius coroutine or task
> > directly. The problem is that Oslo Messaging currently hides 
> > "implementation"
> > details like eventlet.
> 
> I guess my question is how effectively does it hide it? If the answer to 
> the above is that this change can be contained within the oslo.messaging 
> implementation itself, then that would suggest its hidden reasonably well.
> 
> If, as I first understood (perhaps wrongly) it required changes to every 
> use of the oslo.messaging API, then it wouldn't really be hidden.
> 
> > Returning a Trollius object means that Oslo Messaging
> > will use explicitly Trollius. I'm not sure that OpenStack is ready for that
> > today.
> 
> The oslo.messaging API could evolve/expand to include explicitly 
> asynchronous methods that did not directly expose Trollius.

I'd expect us to add e.g.

  @asyncio.coroutine
  def call_async(self, ctxt, method, **kwargs):
  ...

to RPCClient. Perhaps we'd need to add an AsyncRPCClient in a separate
module and only add the method there - I don't have a good sense of it
yet.

However, the key thing is that I don't anticipate us needing to change
the current API in a backwards incompatible way.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Mark McLoughlin
On Sun, 2014-07-06 at 09:28 -0400, Eoghan Glynn wrote:
> 
> > This is an attempt to summarize a really useful discussion that Victor,
> > Flavio and I have been having today. At the bottom are some background
> > links - basically what I have open in my browser right now thinking
> > through all of this.
> 
> Thanks for the detailed summary, it puts a more flesh on the bones
> than a brief conversation on the fringes of the Paris mid-cycle.
> 
> Just a few clarifications and suggestions inline to add into the
> mix.
> 
> > We're attempting to take baby-steps towards moving completely from
> > eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
> > first victim.
> 
> First beneficiary, I hope :)
>  
> > Ceilometer's code is run in response to various I/O events like REST API
> > requests, RPC calls, notifications received, etc. We eventually want the
> > asyncio event loop to be what schedules Ceilometer's code in response to
> > these events. Right now, it is eventlet doing that.
> 
> Yes.
> 
> And there is one other class of stimulus, also related to eventlet,
> that is very important for triggering the execution of ceilometer
> logic. That would be the timed tasks that drive polling of:
> 
>  * REST APIs provided by other openstack services 
>  * the local hypervisor running on each compute node
>  * the SNMP daemons running at host-level etc.
> 
> and also trigger periodic alarm evaluation.
> 
> IIUC these tasks are all mediated via the oslo threadgroup's
> usage of eventlet.greenpool[1]. Would this logic also be replaced
> as part of this effort?

As part of the broader "switch from eventlet to asyncio" effort, yes
absolutely.

At the core of any event loop is code to do select() (or equivalents)
waiting for file descriptors to become readable or writable, or timers
to expire. We want to switch from the eventlet event loop to the asyncio
event loop.

The ThreadGroup abstraction from oslo-incubator is an interface to the
eventlet event loop. When you do:

  self.tg.add_timer(interval, self._evaluate_assigned_alarms)

You're saying "run evaluate_assigned_alarms() every $interval seconds,
using select() to sleep between executions".

When you do:

  self.tg.add_thread(self.start_udp)

you're saying "run some code which will either run to completion or set
wait for fd or timer events using select()".

The asyncio versions of those will be:

  event_loop.call_later(delay, callback)
  event_loop.call_soon(callback)

where the supplied callbacks will be asyncio 'coroutines' which rather
than doing:

  def foo(...):
  buf = read(fd)

and rely on eventlet's monkey patch to cause us to enter the event
loop's select() when the read() blocks, we instead do:

  @asyncio.coroutine
  def foo(...):
  buf = yield from read(fd)

which shows exactly where we might yield to the event loop.

The challenge is that porting code like the foo() function above is
pretty invasive and we can't simply port an entire service at once. So,
we need to be able to support a service using both eventlet-reliant code
and asyncio coroutines.

In your example of the openstack.common.threadgroup API - we would
initially need to add support for scheduling asyncio coroutine callback
arguments as eventlet greenthreads in add_timer() and add_thread(), and
later we would port threadgroup itself to rely completely on asyncio.

> > Now, because we're using eventlet, the code that is run in response to
> > these events looks like synchronous code that makes a bunch of
> > synchronous calls. For example, the code might do some_sync_op() and
> > that will cause a context switch to a different greenthread (within the
> > same native thread) where we might handle another I/O event (like a REST
> > API request)
> 
> Just to make the point that most of the agents in the ceilometer
> zoo tend to react to just a single type of stimulus, as opposed
> to a mix of dispatching from both message bus and the REST API.
> 
> So to classify, we'd have:
> 
>  * compute-agent: timer tasks for polling
>  * central-agent: timer tasks for polling
>  * notification-agent: dispatch of "external" notifications from
>the message bus
>  * collector: dispatch of "internal" metering messages from the
>message bus
>  * api-service: dispatch of REST API calls
>  * alarm-evaluator: timer tasks for alarm evaluation
>  * alarm-notifier: dispatch of "internal" alarm notifications
> 
> IIRC, the only case where there's a significant mix of trigger
> styles is the partitioned alarm evaluator, where assignments of
> alarm subsets for evaluation is driven over RPC, whereas the
> actual thresholding is triggered by a timer.

Cool, that's helpful. I think the key thing is deciding which stimulus
(and hence agent) we should start with.

> > Porting from eventlet's implicit async approach to asyncio's explicit
> > async API will be seriously time consuming and we need to be able to do
> > it piece-by-piece.
> 
> Yes, I agree, a step-wise approach is the key 

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Angus Salkeld
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 03/07/14 05:30, Mark McLoughlin wrote:
> Hey
> 
> This is an attempt to summarize a really useful discussion that Victor,
> Flavio and I have been having today. At the bottom are some background
> links - basically what I have open in my browser right now thinking
> through all of this.
> 
> We're attempting to take baby-steps towards moving completely from
> eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
> first victim.

Has this been widely agreed on? It seems to me like we are mixing two
issues:
1) we need to move to py3
2) some people want to move from eventlet (I am not convinced that the
   volume of code changes warrants the end goal - and review load)

To achieve "1)" in a lower risk change, shouldn't we rather run eventlet
on top of asyncio? - i.e. not require widespread code changes.

So we can maintain the main loop API but move to py3. I am not sure on
the feasibility, but seems to me like a more contained change.

- -Angus

> 
> Ceilometer's code is run in response to various I/O events like REST API
> requests, RPC calls, notifications received, etc. We eventually want the
> asyncio event loop to be what schedules Ceilometer's code in response to
> these events. Right now, it is eventlet doing that.
> 
> Now, because we're using eventlet, the code that is run in response to
> these events looks like synchronous code that makes a bunch of
> synchronous calls. For example, the code might do some_sync_op() and
> that will cause a context switch to a different greenthread (within the
> same native thread) where we might handle another I/O event (like a REST
> API request) while we're waiting for some_sync_op() to return:
> 
>   def foo(self):
>   result = some_sync_op()  # this may yield to another greenlet
>   return do_stuff(result)
> 
> Eventlet's infamous monkey patching is what make this magic happen.
> 
> When we switch to asyncio's event loop, all of this code needs to be
> ported to asyncio's explicitly asynchronous approach. We might do:
> 
>   @asyncio.coroutine
>   def foo(self):
>   result = yield from some_async_op(...)
>   return do_stuff(result)
> 
> or:
> 
>   @asyncio.coroutine
>   def foo(self):
>   fut = Future()
>   some_async_op(callback=fut.set_result)
>   ...
>   result = yield from fut
>   return do_stuff(result)
> 
> Porting from eventlet's implicit async approach to asyncio's explicit
> async API will be seriously time consuming and we need to be able to do
> it piece-by-piece.
> 
> The question then becomes what do we need to do in order to port a
> single oslo.messaging RPC endpoint method in Ceilometer to asyncio's
> explicit async approach?
> 
> The plan is:
> 
>   - we stick with eventlet; everything gets monkey patched as normal
> 
>   - we register the greenio event loop with asyncio - this means that 
> e.g. when you schedule an asyncio coroutine, greenio runs it in a 
> greenlet using eventlet's event loop
> 
>   - oslo.messaging will need a new variant of eventlet executor which 
> knows how to dispatch an asyncio coroutine. For example:
> 
> while True:
> incoming = self.listener.poll()
> method = dispatcher.get_endpoint_method(incoming)
> if asyncio.iscoroutinefunc(method):
> result = method()
> self._greenpool.spawn_n(incoming.reply, result)
> else:
> self._greenpool.spawn_n(method)
> 
> it's important that even with a coroutine endpoint method, we send 
> the reply in a greenthread so that the dispatch greenthread doesn't
> get blocked if the incoming.reply() call causes a greenlet context
> switch
> 
>   - when all of ceilometer has been ported over to asyncio coroutines, 
> we can stop monkey patching, stop using greenio and switch to the 
> asyncio event loop
> 
>   - when we make this change, we'll want a completely native asyncio 
> oslo.messaging executor. Unless the oslo.messaging drivers support 
> asyncio themselves, that executor will probably need a separate
> native thread to poll for messages and send replies.
> 
> If you're confused, that's normal. We had to take several breaks to get
> even this far because our brains kept getting fried.
> 
> HTH,
> Mark.
> 
> Victor's excellent docs on asyncio and trollius:
> 
>   https://docs.python.org/3/library/asyncio.html
>   http://trollius.readthedocs.org/
> 
> Victor's proposed asyncio executor:
> 
>   https://review.openstack.org/70948
> 
> The case for adopting asyncio in OpenStack:
> 
>   https://wiki.openstack.org/wiki/Oslo/blueprints/asyncio
> 
> A previous email I wrote about an asyncio executor:
> 
>  http://lists.openstack.org/pipermail/openstack-dev/2013-June/009934.html
> 
> The mock-up of an asyncio executor I wrote:
> 
>   
> https://github.com/markmc/oslo-incubator/blob/8509b8b/openstack/common/messaging/_executors/impl_tulip.py
>

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Joshua Harlow
aded down. Partially this is due to the fact that its a completely new
programming model and alot of underlying libraries will be forced to
change to accommodate this (sqlalchemy, others listed in [5]...). Do
others feel it's appropriate to start this at this time, or does it feel
premature? Of course we have to start somewhere but I start to wonder if
effort is better spent elsewhere (see above). Along a related question,
seeing that openstack needs to support py2.x and py3.x will this mean that
trollius will be required to be used in 3.x (as it is the least common
denominator, not new syntax like 'yield from' that won't exist in 2.x).
Does this mean that libraries that will now be required to change will be
required to use trollius (the pulsar[6] framework seemed to mesh these two
nicely); is this understood by those authors? Is this the direction we
want to go down (if we stay focused on ensuring py3.x compatible, then why
not just jump to py3.x in the first place)?

Anyways just some things to think about & discuss (from an obviously
workflow-biased[7] viewpoint),

Thoughts?

-Josh

[1] https://github.com/openstack/heat/blob/master/heat/engine/scheduler.py
[2] https://review.openstack.org/#/c/95907/
[3] https://etherpad.openstack.org/p/heat-workflow-vs-convergence
[4] https://wiki.openstack.org/wiki/Governance/CoreDefinition
[5] 
https://github.com/openstack/requirements/blob/master/global-requirements.t
xt
[6] http://pythonhosted.org/pulsar/
[7] http://docs.openstack.org/developer/taskflow/


-Original Message-
From: Mark McLoughlin 
Reply-To: "OpenStack Development Mailing List (not for usage questions)"

Date: Thursday, July 3, 2014 at 8:27 AM
To: "openstack-dev@lists.openstack.org" 
Subject: [openstack-dev] [oslo] Asyncio and oslo.messaging

>Hey
>
>This is an attempt to summarize a really useful discussion that Victor,
>Flavio and I have been having today. At the bottom are some background
>links - basically what I have open in my browser right now thinking
>through all of this.
>
>We're attempting to take baby-steps towards moving completely from
>eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
>first victim.
>
>Ceilometer's code is run in response to various I/O events like REST API
>requests, RPC calls, notifications received, etc. We eventually want the
>asyncio event loop to be what schedules Ceilometer's code in response to
>these events. Right now, it is eventlet doing that.
>
>Now, because we're using eventlet, the code that is run in response to
>these events looks like synchronous code that makes a bunch of
>synchronous calls. For example, the code might do some_sync_op() and
>that will cause a context switch to a different greenthread (within the
>same native thread) where we might handle another I/O event (like a REST
>API request) while we're waiting for some_sync_op() to return:
>
>  def foo(self):
>  result = some_sync_op()  # this may yield to another greenlet
>  return do_stuff(result)
>
>Eventlet's infamous monkey patching is what make this magic happen.
>
>When we switch to asyncio's event loop, all of this code needs to be
>ported to asyncio's explicitly asynchronous approach. We might do:
>
>  @asyncio.coroutine
>  def foo(self):
>  result = yield from some_async_op(...)
>  return do_stuff(result)
>
>or:
>
>  @asyncio.coroutine
>  def foo(self):
>  fut = Future()
>  some_async_op(callback=fut.set_result)
>  ...
>  result = yield from fut
>  return do_stuff(result)
>
>Porting from eventlet's implicit async approach to asyncio's explicit
>async API will be seriously time consuming and we need to be able to do
>it piece-by-piece.
>
>The question then becomes what do we need to do in order to port a
>single oslo.messaging RPC endpoint method in Ceilometer to asyncio's
>explicit async approach?
>
>The plan is:
>
>  - we stick with eventlet; everything gets monkey patched as normal
>
>  - we register the greenio event loop with asyncio - this means that
>e.g. when you schedule an asyncio coroutine, greenio runs it in a
>greenlet using eventlet's event loop
>
>  - oslo.messaging will need a new variant of eventlet executor which
>knows how to dispatch an asyncio coroutine. For example:
>
>while True:
>incoming = self.listener.poll()
>method = dispatcher.get_endpoint_method(incoming)
>if asyncio.iscoroutinefunc(method):
>result = method()
>self._greenpool.spawn_n(incoming.reply, result)
>else:
>self._greenpool.spawn_n(method)
>
>it's important that even wi

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Nikola Đipanov
On 07/07/2014 02:58 PM, Victor Stinner wrote:
> Hi,
> 
> Le lundi 7 juillet 2014, 12:48:59 Nikola Đipanov a écrit :
>> When I read all of this stuff and got my head around it (took some time
>> :) ), a glaring drawback of such an approach, and as I mentioned on the
>> spec proposing it [1] is that we would not really doing asyncio, we
>> would just be pretending we are by using a subset of it's APIs, and
>> having all of the really important stuff for overall design of the code
>> (code that needs to do IO in the callbacks for example) and ultimately -
>> performance, completely unavailable to us when porting.
> 
> The global plan is to:
> 
> 1. use asyncio API
> 2. detect code relying on implicit scheduling and patch it to use explicit 
> scheduling (use the coroutine syntax with yield)
> 3. "just" change the event loop from greenio to a classic "select" event loop 
> (select, poll, epoll, kqueue, etc.) of Trollius
> 
> I see asyncio as an API: it doesn't really matter which event loop is used, 
> but I want to get rid of eventlet :-)
> 

Well this is kind of a misrepresentation since with how greenio is
proposed now in the spec, we are not actually running the asyncio
eventloop, we are running the eventlet eventloop (that uses greenlet API
to switch green threads). More precisely - we will only run the
asyncio/trollius BaseEventLoop._run_once method in a green thread that
is scheduled by eventlet hub as any other.

Correct me if I'm wrong there, it's not exactly straightforward :)

And asyncio may be just an API, but it is a lower level and
fundamentally different API than what we deal with when running with
eventlet, so we can't just pretend we are not missing the code that
bridges this gap, since that's where the real 'meat' of the porting
effort lies, IMHO.

>> So in Mark's example above:
>>
>>   @asyncio.coroutine
>>   def foo(self):
>> result = yield from some_async_op(...)
>> return do_stuff(result)
>>
>> A developer would not need to do anything that asyncio requires like
>> make sure that some_async_op() registers a callback with the eventloop
>> (...)
> 
> It's not possible to break the world right now, some people will complain :-)
> 
> The idea is to have a smooth transition. We will write tools to detect 
> implicit scheduling and fix code. I don't know the best option for that right 
> now (monkey-patch eventlet, greenio or trollius?).
> 
>> So I hacked up together a small POC of a different approach. In short -
>> we actually use a real asyncio selector eventloop in a separate thread,
>> and dispatch stuff to it when we figure out that our callback is in fact
>> a coroutine.
> 
> See my previous attempty: the asyncio executor runs the asyncio event loop in 
> a dedicated thread:
> https://review.openstack.org/#/c/70948/
> 

Yes I spent a good chunk of time looking at that patch, that's where I
got some ideas for my attempt at it
(https://github.com/djipko/eventlet-asyncio). I left some comments there
but forgot to post them (fixed now).

The bit you miss is how to actually communicate back the result of the
dispatched methods.

> I'm not sure that it's possible to use it in OpenStack right now because the 
> whole Python standard library is monkey patched, including the threading 
> module.
> 

Like I said on the review - we unpatch Threading in the libvirt driver
in Nova for example, so it's not like it's beyond us :), and eventlet
gives you relatively good API's for dealing with what gets patched and
when - so greening a single endpoint and a listener is very much
feasible I would say - and this is what we would need to have the
'separation between the worlds' (so to speak :) ).

> The issue is also to switch the control flow between the event loop thread 
> and 
> the main thread. There is no explicit event loop in the main thread. The most 
> obvious solution for that is to schedule tasks using eventlet...
> 
> That's exactly the purpose of greenio: glue between asyncio and greenlet. And 
> using greenio, there is no need of running a new event loop in a thread, 
> which 
> makes the code simpler.
> 
>> (..) we would probably not be 'greening the world' but rather
>> importing patched
>> non-ported modules when we need to dispatch to them. This may sound like
>> a big deal, and it is, but it is critical to actually running ported
>> code in a real asyncio evenloop.
> 
> It will probably require a lot of work to get rid of eventlet. The greenio 
> approach is more realistic because projects can be patched one by one, one 
> file 
> by one file. The goal is also to run projects unmodified with the greenio 
> executor.
> 

All of this would be true with the other approach as well.

>> Another interesting problem is (as I have briefly mentioned in [1]) -
>> what happens when we need to synchronize between eventlet-run and
>> asyncio-run callbacks while we are in the process of porting.
> 
> Such issue is solved by greenio. As I wrote, it's not a good idea to have two 
> event loops 

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Mike Bayer

On 7/4/14, 4:45 AM, Julien Danjou wrote:
> On Thu, Jul 03 2014, Mark McLoughlin wrote:
>
>> We're attempting to take baby-steps towards moving completely from
>> eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
>> first victim.
> Thumbs up for the plan, that sounds like a good approach from what I
> got. I just think there's a lot of things that are going to be
> synchronous anyway because not everything provide a asynchronous
> alternative (i.e. SQLAlchemy or requests don't yet AFAIK). It doesn't
> worry me much as there nothing we can do on our side, except encourage
> people to stop writing synchronous API¹.
>
> And big +1 for using Ceilometer as a test bed. :)
Allowing SQLAlchemy to be fully compatible with an explicit async
programming approach, which note is distinctly different from allowing
SQLAlchemy to run efficiently within an application that uses explicit
async, has been studied and as of yet, it does not seem possible without
ruining the performance of the library (including Core-only),
invalidating the ORM entirely, and of course doing a rewrite of almost
the whole thing (see
http://stackoverflow.com/questions/16491564/how-to-make-sqlalchemy-in-tornado-to-be-async/16503103#16503103,
http://python-notes.curiousefficiency.org/en/latest/pep_ideas/async_programming.html#gevent-and-pep-3156).
   


But before you even look at database abstraction layers, you need a
database driver.  What's the explicitly async-compatible driver for
MySQL?Googling around I found
https://github.com/eliast/async-MySQL-python, but not much else.  Note
that for explicit async, a driver that allows monkeypatching is no
longer enough.  You need an API like Psycopg2s asynchronous support:
http://initd.org/psycopg/docs/advanced.html#async-support.  Note that
psycopg2's API is entirely an extension to the Python DBAPI:
http://legacy.python.org/dev/peps/pep-0249/.  So an all explicit async
approach necessitates throwing out this out as well; as an alternative,
here is twisted's adbapi extension to pep-249's API:
https://twistedmatrix.com/documents/current/core/howto/rdbms.html.   I'm
not sure if Twisted provides an explicit async API for MySQL.

If you are writing an application that runs in an explicit, or even an
implicitly async system, and your database driver isn't compatible with
that, your application will perform terribly - because you've given up
regular old threads, and your app now serializes most of what it does
through a single, blocking pipe.That is the current status of all
Openstack apps that rely heavily on MySQLdb and Eventlet at the same
time.   Explicitly asyncing it will help in that we won't get
hard-to-predict context switches that deadlock against the DB driver
(also solvable just by using an appropriate patchable driver), but it
won't help performance until that is solved.

Nick's post points the way towards a way that everyone can have what
they want - which is that once we get a MySQL database adapter that is
implicit-async-patch-capable, the explicit async parts of openstack call
into database routines that are using implicit async via a gevent-like
approach.   That way SQLAlchemy's source code does not have to multiply
it's function call count by an order of magnitude, or be rewritten, and
the ORM-like features that folks like to complain about as they continue
to use them like crazy (e.g. lazy loading) can remain intact.

If we are in fact considering going down this latest rabbit hole which
claims that program code cannot possibly be efficient or trusted unless
all blocking operations are entirely written literally by humans,
yielding all the way down to the system calls, I would ask that we make
a concerted effort to face just exactly what that means we'd be giving
up.   Because the cost of how much has to be thrown away may be
considerably higher than people might realize.For those parts of an
app that make sense using explicit async, we should be doing so. 
However, we should ensure that those code sections more appropriate as
implicit async remain first class citizens as well.

 







___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Gordon Sim

On 07/07/2014 03:12 PM, Victor Stinner wrote:

The first step is to patch endpoints to add @trollius.coroutine to the methods,
and add yield From(...) on asynchronous tasks.


What are the 'endpoints' here? Are these internal to the oslo.messaging 
library, or external to it?



Later we may modify Oslo Messaging to be able to call an RPC method
asynchronously, a method which would return a Trollius coroutine or task
directly. The problem is that Oslo Messaging currently hides "implementation"
details like eventlet.


I guess my question is how effectively does it hide it? If the answer to 
the above is that this change can be contained within the oslo.messaging 
implementation itself, then that would suggest its hidden reasonably well.


If, as I first understood (perhaps wrongly) it required changes to every 
use of the oslo.messaging API, then it wouldn't really be hidden.



Returning a Trollius object means that Oslo Messaging
will use explicitly Trollius. I'm not sure that OpenStack is ready for that
today.


The oslo.messaging API could evolve/expand to include explicitly 
asynchronous methods that did not directly expose Trollius.




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Victor Stinner
Le lundi 7 juillet 2014, 11:26:27 Gordon Sim a écrit :
> > When we switch to asyncio's event loop, all of this code needs to be
> > ported to asyncio's explicitly asynchronous approach. We might do:
> >@asyncio.coroutine
> >
> >def foo(self):
> >result = yield from some_async_op(...)
> >return do_stuff(result)
> > 
> > or:
> >@asyncio.coroutine
> >def foo(self):
> >fut = Future()
> >some_async_op(callback=fut.set_result)
> >...
> >result = yield from fut
> >return do_stuff(result)
> > 
> > Porting from eventlet's implicit async approach to asyncio's explicit
> > async API will be seriously time consuming and we need to be able to do
> > it piece-by-piece.
> 
> Am I right in saying that this implies a change to the effective API for
> oslo.messaging[1]? I.e. every invocation on the library, e.g. a call or
> a cast, will need to be changed to be explicitly asynchronous?
>
> [1] Not necessarily a change to the signature of functions, but a change
> to the manner in which they are invoked.

The first step is to patch endpoints to add @trollius.coroutine to the methods, 
and add yield From(...) on asynchronous tasks.

Later we may modify Oslo Messaging to be able to call an RPC method 
asynchronously, a method which would return a Trollius coroutine or task 
directly. The problem is that Oslo Messaging currently hides "implementation" 
details like eventlet. Returning a Trollius object means that Oslo Messaging 
will use explicitly Trollius. I'm not sure that OpenStack is ready for that 
today.

Victor

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Victor Stinner
Hi,

Le lundi 7 juillet 2014, 12:48:59 Nikola Đipanov a écrit :
> When I read all of this stuff and got my head around it (took some time
> :) ), a glaring drawback of such an approach, and as I mentioned on the
> spec proposing it [1] is that we would not really doing asyncio, we
> would just be pretending we are by using a subset of it's APIs, and
> having all of the really important stuff for overall design of the code
> (code that needs to do IO in the callbacks for example) and ultimately -
> performance, completely unavailable to us when porting.

The global plan is to:

1. use asyncio API
2. detect code relying on implicit scheduling and patch it to use explicit 
scheduling (use the coroutine syntax with yield)
3. "just" change the event loop from greenio to a classic "select" event loop 
(select, poll, epoll, kqueue, etc.) of Trollius

I see asyncio as an API: it doesn't really matter which event loop is used, 
but I want to get rid of eventlet :-)

> So in Mark's example above:
> 
>   @asyncio.coroutine
>   def foo(self):
> result = yield from some_async_op(...)
> return do_stuff(result)
> 
> A developer would not need to do anything that asyncio requires like
> make sure that some_async_op() registers a callback with the eventloop
> (...)

It's not possible to break the world right now, some people will complain :-)

The idea is to have a smooth transition. We will write tools to detect 
implicit scheduling and fix code. I don't know the best option for that right 
now (monkey-patch eventlet, greenio or trollius?).

> So I hacked up together a small POC of a different approach. In short -
> we actually use a real asyncio selector eventloop in a separate thread,
> and dispatch stuff to it when we figure out that our callback is in fact
> a coroutine.

See my previous attempty: the asyncio executor runs the asyncio event loop in 
a dedicated thread:
https://review.openstack.org/#/c/70948/

I'm not sure that it's possible to use it in OpenStack right now because the 
whole Python standard library is monkey patched, including the threading 
module.

The issue is also to switch the control flow between the event loop thread and 
the main thread. There is no explicit event loop in the main thread. The most 
obvious solution for that is to schedule tasks using eventlet...

That's exactly the purpose of greenio: glue between asyncio and greenlet. And 
using greenio, there is no need of running a new event loop in a thread, which 
makes the code simpler.

> (..) we would probably not be 'greening the world' but rather
> importing patched
> non-ported modules when we need to dispatch to them. This may sound like
> a big deal, and it is, but it is critical to actually running ported
> code in a real asyncio evenloop.

It will probably require a lot of work to get rid of eventlet. The greenio 
approach is more realistic because projects can be patched one by one, one file 
by one file. The goal is also to run projects unmodified with the greenio 
executor.

> Another interesting problem is (as I have briefly mentioned in [1]) -
> what happens when we need to synchronize between eventlet-run and
> asyncio-run callbacks while we are in the process of porting.

Such issue is solved by greenio. As I wrote, it's not a good idea to have two 
event loops in the same process.

Victor

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Nikola Đipanov
On 07/03/2014 05:27 PM, Mark McLoughlin wrote:
> Hey
> 
> This is an attempt to summarize a really useful discussion that Victor,
> Flavio and I have been having today. At the bottom are some background
> links - basically what I have open in my browser right now thinking
> through all of this.
> 
> We're attempting to take baby-steps towards moving completely from
> eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
> first victim.
>
> 
>
> When we switch to asyncio's event loop, all of this code needs to be
> ported to asyncio's explicitly asynchronous approach. We might do:
> 
>   @asyncio.coroutine
>   def foo(self):
>   result = yield from some_async_op(...)
>   return do_stuff(result)
> 
> or:
> 
>   @asyncio.coroutine
>   def foo(self):
>   fut = Future()
>   some_async_op(callback=fut.set_result)
>   ...
>   result = yield from fut
>   return do_stuff(result)
> 
> Porting from eventlet's implicit async approach to asyncio's explicit
> async API will be seriously time consuming and we need to be able to do
> it piece-by-piece.
> 
> The question then becomes what do we need to do in order to port a
> single oslo.messaging RPC endpoint method in Ceilometer to asyncio's
> explicit async approach?
> 
> The plan is:
> 
>   - we stick with eventlet; everything gets monkey patched as normal
> 
>   - we register the greenio event loop with asyncio - this means that 
> e.g. when you schedule an asyncio coroutine, greenio runs it in a 
> greenlet using eventlet's event loop
> 
>   - oslo.messaging will need a new variant of eventlet executor which 
> knows how to dispatch an asyncio coroutine. For example:
> 
> while True:
> incoming = self.listener.poll()
> method = dispatcher.get_endpoint_method(incoming)
> if asyncio.iscoroutinefunc(method):
> result = method()
> self._greenpool.spawn_n(incoming.reply, result)
> else:
> self._greenpool.spawn_n(method)
> 
> it's important that even with a coroutine endpoint method, we send 
> the reply in a greenthread so that the dispatch greenthread doesn't
> get blocked if the incoming.reply() call causes a greenlet context
> switch
> 
>   - when all of ceilometer has been ported over to asyncio coroutines, 
> we can stop monkey patching, stop using greenio and switch to the 
> asyncio event loop
> 
>   - when we make this change, we'll want a completely native asyncio 
> oslo.messaging executor. Unless the oslo.messaging drivers support 
> asyncio themselves, that executor will probably need a separate
> native thread to poll for messages and send replies.
> 
> If you're confused, that's normal. We had to take several breaks to get
> even this far because our brains kept getting fried.
> 

Thanks Mark for putting this all together in an approachable way. This
is really interesting work, and I wish I found out about all of this
sooner :).

When I read all of this stuff and got my head around it (took some time
:) ), a glaring drawback of such an approach, and as I mentioned on the
spec proposing it [1] is that we would not really doing asyncio, we
would just be pretending we are by using a subset of it's APIs, and
having all of the really important stuff for overall design of the code
(code that needs to do IO in the callbacks for example) and ultimately -
performance, completely unavailable to us when porting.

So in Mark's example above:

  @asyncio.coroutine
  def foo(self):
result = yield from some_async_op(...)
return do_stuff(result)

A developer would not need to do anything that asyncio requires like
make sure that some_async_op() registers a callback with the eventloop
(using for example event_loop.add_reader/writer methods) you could just
simply make it use a 'greened' call and things would continue working
happily. I have a feeling this will in turn have a lot of people writing
code that they don't understand, and as library writers - we are not
doing an excellent job at that point.

Now porting an OpenStack project to another IO library with completely
different design is a huge job and there is unlikely a single 'right'
way to do it, so treat this as a discussion starter, that will hopefully
give us a better understanding of the problem we are trying to tackle.

So I hacked up together a small POC of a different approach. In short -
we actually use a real asyncio selector eventloop in a separate thread,
and dispatch stuff to it when we figure out that our callback is in fact
a coroutine. More will be clear form the code so:

(Warning - hacky code ahead): [2]

I will probably be updating it - but if you just clone the repo, all the
history is there. I wrote it without the oslo.messaging abstractions
like listener and dispatcher, but it is relatively easy to see which
bits of code would go in those.

Several things worth noting as you read the above. First

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Gordon Sim

On 07/03/2014 04:27 PM, Mark McLoughlin wrote:

Ceilometer's code is run in response to various I/O events like REST API
requests, RPC calls, notifications received, etc. We eventually want the
asyncio event loop to be what schedules Ceilometer's code in response to
these events. Right now, it is eventlet doing that.

Now, because we're using eventlet, the code that is run in response to
these events looks like synchronous code that makes a bunch of
synchronous calls. For example, the code might do some_sync_op() and
that will cause a context switch to a different greenthread (within the
same native thread) where we might handle another I/O event (like a REST
API request) while we're waiting for some_sync_op() to return:

   def foo(self):
   result = some_sync_op()  # this may yield to another greenlet
   return do_stuff(result)

Eventlet's infamous monkey patching is what make this magic happen.

When we switch to asyncio's event loop, all of this code needs to be
ported to asyncio's explicitly asynchronous approach. We might do:

   @asyncio.coroutine
   def foo(self):
   result = yield from some_async_op(...)
   return do_stuff(result)

or:

   @asyncio.coroutine
   def foo(self):
   fut = Future()
   some_async_op(callback=fut.set_result)
   ...
   result = yield from fut
   return do_stuff(result)

Porting from eventlet's implicit async approach to asyncio's explicit
async API will be seriously time consuming and we need to be able to do
it piece-by-piece.


Am I right in saying that this implies a change to the effective API for 
oslo.messaging[1]? I.e. every invocation on the library, e.g. a call or 
a cast, will need to be changed to be explicitly asynchronous?


[1] Not necessarily a change to the signature of functions, but a change 
to the manner in which they are invoked.




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-06 Thread Eoghan Glynn


> This is an attempt to summarize a really useful discussion that Victor,
> Flavio and I have been having today. At the bottom are some background
> links - basically what I have open in my browser right now thinking
> through all of this.

Thanks for the detailed summary, it puts a more flesh on the bones
than a brief conversation on the fringes of the Paris mid-cycle.

Just a few clarifications and suggestions inline to add into the
mix.

> We're attempting to take baby-steps towards moving completely from
> eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
> first victim.

First beneficiary, I hope :)
 
> Ceilometer's code is run in response to various I/O events like REST API
> requests, RPC calls, notifications received, etc. We eventually want the
> asyncio event loop to be what schedules Ceilometer's code in response to
> these events. Right now, it is eventlet doing that.

Yes.

And there is one other class of stimulus, also related to eventlet,
that is very important for triggering the execution of ceilometer
logic. That would be the timed tasks that drive polling of:

 * REST APIs provided by other openstack services 
 * the local hypervisor running on each compute node
 * the SNMP daemons running at host-level etc.

and also trigger periodic alarm evaluation.

IIUC these tasks are all mediated via the oslo threadgroup's
usage of eventlet.greenpool[1]. Would this logic also be replaced
as part of this effort?

> Now, because we're using eventlet, the code that is run in response to
> these events looks like synchronous code that makes a bunch of
> synchronous calls. For example, the code might do some_sync_op() and
> that will cause a context switch to a different greenthread (within the
> same native thread) where we might handle another I/O event (like a REST
> API request)

Just to make the point that most of the agents in the ceilometer
zoo tend to react to just a single type of stimulus, as opposed
to a mix of dispatching from both message bus and the REST API.

So to classify, we'd have:

 * compute-agent: timer tasks for polling
 * central-agent: timer tasks for polling
 * notification-agent: dispatch of "external" notifications from
   the message bus
 * collector: dispatch of "internal" metering messages from the
   message bus
 * api-service: dispatch of REST API calls
 * alarm-evaluator: timer tasks for alarm evaluation
 * alarm-notifier: dispatch of "internal" alarm notifications

IIRC, the only case where there's a significant mix of trigger
styles is the partitioned alarm evaluator, where assignments of
alarm subsets for evaluation is driven over RPC, whereas the
actual thresholding is triggered by a timer.

> Porting from eventlet's implicit async approach to asyncio's explicit
> async API will be seriously time consuming and we need to be able to do
> it piece-by-piece.

Yes, I agree, a step-wise approach is the key here.

So I'd love to have some sense of the time horizon for this
effort. It clearly feels like a multi-cycle effort, so the main
question in my mind right now is whether we should be targeting
the first deliverables for juno-3?

That would provide a proof-point in advance of the K* summit,
where I presume the task would be get wider buy-in for the idea.

If it makes sense to go ahead and aim the first baby steps for
juno-3, then we'd need to have a ceilometer-spec detailing these
changes. This would need to be proposed by say EoW and then
landed before the spec acceptance deadline for juno (~July 21st).

We could use this spec proposal to dig into the perceived benefits
of this effort:

 * the obvious win around getting rid of the eventlet black-magic
 * plus possibly other benefits such as code clarity and ease of
   maintenance

and OTOH get a heads-up on the risks:

 * possible immaturity in the new framework?
 * overhead involved in contributors getting to grips with the
   new coroutine model

> The question then becomes what do we need to do in order to port a
> single oslo.messaging RPC endpoint method in Ceilometer to asyncio's
> explicit async approach?

One approach would be to select one well-defined area of ceilometer
as an initial test-bed for these ideas.

And one potential candidate for that would be the partitioned alarm
evaluator, which uses:

 1. fan-out RPC for the heartbeats underpinning master-slave
coordination
 2. RPC calls for alarm allocations and assignments

I spoke to Cyril Roelandt at the mid-cycle, who is interested in:

 * replacing #1 with the tooz distributed co-ordination library[2]
 * and also possibly replacing #2 with taskflow

The benefit of using taskflow for "sticky" task assignments isn't
100% clear, so it may actually make better sense to just use tooz
for the leadership election, and the new asyncio model for #2.

Starting there would have the advantage of being out on the side
of the main ceilometer pipeline.

However, if we do decide to go ahead with taskflow, then we could
fine another good star

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-04 Thread victor stinner
Hi,

I promise a status of my work on Trollius and greenio to Mark, but it's not 
easy to summarize it because there are still a few pending patches to implement 
the final greenio executor. There are different parts: asyncio, Trollius, 
greenio, Olso Messaging.


The design of the asyncio is the PEP 3156 (*) which was accepted and 
implemented in Python 3.4, released 4 months ago. After the released of Python 
3.4, many bugs were fixed in asyncio. The API is stable, it didn't change (and 
it cannot change because backward compatibility matters in Python, even if the 
module is still tagged as "provisional" in Python 3.4).

   http://legacy.python.org/dev/peps/pep-3156/


Since January, I released regulary new versions of Trollius. Trollius API is 
the same than the asyncio API, except of the syntax of coroutines:

   http://trollius.readthedocs.org/#differences-between-trollius-and-tulip

The next Trollius release will probably be the version 1.0 because I consider 
that the API is now stable. Last incompatible changes were made to make 
Trollius look closer to asyncio, and to ease the transition from Trollius to 
asyncio. I also renamed the module from "asyncio" to "trollius" to support 
Python 3.4 (which already has an "asyncio" module in the standard library) and 
to make it more explicit than Trollius coroutines are different than asyncio 
coroutines.


The greenio project was written for asyncio and it is available on PyPI. 
greenio only support a few features of asyncio, in short: it only supports 
executing coroutines. But we only need this feature in Oslo Messaging. I sent a 
pull request to port greenio to Trollius:

   https://github.com/1st1/greenio/pull/5/

The pull request requires a new "task factory": I sent a patch to asyncio for 
that.


For Oslo Messaging, my change to poll with a timeout has been merged. (I just 
sent a fix because my change doesn't work with RabbitMQ.) I will work on the 
greenio executor when other pending patches will be merged. We talked with Mark 
about this greenio executor. It will be based on the eventlet executor, with a 
few lines to support Trollius coroutines. We also have to modify the notifier 
to support to pass an optional "execute" function which executes the endpoint 
function which may be a coroutine. According to Mark, this change is short and 
acceptable in Olso Messaging: thanks to the "execute" function, it will be 
possible to restrict code using greenio in the greenio executor (no need to put 
greenio nor trollius everywhere in Oslo Messaging).


I listed a lot of projects and pending patches, but I expect that all pieces of 
the puzzle with be done before the end of the month. We are very close to 
having a working greenio executor in Oslo Messaging ;-)


Victor


- Mail original -
> De: "Mark McLoughlin" 
> À: openstack-dev@lists.openstack.org
> Envoyé: Jeudi 3 Juillet 2014 17:27:58
> Objet: [openstack-dev] [oslo] Asyncio and oslo.messaging
> 
> Hey
> 
> This is an attempt to summarize a really useful discussion that Victor,
> Flavio and I have been having today. At the bottom are some background
> links - basically what I have open in my browser right now thinking
> through all of this.
> 
> We're attempting to take baby-steps towards moving completely from
> eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
> first victim.
> 
> Ceilometer's code is run in response to various I/O events like REST API
> requests, RPC calls, notifications received, etc. We eventually want the
> asyncio event loop to be what schedules Ceilometer's code in response to
> these events. Right now, it is eventlet doing that.
> 
> Now, because we're using eventlet, the code that is run in response to
> these events looks like synchronous code that makes a bunch of
> synchronous calls. For example, the code might do some_sync_op() and
> that will cause a context switch to a different greenthread (within the
> same native thread) where we might handle another I/O event (like a REST
> API request) while we're waiting for some_sync_op() to return:
> 
>   def foo(self):
>   result = some_sync_op()  # this may yield to another greenlet
>   return do_stuff(result)
> 
> Eventlet's infamous monkey patching is what make this magic happen.
> 
> When we switch to asyncio's event loop, all of this code needs to be
> ported to asyncio's explicitly asynchronous approach. We might do:
> 
>   @asyncio.coroutine
>   def foo(self):
>   result = yield from some_async_op(...)
>   return do_stuff(result)
> 
> or:
> 
>   @asyncio.coroutine
>   def foo(self):
>   fut = Future()
>   some_async_op(callback=fut.set_result)
>   ...
>   result = yie

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-04 Thread Julien Danjou
On Thu, Jul 03 2014, Mark McLoughlin wrote:

> We're attempting to take baby-steps towards moving completely from
> eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
> first victim.

Thumbs up for the plan, that sounds like a good approach from what I
got. I just think there's a lot of things that are going to be
synchronous anyway because not everything provide a asynchronous
alternative (i.e. SQLAlchemy or requests don't yet AFAIK). It doesn't
worry me much as there nothing we can do on our side, except encourage
people to stop writing synchronous API¹.

And big +1 for using Ceilometer as a test bed. :)


¹  I'm sure you're familiar with Xlib vs XCB in this regard ;)

-- 
Julien Danjou
;; Free Software hacker
;; http://julien.danjou.info


signature.asc
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-03 Thread Doug Hellmann
On Thu, Jul 3, 2014 at 11:27 AM, Mark McLoughlin  wrote:
> Hey
>
> This is an attempt to summarize a really useful discussion that Victor,
> Flavio and I have been having today. At the bottom are some background
> links - basically what I have open in my browser right now thinking
> through all of this.
>
> We're attempting to take baby-steps towards moving completely from
> eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
> first victim.
>
> Ceilometer's code is run in response to various I/O events like REST API
> requests, RPC calls, notifications received, etc. We eventually want the
> asyncio event loop to be what schedules Ceilometer's code in response to
> these events. Right now, it is eventlet doing that.
>
> Now, because we're using eventlet, the code that is run in response to
> these events looks like synchronous code that makes a bunch of
> synchronous calls. For example, the code might do some_sync_op() and
> that will cause a context switch to a different greenthread (within the
> same native thread) where we might handle another I/O event (like a REST
> API request) while we're waiting for some_sync_op() to return:
>
>   def foo(self):
>   result = some_sync_op()  # this may yield to another greenlet
>   return do_stuff(result)
>
> Eventlet's infamous monkey patching is what make this magic happen.
>
> When we switch to asyncio's event loop, all of this code needs to be
> ported to asyncio's explicitly asynchronous approach. We might do:
>
>   @asyncio.coroutine
>   def foo(self):
>   result = yield from some_async_op(...)
>   return do_stuff(result)
>
> or:
>
>   @asyncio.coroutine
>   def foo(self):
>   fut = Future()
>   some_async_op(callback=fut.set_result)
>   ...
>   result = yield from fut
>   return do_stuff(result)
>
> Porting from eventlet's implicit async approach to asyncio's explicit
> async API will be seriously time consuming and we need to be able to do
> it piece-by-piece.
>
> The question then becomes what do we need to do in order to port a
> single oslo.messaging RPC endpoint method in Ceilometer to asyncio's
> explicit async approach?
>
> The plan is:
>
>   - we stick with eventlet; everything gets monkey patched as normal
>
>   - we register the greenio event loop with asyncio - this means that
> e.g. when you schedule an asyncio coroutine, greenio runs it in a
> greenlet using eventlet's event loop
>
>   - oslo.messaging will need a new variant of eventlet executor which
> knows how to dispatch an asyncio coroutine. For example:
>
> while True:
> incoming = self.listener.poll()
> method = dispatcher.get_endpoint_method(incoming)
> if asyncio.iscoroutinefunc(method):
> result = method()
> self._greenpool.spawn_n(incoming.reply, result)
> else:
> self._greenpool.spawn_n(method)
>
> it's important that even with a coroutine endpoint method, we send
> the reply in a greenthread so that the dispatch greenthread doesn't
> get blocked if the incoming.reply() call causes a greenlet context
> switch
>
>   - when all of ceilometer has been ported over to asyncio coroutines,
> we can stop monkey patching, stop using greenio and switch to the
> asyncio event loop
>
>   - when we make this change, we'll want a completely native asyncio
> oslo.messaging executor. Unless the oslo.messaging drivers support
> asyncio themselves, that executor will probably need a separate
> native thread to poll for messages and send replies.

We tried to keep eventlet out of the drivers. Does it make sense to do
the same for asyncio?

Does this change have any effect on the WSGI services, and the WSGI
container servers we can use to host them?

> If you're confused, that's normal. We had to take several breaks to get
> even this far because our brains kept getting fried.

I won't claim to understand all of the nuances, but it seems like a
good way to stage the changes. Thanks to everyone involved for working
it out!

>
> HTH,
> Mark.
>
> Victor's excellent docs on asyncio and trollius:
>
>   https://docs.python.org/3/library/asyncio.html
>   http://trollius.readthedocs.org/
>
> Victor's proposed asyncio executor:
>
>   https://review.openstack.org/70948
>
> The case for adopting asyncio in OpenStack:
>
>   https://wiki.openstack.org/wiki/Oslo/blueprints/asyncio
>
> A previous email I wrote about an asyncio executor:
>
>  http://lists.openstack.org/pipermail/openstack-dev/2013-June/009934.html
>
> The mock-up of an asyncio executor I wrote:
>
>   
> https://github.com/markmc/oslo-incubator/blob/8509b8b/openstack/common/messaging/_executors/impl_tulip.py
>
> My blog post on async I/O and Python:
>
>   http://blogs.gnome.org/markmc/2013/06/04/async-io-and-python/
>
> greenio - greelets support for asyncio:
>
>   https://github.com/1st1/greenio/
>
>
> __

[openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-03 Thread Mark McLoughlin
Hey

This is an attempt to summarize a really useful discussion that Victor,
Flavio and I have been having today. At the bottom are some background
links - basically what I have open in my browser right now thinking
through all of this.

We're attempting to take baby-steps towards moving completely from
eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
first victim.

Ceilometer's code is run in response to various I/O events like REST API
requests, RPC calls, notifications received, etc. We eventually want the
asyncio event loop to be what schedules Ceilometer's code in response to
these events. Right now, it is eventlet doing that.

Now, because we're using eventlet, the code that is run in response to
these events looks like synchronous code that makes a bunch of
synchronous calls. For example, the code might do some_sync_op() and
that will cause a context switch to a different greenthread (within the
same native thread) where we might handle another I/O event (like a REST
API request) while we're waiting for some_sync_op() to return:

  def foo(self):
  result = some_sync_op()  # this may yield to another greenlet
  return do_stuff(result)

Eventlet's infamous monkey patching is what make this magic happen.

When we switch to asyncio's event loop, all of this code needs to be
ported to asyncio's explicitly asynchronous approach. We might do:

  @asyncio.coroutine
  def foo(self):
  result = yield from some_async_op(...)
  return do_stuff(result)

or:

  @asyncio.coroutine
  def foo(self):
  fut = Future()
  some_async_op(callback=fut.set_result)
  ...
  result = yield from fut
  return do_stuff(result)

Porting from eventlet's implicit async approach to asyncio's explicit
async API will be seriously time consuming and we need to be able to do
it piece-by-piece.

The question then becomes what do we need to do in order to port a
single oslo.messaging RPC endpoint method in Ceilometer to asyncio's
explicit async approach?

The plan is:

  - we stick with eventlet; everything gets monkey patched as normal

  - we register the greenio event loop with asyncio - this means that 
e.g. when you schedule an asyncio coroutine, greenio runs it in a 
greenlet using eventlet's event loop

  - oslo.messaging will need a new variant of eventlet executor which 
knows how to dispatch an asyncio coroutine. For example:

while True:
incoming = self.listener.poll()
method = dispatcher.get_endpoint_method(incoming)
if asyncio.iscoroutinefunc(method):
result = method()
self._greenpool.spawn_n(incoming.reply, result)
else:
self._greenpool.spawn_n(method)

it's important that even with a coroutine endpoint method, we send 
the reply in a greenthread so that the dispatch greenthread doesn't
get blocked if the incoming.reply() call causes a greenlet context
switch

  - when all of ceilometer has been ported over to asyncio coroutines, 
we can stop monkey patching, stop using greenio and switch to the 
asyncio event loop

  - when we make this change, we'll want a completely native asyncio 
oslo.messaging executor. Unless the oslo.messaging drivers support 
asyncio themselves, that executor will probably need a separate
native thread to poll for messages and send replies.

If you're confused, that's normal. We had to take several breaks to get
even this far because our brains kept getting fried.

HTH,
Mark.

Victor's excellent docs on asyncio and trollius:

  https://docs.python.org/3/library/asyncio.html
  http://trollius.readthedocs.org/

Victor's proposed asyncio executor:

  https://review.openstack.org/70948

The case for adopting asyncio in OpenStack:

  https://wiki.openstack.org/wiki/Oslo/blueprints/asyncio

A previous email I wrote about an asyncio executor:

 http://lists.openstack.org/pipermail/openstack-dev/2013-June/009934.html

The mock-up of an asyncio executor I wrote:

  
https://github.com/markmc/oslo-incubator/blob/8509b8b/openstack/common/messaging/_executors/impl_tulip.py

My blog post on async I/O and Python:

  http://blogs.gnome.org/markmc/2013/06/04/async-io-and-python/

greenio - greelets support for asyncio:

  https://github.com/1st1/greenio/


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev