Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-12-01 Thread Zane Bitter

On 13/11/14 13:59, Clint Byrum wrote:

I'm not sure we have the same understanding of AMQP, so hopefully we can
clarify here. This stackoverflow answer echoes my understanding:

http://stackoverflow.com/questions/17841843/rabbitmq-does-one-consumer-block-the-other-consumers-of-the-same-queue

Not ack'ing just means they might get retransmitted if we never ack. It
doesn't block other consumers. And as the link above quotes from the
AMQP spec, when there are multiple consumers, FIFO is not guaranteed.
Other consumers get other messages.


Thanks, obviously my recollection of how AMQP works was coloured too 
much by oslo.messaging.



So just add the ability for a consumer to read, work, ack to
oslo.messaging, and this is mostly handled via AMQP. Of course that
also likely means no zeromq for Heat without accepting that messages
may be lost if workers die.

Basically we need to add something that is not RPC but instead
jobqueue that mimics this:

http://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messaging/rpc/dispatcher.py#n131

I've always been suspicious of this bit of code, as it basically means
that if anything fails between that call, and the one below it, we have
lost contact, but as long as clients are written to re-send when there
is a lack of reply, there shouldn't be a problem. But, for a job queue,
there is no reply, and so the worker would dispatch, and then
acknowledge after the dispatched call had returned (including having
completed the step where new messages are added to the queue for any
newly-possible children).


I'm curious how people are deploying Rabbit at the moment. Are they 
setting up multiple brokers and writing messages to disk before 
accepting them? I assume yes on the former but no on the latter, since 
there's no particular point in having e.g. 5 nines durability in the 
queue when the overall system is as weak as your flakiest node.


OTOH if we were to add what you're proposing, then we would need folks 
to deploy Rabbit that way (at least for Heat), since waiting for Acks on 
receipt is insufficient to make messaging reliable if the broker can 
easily outright lose the message.


I think all of the proposed approaches would benefit from this feature, 
but I'm concerned about any increased burden on deployers too.


cheers,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-14 Thread Joshua Harlow
Sounds like this is tooz[1] ;)

The api for tooz (soon to be an oslo library @ 
https://review.openstack.org/#/c/122439/) is around coordination and 
'service-group' like behavior so I hope we don't have duplicates of this in 
'oslo.healthcheck' instead of just using/contributing to tooz instead.

https://github.com/stackforge/tooz/blob/master/tooz/coordination.py#L63

CoordinationDriver
- watch_join_group
- unwatch_join_group
- join_group
- get_members
- ...

Tooz has backends that use [redis, zookeeper, memcache] to achieve the above 
API (it also has some locking support for distributed locks as well).

Feel free to jump on #openstack-state-management if u want more info (jd and 
the enovance guys and myself have developed that library for this kind of 
purpose).

-josh

On Nov 13, 2014, at 10:58 PM, Jastrzebski, Michal 
michal.jastrzeb...@intel.com wrote:

 Also, on Common approach to HA session we moved something like 
 oslo.healthcheck (or whatever it will be called), common lib for 
 service-group like behavior. In my opinion it's pointless to implement 
 zookeeper management in every project separately (its already in nova..). 
 Might be worth looking closely into this topic.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-14 Thread Joshua Harlow
Sounds like this is tooz[1] ;)

The api for tooz (soon to be an oslo library @ 
https://review.openstack.org/#/c/122439/) is around coordination and 
'service-group' like behavior so I hope we don't have duplicates of this in 
'oslo.healthcheck' instead of just using/contributing to tooz instead.

https://github.com/stackforge/tooz/blob/master/tooz/coordination.py#L63

CoordinationDriver
- watch_join_group
- unwatch_join_group
- join_group
- get_members
- ...

Tooz has backends that use [redis, zookeeper, memcache] to achieve the above 
API (it also has some locking support for distributed locks as well).

Feel free to jump on #openstack-state-management if u want more info (jd and 
the enovance guys and myself have developed that library for this kind of 
purpose).

-josh

On Nov 13, 2014, at 10:58 PM, Jastrzebski, Michal 
michal.jastrzeb...@intel.com wrote:

 Also, on Common approach to HA session we moved something like 
 oslo.healthcheck (or whatever it will be called), common lib for 
 service-group like behavior. In my opinion it's pointless to implement 
 zookeeper management in every project separately (its already in nova..). 
 Might be worth looking closely into this topic.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-14 Thread Joshua Harlow
Sounds like this is tooz[1] ;)

The api for tooz (soon to be an oslo library @ 
https://review.openstack.org/#/c/122439/) is around coordination and 
'service-group' like behavior so I hope we don't have duplicates of this in 
'oslo.healthcheck' instead of just using/contributing to tooz instead.

https://github.com/stackforge/tooz/blob/master/tooz/coordination.py#L63

CoordinationDriver
- watch_join_group
- unwatch_join_group
- join_group
- get_members
- ...

Tooz has backends that use [redis, zookeeper, memcache] to achieve the above 
API (it also has some locking support for distributed locks as well).

Feel free to jump on #openstack-state-management if u want more info (jd and 
the enovance guys and myself have developed that library for this kind of 
purpose).

-josh

On Nov 13, 2014, at 10:58 PM, Jastrzebski, Michal 
michal.jastrzeb...@intel.com wrote:

 Also, on Common approach to HA session we moved something like 
 oslo.healthcheck (or whatever it will be called), common lib for 
 service-group like behavior. In my opinion it's pointless to implement 
 zookeeper management in every project separately (its already in nova..). 
 Might be worth looking closely into this topic.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-14 Thread Joshua Harlow
Arg, sorry for the spam, mail.app was still trying to send it multiple times 
for some reason...
-Josh
 
 From: Joshua Harlow harlo...@outlook.com
 To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org 
 Sent: Friday, November 14, 2014 11:45 AM
 Subject: Re: [openstack-dev] [Heat] Using Job Queues for timeout ops
   
Sounds like this is tooz[1] ;)

The api for tooz (soon to be an oslo library @ 
https://review.openstack.org/#/c/122439/) is around coordination and 
'service-group' like behavior so I hope we don't have duplicates of this in 
'oslo.healthcheck' instead of just using/contributing to tooz instead.

https://github.com/stackforge/tooz/blob/master/tooz/coordination.py#L63

CoordinationDriver
- watch_join_group
- unwatch_join_group
- join_group
- get_members
- ...

Tooz has backends that use [redis, zookeeper, memcache] to achieve the above 
API (it also has some locking support for distributed locks as well).

Feel free to jump on #openstack-state-management if u want more info (jd and 
the enovance guys and myself have developed that library for this kind of 
purpose).

-josh



On Nov 13, 2014, at 10:58 PM, Jastrzebski, Michal 
michal.jastrzeb...@intel.com wrote:

 Also, on Common approach to HA session we moved something like 
 oslo.healthcheck (or whatever it will be called), common lib for 
 service-group like behavior. In my opinion it's pointless to implement 
 zookeeper management in every project separately (its already in nova..). 
 Might be worth looking closely into this topic.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


   ___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Joshua Harlow
A question;

How is using something like celery in heat vs taskflow in heat (or at least 
concept [1]) 'to many code change'.

Both seem like change of similar levels ;-)

What was your metric for determining the code change either would have (out of 
curiosity)?

Perhaps u should look at [2], although I'm unclear on what the desired 
functionality is here.

Do u want the single engine to transfer its work to another engine when it 
'goes down'? If so then the jobboard model + zookeper inherently does this.

Or maybe u want something else? I'm probably confused because u seem to be 
asking for resource timeouts + recover from engine failure (which seems like a 
liveness issue and not a resource timeout one), those 2 things seem separable.

[1] http://docs.openstack.org/developer/taskflow/jobs.html

[2] 
http://docs.openstack.org/developer/taskflow/examples.html#jobboard-producer-consumer-simple

On Nov 13, 2014, at 12:29 AM, Murugan, Visnusaran visnusaran.muru...@hp.com 
wrote:

 Hi all,
  
 Convergence-POC distributes stack operations by sending resource actions over 
 RPC for any heat-engine to execute. Entire stack lifecycle will be controlled 
 by worker/observer notifications. This distributed model has its own 
 advantages and disadvantages.
  
 Any stack operation has a timeout and a single engine will be responsible for 
 it. If that engine goes down, timeout is lost along with it. So a traditional 
 way is for other engines to recreate timeout from scratch. Also a missed 
 resource action notification will be detected only when stack operation 
 timeout happens.
  
 To overcome this, we will need the following capability:
 1.   Resource timeout (can be used for retry)
 2.   Recover from engine failure (loss of stack timeout, resource action 
 notification)
  
  
 Suggestion:
 1.   Use task queue like celery to host timeouts for both stack and 
 resource.
 2.   Poll database for engine failures and restart timers/ retrigger 
 resource retry (IMHO: This would be a traditional and weighs heavy)
 3.   Migrate heat to use TaskFlow. (Too many code change)
  
 I am not suggesting we use Task Flow. Using celery will have very minimum 
 code change. (decorate appropriate functions)
  
  
 Your thoughts.
  
 -Vishnu
 IRC: ckmvishnu
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Murugan, Visnusaran
Hi,

Intension is not to transfer work load of a failed engine onto an active one. 
Convergence implementation that we are working on will be able to recover from 
a failure, provided a timeout notification hits heat-engine. All I want is a 
safe holding area for my timeout tasks. Timeout can be a stack timeout or a 
resource timeout.

By code change :) I meant posting to a job queue will be a matter of decorating 
timeout method and firing it for a delayed execution. Felt that we need not use 
taskflow just for posting a delayed execution(timer in our case).

Correct me if I'm wrong.

-Vishnu

From: Joshua Harlow [mailto:harlo...@outlook.com]
Sent: Thursday, November 13, 2014 2:15 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

A question;

How is using something like celery in heat vs taskflow in heat (or at least 
concept [1]) 'to many code change'.

Both seem like change of similar levels ;-)

What was your metric for determining the code change either would have (out of 
curiosity)?

Perhaps u should look at [2], although I'm unclear on what the desired 
functionality is here.

Do u want the single engine to transfer its work to another engine when it 
'goes down'? If so then the jobboard model + zookeper inherently does this.

Or maybe u want something else? I'm probably confused because u seem to be 
asking for resource timeouts + recover from engine failure (which seems like a 
liveness issue and not a resource timeout one), those 2 things seem separable.

[1] http://docs.openstack.org/developer/taskflow/jobs.html

[2] 
http://docs.openstack.org/developer/taskflow/examples.html#jobboard-producer-consumer-simple

On Nov 13, 2014, at 12:29 AM, Murugan, Visnusaran 
visnusaran.muru...@hp.commailto:visnusaran.muru...@hp.com wrote:


Hi all,

Convergence-POC distributes stack operations by sending resource actions over 
RPC for any heat-engine to execute. Entire stack lifecycle will be controlled 
by worker/observer notifications. This distributed model has its own advantages 
and disadvantages.

Any stack operation has a timeout and a single engine will be responsible for 
it. If that engine goes down, timeout is lost along with it. So a traditional 
way is for other engines to recreate timeout from scratch. Also a missed 
resource action notification will be detected only when stack operation timeout 
happens.

To overcome this, we will need the following capability:
1.   Resource timeout (can be used for retry)
2.   Recover from engine failure (loss of stack timeout, resource action 
notification)


Suggestion:
1.   Use task queue like celery to host timeouts for both stack and 
resource.
2.   Poll database for engine failures and restart timers/ retrigger 
resource retry (IMHO: This would be a traditional and weighs heavy)
3.   Migrate heat to use TaskFlow. (Too many code change)

I am not suggesting we use Task Flow. Using celery will have very minimum code 
change. (decorate appropriate functions)


Your thoughts.

-Vishnu
IRC: ckmvishnu
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.orgmailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Angus Salkeld
On Thu, Nov 13, 2014 at 6:29 PM, Murugan, Visnusaran 
visnusaran.muru...@hp.com wrote:

  Hi all,



 Convergence-POC distributes stack operations by sending resource actions
 over RPC for any heat-engine to execute. Entire stack lifecycle will be
 controlled by worker/observer notifications. This distributed model has its
 own advantages and disadvantages.



 Any stack operation has a timeout and a single engine will be responsible
 for it. If that engine goes down, timeout is lost along with it. So a
 traditional way is for other engines to recreate timeout from scratch. Also
 a missed resource action notification will be detected only when stack
 operation timeout happens.



 To overcome this, we will need the following capability:

 1.   Resource timeout (can be used for retry)

We will shortly have a worker job, can't we have a job that just sleeps
that gets started in parallel with the job that is doing the work?
It gets to the end of the sleep and runs a check.

  2.   Recover from engine failure (loss of stack timeout, resource
 action notification)




My suggestion above could catch failures as long as it was run in a
different process.

-Angus




 Suggestion:

 1.   Use task queue like celery to host timeouts for both stack and
 resource.

 2.   Poll database for engine failures and restart timers/ retrigger
 resource retry (IMHO: This would be a traditional and weighs heavy)

 3.   Migrate heat to use TaskFlow. (Too many code change)



 I am not suggesting we use Task Flow. Using celery will have very minimum
 code change. (decorate appropriate functions)





 Your thoughts.



 -Vishnu

 IRC: ckmvishnu

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Murugan, Visnusaran
Parallel worker was what I initially thought. But what to do if the engine 
hosting that worker goes down?

-Vishnu

From: Angus Salkeld [mailto:asalk...@mirantis.com]
Sent: Thursday, November 13, 2014 5:22 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

On Thu, Nov 13, 2014 at 6:29 PM, Murugan, Visnusaran 
visnusaran.muru...@hp.commailto:visnusaran.muru...@hp.com wrote:
Hi all,

Convergence-POC distributes stack operations by sending resource actions over 
RPC for any heat-engine to execute. Entire stack lifecycle will be controlled 
by worker/observer notifications. This distributed model has its own advantages 
and disadvantages.

Any stack operation has a timeout and a single engine will be responsible for 
it. If that engine goes down, timeout is lost along with it. So a traditional 
way is for other engines to recreate timeout from scratch. Also a missed 
resource action notification will be detected only when stack operation timeout 
happens.

To overcome this, we will need the following capability:

1.   Resource timeout (can be used for retry)
We will shortly have a worker job, can't we have a job that just sleeps that 
gets started in parallel with the job that is doing the work?
It gets to the end of the sleep and runs a check.

2.   Recover from engine failure (loss of stack timeout, resource action 
notification)


My suggestion above could catch failures as long as it was run in a different 
process.
-Angus


Suggestion:

1.   Use task queue like celery to host timeouts for both stack and 
resource.

2.   Poll database for engine failures and restart timers/ retrigger 
resource retry (IMHO: This would be a traditional and weighs heavy)

3.   Migrate heat to use TaskFlow. (Too many code change)

I am not suggesting we use Task Flow. Using celery will have very minimum code 
change. (decorate appropriate functions)


Your thoughts.

-Vishnu
IRC: ckmvishnu

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.orgmailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Zane Bitter

On 13/11/14 06:52, Angus Salkeld wrote:

On Thu, Nov 13, 2014 at 6:29 PM, Murugan, Visnusaran
visnusaran.muru...@hp.com mailto:visnusaran.muru...@hp.com wrote:

Hi all,

__ __

Convergence-POC distributes stack operations by sending resource
actions over RPC for any heat-engine to execute. Entire stack
lifecycle will be controlled by worker/observer notifications. This
distributed model has its own advantages and disadvantages.

__ __

Any stack operation has a timeout and a single engine will be
responsible for it. If that engine goes down, timeout is lost along
with it. So a traditional way is for other engines to recreate
timeout from scratch. Also a missed resource action notification
will be detected only when stack operation timeout happens. __ __

__ __

To overcome this, we will need the following capability:

__1.__Resource timeout (can be used for retry)

We will shortly have a worker job, can't we have a job that just sleeps
that gets started in parallel with the job that is doing the work?
It gets to the end of the sleep and runs a check.


What if that worker dies too? There's no guarantee that it'd even be a 
different worker. In fact, there's not even a guarantee that we'd have 
multiple workers.


BTW Steve Hardy's suggestion, which I have more or less come around to, 
is that the engines themselves should be the workers in convergence, to 
save operators deploying two types of processes. (The observers will 
still be a separate process though, in phase 2.)





__2.__Recover from engine failure (loss of stack timeout, resource
action notification)

__


My suggestion above could catch failures as long as it was run in a
different process.

-Angus

__

__ __

Suggestion:

__1.__Use task queue like celery to host timeouts for both stack and
resource.

__2.__Poll database for engine failures and restart timers/
retrigger resource retry (IMHO: This would be a traditional and
weighs heavy)

__3.__Migrate heat to use TaskFlow. (Too many code change)

__ __

I am not suggesting we use Task Flow. Using celery will have very
minimum code change. (decorate appropriate functions) 

__ __

__ __

Your thoughts.

__ __

-Vishnu

IRC: ckmvishnu


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Zane Bitter

On 13/11/14 03:29, Murugan, Visnusaran wrote:

Hi all,

Convergence-POC distributes stack operations by sending resource actions
over RPC for any heat-engine to execute. Entire stack lifecycle will be
controlled by worker/observer notifications. This distributed model has
its own advantages and disadvantages.

Any stack operation has a timeout and a single engine will be
responsible for it. If that engine goes down, timeout is lost along with
it. So a traditional way is for other engines to recreate timeout from
scratch. Also a missed resource action notification will be detected
only when stack operation timeout happens.

To overcome this, we will need the following capability:

1.Resource timeout (can be used for retry)


I don't believe this is strictly needed for phase 1 (essentially we 
don't have it now, so nothing gets worse).


For phase 2, yes, we'll want it. One thing we haven't discussed much is 
that if we used Zaqar for this then the observer could claim a message 
but not acknowledge it until it had processed it, so we could have 
guaranteed delivery.



2.Recover from engine failure (loss of stack timeout, resource action
notification)

Suggestion:

1.Use task queue like celery to host timeouts for both stack and resource.


I believe Celery is more or less a non-starter as an OpenStack 
dependency because it uses Kombu directly to talk to the queue, vs. 
oslo.messaging which is an abstraction layer over Kombu, Qpid, ZeroMQ 
and maybe others in the future. i.e. requiring Celery means that some 
users would be forced to install Rabbit for the first time.


One option would be to fork Celery and replace Kombu with oslo.messaging 
as its abstraction layer. Good luck getting that maintained though, 
since Celery _invented_ Kombu to be it's abstraction layer.



2.Poll database for engine failures and restart timers/ retrigger
resource retry (IMHO: This would be a traditional and weighs heavy)

3.Migrate heat to use TaskFlow. (Too many code change)


If it's just handling timed triggers (maybe this is closer to #2) and 
not migrating the whole code base, then I don't see why it would be a 
big change (or even a change at all - it's basically new functionality). 
I'm not sure if TaskFlow has something like this already. If not we 
could also look at what Mistral is doing with timed tasks and see if we 
could spin some of it out into an Oslo library.


cheers,
Zane.


I am not suggesting we use Task Flow. Using celery will have very
minimum code change. (decorate appropriate functions)

Your thoughts.

-Vishnu

IRC: ckmvishnu



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Murugan, Visnusaran
Zane,

We do follow shardy's suggestion of having worker/observer as eventlet in 
heat-engine. No new process. The timer will be executed under an engine's 
worker.

Question:
1. heat-engine processing resource-action failed (process killed)
2. heat-engine processing timeout for a stack fails (process killed)

In the above mentioned cases, I thought celery tasks would come to our rescue.

Convergence-poc implementation can recover from error and retry if there is a 
notification available.


-Vishnu

-Original Message-
From: Zane Bitter [mailto:zbit...@redhat.com] 
Sent: Thursday, November 13, 2014 7:05 PM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

On 13/11/14 06:52, Angus Salkeld wrote:
 On Thu, Nov 13, 2014 at 6:29 PM, Murugan, Visnusaran 
 visnusaran.muru...@hp.com mailto:visnusaran.muru...@hp.com wrote:

 Hi all,

 __ __

 Convergence-POC distributes stack operations by sending resource
 actions over RPC for any heat-engine to execute. Entire stack
 lifecycle will be controlled by worker/observer notifications. This
 distributed model has its own advantages and disadvantages.

 __ __

 Any stack operation has a timeout and a single engine will be
 responsible for it. If that engine goes down, timeout is lost along
 with it. So a traditional way is for other engines to recreate
 timeout from scratch. Also a missed resource action notification
 will be detected only when stack operation timeout happens. __ __

 __ __

 To overcome this, we will need the following capability:

 __1.__Resource timeout (can be used for retry)

 We will shortly have a worker job, can't we have a job that just 
 sleeps that gets started in parallel with the job that is doing the work?
 It gets to the end of the sleep and runs a check.

What if that worker dies too? There's no guarantee that it'd even be a 
different worker. In fact, there's not even a guarantee that we'd have multiple 
workers.

BTW Steve Hardy's suggestion, which I have more or less come around to, is that 
the engines themselves should be the workers in convergence, to save operators 
deploying two types of processes. (The observers will still be a separate 
process though, in phase 2.)

 

 __2.__Recover from engine failure (loss of stack timeout, resource
 action notification)

 __


 My suggestion above could catch failures as long as it was run in a 
 different process.

 -Angus

 __

 __ __

 Suggestion:

 __1.__Use task queue like celery to host timeouts for both stack and
 resource.

 __2.__Poll database for engine failures and restart timers/
 retrigger resource retry (IMHO: This would be a traditional and
 weighs heavy)

 __3.__Migrate heat to use TaskFlow. (Too many code change)

 __ __

 I am not suggesting we use Task Flow. Using celery will have very
 minimum code change. (decorate appropriate functions) 

 __ __

 __ __

 Your thoughts.

 __ __

 -Vishnu

 IRC: ckmvishnu


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Jastrzebski, Michal
Guys, I don't think we want to get into this cluster management mud. You say 
let's
make observer...and what if observer dies? Do we do observer to observer? And 
then
there is split brain. I'm observer, I've lost connection to worker. Should I 
restart a worker?
Maybe I'm one who lost connection to the rest of the world? Should I resume 
task and risk
duplicate workload?

And then there is another problem. If there is timeout caused by limit of 
resources of workers,
if  we restart whole workload after timeout, we will stretch these resources 
even further, and in turn
we'll get more timeouts (...) - great way to kill whole setup.

So we get to horizontal scalability. Or total lack of it. Any stack that is too 
complicated for single engine
to process will be impossible to process at all. We should find a way to 
distribute workloads in
active-active, stateless (as much as possible) manner.

Regards,
Michał inc0 Jastrzębski   

 -Original Message-
 From: Murugan, Visnusaran [mailto:visnusaran.muru...@hp.com]
 Sent: Thursday, November 13, 2014 2:59 PM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [Heat] Using Job Queues for timeout ops
 
 Zane,
 
 We do follow shardy's suggestion of having worker/observer as eventlet in
 heat-engine. No new process. The timer will be executed under an engine's
 worker.
 
 Question:
 1. heat-engine processing resource-action failed (process killed) 2. heat-
 engine processing timeout for a stack fails (process killed)
 
 In the above mentioned cases, I thought celery tasks would come to our
 rescue.
 
 Convergence-poc implementation can recover from error and retry if there is
 a notification available.
 
 
 -Vishnu
 
 -Original Message-
 From: Zane Bitter [mailto:zbit...@redhat.com]
 Sent: Thursday, November 13, 2014 7:05 PM
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [Heat] Using Job Queues for timeout ops
 
 On 13/11/14 06:52, Angus Salkeld wrote:
  On Thu, Nov 13, 2014 at 6:29 PM, Murugan, Visnusaran
  visnusaran.muru...@hp.com mailto:visnusaran.muru...@hp.com
 wrote:
 
  Hi all,
 
  __ __
 
  Convergence-POC distributes stack operations by sending resource
  actions over RPC for any heat-engine to execute. Entire stack
  lifecycle will be controlled by worker/observer notifications. This
  distributed model has its own advantages and disadvantages.
 
  __ __
 
  Any stack operation has a timeout and a single engine will be
  responsible for it. If that engine goes down, timeout is lost along
  with it. So a traditional way is for other engines to recreate
  timeout from scratch. Also a missed resource action notification
  will be detected only when stack operation timeout happens. __ __
 
  __ __
 
  To overcome this, we will need the following capability:
 
  __1.__Resource timeout (can be used for retry)
 
  We will shortly have a worker job, can't we have a job that just
  sleeps that gets started in parallel with the job that is doing the work?
  It gets to the end of the sleep and runs a check.
 
 What if that worker dies too? There's no guarantee that it'd even be a
 different worker. In fact, there's not even a guarantee that we'd have
 multiple workers.
 
 BTW Steve Hardy's suggestion, which I have more or less come around to, is
 that the engines themselves should be the workers in convergence, to save
 operators deploying two types of processes. (The observers will still be a
 separate process though, in phase 2.)
 
  
 
  __2.__Recover from engine failure (loss of stack timeout, resource
  action notification)
 
  __
 
 
  My suggestion above could catch failures as long as it was run in a
  different process.
 
  -Angus
 
  __
 
  __ __
 
  Suggestion:
 
  __1.__Use task queue like celery to host timeouts for both stack and
  resource.
 
  __2.__Poll database for engine failures and restart timers/
  retrigger resource retry (IMHO: This would be a traditional and
  weighs heavy)
 
  __3.__Migrate heat to use TaskFlow. (Too many code change)
 
  __ __
 
  I am not suggesting we use Task Flow. Using celery will have very
  minimum code change. (decorate appropriate functions) 
 
  __ __
 
  __ __
 
  Your thoughts.
 
  __ __
 
  -Vishnu
 
  IRC: ckmvishnu
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  mailto:OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 ___
 OpenStack-dev mailing list

Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Zane Bitter

On 13/11/14 09:31, Jastrzebski, Michal wrote:

Guys, I don't think we want to get into this cluster management mud. You say 
let's
make observer...and what if observer dies? Do we do observer to observer? And 
then
there is split brain. I'm observer, I've lost connection to worker. Should I 
restart a worker?
Maybe I'm one who lost connection to the rest of the world? Should I resume 
task and risk
duplicate workload?


I think you're misinterpreting what we mean by observer. See 
https://wiki.openstack.org/wiki/Heat/ConvergenceDesign


- ZB

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Clint Byrum
Excerpts from Zane Bitter's message of 2014-11-13 05:54:03 -0800:
 On 13/11/14 03:29, Murugan, Visnusaran wrote:
  Hi all,
 
  Convergence-POC distributes stack operations by sending resource actions
  over RPC for any heat-engine to execute. Entire stack lifecycle will be
  controlled by worker/observer notifications. This distributed model has
  its own advantages and disadvantages.
 
  Any stack operation has a timeout and a single engine will be
  responsible for it. If that engine goes down, timeout is lost along with
  it. So a traditional way is for other engines to recreate timeout from
  scratch. Also a missed resource action notification will be detected
  only when stack operation timeout happens.
 
  To overcome this, we will need the following capability:
 
  1.Resource timeout (can be used for retry)
 
 I don't believe this is strictly needed for phase 1 (essentially we 
 don't have it now, so nothing gets worse).
 

We do have a stack timeout, and it stands to reason that we won't have a
single box with a timeout greenthread after this, so a strategy is
needed.

 For phase 2, yes, we'll want it. One thing we haven't discussed much is 
 that if we used Zaqar for this then the observer could claim a message 
 but not acknowledge it until it had processed it, so we could have 
 guaranteed delivery.


Frankly, if oslo.messaging doesn't support reliable delivery then we
need to add it. Zaqar should have nothing to do with this and is, IMO, a
poor choice at this stage, though I like the idea of using it in the
future so that we can make Heat more of an outside-the-cloud app.

  2.Recover from engine failure (loss of stack timeout, resource action
  notification)
 
  Suggestion:
 
  1.Use task queue like celery to host timeouts for both stack and resource.
 
 I believe Celery is more or less a non-starter as an OpenStack 
 dependency because it uses Kombu directly to talk to the queue, vs. 
 oslo.messaging which is an abstraction layer over Kombu, Qpid, ZeroMQ 
 and maybe others in the future. i.e. requiring Celery means that some 
 users would be forced to install Rabbit for the first time.

 One option would be to fork Celery and replace Kombu with oslo.messaging 
 as its abstraction layer. Good luck getting that maintained though, 
 since Celery _invented_ Kombu to be it's abstraction layer.
 

A slight side point here: Kombu supports Qpid and ZeroMQ. Oslo.messaging
is more about having a unified API than a set of magic backends. It
actually boggles my mind why we didn't just use kombu (cue 20 reactions
with people saying it wasn't EXACTLY right), but I think we're committed
to oslo.messaging now. Anyway, celery would need no such refactor, as
kombu would be able to access the same bus as everything else just fine.

  2.Poll database for engine failures and restart timers/ retrigger
  resource retry (IMHO: This would be a traditional and weighs heavy)
 
  3.Migrate heat to use TaskFlow. (Too many code change)
 
 If it's just handling timed triggers (maybe this is closer to #2) and 
 not migrating the whole code base, then I don't see why it would be a 
 big change (or even a change at all - it's basically new functionality). 
 I'm not sure if TaskFlow has something like this already. If not we 
 could also look at what Mistral is doing with timed tasks and see if we 
 could spin some of it out into an Oslo library.
 

I feel like it boils down to something running periodically checking for
scheduled tasks that are due to run but have not run yet. I wonder if we
can actually look at Ironic for how they do this, because Ironic polls
power state of machines constantly, and uses a hash ring to make sure
only one conductor is polling any one machine at a time. If we broke
stacks up into a hash ring like that for the purpose of singleton tasks
like timeout checking, that might work out nicely.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Jastrzebski, Michal
By observer I mean process which will actually notify about stack timeout. 
Maybe it was poor  choice of words. Anyway, something will need to check what 
stacks are timed out, and that's new single point of failure.

 -Original Message-
 From: Zane Bitter [mailto:zbit...@redhat.com]
 Sent: Thursday, November 13, 2014 3:49 PM
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [Heat] Using Job Queues for timeout ops
 
 On 13/11/14 09:31, Jastrzebski, Michal wrote:
  Guys, I don't think we want to get into this cluster management mud.
  You say let's make observer...and what if observer dies? Do we do
  observer to observer? And then there is split brain. I'm observer, I've lost
 connection to worker. Should I restart a worker?
  Maybe I'm one who lost connection to the rest of the world? Should I
  resume task and risk duplicate workload?
 
 I think you're misinterpreting what we mean by observer. See
 https://wiki.openstack.org/wiki/Heat/ConvergenceDesign
 
 - ZB
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Clint Byrum
Excerpts from Joshua Harlow's message of 2014-11-13 00:45:07 -0800:
 A question;
 
 How is using something like celery in heat vs taskflow in heat (or at least 
 concept [1]) 'to many code change'.
 
 Both seem like change of similar levels ;-)
 

I've tried a few times to dive into refactoring some things to use
TaskFlow at a shallow level, and have always gotten confused and
frustrated.

The amount of lines that are changed probably is the same. But the
massive shift in thinking is not an easy one to make. It may be worth some
thinking on providing a shorter bridge to TaskFlow adoption, because I'm
a huge fan of the idea and would _start_ something with it in a heartbeat,
but refactoring things to use it feels really weird to me.

 What was your metric for determining the code change either would have (out 
 of curiosity)?
 
 Perhaps u should look at [2], although I'm unclear on what the desired 
 functionality is here.
 
 Do u want the single engine to transfer its work to another engine when it 
 'goes down'? If so then the jobboard model + zookeper inherently does this.
 
 Or maybe u want something else? I'm probably confused because u seem to be 
 asking for resource timeouts + recover from engine failure (which seems like 
 a liveness issue and not a resource timeout one), those 2 things seem 
 separable.
 

I agree with you on this. It is definitely a liveness problem. The
resource timeout isn't something I've seen discussed before. We do have
a stack timeout, and we need to keep on honoring that, but we can do
that with a job that sleeps for the stack timeout if we have a liveness
guarantee that will resurrect the job (with the sleep shortened by the
time since stack-update-time) somewhere else if the original engine
can't complete the job.

 [1] http://docs.openstack.org/developer/taskflow/jobs.html
 
 [2] 
 http://docs.openstack.org/developer/taskflow/examples.html#jobboard-producer-consumer-simple
 
 On Nov 13, 2014, at 12:29 AM, Murugan, Visnusaran visnusaran.muru...@hp.com 
 wrote:
 
  Hi all,
   
  Convergence-POC distributes stack operations by sending resource actions 
  over RPC for any heat-engine to execute. Entire stack lifecycle will be 
  controlled by worker/observer notifications. This distributed model has its 
  own advantages and disadvantages.
   
  Any stack operation has a timeout and a single engine will be responsible 
  for it. If that engine goes down, timeout is lost along with it. So a 
  traditional way is for other engines to recreate timeout from scratch. Also 
  a missed resource action notification will be detected only when stack 
  operation timeout happens.
   
  To overcome this, we will need the following capability:
  1.   Resource timeout (can be used for retry)
  2.   Recover from engine failure (loss of stack timeout, resource 
  action notification)
   
   
  Suggestion:
  1.   Use task queue like celery to host timeouts for both stack and 
  resource.
  2.   Poll database for engine failures and restart timers/ retrigger 
  resource retry (IMHO: This would be a traditional and weighs heavy)
  3.   Migrate heat to use TaskFlow. (Too many code change)
   
  I am not suggesting we use Task Flow. Using celery will have very minimum 
  code change. (decorate appropriate functions)
   
   
  Your thoughts.
   
  -Vishnu
  IRC: ckmvishnu
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Ryan Brown
On 11/13/2014 09:58 AM, Clint Byrum wrote:
 Excerpts from Zane Bitter's message of 2014-11-13 05:54:03 -0800:
 On 13/11/14 03:29, Murugan, Visnusaran wrote:

 [snip]

 3.Migrate heat to use TaskFlow. (Too many code change)

 If it's just handling timed triggers (maybe this is closer to #2) and 
 not migrating the whole code base, then I don't see why it would be a 
 big change (or even a change at all - it's basically new functionality). 
 I'm not sure if TaskFlow has something like this already. If not we 
 could also look at what Mistral is doing with timed tasks and see if we 
 could spin some of it out into an Oslo library.

 
 I feel like it boils down to something running periodically checking for
 scheduled tasks that are due to run but have not run yet. I wonder if we
 can actually look at Ironic for how they do this, because Ironic polls
 power state of machines constantly, and uses a hash ring to make sure
 only one conductor is polling any one machine at a time. If we broke
 stacks up into a hash ring like that for the purpose of singleton tasks
 like timeout checking, that might work out nicely.

+1

Using a hash ring is a great way to shard tasks. I think the most
sensible way to add this would be to make timeout polling a
responsibility of the Observer instead of the engine.

-- 
Ryan Brown / Software Engineer, Openstack / Red Hat, Inc.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Zane Bitter

On 13/11/14 09:58, Clint Byrum wrote:

Excerpts from Zane Bitter's message of 2014-11-13 05:54:03 -0800:

On 13/11/14 03:29, Murugan, Visnusaran wrote:

Hi all,

Convergence-POC distributes stack operations by sending resource actions
over RPC for any heat-engine to execute. Entire stack lifecycle will be
controlled by worker/observer notifications. This distributed model has
its own advantages and disadvantages.

Any stack operation has a timeout and a single engine will be
responsible for it. If that engine goes down, timeout is lost along with
it. So a traditional way is for other engines to recreate timeout from
scratch. Also a missed resource action notification will be detected
only when stack operation timeout happens.

To overcome this, we will need the following capability:

1.Resource timeout (can be used for retry)


I don't believe this is strictly needed for phase 1 (essentially we
don't have it now, so nothing gets worse).



We do have a stack timeout, and it stands to reason that we won't have a
single box with a timeout greenthread after this, so a strategy is
needed.


Right, that was 2, but I was talking specifically about the resource 
retry. I think we agree on both points.



For phase 2, yes, we'll want it. One thing we haven't discussed much is
that if we used Zaqar for this then the observer could claim a message
but not acknowledge it until it had processed it, so we could have
guaranteed delivery.



Frankly, if oslo.messaging doesn't support reliable delivery then we
need to add it.


That is straight-up impossible with AMQP. Either you ack the message and 
risk losing it if the worker dies before processing is complete, or you 
don't ack the message until it's processed and you become a blocker for 
every other worker trying to pull jobs off the queue. It works fine when 
you have only one worker; otherwise not so much. This is the crux of the 
whole why isn't Zaqar just Rabbit debate.


Most stuff in OpenStack gets around this by doing synchronous calls 
across oslo.messaging, where there is an end-to-end ack. We don't want 
that here though. We'll probably have to make do with having ways to 
recover after a failure (kick off another update with the same data is 
always an option). The hard part is that if something dies we don't 
really want to wait until the stack timeout to start recovering.



Zaqar should have nothing to do with this and is, IMO, a
poor choice at this stage, though I like the idea of using it in the
future so that we can make Heat more of an outside-the-cloud app.


I'm inclined to agree that it would be hard to force operators to deploy 
Zaqar in order to be able to deploy Heat, and that we should probably be 
cautious for that reason.


That said, from a purely technical point of view it's not a poor choice 
at all - it has *exactly* the semantics we want (unlike AMQP), and at 
least to the extent that the operator wants to offer Zaqar to users 
anyway it completely eliminates a whole backend that they would 
otherwise have to deploy. It's a tragedy that all of OpenStack has not 
been designed to build upon itself in this way and it causes me physical 
pain to know that we're about to perpetuate it.



2.Recover from engine failure (loss of stack timeout, resource action
notification)

Suggestion:

1.Use task queue like celery to host timeouts for both stack and resource.


I believe Celery is more or less a non-starter as an OpenStack
dependency because it uses Kombu directly to talk to the queue, vs.
oslo.messaging which is an abstraction layer over Kombu, Qpid, ZeroMQ
and maybe others in the future. i.e. requiring Celery means that some
users would be forced to install Rabbit for the first time.

One option would be to fork Celery and replace Kombu with oslo.messaging
as its abstraction layer. Good luck getting that maintained though,
since Celery _invented_ Kombu to be it's abstraction layer.



A slight side point here: Kombu supports Qpid and ZeroMQ. Oslo.messaging


You're right about Kombu supporting Qpid, it appears they added it. I 
don't see ZeroMQ on the list though:


http://kombu.readthedocs.org/en/latest/userguide/connections.html#transport-comparison


is more about having a unified API than a set of magic backends. It
actually boggles my mind why we didn't just use kombu (cue 20 reactions
with people saying it wasn't EXACTLY right), but I think we're committed


Well, we also have to take into account the fact that Qpid support was 
added only during the last 9 months, whereas oslo.messaging was 
implemented 3 years ago and time travel hasn't been invented yet (for 
any definition of 'yet').



to oslo.messaging now. Anyway, celery would need no such refactor, as
kombu would be able to access the same bus as everything else just fine.


Interesting, so that would make it easier to get Celery added to the 
global requirements, although we'd likely still have headaches to deal 
with around configuration.



2.Poll database for engine 

Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Nandavar, Divakar Padiyar
 Most stuff in OpenStack gets around this by doing synchronous calls across 
 oslo.messaging, where there is an end-to-end ack. We don't want that here  
 though. We'll probably have to make do with having ways to recover after a 
 failure (kick off another update with the same data is always an option). The 
 hard part is that if something dies we don't really want to wait until the 
 stack timeout to start recovering.

We should be able to address this in convergence without having to wait for 
stack timeout.  This scenario would be similar to initiating the stack update 
while another large stack update is still progress.  We are looking into 
addressing this scenario.

Thanks,
Divakar

-Original Message-
From: Zane Bitter [mailto:zbit...@redhat.com] 
Sent: Thursday, November 13, 2014 11:26 PM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

On 13/11/14 09:58, Clint Byrum wrote:
 Excerpts from Zane Bitter's message of 2014-11-13 05:54:03 -0800:
 On 13/11/14 03:29, Murugan, Visnusaran wrote:
 Hi all,

 Convergence-POC distributes stack operations by sending resource 
 actions over RPC for any heat-engine to execute. Entire stack 
 lifecycle will be controlled by worker/observer notifications. This 
 distributed model has its own advantages and disadvantages.

 Any stack operation has a timeout and a single engine will be 
 responsible for it. If that engine goes down, timeout is lost along 
 with it. So a traditional way is for other engines to recreate 
 timeout from scratch. Also a missed resource action notification 
 will be detected only when stack operation timeout happens.

 To overcome this, we will need the following capability:

 1.Resource timeout (can be used for retry)

 I don't believe this is strictly needed for phase 1 (essentially we 
 don't have it now, so nothing gets worse).


 We do have a stack timeout, and it stands to reason that we won't have 
 a single box with a timeout greenthread after this, so a strategy is 
 needed.

Right, that was 2, but I was talking specifically about the resource retry. I 
think we agree on both points.

 For phase 2, yes, we'll want it. One thing we haven't discussed much 
 is that if we used Zaqar for this then the observer could claim a 
 message but not acknowledge it until it had processed it, so we could 
 have guaranteed delivery.


 Frankly, if oslo.messaging doesn't support reliable delivery then we 
 need to add it.

That is straight-up impossible with AMQP. Either you ack the message and risk 
losing it if the worker dies before processing is complete, or you don't ack 
the message until it's processed and you become a blocker for every other 
worker trying to pull jobs off the queue. It works fine when you have only one 
worker; otherwise not so much. This is the crux of the whole why isn't Zaqar 
just Rabbit debate.

Most stuff in OpenStack gets around this by doing synchronous calls across 
oslo.messaging, where there is an end-to-end ack. We don't want that here 
though. We'll probably have to make do with having ways to recover after a 
failure (kick off another update with the same data is always an option). The 
hard part is that if something dies we don't really want to wait until the 
stack timeout to start recovering.



 Zaqar should have nothing to do with this and is, IMO, a poor choice 
 at this stage, though I like the idea of using it in the future so 
 that we can make Heat more of an outside-the-cloud app.

I'm inclined to agree that it would be hard to force operators to deploy Zaqar 
in order to be able to deploy Heat, and that we should probably be cautious for 
that reason.

That said, from a purely technical point of view it's not a poor choice at all 
- it has *exactly* the semantics we want (unlike AMQP), and at least to the 
extent that the operator wants to offer Zaqar to users anyway it completely 
eliminates a whole backend that they would otherwise have to deploy. It's a 
tragedy that all of OpenStack has not been designed to build upon itself in 
this way and it causes me physical pain to know that we're about to perpetuate 
it.

 2.Recover from engine failure (loss of stack timeout, resource 
 action
 notification)

 Suggestion:

 1.Use task queue like celery to host timeouts for both stack and resource.

 I believe Celery is more or less a non-starter as an OpenStack 
 dependency because it uses Kombu directly to talk to the queue, vs.
 oslo.messaging which is an abstraction layer over Kombu, Qpid, ZeroMQ 
 and maybe others in the future. i.e. requiring Celery means that some 
 users would be forced to install Rabbit for the first time.

 One option would be to fork Celery and replace Kombu with 
 oslo.messaging as its abstraction layer. Good luck getting that 
 maintained though, since Celery _invented_ Kombu to be it's abstraction 
 layer.


 A slight side point here: Kombu supports Qpid and ZeroMQ. 
 Oslo.messaging

You're right about

Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Clint Byrum
Excerpts from Zane Bitter's message of 2014-11-13 09:55:43 -0800:
 On 13/11/14 09:58, Clint Byrum wrote:
  Excerpts from Zane Bitter's message of 2014-11-13 05:54:03 -0800:
  On 13/11/14 03:29, Murugan, Visnusaran wrote:
  Hi all,
 
  Convergence-POC distributes stack operations by sending resource actions
  over RPC for any heat-engine to execute. Entire stack lifecycle will be
  controlled by worker/observer notifications. This distributed model has
  its own advantages and disadvantages.
 
  Any stack operation has a timeout and a single engine will be
  responsible for it. If that engine goes down, timeout is lost along with
  it. So a traditional way is for other engines to recreate timeout from
  scratch. Also a missed resource action notification will be detected
  only when stack operation timeout happens.
 
  To overcome this, we will need the following capability:
 
  1.Resource timeout (can be used for retry)
 
  I don't believe this is strictly needed for phase 1 (essentially we
  don't have it now, so nothing gets worse).
 
 
  We do have a stack timeout, and it stands to reason that we won't have a
  single box with a timeout greenthread after this, so a strategy is
  needed.
 
 Right, that was 2, but I was talking specifically about the resource 
 retry. I think we agree on both points.
 
  For phase 2, yes, we'll want it. One thing we haven't discussed much is
  that if we used Zaqar for this then the observer could claim a message
  but not acknowledge it until it had processed it, so we could have
  guaranteed delivery.
 
 
  Frankly, if oslo.messaging doesn't support reliable delivery then we
  need to add it.
 
 That is straight-up impossible with AMQP. Either you ack the message and 
 risk losing it if the worker dies before processing is complete, or you 
 don't ack the message until it's processed and you become a blocker for 
 every other worker trying to pull jobs off the queue. It works fine when 
 you have only one worker; otherwise not so much. This is the crux of the 
 whole why isn't Zaqar just Rabbit debate.
 

I'm not sure we have the same understanding of AMQP, so hopefully we can
clarify here. This stackoverflow answer echoes my understanding:

http://stackoverflow.com/questions/17841843/rabbitmq-does-one-consumer-block-the-other-consumers-of-the-same-queue

Not ack'ing just means they might get retransmitted if we never ack. It
doesn't block other consumers. And as the link above quotes from the
AMQP spec, when there are multiple consumers, FIFO is not guaranteed.
Other consumers get other messages.

So just add the ability for a consumer to read, work, ack to
oslo.messaging, and this is mostly handled via AMQP. Of course that
also likely means no zeromq for Heat without accepting that messages
may be lost if workers die.

Basically we need to add something that is not RPC but instead
jobqueue that mimics this:

http://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messaging/rpc/dispatcher.py#n131

I've always been suspicious of this bit of code, as it basically means
that if anything fails between that call, and the one below it, we have
lost contact, but as long as clients are written to re-send when there
is a lack of reply, there shouldn't be a problem. But, for a job queue,
there is no reply, and so the worker would dispatch, and then
acknowledge after the dispatched call had returned (including having
completed the step where new messages are added to the queue for any
newly-possible children).

Just to be clear, I believe what Zaqar adds is the ability to peek at
a specific message ID and not affect it in the queue, which is entirely
different than ACK'ing the ones you've already received in your session.

 Most stuff in OpenStack gets around this by doing synchronous calls 
 across oslo.messaging, where there is an end-to-end ack. We don't want 
 that here though. We'll probably have to make do with having ways to 
 recover after a failure (kick off another update with the same data is 
 always an option). The hard part is that if something dies we don't 
 really want to wait until the stack timeout to start recovering.


I fully agree. Josh's point about using a coordination service like
Zookeeper to maintain liveness is an interesting one here. If we just
make sure that all the workers that have claimed work off the queue are
alive, that should be sufficient to prevent a hanging stack situation
like you describe above.

  Zaqar should have nothing to do with this and is, IMO, a
  poor choice at this stage, though I like the idea of using it in the
  future so that we can make Heat more of an outside-the-cloud app.
 
 I'm inclined to agree that it would be hard to force operators to deploy 
 Zaqar in order to be able to deploy Heat, and that we should probably be 
 cautious for that reason.
 
 That said, from a purely technical point of view it's not a poor choice 
 at all - it has *exactly* the semantics we want (unlike AMQP), and at 
 least to the 

Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Joshua Harlow
On Nov 13, 2014, at 10:59 AM, Clint Byrum cl...@fewbar.com wrote:

 Excerpts from Zane Bitter's message of 2014-11-13 09:55:43 -0800:
 On 13/11/14 09:58, Clint Byrum wrote:
 Excerpts from Zane Bitter's message of 2014-11-13 05:54:03 -0800:
 On 13/11/14 03:29, Murugan, Visnusaran wrote:
 Hi all,
 
 Convergence-POC distributes stack operations by sending resource actions
 over RPC for any heat-engine to execute. Entire stack lifecycle will be
 controlled by worker/observer notifications. This distributed model has
 its own advantages and disadvantages.
 
 Any stack operation has a timeout and a single engine will be
 responsible for it. If that engine goes down, timeout is lost along with
 it. So a traditional way is for other engines to recreate timeout from
 scratch. Also a missed resource action notification will be detected
 only when stack operation timeout happens.
 
 To overcome this, we will need the following capability:
 
 1.Resource timeout (can be used for retry)
 
 I don't believe this is strictly needed for phase 1 (essentially we
 don't have it now, so nothing gets worse).
 
 
 We do have a stack timeout, and it stands to reason that we won't have a
 single box with a timeout greenthread after this, so a strategy is
 needed.
 
 Right, that was 2, but I was talking specifically about the resource 
 retry. I think we agree on both points.
 
 For phase 2, yes, we'll want it. One thing we haven't discussed much is
 that if we used Zaqar for this then the observer could claim a message
 but not acknowledge it until it had processed it, so we could have
 guaranteed delivery.
 
 
 Frankly, if oslo.messaging doesn't support reliable delivery then we
 need to add it.
 
 That is straight-up impossible with AMQP. Either you ack the message and
 risk losing it if the worker dies before processing is complete, or you 
 don't ack the message until it's processed and you become a blocker for 
 every other worker trying to pull jobs off the queue. It works fine when 
 you have only one worker; otherwise not so much. This is the crux of the 
 whole why isn't Zaqar just Rabbit debate.
 
 
 I'm not sure we have the same understanding of AMQP, so hopefully we can
 clarify here. This stackoverflow answer echoes my understanding:
 
 http://stackoverflow.com/questions/17841843/rabbitmq-does-one-consumer-block-the-other-consumers-of-the-same-queue
 
 Not ack'ing just means they might get retransmitted if we never ack. It
 doesn't block other consumers. And as the link above quotes from the
 AMQP spec, when there are multiple consumers, FIFO is not guaranteed.
 Other consumers get other messages.
 
 So just add the ability for a consumer to read, work, ack to
 oslo.messaging, and this is mostly handled via AMQP. Of course that
 also likely means no zeromq for Heat without accepting that messages
 may be lost if workers die.
 
 Basically we need to add something that is not RPC but instead
 jobqueue that mimics this:
 
 http://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messaging/rpc/dispatcher.py#n131
 
 I've always been suspicious of this bit of code, as it basically means
 that if anything fails between that call, and the one below it, we have
 lost contact, but as long as clients are written to re-send when there
 is a lack of reply, there shouldn't be a problem. But, for a job queue,
 there is no reply, and so the worker would dispatch, and then
 acknowledge after the dispatched call had returned (including having
 completed the step where new messages are added to the queue for any
 newly-possible children).
 
 Just to be clear, I believe what Zaqar adds is the ability to peek at
 a specific message ID and not affect it in the queue, which is entirely
 different than ACK'ing the ones you've already received in your session.
 
 Most stuff in OpenStack gets around this by doing synchronous calls 
 across oslo.messaging, where there is an end-to-end ack. We don't want 
 that here though. We'll probably have to make do with having ways to 
 recover after a failure (kick off another update with the same data is 
 always an option). The hard part is that if something dies we don't 
 really want to wait until the stack timeout to start recovering.
 
 
 I fully agree. Josh's point about using a coordination service like
 Zookeeper to maintain liveness is an interesting one here. If we just
 make sure that all the workers that have claimed work off the queue are
 alive, that should be sufficient to prevent a hanging stack situation
 like you describe above.
 
 Zaqar should have nothing to do with this and is, IMO, a
 poor choice at this stage, though I like the idea of using it in the
 future so that we can make Heat more of an outside-the-cloud app.
 
 I'm inclined to agree that it would be hard to force operators to deploy 
 Zaqar in order to be able to deploy Heat, and that we should probably be 
 cautious for that reason.
 
 That said, from a purely technical point of view it's not a poor choice 
 at 

Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Joshua Harlow
On Nov 13, 2014, at 7:10 AM, Clint Byrum cl...@fewbar.com wrote:

 Excerpts from Joshua Harlow's message of 2014-11-13 00:45:07 -0800:
 A question;
 
 How is using something like celery in heat vs taskflow in heat (or at least 
 concept [1]) 'to many code change'.
 
 Both seem like change of similar levels ;-)
 
 
 I've tried a few times to dive into refactoring some things to use
 TaskFlow at a shallow level, and have always gotten confused and
 frustrated.
 
 The amount of lines that are changed probably is the same. But the
 massive shift in thinking is not an easy one to make. It may be worth some
 thinking on providing a shorter bridge to TaskFlow adoption, because I'm
 a huge fan of the idea and would _start_ something with it in a heartbeat,
 but refactoring things to use it feels really weird to me.

I wonder how I can make that better...

Where the concepts that new/different? Maybe I just have more of a functional 
programming background and the way taskflow gets you to create tasks that are 
later executed, order them ahead of time, and then *later* run them is still a 
foreign concept for folks that have not done things with non-procedural 
languages. What were the confusion points, how may I help address them? More 
docs maybe, more examples, something else?

I would agree that the jobboard[0] concept is different than the other parts of 
taskflow, but it could be useful here:

Basically at its core its a application of zookeeper where 'jobs' are posted to 
a directory (using sequenced nodes in zookeeper, so that ordering is retained). 
Entities then acquire ephemeral locks on those 'jobs' (these locks will be 
released if the owner process disconnects, or fails...) and then work on the 
contents of that job (where contents can be pretty much arbitrary). This 
creates a highly available job queue (queue-like due to the node 
sequencing[1]), and it sounds pretty similar to what zaqar could provide in 
theory (except the zookeeper one is proven, battle-hardened, works and 
exists...). But we should of course continue being scared of zookeeper, because 
u know, who wants to use a tool where it would fit, haha (this is a joke).

[0] 
https://github.com/openstack/taskflow/blob/master/taskflow/jobs/jobboard.py#L25 

[1] 
http://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#Sequence+Nodes+--+Unique+Naming

 
 What was your metric for determining the code change either would have (out 
 of curiosity)?
 
 Perhaps u should look at [2], although I'm unclear on what the desired 
 functionality is here.
 
 Do u want the single engine to transfer its work to another engine when it 
 'goes down'? If so then the jobboard model + zookeper inherently does this.
 
 Or maybe u want something else? I'm probably confused because u seem to be 
 asking for resource timeouts + recover from engine failure (which seems like 
 a liveness issue and not a resource timeout one), those 2 things seem 
 separable.
 
 
 I agree with you on this. It is definitely a liveness problem. The
 resource timeout isn't something I've seen discussed before. We do have
 a stack timeout, and we need to keep on honoring that, but we can do
 that with a job that sleeps for the stack timeout if we have a liveness
 guarantee that will resurrect the job (with the sleep shortened by the
 time since stack-update-time) somewhere else if the original engine
 can't complete the job.
 
 [1] http://docs.openstack.org/developer/taskflow/jobs.html
 
 [2] 
 http://docs.openstack.org/developer/taskflow/examples.html#jobboard-producer-consumer-simple
 
 On Nov 13, 2014, at 12:29 AM, Murugan, Visnusaran 
 visnusaran.muru...@hp.com wrote:
 
 Hi all,
 
 Convergence-POC distributes stack operations by sending resource actions 
 over RPC for any heat-engine to execute. Entire stack lifecycle will be 
 controlled by worker/observer notifications. This distributed model has its 
 own advantages and disadvantages.
 
 Any stack operation has a timeout and a single engine will be responsible 
 for it. If that engine goes down, timeout is lost along with it. So a 
 traditional way is for other engines to recreate timeout from scratch. Also 
 a missed resource action notification will be detected only when stack 
 operation timeout happens.
 
 To overcome this, we will need the following capability:
 1.   Resource timeout (can be used for retry)
 2.   Recover from engine failure (loss of stack timeout, resource 
 action notification)
 
 
 Suggestion:
 1.   Use task queue like celery to host timeouts for both stack and 
 resource.
 2.   Poll database for engine failures and restart timers/ retrigger 
 resource retry (IMHO: This would be a traditional and weighs heavy)
 3.   Migrate heat to use TaskFlow. (Too many code change)
 
 I am not suggesting we use Task Flow. Using celery will have very minimum 
 code change. (decorate appropriate functions)
 
 
 Your thoughts.
 
 -Vishnu
 IRC: ckmvishnu
 

Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Clint Byrum
Excerpts from Joshua Harlow's message of 2014-11-13 14:01:14 -0800:
 On Nov 13, 2014, at 7:10 AM, Clint Byrum cl...@fewbar.com wrote:
 
  Excerpts from Joshua Harlow's message of 2014-11-13 00:45:07 -0800:
  A question;
  
  How is using something like celery in heat vs taskflow in heat (or at 
  least concept [1]) 'to many code change'.
  
  Both seem like change of similar levels ;-)
  
  
  I've tried a few times to dive into refactoring some things to use
  TaskFlow at a shallow level, and have always gotten confused and
  frustrated.
  
  The amount of lines that are changed probably is the same. But the
  massive shift in thinking is not an easy one to make. It may be worth some
  thinking on providing a shorter bridge to TaskFlow adoption, because I'm
  a huge fan of the idea and would _start_ something with it in a heartbeat,
  but refactoring things to use it feels really weird to me.
 
 I wonder how I can make that better...
 
 Where the concepts that new/different? Maybe I just have more of a functional 
 programming background and the way taskflow gets you to create tasks that are 
 later executed, order them ahead of time, and then *later* run them is still 
 a foreign concept for folks that have not done things with non-procedural 
 languages. What were the confusion points, how may I help address them? More 
 docs maybe, more examples, something else?

My feeling is that it is hard to let go of the language constructs that
_seem_ to solve the problems TaskFlow does, even though in fact they are
the problem because they're using the stack for control-flow where we
want that control-flow to yield to TaskFlow.

I also kind of feel like the Twisted folks answered a similar question
with inline callbacks and made things easier but more complex in
doing so. If I had a good answer I would give it to you though. :)

 
 I would agree that the jobboard[0] concept is different than the other parts 
 of taskflow, but it could be useful here:
 
 Basically at its core its a application of zookeeper where 'jobs' are posted 
 to a directory (using sequenced nodes in zookeeper, so that ordering is 
 retained). Entities then acquire ephemeral locks on those 'jobs' (these locks 
 will be released if the owner process disconnects, or fails...) and then work 
 on the contents of that job (where contents can be pretty much arbitrary). 
 This creates a highly available job queue (queue-like due to the node 
 sequencing[1]), and it sounds pretty similar to what zaqar could provide in 
 theory (except the zookeeper one is proven, battle-hardened, works and 
 exists...). But we should of course continue being scared of zookeeper, 
 because u know, who wants to use a tool where it would fit, haha (this is a 
 joke).
 

So ordering is a distraction from the task at hand. But the locks that
indicate liveness of the workers is very interesting to me. Since we
don't actually have requirements of ordering on the front-end of the task
(we do on the completion of certain tasks, but we can use a DB for that),
I wonder if we can just get the same effect with a durable queue that uses
a reliable messaging pattern where we don't ack until we're done. That
would achieve the goal of liveness.

 [0] 
 https://github.com/openstack/taskflow/blob/master/taskflow/jobs/jobboard.py#L25
  
 
 [1] 
 http://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#Sequence+Nodes+--+Unique+Naming
 
  
  What was your metric for determining the code change either would have 
  (out of curiosity)?
  
  Perhaps u should look at [2], although I'm unclear on what the desired 
  functionality is here.
  
  Do u want the single engine to transfer its work to another engine when it 
  'goes down'? If so then the jobboard model + zookeper inherently does this.
  
  Or maybe u want something else? I'm probably confused because u seem to be 
  asking for resource timeouts + recover from engine failure (which seems 
  like a liveness issue and not a resource timeout one), those 2 things seem 
  separable.
  
  
  I agree with you on this. It is definitely a liveness problem. The
  resource timeout isn't something I've seen discussed before. We do have
  a stack timeout, and we need to keep on honoring that, but we can do
  that with a job that sleeps for the stack timeout if we have a liveness
  guarantee that will resurrect the job (with the sleep shortened by the
  time since stack-update-time) somewhere else if the original engine
  can't complete the job.
  
  [1] http://docs.openstack.org/developer/taskflow/jobs.html
  
  [2] 
  http://docs.openstack.org/developer/taskflow/examples.html#jobboard-producer-consumer-simple
  
  On Nov 13, 2014, at 12:29 AM, Murugan, Visnusaran 
  visnusaran.muru...@hp.com wrote:
  
  Hi all,
  
  Convergence-POC distributes stack operations by sending resource actions 
  over RPC for any heat-engine to execute. Entire stack lifecycle will be 
  controlled by worker/observer notifications. This distributed model has 
  

Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Joshua Harlow
On Nov 13, 2014, at 4:08 PM, Clint Byrum cl...@fewbar.com wrote:

 Excerpts from Joshua Harlow's message of 2014-11-13 14:01:14 -0800:
 On Nov 13, 2014, at 7:10 AM, Clint Byrum cl...@fewbar.com wrote:
 
 Excerpts from Joshua Harlow's message of 2014-11-13 00:45:07 -0800:
 A question;
 
 How is using something like celery in heat vs taskflow in heat (or at 
 least concept [1]) 'to many code change'.
 
 Both seem like change of similar levels ;-)
 
 
 I've tried a few times to dive into refactoring some things to use
 TaskFlow at a shallow level, and have always gotten confused and
 frustrated.
 
 The amount of lines that are changed probably is the same. But the
 massive shift in thinking is not an easy one to make. It may be worth some
 thinking on providing a shorter bridge to TaskFlow adoption, because I'm
 a huge fan of the idea and would _start_ something with it in a heartbeat,
 but refactoring things to use it feels really weird to me.
 
 I wonder how I can make that better...
 
 Where the concepts that new/different? Maybe I just have more of a 
 functional programming background and the way taskflow gets you to create 
 tasks that are later executed, order them ahead of time, and then *later* 
 run them is still a foreign concept for folks that have not done things with 
 non-procedural languages. What were the confusion points, how may I help 
 address them? More docs maybe, more examples, something else?
 
 My feeling is that it is hard to let go of the language constructs that
 _seem_ to solve the problems TaskFlow does, even though in fact they are
 the problem because they're using the stack for control-flow where we
 want that control-flow to yield to TaskFlow.
 

U know u want to let go!

 I also kind of feel like the Twisted folks answered a similar question
 with inline callbacks and made things easier but more complex in
 doing so. If I had a good answer I would give it to you though. :)
 
 
 I would agree that the jobboard[0] concept is different than the other parts 
 of taskflow, but it could be useful here:
 
 Basically at its core its a application of zookeeper where 'jobs' are posted 
 to a directory (using sequenced nodes in zookeeper, so that ordering is 
 retained). Entities then acquire ephemeral locks on those 'jobs' (these 
 locks will be released if the owner process disconnects, or fails...) and 
 then work on the contents of that job (where contents can be pretty much 
 arbitrary). This creates a highly available job queue (queue-like due to the 
 node sequencing[1]), and it sounds pretty similar to what zaqar could 
 provide in theory (except the zookeeper one is proven, battle-hardened, 
 works and exists...). But we should of course continue being scared of 
 zookeeper, because u know, who wants to use a tool where it would fit, haha 
 (this is a joke).
 
 
 So ordering is a distraction from the task at hand. But the locks that
 indicate liveness of the workers is very interesting to me. Since we
 don't actually have requirements of ordering on the front-end of the task
 (we do on the completion of certain tasks, but we can use a DB for that),
 I wonder if we can just get the same effect with a durable queue that uses
 a reliable messaging pattern where we don't ack until we're done. That
 would achieve the goal of liveness.
 

Possibly, it depends on what the message broker is doing with the message when 
the message hasn't been acked. With zookeeper being used as a queue of jobs, 
the job actually has an owner (the thing with the ephemeral lock on the job) so 
the job won't get 'taken over' by someone else unless that ephemeral lock drops 
off (due to owner dying or disconnecting...); this is where I'm not sure what 
message brokers do (varies by message broker?).

An example little taskflow program that I made that u can also run (replace my 
zookeeper server with your own).

http://paste.ubuntu.com/8995861/

You can then run like:

$ python jb.py  'producer'

And for a worker (start many of these if u want),

$ python jb.py 'c1'

Then you can see the work being produced/consumed, and u can ctrl-c 'c1' and 
then another worker will take over the work...

Something like the following should be output (by workers):

$ python jb.py 'c2'
INFO:kazoo.client:Connecting to buildingbuild.corp.yahoo.com:2181
INFO:kazoo.client:Zookeeper connection established, state: CONNECTED
Waiting for jobs to appear...
Running {u'action': u'stuff', u'id': 1}
Waiting for jobs to appear...
Running {u'action': u'stuff', u'id': 3}

For producers:

$ python jb.py  'producer'
INFO:kazoo.client:Connecting to buildingbuild.corp.yahoo.com:2181
INFO:kazoo.client:Zookeeper connection established, state: CONNECTED
Posting work item {'action': 'stuff', 'id': 0}
Posting work item {'action': 'stuff', 'id': 1}
Posting work item {'action': 'stuff', 'id': 2}
Posting work item {'action': 'stuff', 'id': 3}
Posting work item {'action': 'stuff', 'id': 4}
Posting work item {'action': 'stuff', 'id': 5}

Now you 

Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Jastrzebski, Michal


 -Original Message-
 From: Clint Byrum [mailto:cl...@fewbar.com]
 Sent: Thursday, November 13, 2014 8:00 PM
 To: openstack-dev
 Subject: Re: [openstack-dev] [Heat] Using Job Queues for timeout ops
 
 Excerpts from Zane Bitter's message of 2014-11-13 09:55:43 -0800:
  On 13/11/14 09:58, Clint Byrum wrote:
   Excerpts from Zane Bitter's message of 2014-11-13 05:54:03 -0800:
   On 13/11/14 03:29, Murugan, Visnusaran wrote:
   Hi all,
  
   Convergence-POC distributes stack operations by sending resource
   actions over RPC for any heat-engine to execute. Entire stack
   lifecycle will be controlled by worker/observer notifications.
   This distributed model has its own advantages and disadvantages.
  
   Any stack operation has a timeout and a single engine will be
   responsible for it. If that engine goes down, timeout is lost
   along with it. So a traditional way is for other engines to
   recreate timeout from scratch. Also a missed resource action
   notification will be detected only when stack operation timeout
 happens.
  
   To overcome this, we will need the following capability:
  
   1.Resource timeout (can be used for retry)
  
   I don't believe this is strictly needed for phase 1 (essentially we
   don't have it now, so nothing gets worse).
  
  
   We do have a stack timeout, and it stands to reason that we won't
   have a single box with a timeout greenthread after this, so a
   strategy is needed.
 
  Right, that was 2, but I was talking specifically about the resource
  retry. I think we agree on both points.
 
   For phase 2, yes, we'll want it. One thing we haven't discussed
   much is that if we used Zaqar for this then the observer could
   claim a message but not acknowledge it until it had processed it,
   so we could have guaranteed delivery.
  
  
   Frankly, if oslo.messaging doesn't support reliable delivery then we
   need to add it.
 
  That is straight-up impossible with AMQP. Either you ack the message
  and risk losing it if the worker dies before processing is complete,
  or you don't ack the message until it's processed and you become a
  blocker for every other worker trying to pull jobs off the queue. It
  works fine when you have only one worker; otherwise not so much. This
  is the crux of the whole why isn't Zaqar just Rabbit debate.
 
 
 I'm not sure we have the same understanding of AMQP, so hopefully we can
 clarify here. This stackoverflow answer echoes my understanding:
 
 http://stackoverflow.com/questions/17841843/rabbitmq-does-one-
 consumer-block-the-other-consumers-of-the-same-queue
 
 Not ack'ing just means they might get retransmitted if we never ack. It
 doesn't block other consumers. And as the link above quotes from the
 AMQP spec, when there are multiple consumers, FIFO is not guaranteed.
 Other consumers get other messages.
 
 So just add the ability for a consumer to read, work, ack to oslo.messaging,
 and this is mostly handled via AMQP. Of course that also likely means no
 zeromq for Heat without accepting that messages may be lost if workers die.
 
 Basically we need to add something that is not RPC but instead jobqueue
 that mimics this:
 
 http://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messagin
 g/rpc/dispatcher.py#n131
 
 I've always been suspicious of this bit of code, as it basically means that if
 anything fails between that call, and the one below it, we have lost contact,
 but as long as clients are written to re-send when there is a lack of reply,
 there shouldn't be a problem. But, for a job queue, there is no reply, and so
 the worker would dispatch, and then acknowledge after the dispatched call
 had returned (including having completed the step where new messages are
 added to the queue for any newly-possible children).
 
 Just to be clear, I believe what Zaqar adds is the ability to peek at a 
 specific
 message ID and not affect it in the queue, which is entirely different than
 ACK'ing the ones you've already received in your session.
 
  Most stuff in OpenStack gets around this by doing synchronous calls
  across oslo.messaging, where there is an end-to-end ack. We don't want
  that here though. We'll probably have to make do with having ways to
  recover after a failure (kick off another update with the same data is
  always an option). The hard part is that if something dies we don't
  really want to wait until the stack timeout to start recovering.
 
 
 I fully agree. Josh's point about using a coordination service like Zookeeper 
 to
 maintain liveness is an interesting one here. If we just make sure that all 
 the
 workers that have claimed work off the queue are alive, that should be
 sufficient to prevent a hanging stack situation like you describe above.
 
   Zaqar should have nothing to do with this and is, IMO, a poor choice
   at this stage, though I like the idea of using it in the future so
   that we can make Heat more of an outside-the-cloud app.
 
  I'm inclined to agree that it would

Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

2014-11-13 Thread Jastrzebski, Michal


 -Original Message-
 From: Joshua Harlow [mailto:harlo...@outlook.com]
 Sent: Thursday, November 13, 2014 10:50 PM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [Heat] Using Job Queues for timeout ops
 
 On Nov 13, 2014, at 10:59 AM, Clint Byrum cl...@fewbar.com wrote:
 
  Excerpts from Zane Bitter's message of 2014-11-13 09:55:43 -0800:
  On 13/11/14 09:58, Clint Byrum wrote:
  Excerpts from Zane Bitter's message of 2014-11-13 05:54:03 -0800:
  On 13/11/14 03:29, Murugan, Visnusaran wrote:
  Hi all,
 
  Convergence-POC distributes stack operations by sending resource
  actions over RPC for any heat-engine to execute. Entire stack
  lifecycle will be controlled by worker/observer notifications.
  This distributed model has its own advantages and disadvantages.
 
  Any stack operation has a timeout and a single engine will be
  responsible for it. If that engine goes down, timeout is lost
  along with it. So a traditional way is for other engines to
  recreate timeout from scratch. Also a missed resource action
  notification will be detected only when stack operation timeout
 happens.
 
  To overcome this, we will need the following capability:
 
  1.Resource timeout (can be used for retry)
 
  I don't believe this is strictly needed for phase 1 (essentially we
  don't have it now, so nothing gets worse).
 
 
  We do have a stack timeout, and it stands to reason that we won't
  have a single box with a timeout greenthread after this, so a
  strategy is needed.
 
  Right, that was 2, but I was talking specifically about the resource
  retry. I think we agree on both points.
 
  For phase 2, yes, we'll want it. One thing we haven't discussed
  much is that if we used Zaqar for this then the observer could
  claim a message but not acknowledge it until it had processed it,
  so we could have guaranteed delivery.
 
 
  Frankly, if oslo.messaging doesn't support reliable delivery then we
  need to add it.
 
  That is straight-up impossible with AMQP. Either you ack the message
  and risk losing it if the worker dies before processing is complete,
  or you don't ack the message until it's processed and you become a
  blocker for every other worker trying to pull jobs off the queue. It
  works fine when you have only one worker; otherwise not so much. This
  is the crux of the whole why isn't Zaqar just Rabbit debate.
 
 
  I'm not sure we have the same understanding of AMQP, so hopefully we
  can clarify here. This stackoverflow answer echoes my understanding:
 
  http://stackoverflow.com/questions/17841843/rabbitmq-does-one-
 consumer
  -block-the-other-consumers-of-the-same-queue
 
  Not ack'ing just means they might get retransmitted if we never ack.
  It doesn't block other consumers. And as the link above quotes from
  the AMQP spec, when there are multiple consumers, FIFO is not
 guaranteed.
  Other consumers get other messages.
 
  So just add the ability for a consumer to read, work, ack to
  oslo.messaging, and this is mostly handled via AMQP. Of course that
  also likely means no zeromq for Heat without accepting that messages
  may be lost if workers die.
 
  Basically we need to add something that is not RPC but instead
  jobqueue that mimics this:
 
  http://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messa
  ging/rpc/dispatcher.py#n131
 
  I've always been suspicious of this bit of code, as it basically means
  that if anything fails between that call, and the one below it, we
  have lost contact, but as long as clients are written to re-send when
  there is a lack of reply, there shouldn't be a problem. But, for a job
  queue, there is no reply, and so the worker would dispatch, and then
  acknowledge after the dispatched call had returned (including having
  completed the step where new messages are added to the queue for any
  newly-possible children).
 
  Just to be clear, I believe what Zaqar adds is the ability to peek at
  a specific message ID and not affect it in the queue, which is
  entirely different than ACK'ing the ones you've already received in your
 session.
 
  Most stuff in OpenStack gets around this by doing synchronous calls
  across oslo.messaging, where there is an end-to-end ack. We don't
  want that here though. We'll probably have to make do with having
  ways to recover after a failure (kick off another update with the
  same data is always an option). The hard part is that if something
  dies we don't really want to wait until the stack timeout to start
 recovering.
 
 
  I fully agree. Josh's point about using a coordination service like
  Zookeeper to maintain liveness is an interesting one here. If we just
  make sure that all the workers that have claimed work off the queue
  are alive, that should be sufficient to prevent a hanging stack
  situation like you describe above.
 
  Zaqar should have nothing to do with this and is, IMO, a poor choice
  at this stage, though I like the idea