Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

Murugan, Visnusaran Thu, 13 Nov 2014 05:01:43 -0800

Parallel worker was what I initially thought. But what to do if the engine 
hosting that worker goes down?

-Vishnu

From: Angus Salkeld [mailto:asalk...@mirantis.com]
Sent: Thursday, November 13, 2014 5:22 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

On Thu, Nov 13, 2014 at 6:29 PM, Murugan, Visnusaran 
<visnusaran.muru...@hp.com<mailto:visnusaran.muru...@hp.com>> wrote:
Hi all,

Convergence-POC distributes stack operations by sending resource actions over 
RPC for any heat-engine to execute. Entire stack lifecycle will be controlled 
by worker/observer notifications. This distributed model has its own advantages 
and disadvantages.

Any stack operation has a timeout and a single engine will be responsible for 
it. If that engine goes down, timeout is lost along with it. So a traditional 
way is for other engines to recreate timeout from scratch. Also a missed 
resource action notification will be detected only when stack operation timeout 
happens.

To overcome this, we will need the following capability:

1.       Resource timeout (can be used for retry)
We will shortly have a worker job, can't we have a job that just sleeps that 
gets started in parallel with the job that is doing the work?
It gets to the end of the sleep and runs a check.

2.       Recover from engine failure (loss of stack timeout, resource action 
notification)

My suggestion above could catch failures as long as it was run in a different 
process.
-Angus

Suggestion:

1.       Use task queue like celery to host timeouts for both stack and 
resource.

2.       Poll database for engine failures and restart timers/ retrigger 
resource retry (IMHO: This would be a traditional and weighs heavy)

3.       Migrate heat to use TaskFlow. (Too many code change)

I am not suggesting we use Task Flow. Using celery will have very minimum code 
change. (decorate appropriate functions)

Your thoughts.

-Vishnu
IRC: ckmvishnu

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org<mailto:OpenStack-dev@lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Heat] Using Job Queues for timeout ops

Reply via email to