Excerpts from Zane Bitter's message of 2014-11-13 05:54:03 -0800: > On 13/11/14 03:29, Murugan, Visnusaran wrote: > > Hi all, > > > > Convergence-POC distributes stack operations by sending resource actions > > over RPC for any heat-engine to execute. Entire stack lifecycle will be > > controlled by worker/observer notifications. This distributed model has > > its own advantages and disadvantages. > > > > Any stack operation has a timeout and a single engine will be > > responsible for it. If that engine goes down, timeout is lost along with > > it. So a traditional way is for other engines to recreate timeout from > > scratch. Also a missed resource action notification will be detected > > only when stack operation timeout happens. > > > > To overcome this, we will need the following capability: > > > > 1.Resource timeout (can be used for retry) > > I don't believe this is strictly needed for phase 1 (essentially we > don't have it now, so nothing gets worse). >
We do have a stack timeout, and it stands to reason that we won't have a single box with a timeout greenthread after this, so a strategy is needed. > For phase 2, yes, we'll want it. One thing we haven't discussed much is > that if we used Zaqar for this then the observer could claim a message > but not acknowledge it until it had processed it, so we could have > guaranteed delivery. > Frankly, if oslo.messaging doesn't support reliable delivery then we need to add it. Zaqar should have nothing to do with this and is, IMO, a poor choice at this stage, though I like the idea of using it in the future so that we can make Heat more of an outside-the-cloud app. > > 2.Recover from engine failure (loss of stack timeout, resource action > > notification) > > > > Suggestion: > > > > 1.Use task queue like celery to host timeouts for both stack and resource. > > I believe Celery is more or less a non-starter as an OpenStack > dependency because it uses Kombu directly to talk to the queue, vs. > oslo.messaging which is an abstraction layer over Kombu, Qpid, ZeroMQ > and maybe others in the future. i.e. requiring Celery means that some > users would be forced to install Rabbit for the first time. > > One option would be to fork Celery and replace Kombu with oslo.messaging > as its abstraction layer. Good luck getting that maintained though, > since Celery _invented_ Kombu to be it's abstraction layer. > A slight side point here: Kombu supports Qpid and ZeroMQ. Oslo.messaging is more about having a unified API than a set of magic backends. It actually boggles my mind why we didn't just use kombu (cue 20 reactions with people saying it wasn't EXACTLY right), but I think we're committed to oslo.messaging now. Anyway, celery would need no such refactor, as kombu would be able to access the same bus as everything else just fine. > > 2.Poll database for engine failures and restart timers/ retrigger > > resource retry (IMHO: This would be a traditional and weighs heavy) > > > > 3.Migrate heat to use TaskFlow. (Too many code change) > > If it's just handling timed triggers (maybe this is closer to #2) and > not migrating the whole code base, then I don't see why it would be a > big change (or even a change at all - it's basically new functionality). > I'm not sure if TaskFlow has something like this already. If not we > could also look at what Mistral is doing with timed tasks and see if we > could spin some of it out into an Oslo library. > I feel like it boils down to something running periodically checking for scheduled tasks that are due to run but have not run yet. I wonder if we can actually look at Ironic for how they do this, because Ironic polls power state of machines constantly, and uses a hash ring to make sure only one conductor is polling any one machine at a time. If we broke stacks up into a hash ring like that for the purpose of singleton tasks like timeout checking, that might work out nicely. _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev