On 13/11/14 06:52, Angus Salkeld wrote:
On Thu, Nov 13, 2014 at 6:29 PM, Murugan, Visnusaran
<[email protected] <mailto:[email protected]>> wrote:

    Hi all,____

    __ __

    Convergence-POC distributes stack operations by sending resource
    actions over RPC for any heat-engine to execute. Entire stack
    lifecycle will be controlled by worker/observer notifications. This
    distributed model has its own advantages and disadvantages.____

    __ __

    Any stack operation has a timeout and a single engine will be
    responsible for it. If that engine goes down, timeout is lost along
    with it. So a traditional way is for other engines to recreate
    timeout from scratch. Also a missed resource action notification
    will be detected only when stack operation timeout happens. __ __

    __ __

    To overcome this, we will need the following capability:____

    __1.__Resource timeout (can be used for retry)

We will shortly have a worker job, can't we have a job that just sleeps
that gets started in parallel with the job that is doing the work?
It gets to the end of the sleep and runs a check.

What if that worker dies too? There's no guarantee that it'd even be a different worker. In fact, there's not even a guarantee that we'd have multiple workers.

BTW Steve Hardy's suggestion, which I have more or less come around to, is that the engines themselves should be the workers in convergence, to save operators deploying two types of processes. (The observers will still be a separate process though, in phase 2.)

    ____

    __2.__Recover from engine failure (loss of stack timeout, resource
    action notification)____

    __


My suggestion above could catch failures as long as it was run in a
different process.

-Angus

    __

    __ __

    Suggestion:____

    __1.__Use task queue like celery to host timeouts for both stack and
    resource.____

    __2.__Poll database for engine failures and restart timers/
    retrigger resource retry (IMHO: This would be a traditional and
    weighs heavy)____

    __3.__Migrate heat to use TaskFlow. (Too many code change)____

    __ __

    I am not suggesting we use Task Flow. Using celery will have very
    minimum code change. (decorate appropriate functions) ____

    __ __

    __ __

    Your thoughts.____

    __ __

    -Vishnu____

    IRC: ckmvishnu____


    _______________________________________________
    OpenStack-dev mailing list
    [email protected]
    <mailto:[email protected]>
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to