Gotcha. So the way this might work is, for example, when a run_instance fails on compute node, it would publish a "run_instance for uuid=<blah> failed" event. There would be a subscriber associated with the scheduler listening for such events--when it receives one it would go check the capacity table and update it to reflect the failure. Does that sound about right?
"Sandy Walsh" <[email protected]> said: > Sure, the problem I'm immediately facing is reclaiming resources from the > Capacity > table when something fails. (we claim them immediately in the scheduler when > the > host is selected to lessen the latency). > > The other situation is Orchestration needs it for retries, rescheduling, > rollbacks > and cross-service timeouts. > > I think it's needed core functionality. I like Fail-Fast for the same > reasons, but > it can get in the way. > > -S > > ________________________________________ > From: [email protected] > [[email protected]] on behalf of > Mark Washenberger [[email protected]] > Sent: Wednesday, December 07, 2011 11:53 AM > To: [email protected] > Subject: Re: [Openstack] [Orchestration] Handling error events ... explicit > > vs. implicit > > Can you talk a little more about how you want to apply this failure > notification? > That is, what is the case where you are going to use the information that an > operation failed? In my head I have an idea of getting code simplicity > dividends > from an "everything succeeds" approach to some of our operations. But it > might not > really apply to the case you're working on. > _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : [email protected] Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp

