Yeah, this is really sticky when it comes to Auto Scaling. Because you probably want a "Oh, we've got more load than expected, scale up a bit" policy and an "OMG! REDDIT FRONT PAGE!" policy.
The right thing is probably to be prepared to execute multiple concurrent signals simultaneously. That's what Rackspace Auto Scale does. And, yes, based on everything I've seen from customers using it in production, signals must not error unless your entire infrastructure is dispersing pieces of itself in disorderly pieces all over the floor. Especially in the public cloud space, people just want a webhook URL and even doing OpenStack auth is too much trouble. OTOH, I'm pretty sure that I'm fixated on the palatable and concrete. A n+2 policy and an n+5 policy becomes an n+7 policy when you trigger both. The right answer is "Convergence" :) Even a n-2 policy banged up against a n+5 policy can be explained without much ambiguity. But then we start to talk about concurrently signalling different types of things, this model falls apart. A scaling webhook triggers in the middle of a deploy, for example. I suspect that the user interest is properly represented by the "execute in parallel if you can, queue if you can't" logic case. Except for the Ouroboros bros. (pun intended) How long is it till feature creep gives us a chain of events that makes that logic give us a nice deadlock? Either signals need to be only user-facing (which breaks the idea that user requests are handled just like proxied-user-requests from other systems) or all y'all get to flash back to the joys of reference counted garbage collection. Yeah. This could use some deepthink. Rest assured, our users will trigger interesting behaviors. :) ________________________________________ From: Clint Byrum [[email protected]] Sent: Monday, July 07, 2014 11:52 AM To: openstack-dev Subject: [openstack-dev] [Heat] [Marconi] Heat and concurrent signal processing needs some deep thought I just noticed this review: https://review.openstack.org/#/c/90325/ And gave it some real thought. This will likely break any large scale usage of signals, and I think breaks the user expectations. Nobody expects to get a failure for a signal. It is one of those things that you fire and forget. "I'm done, deal with it." If we start returning errors, or 409's or 503's, I don't think users are writing their in-instance initialization tooling to retry. I think we need to accept it and reliably deliver it. Does anybody have any good ideas for how to go forward with this? I'd much rather borrow a solution from some other project than try to invent something for Heat. I've added Marconi as I suspect there has already been some thought put into how a user-facing set of tools would send messages. _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
