Re: [openstack-dev] [heat] health maintenance in autoscaling groups

Mike Spreitzer Wed, 02 Jul 2014 09:30:23 -0700

Qiming Teng <teng...@linux.vnet.ibm.com> wrote on 07/02/2014 03:02:14 AM:

> Just some random thoughts below ...
> 
> On Tue, Jul 01, 2014 at 03:47:03PM -0400, Mike Spreitzer wrote:
> > ...
> > I have not found design discussion of this; have I missed something?
> > 
> > I suppose the natural answer for OpenStack would be centered around 
> > webhooks... 
> 
> Well, I would suggest we generalize this into a event messaging or
> signaling solution, instead of just 'webhooks'.  The reason is that
> webhooks as it is implemented today is not carrying a payload of useful
> information -- I'm referring to the alarms in Ceilometer.

OK, this is great (and Steve Hardy provided more details in his reply), I 
did not know about the existing abilities to have a payload.  However 
Ceilometer alarms are still deficient in that way, right?  A Ceilometer 
alarm's action list is simply a list of URLs, right?  I would be happy to 
say let's generalize Ceilometer alarms to allow a payload in an action.

> There are other cases as well.  A member failure could be caused by a 
> temporary communication problem, which means it may show up quickly when
> a replacement member is already being created.  It may mean that we have
> to respond to an 'online' event in addition to an 'offline' event?
> ...
> The problem here today is about the recovery of SG member.  If it is a
> compute instance, we can 'reboot', 'rebuild', 'evacuate', 'migrate' it,
> just to name a few options.  The most brutal way to do this is like what
> HARestarter is doing today -- delete followed by a create.

We could get into arbitrary subtlety, and maybe eventually will do better, 
but I think we can start with a simple solution that is widely applicable. 
 The simple solution is that once the decision has been made to do 
convergence on a member (note that this is distinct from merely detecting 
and noting a divergence) then it will be done regardless of whether the 
doomed member later appears to have recovered, and the convergence action 
for a scaling group member is to delete the old member and create a 
replacement (not in that order).

> > When the member is a nested stack and Ceilometer exists, it could be 
the 
> > member stack's responsibility to include a Ceilometer alarm that 
detects 
> > the member stack's death and hit the member stack's deletion webhook. 
> 
> This is difficult.  A '(nested) stack' is a Heat specific abstraction --
> recall that we have to annotate a nova server resource in its metadata
> to which stack this server belongs.  Besides the 'visible' resources
> specified in a template, Heat may create internal data structures and/or
> resources (e.g. users) for a stack.  I am not quite sure a stack's death
> can be easily detected from outside Heat.  It would be at least
> cumbersome to have Heat notify Ceilometer that a stack is dead, and then
> have Ceilometer send back a signal.

A (nested) stack is not only a heat-specific abstraction but its semantics 
and failure modes are specific to the stack (at least, its template).  I 
think we have no practical choice but to let the template author declare 
how failure is detected.  It could be as simple as creating a Ceilometer 
alarms that detect death one or more resources in the nested stack; it 
could be more complicated Ceilometer stuff; it could be based on something 
other than, or in addition to, Ceilometer.  If today there are not enough 
sensors to detect failures of all kinds of resources, I consider that a 
gap in telemetry (and think it is small enough that we can proceed 
usefully today, and should plan on filling that gap over time).

> > There is a small matter of how the author of the template used to 
create 
> > the member stack writes some template snippet that creates a 
Ceilometer 
> > alarm that is specific to a member stack that does not exist yet. 
> 
> How about just one signal responder per ScalingGroup?  A SG is supposed
> to be in a better position to make the judgement: do I have to recreate
> a failed member? am I recreating it right now or wait a few seconds?
> maybe I should recreate the member on some specific AZs?

That is confusing two issues.  The thing that is new here is making the 
scaling group recognize member failure; the primary reaction is to update 
its accounting of members (which, in the current code, must be done by 
making sure the failed member is deleted); recovery of other scaling group 
aspects is fairly old-hat, it is analogous to the problems that the 
scaling group already solves when asked to increase its size.

> ...
> > I suppose we could stipulate that if the member template includes a 
> > parameter with name "member_name" and type "string" then the OS OG 
takes 
> > care of supplying the correct value of that parameter; as illustrated 
in 
> > the asg_of_stacks.yaml of https://review.openstack.org/#/c/97366/ , a 
> > member template can use a template parameter to tag Ceilometer data 
for 
> > querying.  The URL of the member stack's deletion webhook could be 
passed 
> > to the member template via the same sort of convention. 
> 
> I am not in favor of the per-member webhook design.  But I vote for an
> additional *implicit* parameter to a nested stack of any groups.  It
> could be an index or a name.

Right, I was elaborating on a particular formulation of "implicit 
parameter".  In particular, I suggested an "implicit parameter value" for 
an optional explicit parameter.  We could make the parameter declaration 
implicit, but that (1) is a bit irregular (reminiscent of "modes") if we 
only do it for stacks that are scaling group members and (2) is equivalent 
to the existing concept of psuedo-parameters if we do it for all stacks. I 
would be content with adding a pseudo-parameter for all stacks that is the 
UUID of the stack.  The index of the member in the group could be 
problematic, as those are re-used; the UUID is not re-used.  Names also 
have issues with uniqueness.

> > When Ceilometer 
> > does not exist, it is less obvious to me what could usefully be done. 
Are 
> > there any useful SG member types besides Compute instances and nested 
> > stacks?  Note that a nested stack could also pass its member deletion 
> > webhook to a load balancer (that is willing to accept such a thing, of 

> > course), so we get a lot of unity of mechanism between the case of 
> > detection by infrastructure vs. application level detection.
> > 
> 
> I'm a little bit concerned about passing the member deletion webhook to
> LB.  Maybe we need to rethink about this: do we really want to bring
> application level design considerations down to the infrastructure 
level?

I look at it this way: do we want two completely independent loops of 
detection and response, or shall we share a common response mechanism with 
two different levels of detection?  I think both want the same response, 
and so recommend a shared response mechanism.

> Some of the detection work might be covered by the observer engine specs
> that is under review.  My doubt about it is about how to make it "listen
> only to what need to know while ignore everything else".

I am not sure what you mean by that.  If this is about the case of the 
group members being nested stacks, I go back to the idea that it must be 
up to the nested template author to define failure (via declaring how to 
detect it).

> > I am not entirely happy with the idea of a webhook per member.  If I 
> > understand correctly, generating webhooks is a somewhat expensive and 
> > problematic process.  What would be the alternative?
> 
> My understanding is that the webhooks' problem is not about cost, it is
> more about authentication and flexibility.  Steve Hardy and Thomas Herve
> are already looking into the authentication problem.

I was not disagreeing, I was including those in "problematic".

Thanks,
Mike

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [heat] health maintenance in autoscaling groups

Reply via email to