On Thu, Apr 9, 2015 at 8:03 AM, Miguel Grinberg <[email protected] > wrote:
> Zane, replies inline. > > On Wed, Apr 8, 2015 at 3:46 PM, Zane Bitter <[email protected]> wrote: > >> On 07/04/15 22:02, Miguel Grinberg wrote: >> >>> Hi, >>> >>> The OS::Heat::AutoScalingGroup resource is somewhat limited at this >>> time, because when a scaling even occurs it does not notify dependent >>> resources, such as a load balancer, that the pool of instances has >>> changed. >>> >> >> As Thomas mentioned, the 'approved' way to solve this is to make your >> scaled unit a stack, and include a Neutron PoolMember resource in it. > > > LBAAS is an optional, now even external component, not part of the Neutron > API. Many installations don't have it. Allowing the use of custom load > balancers is a desirable option, in my opinion, more so while LBAAS is not > core neutron functionality. > > >> >> >> The AWS::AutoScaling::AutoScalingGroup resource, on the other side, has >>> a LoadBalancerNames property that takes a list of >>> AWS::ElasticLoadBalancing::LoadBalancer resources that get updated >>> anytime the size of the ASG changes. >>> >> >> Which is an appalling hack. >> >> Yes. This is hacky, but it seems it models the AWS load balancing APIs, > so there isn't much that can be done here, right? > > >> (If it called the Neutron LBaaS API, like the equivalent in >> CloudFormation does with ELB, it would be OK. But in reality, as you know, >> it's a hack that makes calls directly to another resource plugin within >> Heat.) >> >> I'm trying to implement this notification mechanism for HOT templates, >>> but there are a few aspects that I hope to do better. >>> >>> 1. A HOT template can have get_attr function calls that invoke >>> attributes of the ASG. None of these update when the ASG resizes at this >>> time, a scaling even does a partial update that only affects the ASG >>> resource. I would like to address this. >>> >> >> In the medium-term I think this is something that I believe Convergence >> will be able to solve for us. I'm not sure that it's worth putting in a >> short-term work-around for. > > > Here is where we disagree. In my opinion this is broken functionality. > After a scaling event there are resources that go stale because they are > never told that the ASG resized. This to me is clearly a bug that deserves > fixing, even if in the future a better/nicer fix can be crafted. > So the problem is the result of get_attr is dynamic and we do not support triggering stack updates on changes to their results. As Zane suggested, you should think of autoscaling as been in a different service. A possible solution: You have a top level template that has an StackUpdatePolicy (a new thing), and it gets triggered by a Ceilometer Alarm based on the following notification: https://github.com/openstack/heat/blob/master/heat/engine/resources/aws/autoscaling/autoscaling_group.py#L338-L344 This then runs an update to refresh the stack. -Angus > > >> >> >> 2. The AWS solution relies on the well known LoadBalancer resource, but >>> often load balancers are just regular instances that get loaded with a >>> load balancer such as haproxy in a custom way. I'd like custom load >>> balancers to also update when the ASG resizes. >>> >> >> TBH the correct answer for load balancers specifically is use the Neutron >> LBaaS API, end of story. > > > This does not help me, as I don't have LBAAS. But as a said above, even if > I had it, I may want to use my own load balancer, why not let me use my own > if that is what I need for my project? Or what if I had another resource > type that is not a load balancer, maybe a custom resource from a plugin > that wants to be notified when the ASG resizes? If this can be done for > regular stack updates, my opinion is that it should also work for these > special signal-triggered updates to the ASG. > > >> But you're right that there are many uses for a more generic notification >> mechanism. (For example, in OpenShift we need to notify the controller when >> we add or remove nodes.) The design goal for ASG was always that we would >> have an arbitrary scaled unit (defined by a template) and an arbitrary >> non-scaled unit that could receive notifications about changes to the >> scaling group. So far we have delivered on only the first part of that >> promise. >> >> My vision for the second part has always been that we'd use hooks, the >> initial implementation of which Tomas has landed in Kilo. We'll need to >> implement some more hook types to do it - post-create, post-update and >> pre-delete at a minimum. We also need some way of notifying the user >> asynchronously about when the hooks are triggered, so that they can take >> whatever action (e.g. add to load balancer) before calling the API to clear >> the hook. (At the moment the only way to find out when your hook should run >> is by polling the Heat API.) >> > > I'm not really sure I understand how this would work. If I have a resource > that sets one of its properties to { get_attr: [my_asg, size] }, then on a > stack-update I don't need a hook to update my resource, it automatically > updates. On an alarm triggered resize it will not, only because the update > is partial in that case. If I add a post-update hook to that, then I may be > able to get the resource to update on a resize event, but on a regular > stack-update now the update will happen twice, once due to the normal > update process, then again with the hook. > > To make this work I would have to not use get_attr, and somehow get this > resource to obtain whatever attribute it needs from the ASG using some > other way, like maybe the Heat API. Which is all fine, but get_attr is a > valid option I have as a stack developer, and it is currently broken. > > I know you disagree with my view, but in my opinion the problem, as I > mentioned before, is that the resize event of the ASG does a partial > update, which leaves the stack in an inconsistent state. > > >> In my ideal world, the notification mechanism (or at least one of them) >> is a message to a Zaqar queue/topic (whatever you want to call it) >> specified by the user. So someone e.g. running their own HAProxy (don't do >> this ;) could put a little micro-daemon on the same box that listened to >> Zaqar for notifications and update the HAProxy config appropriately. >> >> Also in my ideal world, a Mistral workflow could be triggered (and seeded >> with parameter values) by the exact same message queue, so that the user >> can run any action that Mistral can support without having to have a server >> around to run it. And we'd use the same system for e.g. Ceilometer alarms >> talking to scaling policies, so that one could also insert a Mistral >> workflow into the process. Things are actually pretty awesome in my ideal >> world. > > > I really have no objection to this, sounds pretty good and I would likely > use it when it is available. But this is future looking, and I'm trying to > address a very specific problem in current releases. > > >> >> The ResourceGroup is an interesting resource. It is much simpler than >>> the ASG. In particular, the only way to scale the ResourceGroup is by >>> issuing a stack-update with a new size. This indirectly solves #1 and #2 >>> above, because when a full update is issued any references to the >>> ResourceGroup get updated as well. >>> >> >> It doesn't really solve the problem, because you could still manually >> update the nested stack that the ResourceGroup manages. It just entirely >> lacks the feature that makes it easy to run in to the problem. And not in a >> good way. >> >> > Not sure I understand this. You have a list of nested stacks, as many as > the size property of the resource group dictates. You can update them and > that's fine. I guess you can delete one and that is probably not fine, in > the same way you can delete instances from the ASG pool without the ASG > resource knowing, or actually modify or delete any native entities without > the heat resource that owns it knowing. That still does not cancel the fact > that if you play by the rules, the ResourceGroup is much more reliable than > the ASG because it can only be updated in a stack-update operation. > > > >> In my opinion, the best way to address #1 and #2 above so that they work >>> for the ASG as they work for the RG, is to change what happens when >>> there is a scaling event. When the ScalingPolicy resource gets a signal, >>> it reaches directly to the ASG by calling asg.adjust() (or in the near >>> future by sending a signal to it, when a currently proposed patch >>> merges) with the new size. This bypasses the update mechanism, so only a >>> partial update occurs, just the ASG resource itself is updated. I would >>> like this to be a full stack update, so that all references get updated >>> with the new ASG size. This will address #1 and #2. >>> >> >> -1 >> > > This I disagree with. The partial update leaves the stack in an > inconsistent state. It's a bug that should be straightforward to fix, > without altering any plans for the future that can make the use of load > balancers more friendly to users. > > >> >> The way to think about autoscaling is as a separate service that >> delegates the creation and deletion of its members to and maintains its >> state in a Heat stack. It *isn't* of course, but nor will it ever be if >> people continue to think about it as a resource plugin that is free to >> reach in to its parent stack and start messing with other things. >> >> Apart from being a layering violation, anything that relies on updating >> the parent stack *after* a scaling operation is complete simply doesn't >> work. When scaling down, you want the changes to be made *before* updating >> the scaling group. In the general case - a batched rolling update - there >> are multiple changes that need to be made mostly *during* the scaling group >> update. >> >> But there is an alternative to this. I guess we could copy the update >>> mechanism used on the AWS side, which is also partial, but at least >>> >> >> -2! This is what we most wanted to avoid in the native resources. > > > I'm fine with this, I don't really like the solution myself that much. > > >> >> >> covers the load balancers, given in the LoadBalancerNames property. We >>> can have a "load_balancer_names" equivalent property for the >>> OS::Heat::ASG resource, and we can then trigger the updates of the load >>> balancer(s) exactly like the AWS side does it. For this option, I would >>> like to extend the load balancer update mechanism to work on custom load >>> balancers, as it currently works with the well known load balancer >>> resources. I have implemented this approach and is currently up for >>> review: https://review.openstack.org/#/c/170634/. I honestly prefer the >>> full update, seems cleaner to me. >>> >>> Anyway, sorry for the long email. If you can provide guidance on which >>> of the approaches are preferred, or if you have other ideas, I would >>> appreciate it. >>> >> >> Long emails are good, thanks for writing this up :) >> >> cheers, >> Zane. >> >> >> ____________________________________________________________ >> ______________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: [email protected]?subject: >> unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
