Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
On 09/19/2013 04:35 AM, Mike Spreitzer wrote: I'd like to try to summarize this discussion, if nothing else than to see whether I have correctly understood it. There is a lot of consensus, but I haven't heard from Adrian Otto since he wrote some objections. I'll focus on trying to describe the consensus; Adrian's concerns are already collected in a single message. Or maybe this is already written in some one place? The consensus is that there should be an autoscaling (AS) service that is accessible via its own API. This autoscaling service can scale anything describable by a snippet of Heat template (it's not clear to me exactly what sort of syntax this is; is it written up anywhere?). The autoscaling service is stimulated into action by a webhook call. The user has the freedom to arrange calls on that webhook in any way she wants. It is anticipated that a common case will be alarms raised by Ceilometer. For more specialized or complicated logic, the user is free to wire up anything she wants to call the webhook. An instance of the autoscaling service maintains an integer variable, which is the current number of copies of the thing being autoscaled. Does the webhook call provide a new number, or +1/-1 signal, or ...? There was some discussion of a way to indicate which individuals to remove, in the case of decreasing the multiplier. I suppose that would be an option in the webhook, and one that will not be exercised by Ceilometer alarms. (It seems to me that there is not much auto in this autoscaling service --- it is really a scaling service driven by an external controller. This is not a criticism, I think this is a good factoring --- but maybe not the best naming.) The autoscaling service does its job by multiplying the heat template snippet (the thing to be autoscaled) by the current number of copies and passing this derived template to Heat to make it so. As the desired number of copies changes, the AS service changes the derived template that it hands to Heat. Most commentators argue that the consistency and non-redundancy of making the AS service use Heat outweigh the extra path-length compared to a more direct solution. Heat will have a resource type, analogous to AWS::AutoScaling::AutoScalingGroup, through which the template author can request usage of the AS service. OpenStack in general, and Heat in particular, need to be much better at traceability and debuggability; the AS service should be good at these too. Have I got this right? Mike, The key contention to a separate API is that Heat already provides all of this today. It is unclear to me how separating a specially designed autoscaling service from Heat would be of big benefit because we still need the launch configuration and properties of the autoscaling group to be specified. A separate service may specify this in REST API calls, whereas heat specifies it in a template, but really, this isn't much of a difference from a user's view. The user still has to pass all of the same data set in some way. Then there is the issue of duplicated code for at-least handling the creation and removal of the server instances themselves, and the bootstrapping that occurs in the process. Your thread suggests we remove the auto from the scaling - these two concepts seem tightly integrated to me, and my personal opinion is doing so is just a way to work around the need to pass all of the necessary autoscaling parameters in API calls. IMO there is no real benefit in a simple scaling service that is directed by a third party software component (in the proposed case Heat, activated on Ceilometer Alarms). It just feels like it doesn't do enough to warrant an entire OpenStack program. There is significant overhead in each OS program added and I don't see the gain for the pain. I think these are the main points of contention at this point, with no clear consensus. An alternate point in favor of a separate autoscaling component not mentioned in your post is that an API produces a more composable[1] system which brings many advantages. Regards -steve [1] http://en.wikipedia.org/wiki/Composability Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
I'd like to try to summarize this discussion, if nothing else than to see whether I have correctly understood it. There is a lot of consensus, but I haven't heard from Adrian Otto since he wrote some objections. I'll focus on trying to describe the consensus; Adrian's concerns are already collected in a single message. Or maybe this is already written in some one place? The consensus is that there should be an autoscaling (AS) service that is accessible via its own API. This autoscaling service can scale anything describable by a snippet of Heat template (it's not clear to me exactly what sort of syntax this is; is it written up anywhere?). The autoscaling service is stimulated into action by a webhook call. The user has the freedom to arrange calls on that webhook in any way she wants. It is anticipated that a common case will be alarms raised by Ceilometer. For more specialized or complicated logic, the user is free to wire up anything she wants to call the webhook. An instance of the autoscaling service maintains an integer variable, which is the current number of copies of the thing being autoscaled. Does the webhook call provide a new number, or +1/-1 signal, or ...? There was some discussion of a way to indicate which individuals to remove, in the case of decreasing the multiplier. I suppose that would be an option in the webhook, and one that will not be exercised by Ceilometer alarms. (It seems to me that there is not much auto in this autoscaling service --- it is really a scaling service driven by an external controller. This is not a criticism, I think this is a good factoring --- but maybe not the best naming.) The autoscaling service does its job by multiplying the heat template snippet (the thing to be autoscaled) by the current number of copies and passing this derived template to Heat to make it so. As the desired number of copies changes, the AS service changes the derived template that it hands to Heat. Most commentators argue that the consistency and non-redundancy of making the AS service use Heat outweigh the extra path-length compared to a more direct solution. Heat will have a resource type, analogous to AWS::AutoScaling::AutoScalingGroup, through which the template author can request usage of the AS service. OpenStack in general, and Heat in particular, need to be much better at traceability and debuggability; the AS service should be good at these too. Have I got this right? Thanks, Mike___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
Hi Michael! Thanks for this summary. There were some minor inaccuracies, but I appreciate you at least trying when I should have summarized it earlier. I'll give some feedback inline. First, though, I have recently worked a lot on the wiki page for the blueprint. It's available here: https://wiki.openstack.org/wiki/Heat/AutoScaling It still might need a little bit more cleaning up and probably a more holistic example, but it should be pretty close now. I will say that I changed it to specify the Heat resources for using autoscale instead of the APIs of the AS API mostly for convenience because they're easily specifiable. The AS API should be derived pretty obviously from the resources. On Thu, Sep 19, 2013 at 6:35 AM, Mike Spreitzer mspre...@us.ibm.com wrote: I'd like to try to summarize this discussion, if nothing else than to see whether I have correctly understood it. There is a lot of consensus, but I haven't heard from Adrian Otto since he wrote some objections. I'll focus on trying to describe the consensus; Adrian's concerns are already collected in a single message. Or maybe this is already written in some one place? Yeah. Sorry I didn't link that wiki page earlier; it was in a pretty raw and chaotic form. The consensus is that there should be an autoscaling (AS) service that is accessible via its own API. This autoscaling service can scale anything describable by a snippet of Heat template (it's not clear to me exactly what sort of syntax this is; is it written up anywhere?). Yes. See the wiki page above; it's basically just a mapping exactly like the Resources section in a typical Heat template. e.g. {..., Resources: {mywebserver: {Type: OS::Nova::Server}, ...}} The autoscaling service is stimulated into action by a webhook call. The user has the freedom to arrange calls on that webhook in any way she wants. It is anticipated that a common case will be alarms raised by Ceilometer. For more specialized or complicated logic, the user is free to wire up anything she wants to call the webhook. This is accurate. An instance of the autoscaling service maintains an integer variable, which is the current number of copies of the thing being autoscaled. Does the webhook call provide a new number, or +1/-1 signal, or ...? The webhook provides no parameters. The amount of change is encoded into the policy that the webhook is associated with. Policies can change it the same way they can in current AWS-based autoscaling: +/- fixed number, or +/- percent, or setting it to a specific number directly. There was some discussion of a way to indicate which individuals to remove, in the case of decreasing the multiplier. I suppose that would be an option in the webhook, and one that will not be exercised by Ceilometer alarms. I don't think the webhook is the right place to do that. That should probably be a specific thing in the AS API. (It seems to me that there is not much auto in this autoscaling service --- it is really a scaling service driven by an external controller. This is not a criticism, I think this is a good factoring --- but maybe not the best naming.) I think the policies are what qualify it for the auto term. You can have webhook policies or schedule-based policies (and maybe more policies in the future). The policies determine how to change the group. The autoscaling service does its job by multiplying the heat template snippet (the thing to be autoscaled) by the current number of copies and passing this derived template to Heat to make it so. As the desired number of copies changes, the AS service changes the derived template that it hands to Heat. Most commentators argue that the consistency and non-redundancy of making the AS service use Heat outweigh the extra path-length compared to a more direct solution. Agreed. Heat will have a resource type, analogous to AWS::AutoScaling::AutoScalingGroup, through which the template author can request usage of the AS service. Yes. OpenStack in general, and Heat in particular, need to be much better at traceability and debuggability; the AS service should be good at these too. Agreed. Have I got this right? Pretty much! Thanks for the summary :-) -- IRC: radix Christopher Armstrong Rackspace ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
radix, thanks. How exactly does the cooldown work? Thanks, Mike___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
On Thu, Sep 12, 2013 at 04:15:39AM +, Joshua Harlow wrote: Ah, thx keith, that seems to make a little more sense with that context. Maybe that different instance will be doing other stuff also? Is that the general heat 'topology' that should/is recommended for trove? For say autoscaling trove, will trove emit a set of metrics via ceilometer that heat (or a separate autoscaling thing) will use to analyze if autoscaling should occur? I suppose nova would also emit its own set and it will be up to the autoscaler to merge those together (as trove instances are nova instances). Its a very interesting set of problems to make an autoscaling entity that works well without making that autoscaling entity to aware of the internals of the various projects. Making it to aware and then the whole system is fragile but not making it aware enough and then it will not do its job very well. No, this is not how things work now we're integrated with Ceilometer (*alarms*, not raw metrics) Previously Head did do basic metric evaluation internally, but now we rely on Ceilometer to do all of that for us, so we just pass a web-hook URL to Ceilometer, which gets hit when an alarm happens (in Ceilometer). So Trove, Nova, or whatever, just need to get metrics into Ceilometer, then you can set up a Ceilometer alarm via Heat, associated with the Autoscaling resource. This has a number of advantages, in particular it removes any coupling between heat and specific metrics or internals, and it provides a very flexible interface if people want to drive Heat AutoScaling via something other than Ceilometer (e.g the autoscale API under discussion here) Steve ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
Cool, thanks for the explanation and clarification :) Sent from my really tiny device... On Sep 12, 2013, at 12:41 AM, Steven Hardy sha...@redhat.com wrote: On Thu, Sep 12, 2013 at 04:15:39AM +, Joshua Harlow wrote: Ah, thx keith, that seems to make a little more sense with that context. Maybe that different instance will be doing other stuff also? Is that the general heat 'topology' that should/is recommended for trove? For say autoscaling trove, will trove emit a set of metrics via ceilometer that heat (or a separate autoscaling thing) will use to analyze if autoscaling should occur? I suppose nova would also emit its own set and it will be up to the autoscaler to merge those together (as trove instances are nova instances). Its a very interesting set of problems to make an autoscaling entity that works well without making that autoscaling entity to aware of the internals of the various projects. Making it to aware and then the whole system is fragile but not making it aware enough and then it will not do its job very well. No, this is not how things work now we're integrated with Ceilometer (*alarms*, not raw metrics) Previously Head did do basic metric evaluation internally, but now we rely on Ceilometer to do all of that for us, so we just pass a web-hook URL to Ceilometer, which gets hit when an alarm happens (in Ceilometer). So Trove, Nova, or whatever, just need to get metrics into Ceilometer, then you can set up a Ceilometer alarm via Heat, associated with the Autoscaling resource. This has a number of advantages, in particular it removes any coupling between heat and specific metrics or internals, and it provides a very flexible interface if people want to drive Heat AutoScaling via something other than Ceilometer (e.g the autoscale API under discussion here) Steve ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
I apologize that this mail will appear at the incorrect position in the thread, but I somehow got unsubscribed from openstack-dev due to bounces and didn't receive the original email. On 9/11/13 03:15 UTC, Adrian Otto adrian.o...@rackspace.com wrote: So, I don't intend to argue the technical minutia of each design point, but I challenge you to make sure that we (1) arrive at a simple system that any OpenStack user can comprehend, I think there is tension between simplicity of the stack and simplicity of the components in that stack. We're making sure that the components will be simple, self-contained, and easy to understand, and the stack will need to plug them together in an interesting way. (2) responds quickly to alarm stimulus, Like Zane, I don't really buy the argument that the API calls to Heat will make any significant impact on the speed of autoscaling. There are MUCH bigger wins in e.g. improving the ability for people to use cached, pre-configured images vs a couple of API calls. Even once you've optimized all of that, booting an instance still takes much, much longer than running the control code. (3) is unlikely to fail, I know this isn't exactly what you mentioned, but I have some things to say not about resilience but instead about reaction to failures. The traceability and debuggability of errors is something that unfortunately plagues all of OpenStack, both for developers and end-users. It is fortunate that OpenStack compononts make good use of each other, but unfortunate that 1. there's no central, filterable logging facility (without investing significant ops effort to deploy a third-party one yourself); 2. not enough consistent tagging of requests throughout the system that allows operators looking at logs to understand how a user's original request led to some ultimate error; 3. no ubiquitous mechanisms for propagating errors between service APIs in a way that ultimately lead back to the consumer of the service; 4. many services don't even report detailed information with errors that happen internally. I believe we'll have to do what we can, especially in #3 and #4, to make sure that the users of autoscaling and Heat have good visibility into the system when errors occur. (4) can be easily customized with user-supplied logic that controls how the scaling happens, and under what conditions. I think this is a good argument for using Heat for the scaling resources instead of doing it separately. One of the biggest new features that the new AS design provides is the ability to scale *any* resource, not just AWS::EC2::Instance. This means you can write your own custom resource with custom logic and scale it trivially. Doing it in terms of resources instead of launch configurations provides a lot of flexibility, and a Resource implementation is a nice way to wrap up that custom logic. If we implemented this in the AS service without using Heat, we'd either be constrained to nova instances again, or have to come up with our own API for customization. As far as customizing the conditions under which scaling happens, that's provided at the lowest common denominator by providing a webhook trigger for scaling policies (on top of which will be implemented convenient Ceilometer integration support). Users will be able to provide their own logic and hit the webhook whenever they want to execute the policy. It would be better if we could explain Autoscale like this: Heat - Autoscale - Nova, etc. -or- User - Autoscale - Nova, etc. This approach allows use cases where (for whatever reason) the end user does not want to use Heat at all, but still wants something simple to be auto-scaled for them. Nobody would be scratching their heads wondering why things are going in circles. The Heat behind Autoscale isn't something that the *consumer* of the service knows about, only the administrator. Granted, the API design that we're running with *does* currently require the user to provide snippets of heat resource templates -- just specifying the individual resources that should be scaled -- but I think it would be trivial to support an alternative type of launch configuration that does the translation to heat templates in the background, if we really want to hide all the Heatiness from a user who just wants the simplicity of knowing only about Nova and autoscaling. To conclude, I'd like to just say I basically agree with what Clint, Keith, and Steven have said in other messages in this thread. I doesn't appear that the design of Heat autoscaling (informed by Zane, Clint, Angus and others) fails to meet the criteria you've brought up. -- IRC: radix Christopher Armstrong Rackspace ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
Steve, I think I see where I introduced some confusion... Below, when you draw: User - Trove - (Heat - Nova) I come at it from a view that the version of Nova that Trove talks to (via Heat or not) is not necessarily a publicly available Nova endpoint (I.e. Not in the catalog), although it *could* be. For example, there are reasons that Trove may provision to an internal-only Nova end-point that is tricked out with custom scheduler or virt driver (e.g. Containers) or special DB performant hardware, etc. This Nova endpoint would be different than the Nova endpoint in the end-user's catalog. But, I realize that Trove could interact with the catalog endpoint for Nova as well. I'm sorry for the confusion I introduced by how I was thinking about that. I guess this is one of those differences between a default OpenStack setup vs. how a service provider might want to run the system for scale and performance. The cool part is, I think Heat and all these general services can work in a variety of cool configurations! -Keith On 9/12/13 2:30 AM, Steven Hardy sha...@redhat.com wrote: On Thu, Sep 12, 2013 at 01:07:03AM +, Keith Bray wrote: There is context missing here. heat==trove interaction is through the trove API. trove==heat interaction is a _different_ instance of Heat, internal to trove's infrastructure setup, potentially provisioning instances. Public Heat wouldn't be creating instances and then telling trove to make them into databases. Well that's a deployer decision, you wouldn't need (or necessarily want) to run an additional heat service (if that's what you mean by instance in this case). What you may want is for the trove-owned stacks to be created in a different tenant (owned by the trove service user in the services tenant?) So the top level view would be: User - Trove - (Heat - Nova) Or if the user is interacting via a Trove Heat resource User - Heat - Trove - (Heat - Nova) There is nothing circular here, Trove uses Heat as an internal implementation detail: * User defines a Heat template, and passes it to Heat * Heat parses the template and translates a Trove resource into API calls * Trove internally defines a stack, which is passes to Heat In the last step, although Trove *could* just pass on the user token it has from the top level API interaction to Heat, you may not want it to, particularly in public cloud environments. Steve ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
Excerpts from Steven Hardy's message of 2013-09-11 05:59:02 -0700: On Wed, Sep 11, 2013 at 03:51:02AM +, Adrian Otto wrote: It would be better if we could explain Autoscale like this: Heat - Autoscale - Nova, etc. -or- User - Autoscale - Nova, etc. This approach allows use cases where (for whatever reason) the end user does not want to use Heat at all, but still wants something simple to be auto-scaled for them. Nobody would be scratching their heads wondering why things are going in circles. From an implementation perspective, that means the auto-scale service needs at least a simple linear workflow capability in it that may trigger a Heat orchestration if there is a good reason for it. This way, the typical use cases don't have anything resembling circular dependencies. The source of truth for how many members are currently in an Autoscaling group should be the Autoscale service, not in the Heat database. If you want to expose that in list-stack-resources output, then cause Heat to call out to the Autoscale service to fetch that figure as needed. It is irrelevant to orchestration. Code does not need to be duplicated. Both Autoscale and Heat can use the same exact source code files for the code that launches/terminates instances of resources. So I take issue with the circular dependencies statement, nothing proposed so far has anything resembling a circular dependency. I think it's better to consider traditional encapsulation, where two projects may very well make use of the same class from a library. Why is it less valid to consider code reuse via another interface (ReST service)? The point of the arguments to date, AIUI is to ensure orchestration actions and management of dependencies don't get duplicated in any AS service which is created. This is the crux of the reason that Heat should be involved. In the driving analogy, Heat is not some boot perched on a ladder waiting for autoscaling to drop a bowling ball on it to turn the car. It is more like power steering. The driver puts an input into the system, and the power steering does the hard work. If you ever need the full power of power steering, then designing the system to be able to bypass power steering sometimes will make it _more_ complex, not less. Meanwhile, as the problems that need to be solved become more complex, Heat will be there to simplify the solutions. If it is ever making system control more complex, that is Heat's failure and we need to make Heat simpler. What we should get out of the habit of is bypassing Heat and building new control systems because Heat doesn't yet do what we want it to do. To any who would roll their own orchestration rather than let Heat do it: If Heat adds an unacceptable amount of latency, please file a bug. If Heat adds complexity, please file a bug. If you've already done that.. I owe you a gold star. :) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
Sure, I was thinking that since heat would do autoscaling persay, then heat would say ask trove to make more databases (autoscale policy here) then this would cause trove to actually callback into heat to make more instances. Just feels a little weird, idk. Why didn't heat just make those instances on behalf of trove to begin with and then tell trove make these instances into databases. Then trove doesn't really need to worry about calling into heat to do the instance creation work, and trove can just worry about converting those blank instances into databases (for example). But maybe I am missing other context also :) Sent from my really tiny device... On Sep 11, 2013, at 8:04 AM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Joshua Harlow's message of 2013-09-11 01:00:37 -0700: +1 The assertions are not just applicable to autoscaling but to software in general. I hope we can make autoscaling just enough simple to work. The circular heat=trove example is one of those that does worry me a little. It feels like something is not structured right if that it is needed (rube goldberg like). I am not sure what could be done differently, just my gut feeling that something is off. Joshua, can you elaborate on the circular heat=trove example? I don't see Heat and Trove's relationship as circular. Heat has a Trove resource, and (soon? now?) Trove can use Heat to simplify its control of underlying systems. This is a stack, not a circle, or did I miss something? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
Excerpts from Joshua Harlow's message of 2013-09-11 09:11:06 -0700: Sure, I was thinking that since heat would do autoscaling persay, then heat would say ask trove to make more databases (autoscale policy here) then this would cause trove to actually callback into heat to make more instances. Just feels a little weird, idk. Why didn't heat just make those instances on behalf of trove to begin with and then tell trove make these instances into databases. Then trove doesn't really need to worry about calling into heat to do the instance creation work, and trove can just worry about converting those blank instances into databases (for example). But maybe I am missing other context also :) That sort of optimization would violate encapsulation and make the system more complex. Heat doing Trove's provisioning and coordinating Trove's interaction with other pieces of the system is an implementation detail, safely hidden behind Trove. Interaction between other pieces of the end user's stack and Trove is limited to what Trove wants to expose. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
I just have this idea that if u imagine a factory. Heat is the 'robot' in an assembly line that ensures the 'assembly line' is done correctly. At different stages heat makes sure the 'person/thing' putting a part on does it correctly and heat verifies that the part is in the right place (for example, nova didn't put the wheel on backwards). The 'robot' then moves the partially completed part to the next person and repeats the same checks. So to me, autoscaling say a database would be like going through the stages of that assembly line via a non-user triggered system (the autoscaler) and then the final 'paint job' on the vms would be done by the handoff from heat - trove. Then trove doesn't need to call back into heat to make vms that it uses, heat does this for trove as part of the assembly line. +2 for factory example, ha. On 9/11/13 9:11 AM, Joshua Harlow harlo...@yahoo-inc.com wrote: Sure, I was thinking that since heat would do autoscaling persay, then heat would say ask trove to make more databases (autoscale policy here) then this would cause trove to actually callback into heat to make more instances. Just feels a little weird, idk. Why didn't heat just make those instances on behalf of trove to begin with and then tell trove make these instances into databases. Then trove doesn't really need to worry about calling into heat to do the instance creation work, and trove can just worry about converting those blank instances into databases (for example). But maybe I am missing other context also :) Sent from my really tiny device... On Sep 11, 2013, at 8:04 AM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Joshua Harlow's message of 2013-09-11 01:00:37 -0700: +1 The assertions are not just applicable to autoscaling but to software in general. I hope we can make autoscaling just enough simple to work. The circular heat=trove example is one of those that does worry me a little. It feels like something is not structured right if that it is needed (rube goldberg like). I am not sure what could be done differently, just my gut feeling that something is off. Joshua, can you elaborate on the circular heat=trove example? I don't see Heat and Trove's relationship as circular. Heat has a Trove resource, and (soon? now?) Trove can use Heat to simplify its control of underlying systems. This is a stack, not a circle, or did I miss something? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
There is context missing here. heat==trove interaction is through the trove API. trove==heat interaction is a _different_ instance of Heat, internal to trove's infrastructure setup, potentially provisioning instances. Public Heat wouldn't be creating instances and then telling trove to make them into databases. At least, that's what I understand from conversations with the Trove folks. I could be wrong here also. -Keith On 9/11/13 11:11 AM, Joshua Harlow harlo...@yahoo-inc.com wrote: Sure, I was thinking that since heat would do autoscaling persay, then heat would say ask trove to make more databases (autoscale policy here) then this would cause trove to actually callback into heat to make more instances. Just feels a little weird, idk. Why didn't heat just make those instances on behalf of trove to begin with and then tell trove make these instances into databases. Then trove doesn't really need to worry about calling into heat to do the instance creation work, and trove can just worry about converting those blank instances into databases (for example). But maybe I am missing other context also :) Sent from my really tiny device... On Sep 11, 2013, at 8:04 AM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Joshua Harlow's message of 2013-09-11 01:00:37 -0700: +1 The assertions are not just applicable to autoscaling but to software in general. I hope we can make autoscaling just enough simple to work. The circular heat=trove example is one of those that does worry me a little. It feels like something is not structured right if that it is needed (rube goldberg like). I am not sure what could be done differently, just my gut feeling that something is off. Joshua, can you elaborate on the circular heat=trove example? I don't see Heat and Trove's relationship as circular. Heat has a Trove resource, and (soon? now?) Trove can use Heat to simplify its control of underlying systems. This is a stack, not a circle, or did I miss something? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
Ah, thx keith, that seems to make a little more sense with that context. Maybe that different instance will be doing other stuff also? Is that the general heat 'topology' that should/is recommended for trove? For say autoscaling trove, will trove emit a set of metrics via ceilometer that heat (or a separate autoscaling thing) will use to analyze if autoscaling should occur? I suppose nova would also emit its own set and it will be up to the autoscaler to merge those together (as trove instances are nova instances). Its a very interesting set of problems to make an autoscaling entity that works well without making that autoscaling entity to aware of the internals of the various projects. Making it to aware and then the whole system is fragile but not making it aware enough and then it will not do its job very well. On 9/11/13 6:07 PM, Keith Bray keith.b...@rackspace.com wrote: There is context missing here. heat==trove interaction is through the trove API. trove==heat interaction is a _different_ instance of Heat, internal to trove's infrastructure setup, potentially provisioning instances. Public Heat wouldn't be creating instances and then telling trove to make them into databases. At least, that's what I understand from conversations with the Trove folks. I could be wrong here also. -Keith On 9/11/13 11:11 AM, Joshua Harlow harlo...@yahoo-inc.com wrote: Sure, I was thinking that since heat would do autoscaling persay, then heat would say ask trove to make more databases (autoscale policy here) then this would cause trove to actually callback into heat to make more instances. Just feels a little weird, idk. Why didn't heat just make those instances on behalf of trove to begin with and then tell trove make these instances into databases. Then trove doesn't really need to worry about calling into heat to do the instance creation work, and trove can just worry about converting those blank instances into databases (for example). But maybe I am missing other context also :) Sent from my really tiny device... On Sep 11, 2013, at 8:04 AM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Joshua Harlow's message of 2013-09-11 01:00:37 -0700: +1 The assertions are not just applicable to autoscaling but to software in general. I hope we can make autoscaling just enough simple to work. The circular heat=trove example is one of those that does worry me a little. It feels like something is not structured right if that it is needed (rube goldberg like). I am not sure what could be done differently, just my gut feeling that something is off. Joshua, can you elaborate on the circular heat=trove example? I don't see Heat and Trove's relationship as circular. Heat has a Trove resource, and (soon? now?) Trove can use Heat to simplify its control of underlying systems. This is a stack, not a circle, or did I miss something? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
I have a different point of view. First I will offer some assertions: A-1) We need to keep it simple. A-1.1) Systems that are hard to comprehend are hard to debug, and that's bad. A-1.2) Complex systems tend to be much more brittle than simple ones. A-2) Scale-up operations need to be as-fast-as-possible. A-2.1) Auto-Scaling only works right if your new capacity is added quickly when your controller detects that you need more. If you spend a bunch of time goofing around before actually adding a new resource to a pool when its under staring. A-2.2) The fewer network round trips between add-more-resources-now and resources-added the better. Fewer = less brittle. A-3) The control logic for scaling different applications vary. A-3.1) What metrics are watched may differ between various use cases. A-3.2) The data types that represent sensor data may vary. A-3.3) The policy that's applied to the metrics (such as max, min, and cooldown period) vary between applications. Not only the values vary, but the logic itself. A-3.4) A scaling policy may not just be a handful of simple parameters. Ideally it allows configurable logic that the end-user can control to some extent. A-4) Auto-scale operations are usually not orchestrations. They are usually simple linear workflows. A-4.1) The Taskflow project[1] offers a simple way to do workflows and stable state management that can be integrated directly into Autoscale. A-4.2) A task flow (workflow) can trigger a Heat orchestration if needed. Now a mental tool to think about control policies: Auto-scaling is like steering a car. The control policy says that you want to drive equally between the two lane lines, and that if you drift off center, you gradually correct back toward center again. If the road bends, you try to remain in your lane as the lane lines curve. You try not to weave around in your lane, and you try not to drift out of the lane. If your controller notices that you are about to drift out of your lane because the road is starting to bend, and you are distracted, or your hands slip off the wheel, you might drift out of your lane into nearby traffic. That's why you don't want a Rube Goldberg Machine[2] between you and the steering wheel. See assertions A-1 and A-2. If you are driving an 18-wheel tractor/trailer truck, steering is different than if you are driving a Fiat. You need to wait longer and steer toward the outside of curves so your trailer does not lag behind on the inside of the curve behind you as you correct for a bend in the road. When you are driving the Fiat, you may want to aim for the middle of the lane at all times, possibly even apexing bends to reduce your driving distance, which is actually the opposite of what truck drivers need to do. Control policies apply to other parts of driving too. I want a different policy for braking than I use for steering. On some vehicles I go through a gear shifting workflow, and on others I don't. See assertion A-3. So, I don't intend to argue the technical minutia of each design point, but I challenge you to make sure that we (1) arrive at a simple system that any OpenStack user can comprehend, (2) responds quickly to alarm stimulus, (3) is unlikely to fail, (4) can be easily customized with user-supplied logic that controls how the scaling happens, and under what conditions. It would be better if we could explain Autoscale like this: Heat - Autoscale - Nova, etc. -or- User - Autoscale - Nova, etc. This approach allows use cases where (for whatever reason) the end user does not want to use Heat at all, but still wants something simple to be auto-scaled for them. Nobody would be scratching their heads wondering why things are going in circles. From an implementation perspective, that means the auto-scale service needs at least a simple linear workflow capability in it that may trigger a Heat orchestration if there is a good reason for it. This way, the typical use cases don't have anything resembling circular dependencies. The source of truth for how many members are currently in an Autoscaling group should be the Autoscale service, not in the Heat database. If you want to expose that in list-stack-resources output, then cause Heat to call out to the Autoscale service to fetch that figure as needed. It is irrelevant to orchestration. Code does not need to be duplicated. Both Autoscale and Heat can use the same exact source code files for the code that launches/terminates instances of resources. References: [1] https://wiki.openstack.org/wiki/TaskFlow [2] http://en.wikipedia.org/wiki/Rube_Goldberg_machine Thanks, Adrian On Aug 16, 2013, at 11:36 AM, Zane Bitter zbit...@redhat.com wrote: On 16/08/13 00:50, Christopher Armstrong wrote: *Introduction and Requirements* So there's kind of a perfect storm happening around autoscaling in Heat right now. It's making
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
On Fri, Aug 16, 2013 at 1:35 PM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Zane Bitter's message of 2013-08-16 09:36:23 -0700: On 16/08/13 00:50, Christopher Armstrong wrote: *Introduction and Requirements* So there's kind of a perfect storm happening around autoscaling in Heat right now. It's making it really hard to figure out how I should compose this email. There are a lot of different requirements, a lot of different cool ideas, and a lot of projects that want to take advantage of autoscaling in one way or another: Trove, OpenShift, TripleO, just to name a few... I'll try to list the requirements from various people/projects that may be relevant to autoscaling or scaling in general. 1. Some users want a service like Amazon's Auto Scaling or Rackspace's Otter -- a simple API that doesn't really involve orchestration. 2. If such a API exists, it makes sense for Heat to take advantage of its functionality instead of reimplementing it. +1, obviously. But the other half of the story is that the API is likely be implemented using Heat on the back end, amongst other reasons because that implementation already exists. (As you know, since you wrote it ;) So, just as we will have an RDS resource in Heat that calls Trove, and Trove will use Heat for orchestration: user = [Heat =] Trove = Heat = Nova there will be a similar workflow for Autoscaling: user = [Heat =] Autoscaling - Heat = Nova After a lot of consideration and an interesting IRC discussion, I think the point above makes it clear for me. Autoscaling will have a simpler implementation by making use of Heat's orchestration capabilities, but the fact that Heat will also use autoscaling is orthogonal to that. That does beg the question of why this belongs in Heat. Originally we had taken the stance that there must be only one control system, lest they have a policy-based battle royale. If we only ever let autoscaled resources be controlled via Heat (via nested stack produced by autoscaling), then there can be only one.. control service (Heat). By enforcing that autoscaling always talks to the world via Heat though, I think that reaffirms for me that autoscaling, while not really the same project (seems like it could happily live in its own code tree), will be best served by staying inside the OpenStack Orchestration program. The question of private RPC or driving it via the API is not all that interesting to me. I do prefer the SOA method and having things talk via their respective public APIs as it keeps things loosely coupled and thus easier to fit into one's brain and debug/change. I agree with using only public APIs. I have managed to fit this model of autoscaling managing a completely independent Heat stack into my brain, and I am willing to take it and run with it. Thanks to Zane and Clint for hashing this out with me in a 2-hour IRC design discussion, it was incredibly helpful :-) -- IRC: radix Christopher Armstrong Rackspace ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] How the autoscale API should control scaling in Heat
On Thu, Aug 15, 2013 at 6:39 PM, Randall Burt randall.b...@rackspace.comwrote: On Aug 15, 2013, at 6:20 PM, Angus Salkeld asalk...@redhat.com wrote: On 15/08/13 17:50 -0500, Christopher Armstrong wrote: 2. There should be a new custom-built API for doing exactly what the autoscaling service needs on an InstanceGroup, named something unashamedly specific -- like instance-group-adjust. Pros: It'll do exactly what it needs to do for this use case; very little state management in autoscale API; it lets Heat do all the orchestration and only give very specific delegation to the external autoscale API. Cons: The API grows an additional method for a specific use case. I like this one above: adjust(new_size, victim_list=['i1','i7']) So if you are reducing the new_size we look in the victim_list to choose those first. This should cover Clint's use case as well. -Angus We could just support victim_list=[1, 7], since these groups are collections of identical resources. Simple indexing should be sufficient, I would think. Perhaps separating the stimulus from the actions to take would let us design/build toward different policy implementations. Initially, we could have a HeatScalingPolicy that works with the signals that a scaling group can handle. When/if AS becomes an API outside of Heat, we can implement a fairly simple NovaScalingPolicy that includes the args to pass to nova boot. I don't agree with using indices. I'd rather use the actual resource IDs. For one, indices can change out from under you. Also, figuring out the index of the instance you want to kill is probably an additional step most of the time you actually care about destroying specific instances. 3. the autoscaling API should update the Size Property of the InstanceGroup resource in the stack that it is placed in. This would require the ability to PATCH a specific piece of a template (an operation isomorphic to update-stack). I think a PATCH semantic for updates would be generally useful in terms of quality of life for API users. Not having to pass the complete state and param values for trivial updates would be quite nice regardless of its implications to AS. Agreed. -- IRC: radix Christopher Armstrong Rackspace ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Heat] How the autoscale API should control scaling in Heat
*Introduction and Requirements* So there's kind of a perfect storm happening around autoscaling in Heat right now. It's making it really hard to figure out how I should compose this email. There are a lot of different requirements, a lot of different cool ideas, and a lot of projects that want to take advantage of autoscaling in one way or another: Trove, OpenShift, TripleO, just to name a few... I'll try to list the requirements from various people/projects that may be relevant to autoscaling or scaling in general. 1. Some users want a service like Amazon's Auto Scaling or Rackspace's Otter -- a simple API that doesn't really involve orchestration. 2. If such a API exists, it makes sense for Heat to take advantage of its functionality instead of reimplementing it. 3. If Heat integrates with that separate API, however, that API will need two ways to do its work: 1. native instance-launching functionality, for the simple use 2. a way to talk back to Heat to perform orchestration-aware scaling operations. 4. There may be things that are different than AWS::EC2::Instance that we would want to scale (I have personally been playing around with the concept of a ResourceGroup, which would maintain a nested stack of resources based on an arbitrary template snippet). 5. Some people would like to be able to perform manual operations on an instance group -- such as Clint Byrum's recent example of remove instance 4 from resource group A. Please chime in with your additional requirements if you have any! Trove and TripleO people, I'm looking at you :-) *TL;DR* Point 3.2. above is the main point of this email: exactly how should the autoscaling API talk back to Heat to tell it to add more instances? I included the other points so that we keep them in mind while considering a solution. *Possible Solutions* I have heard at least three possibilities so far: 1. the autoscaling API should maintain a full template of all the nodes in the autoscaled nested stack, manipulate it locally when it wants to add or remove instances, and post an update-stack to the nested-stack associated with the InstanceGroup. Pros: It doesn't require any changes to Heat. Cons: It puts a lot of burden of state management on the autoscale API, and it arguably spreads out the responsibility of orchestration to the autoscale API. Also arguable is that automated agents outside of Heat shouldn't be managing an internal template, which are typically developed by devops people and kept in version control. 2. There should be a new custom-built API for doing exactly what the autoscaling service needs on an InstanceGroup, named something unashamedly specific -- like instance-group-adjust. Pros: It'll do exactly what it needs to do for this use case; very little state management in autoscale API; it lets Heat do all the orchestration and only give very specific delegation to the external autoscale API. Cons: The API grows an additional method for a specific use case. 3. the autoscaling API should update the Size Property of the InstanceGroup resource in the stack that it is placed in. This would require the ability to PATCH a specific piece of a template (an operation isomorphic to update-stack). Pros: The API modification is generic, simply a more optimized version of update-stack; very little state management required in autoscale API. Cons: This would essentially require manipulating the user-provided template. (unless we have a concept of private properties, which perhaps wouldn't appear in the template as provided by the user, but can be manipulated with such an update stack operation?) *Addenda* Keep in mind that there are use cases which require other types of manipulation of the InstanceGroup -- not just the autoscaling API. For example, see Clint's #5 above. Also, about implementation: Andrew Plunk and I have begun work on Heat resources for Rackspace's Otter, which I think will be a really good proof of concept for how this stuff should work in the Heat-native autoscale API. I am trying to gradually work the design into the native Heat autoscaling design, and we will need to solve the autoscale-controlling-InstanceGroup issue soon. -- IRC: radix Christopher Armstrong Rackspace ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev