Excerpts from Walls, Jeffrey Joel (Cloud OS R&D)'s message of 2014-01-13 06:54:11 -0800: > > From: Jaromir Coufal [mailto:[email protected]] > >> On 2014/10/01 19:02, Dougal Matthews wrote: > > > > - If I remove some instances, do I as the administrator need to care > > > which are removed? Do we need to choose or be informed at the end? > > This is great question on which we have long debates. I am convinced that I > > as > > administrator, do care which nodes I want to free up. > > > > But current TripleO approach is using heat template and there we can just > > specify number of nodes of that specific role. So it means that I decrease > > from > > 10 to 9 instances and app will take care for us for some node to be removed > > (AFAIK heat removes the last added node). > > > > So what we can do at the moment (until there is some way to specify which > > node to remove) is to inform user, which nodes were removed in the end... at > > least. > > > > In the future, I'd like to enable user to have both ways available - just > > decrease > > number and let system to decide which nodes are going to be removed for him > > (but at least inform in advance which nodes are the chosen ones). Or, let > > user to > > choose by himself. > > Should a defect be filed against Heat then? If I have a system that is > currently running > my app server and heat comes along and deprovisions it (simply because it > happened > to be running on the system that was spun up last), I'm going to be quite > upset. >
Yes indeed, there are a number of bugs in this area and I think they're so fundamental to our problems in TripleO that I haven't even taken the time to file these bugs, which I should probably do. Basically what we need is two changes: 1) Check for an already deleted server before deleting any. This is related to stack convergence: https://blueprints.launchpad.net/heat/+spec/stack-convergence This will allow users to just delete a server they want to delete, and then update the template to reflect reality. 2) Allow resources to be marked as critical or disposable. Critical resources would not ever be deleted for scaling purposes or during updates. An update would fail if there were no disposable resources. Scaling down would just need to be retried at this point. With those two things, TripleO can make the default "disposable" for stateless resources, and "critical" for stateful resources. Tuskar would just report on problems in managing the Heat stack. Admins can then control any business cases for evacuations/retirement of workloads/etc for automation purposes. Eventually perhaps we could use Mistral to manage that, but for now, I think just being able to protect and manually delete important nodes for scale down is enough. Perhaps Tuskar could even pop up a dialog showing them and allowing manual selection. _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
