I want to start a dialog around adding some usability features to Savanna, now
that I have had a chance to spend a fair amount of time provisioning, scaling
and changing clusters. Here is a list of items that I believe are important to
address; comments welcomed:
1. Changing an OpenStack flavor associated with a node-group template has a
ripple effect on both node group and cluster templates. The reason for this is
that OpenStack does not support modification of flavors; when a flavor is
modified (RAM, CPU, Root Disk, etc.) the flavor is deleted and a new one is
created resulting in a new flavor id. The implication is that both node-group
referencing the flavor [id] and any cluster templates referencing the affected
node-group become stale and unusable. A user then has to start from scratch,
creating new node-groups and cluster templates for a simple flavor change.
a. A possible solution to this is to internally associate flavor
name with node-group and look up the flavor id based on flavor name when
provisioning instances
b. At a minimum it should be possible to change the flavor id
associated with a node-group. See #2.
2. Cluster templates and node group templates are immutable. This is more of
an issue at the node-group level as I often times want to make changes to a
node-group and want to affect all cluster templates that make use of that
node-group. I see this as being fairly commonplace.
3. Before provisioning a cluster, quotas should be checked to make sure that
enough quota exists. I know this can be done transactionally (check quota,
spawn cluster), but a basic check would go a long way.
4. Spawning a large cluster comes with some problems today, as Savanna will
abort if a single VM fails. In deploying large clusters (100’s to x1000), which
will be commonplace, having a single slave VM (i.e. data node) not spawn
properly should not necessarily be a reason to abort the entire deployment, in
particular in a scaling operation. Obviously the failing VM cannot be a master
service. This applies both to the plugin as well as controller as they are both
involved in the deployment. I could see a possible solution being the creation
of an error/fault policy allowing a user to specify (perhaps optionally) a % or
hard number for minimum # of nodes that need to come up without aborting
deployment.
a. This also applies to scaling the cluster by larger increments
Just some thoughts based on my experience last week; comments welcomed.
Best,
Erik
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev