On 10/13/2014 05:59 PM, Russell Bryant wrote:
Nice timing.  I was working on a blog post on this topic.

On 10/13/2014 05:40 PM, Fei Long Wang wrote:
I think Adam is talking about this bp:

For now, we're using Nagios probe/event to trigger the Nova evacuate
command, but I think it's possible to do that in Nova if we can find a
good way to define the trigger policy.

I actually think that's the right way to do it.

+1. Not everything needs to be built-in to Nova. This very much sounds like something that should be handled by PaaS-layer things that can react to a Nagios notification (or any other event) and take some sort of action, possibly using "administrative" commands like nova evacuate.

> There are a couple of
other things to consider:

1) An ideal solution also includes fencing.  When you evacuate, you want
to make sure you've fenced the original compute node.  You need to make
absolutely sure that the same VM can't be running more than once,
especially when the disks are backed by shared storage.

Because of the fencing requirement, another option would be to use
Pacemaker to orchestrate this whole thing.  Historically Pacemaker
hasn't been suitable to scale to the number of compute nodes an
OpenStack deployment might have, but Pacemaker has a new feature called
pacemaker_remote [1] that may be suitable.

2) Looking forward, there is a lot of demand for doing this on a per
instance basis.  We should decide on a best practice for allowing end
users to indicate whether they would like their VMs automatically
rescued by the infrastructure, or just left down in the case of a
failure.  It could be as simple as a special tag set on an instance [2].

Please note that server instance tagging (thanks for the shout-out, BTW) is intended for only user-defined tags, not system-defined metadata which is what this sounds like...

Of course, one might implement some external polling/monitoring system using server instance tags, which might do a nova list --tag $TAG --host $FAILING_HOST, and initiate a migrate for each returned server instance...


[2] https://review.openstack.org/#/c/127281/

OpenStack-dev mailing list

Reply via email to