On Wed, Oct 15, 2014 at 7:20 PM, Russell Bryant <rbry...@redhat.com>
On 10/13/2014 05:59 PM, Russell Bryant wrote:
Nice timing. I was working on a blog post on this topic.
which is now here:
I am absolutely loving the fact that we are finally having a
discussion in earnest about this. i think this deserves a Design
If I may weigh in here, let me share what I've seen users do and what
can currently be done, and what may be supported in the future.
Problem: automatically ensure that a Nova guest continues to run, even
if its host fails.
(That's the general problem description and I don't need to go into
further details explaining the problem, because Russell has done that
beautifully in his blog post.)
Now, what are the options?
(1) Punt and leave it to the hypervisor.
This essentially means that you must use a hypervisor that already has
HA built in, such as VMware with the VCenter driver. In that scenario,
Nova itself neither deals with HA, nor exposes any HA switches to the
user. Obvious downside: not generic, doesn't work with all
hypervisors, most importantly doesn't work with the most popular one
(2) Deploy Nova nodes in pairs/groups, and pretend that they are one node.
You can already do that by overriding "host" in nova-compute.conf,
setting resume_guests_state_on_host_boot, and using VIPs with
Corosync/Pacemaker. You can then group these hosts in host aggregates,
and the user's scheduler hint to point a newly scheduled guest to such
a host aggregate becomes, effectively, the "keep this guest running at
all times" flag. Upside: no changes to Nova at all, monitoring,
fencing and recovery for free from Corosync/Pacemaker. Downsides:
requires vendors to automate Pacemaker configuration in deployment
tools (because you really don't want to do those things manually).
Additional downside: you either have some idle hardware, or you might
be overcommitting resources in case of failover.
(3) Automatic host evacuation.
Not supported in Nova right now, as Adam pointed out at the top of the
thread, and repeatedly shot down. If someone were to implement this,
it would *still* require that Corosync/Pacemaker be used for
monitoring and fencing of nodes, because re-implementing this from
scratch would be the reinvention of a wheel while painting a bikeshed.
(4) Per-guest HA.
This is the idea of just doing "nova boot --keep-this running", i.e.
setting a per-guest flag that still means the machine is to be kept up
at all times. Again, not supported in Nova right now, and probably
even more complex to implement generically than (3), at the same or
I have a suggestion to tackle this that I *think* is reasonably
user-friendly while still bearable in terms of Nova development
(a) Define a well-known metadata key for a host aggregate, say "ha".
Define that any host aggregate that represents a highly available
group of compute nodes should have this metadata key set.
(b) Then define a flavor that sets extra_specs "ha=true".
Granted, this places an additional burden on distro vendors to
integrate highly-available compute nodes into their deployment
infrastructure. But since practically all of them already include
Pacemaker, the additional scaffolding required is actually rather