Re: Some thoughts on enhancing High Availability in oVirt

Yaniv Kaul Tue, 14 Feb 2012 09:07:43 -0800

On 02/14/2012 06:31 PM, Adam Litke wrote:

On Thu, Feb 09, 2012 at 11:45:09AM -0500, Perry Myers wrote:

warning: tl;dr


Right now, HA in oVirt is limited to VM level granularity.  Each VM
provides a heartbeat through vdsm back to the oVirt Engine.  If that
heartbeat is lost, the VM is terminated and (if the user has configured
it) the VM is relaunched.  If the host running that VM has lost its
heartbeat, the host is fenced (via a remote power operation) and all HA
VMs are restarted on an alternate host.

Has anyone considered how live snapshots and live block copy will intersect HA
to provide a better end-user experience?  For example, will we be able to handle
a storage connection failure without power-cycling VMs by migrating storage to a
failover storage domain and/or live-migrating the VM to a host with functioning
storage connections?

I think migrating a paused VM (due to EIO) is something KVM is afraid todo - there might be in-flight (in the host already) data en-route to thestorage.

I'm not entirely sure how you migrate the storage, when it's failed.
Y.

Also, the policies for controlling if/when a VM should be restarted are
somewhat limited and hardcoded.

So there are two things that we can improve here:

1. Provide introspection into VMs so that we can monitor the health of
    individual services and not just the VM

2. Provide a more configurable way of expressing policy for when a VM
    (and its services) should trigger remediation by the HA subsystem

We can tackle these two things in isolation, or we can try to combine
and solve them at the same time.

Some possible paths (not the only ones) might be:

I also want to mention Memory Overcommitment Manager.  It hasn't been included
in vdsm yet, but the patches will be hitting gerrit within the next couple of
days.  MOM will contribute a single-host policy which is useful for making
decisions about the condition of a host and applying remediation policies:
ballooning, ksm, cgroups, vm ejection (migrating to another host).  It is
lightweight and will integrate seamlessly with vdsm from an oVirt-engine
perspective.

* Leverage Pacemaker Cloud (http://pacemaker-cloud.org/)

Pacemaker Cloud works by providing a generic (read: virt mgmt system
agnostic) way of managing HA for virtual machines and their services.
At a high level the concept is that you define 1 or more virtual
machines to be in a application group, and pcmk-cloud spawns a process
to monitor that application group using either Matahari/QMF or direct
SSH access.

pcmk-cloud is not meant to be a user facing component, so integration
work would need to be done here to have oVirt consume the pcmk-cloud
REST API for specifying what the application groups (sets of VMs) are
and exposing that through the oVirt web UI.

pcmk-cloud at a high level has the following functions:
   + monitoring of services through Matahari/QMF/SSH
   + monitoring of VMs through Matahari/QMF/SSH/Deltacloud
   + control of services through Matahari/QMF/SSH
   + control of VMs through Deltacloud or the native provider (in this
     case oVirt Engine REST API)
   + policy engine/model (per application group) to make decisions about
     when to control services/VMs based on the monitoring input

Integration decisions:
   + pcmk-cloud to use existing transports for monitoring/control
     (QMF/SSH) or do we leverage a new transport via vdsm/ovirt-guest-
     agent?
   + pcmk-cloud could act as the core policy engine to determine VM
     placement in the oVirt datacenter/clusters or it could be used
     solely for the monitoring/remediation aspect


* Leverage guest monitoring agents w/ ovirt-guest-agent

This would be taking the Services Agent from Matahari (which is just a C
library) and utilizing it from the ovirt-guest-agent.  So oga would
setup recurring monitoring of services using this lib and use its
existing communication path with vdsm->oVirt Engine to report back
service events.  In turn, oVirt Engine would need to interpret these
events and then issue service control actions back to oga

Conceptually this is very similar to using pcmk-cloud in the case where
pcmk-cloud utilizes information obtained through oga/vdsm through oVirt
Engine instead of communicating directly to Guests via QMF/SSH.  In
fact, taking this route would probably end up duplicating some effort
because effectively you'd need the pcmk-cloud concept of the Cloud
Application Policy Engine (formerly called DPE/Deployable Policy Engine)
built directly into oVirt Engine anyhow.

So part of looking at this is determining how much reuse/integration of
existing components makes sense vs. just re-implementing similar concepts.

I've cc'd folks from the HA community/pcmk-cloud and hopefully we can
have a bit of a discussion to determine the best path forward here.

Perry
_______________________________________________
Arch mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/arch


_______________________________________________
Arch mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/arch

Re: Some thoughts on enhancing High Availability in oVirt

Reply via email to