Re: [DISCUSS][FS] Host HA for CloudStack

Rohit Yadav Wed, 22 Feb 2017 00:07:26 -0800

Here are some advantages of the proposed feature that is implemented in two 
parts - a HA entity-agnostic framework and a HA provider implementation 
specific for a resource:


- Framework provides building blocks for health checks, degrading, recovery, 
fencing operations for a HA resource. This allows for several custom provider 
implementation possible for a resource type (host: hypervisor/storage etc), 
that are configurable/select-able for a host, with custom cluster/global 
settings to fine tune timeouts, rounds/limits and failure threshold ratios
- Separates policies from mechanism to provide health check investigation, 
fault detection (suspicions, activity checks) and degrade/recovery/fencing 
operation based on threshold/ratio of failure rounds. Over existing 
abstraction, the framework provides means to perform additional operations such 
as to perform liveness checks, recovery and degrading operations, and also 
perform all the operations (health checks, activity checks, 
degrade/recovery/fence operations) based on configurable rounds/thresholds of 
failure with all such tasks bound by a timeout for a partition (such as 
cluster, global etc).
- To reduce load on database and perform at scale (10k+ hosts), provides in 
memory task/queue management with configurable queue limits/sizes, operation 
thresholds and timeouts, operation counter management (for decision making) 
that are bound by a FSM.
- Also provides management of resource ownership across multiple management 
server
- Extends instrumentation to allow us to write deeper marvin tests to test 
internal state and state transitions using fault/state injecting APIs
- Host HA feature specifically provides means to recover and fence a host. With 
this feature, we've implemented a specific HA provider implementation for KVM 
that reliably uses out-of-band management subsystem (i.e uses ipmi) for 
recovery/fencing (reboot/poweroff), and uses a storage plugins approach to 
detect (disk) activities on NFS storage pools across a number of rounds and a 
failure threshold ratio for decision making.

I'll send the PR soon. Thanks.


Regards.

________________________________
From: Koushik Das <koushik....@accelerite.com>
Sent: 21 February 2017 14:17:19
To: dev@cloudstack.apache.org
Subject: Re: [DISCUSS][FS] Host HA for CloudStack

See inline.

Thanks,
Koushik

On 21/02/17, 11:47 AM, "Rohit Yadav" <rohit.ya...@shapeblue.com> wrote:

    Hi Koushik,


    Thanks for sharing your comments and questions.


1. Yes, the FS is divided into two parts - a general HA framework which makes 
no assumption about the type of resource and HA provider implementation that 
works on a type of resource/hypervisor/storage etc.

[Koushik] Hmm the heading is misleading then. I would like to see the details 
of the generic HA framework that you are proposing for any resource type. What 
all resource types can/need to be HA’ed? Also I would like to see a clear 
definition of “storage HA”, ”network HA” or “any resource HA” etc. before going 
ahead with this generic framework. If this new framework ends up doing only 
doing Host/VM HA then there is no point doing all this.

Specifically, with this feature we want to solve the problem of HA-ing a host 
reliably and use out-of-band management subsystem (i.e. ipmi based 
status/reboot/power-off to investigate/recover/fence the host) in the HA 
provider implementation. Yes, a host HA should trigger VM HA, i.e. for the host 
being fenced move HA VMs to other hosts. This also reliably solves the issue of 
disk corruption when same HA VMs get started on multiple hosts.

[Koushik] If host HA implies doing HA on all VMs running in a host, I am not 
clear as to why host HA is needed separately when there is already VM HA 
available.

    2. The old VM HA implementation makes a lot of assumptions about the type 
of resource (i.e. VM) it is HA-ing, it is tied to VM HA which is why HA for 
host could not be added in a straight forward way without regressions we could 
not test. With this new HA framework, it does not make any assumption around 
type of the resource and separates policy from mechanism, we also want to add 
deterministic tests (using marvin tests and a simulator based ha provider 
implementation) to demonstrate the generic HA functionality. In future with 
this framework, HA for various resources such as VM, storage, network can be 
added. As a first step we want to get the framework in, and support for Host as 
a resource type. We also want to reduce assumptions, or dependency as both VM 
HA and Host HA are related (sequence etc). The HAProvider interface would be 
something every hypervisor can implement.

[Koushik] Again please justify why host HA is needed when VM HA is already 
there? If the question is about ease of writing automated tests, I have already 
written simulator based tests for the existing VM HA. Please refer 
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Writing+tests+leveraging+the+simulator+enhancements
 for the test details.

    3. While an existing (VM) HA framework exists, it was safer to write new 
code and demonstrate it works for any general HA resource than refactor and 
implement this in the old framework which could introduce serious regressions 
leading to production issues. For the most part, we've avoided to alter 
anything in the old HA framework while making sure that old (VM) HA works well 
with the new HA framework. The JIRA issue for the feature is in the FS.

[Koushik] As mentioned in a previous comment, please define what all resources 
need to be HA’d and why is it needed? For e.g. there is RVR which provides HA 
for the network services provided by VR. Also for other network plugins there 
may be native ways for achieving HA and may not need anything from CS 
perspective. I wanted to make sure that all these points are accounted for 
before we proceed with a generic framework.


    4. Any HA operation can be blocking in nature, one of the things included 
is a background polling manager that polls for changes, and a task/activity 
executor as out-of-band operations can take time. Therefore, all the 
health/activity/fencing/recovery operations have some timeout, limits and 
specific queues. The existing framework does not provide any abstraction to 
queue, restrict operation timeout, and tie them against a FSM. The existing 
framework also is hard to test, specifically to validate using integration 
test. We also wanted to avoid adding any regressions to existing/old VM HA. 
Lastly, the primary use of IPMI/out-of-band management in performing host-ha is 
not for investigation but for recovery (try a reboot), and fencing (power off).

[Koushik] A lot of points you have raised here is not correct. There is already 
polling of all the hosts to find out VM state changes, queues, time-outs in 
place to send commands to hypervisors etc. Have you evaluated the option of 
using IPMI in the existing KVM HA plugins?



    Hope this answers your questions, please feel free add more comments and 
questions. Thanks.


    Regards.


    ________________________________
    From: Koushik Das <koushik....@accelerite.com>
    Sent: 20 February 2017 11:45
    To: dev@cloudstack.apache.org
    Subject: Re: [DISCUSS][FS] Host HA for CloudStack

    Rohit,

    Thanks for the effort you have put in writing the FS. I have some questions 
based on my initial reading of the FS.

    1. “Host HA” – In the FS you are talking about a generic HA framework but 
it is not clear what is meaning of “host HA”. Is it something like all or some 
VMs running on a host will be started on another host(s) in case of a failure 
or is it something else? How is it different from the existing “VM HA” that is 
already there?
    2. You have mentioned that “Cloudstack lacks a way to reliably fence host”. 
Cloudstack considers VM as a 1st class object and so provides fencing for VM 
instead of host. There are hypervisor specific plugins that implement mechanism 
to fence a VM. I am not sure if it makes sense to expose host fencing as end 
user doesn’t care about it. Now the VM fencing implementation can use something 
like “host fencing” internally.
    3. There is an existing HA framework which provides plugins for doing 
investigation if a VM is alive or not, host is alive or not, fencing of VM in 
case it is not alive. It will be good to understand the limitations of the 
existing framework and how the new framework helps in solving these problems. 
We also need to understand if the limitation is in the framework or some 
specific plugin implementation that is causing issues. Reference to JIRA issues 
would help.
    4. You have mentioned about ipmi to investigate host failure. I would like 
to understand why same can’t be used in the existing framework.

    Thanks,
    Koushik

    On 16/02/17, 4:48 PM, "Rohit Yadav" <rohit.ya...@shapeblue.com> wrote:

        All,


        I would like to start discussion on a new feature - Host HA for 
CloudStack.

        CloudStack lacks a way to reliably fence a host, the idea of the 
host-ha feature is to provide a general purpose HA framework and HA provider 
implementation specific for hypervisor that can use additional mechanism such 
as OOBM (ipmi based power management) to reliably investigate, recover and 
fence a host. This feature can handle scenarios associated with server crash 
issues and reliable fencing of hosts and HA of VM. The first version will have 
HA provider implementation for KVM (and for simulator to test the framework 
implementation, and write marvin tests that can validate the feature on Travis 
and others).


        Please have a look at the FS here:

        https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA


        Looking forward to your comments and questions.


        Regards.

        rohit.ya...@shapeblue.com
        www.shapeblue.com<http://www.shapeblue.com>
        53 Chandos Place, Covent Garden, London  WC2N 4HSUK
        @shapeblue








    DISCLAIMER
    ==========
    This e-mail may contain privileged and confidential information which is 
the property of Accelerite, a Persistent Systems business. It is intended only 
for the use of the individual or entity to which it is addressed. If you are 
not the intended recipient, you are not authorized to read, retain, copy, 
print, distribute or use this message. If you have received this communication 
in error, please notify the sender and delete all copies of this message. 
Accelerite, a Persistent Systems business does not accept any liability for 
virus infected mails.

    rohit.ya...@shapeblue.com
    www.shapeblue.com<http://www.shapeblue.com>
    53 Chandos Place, Covent Garden, London  WC2N 4HSUK
    @shapeblue








DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the 
property of Accelerite, a Persistent Systems business. It is intended only for 
the use of the individual or entity to which it is addressed. If you are not 
the intended recipient, you are not authorized to read, retain, copy, print, 
distribute or use this message. If you have received this communication in 
error, please notify the sender and delete all copies of this message. 
Accelerite, a Persistent Systems business does not accept any liability for 
virus infected mails.

rohit.ya...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue

Re: [DISCUSS][FS] Host HA for CloudStack

Reply via email to