GabrielBrascher opened a new pull request #4978:
URL: https://github.com/apache/cloudstack/pull/4978


   ### Description
   
   Currently, KVM HA implementation works only if the cluster has at least one 
primary storage served via NFS. This is due to the NFS heartbeat script used to 
check if the host is healthy. This implementation adds health checks that work 
regardless of a storage pool. This is done via a Java client that checks Agent 
status via a webserver.
   
   The additional web-server exposes a simple JSON API that returns a list of 
Virtual Machines that are running on that host according to Libvirt. This way, 
KVM HA can verify, via Libvirt, VMs status with HTTP-call to this simple 
webserver and determine if the host is actually down or if it is just the Java 
Agent which has crashed.
   
   #### New KVM HA Helper component
   The following image shows how the new KVM-HA-Helper web-service is 
integrated. The current NFS HeartBeat execution flow will still be used aligned 
with the new HA-Helper.
   
   <p align="center">
     <img width="460" height="300" 
src="https://user-images.githubusercontent.com/5025148/122809301-522bbe00-d2a4-11eb-9ebd-548d4f74b5fe.png";>
   </p>
   
   #### High Availability Workflow
   
   Proposed workflow where the HA Check takes into account both **NFS 
Heartbeat** and the **KVM HA Helper** checks.
   
   **Note that** in order to simplify the diagram it is ignored the whole [HA 
state machine](https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA). 
However, if NFS and HA Helper fails not necessarily it is going to 
Recover/Fence a host as depending on the HA configurations it needs to re-check 
some times until it reaches a threshold of accepted failures.
   
   <p align="center">
     <img width="400" height="500" 
src="https://user-images.githubusercontent.com/5025148/122818822-0a129880-d2b0-11eb-8085-226eb900a2f1.png";>
   </p>
   
   ### Types of changes
   
   - [ ] Breaking change (fix or feature that would cause existing 
functionality to change)
   - [x] New feature (non-breaking change which adds functionality)
   - [ ] Bug fix (non-breaking change which fixes an issue)
   - [ ] Enhancement (improves an existing feature and functionality)
   - [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
   
   ### Feature/Enhancement Scale or Bug Severity
   
   #### Feature/Enhancement Scale
   
   - [x] Major
   - [ ] Minor
   
   <!-- ### How Has This Been Tested? -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to