rhtyd opened a new pull request #2984: kvm: reset KVM host on heartbeat failure URL: https://github.com/apache/cloudstack/pull/2984 On actual testing, I could see that kvmheartbeat.sh script fails on NFS server failure and stops the agent only. Any HA VMs could be launched in different hosts, and recovery of NFS server could lead to a state where a HA enabled VM runs on two hosts and can potentially cause disk corruptions. In most cases, VM disk corruption will be worse than VM downtime. I've kept the sleep interval between check/rounds but reduced it to 10s. The change in behaviour was introduced in #2722. ## Types of changes <!--- What types of changes does your code introduce? Put an `x` in all the boxes that apply: --> - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] New feature (non-breaking change which adds functionality) - [x] Bug fix (non-breaking change which fixes an issue) - [ ] Enhancement (improves an existing feature and functionality) - [ ] Cleanup (Code refactoring and cleanup, that may add test cases) ## How Has This Been Tested? See comment: https://github.com/apache/cloudstack/issues/2890#issuecomment-434207830
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services