Slair1 opened a new pull request #2722: CLOUDSTACK-10310 Fix KVM reboot on 
storage issue
URL: https://github.com/apache/cloudstack/pull/2722
 
 
   Rebase of #2472 onto 4.11
   
   If the KVM heartbeat file can't be written to, the host is rebooted, and 
thus taking down all VMs running on it. The code does try 5x times before the 
reboot, but the there is not a delay between the retires, so they are 5 
simultaneous retries, which doesn't help. Standard SAN storage HA operations or 
quick network blip could cause this reboot to occur.
   
   Some discussions on the dev mailing list revealed that some people are just 
commenting out the reboot line in their version of the CloudStack source.
   
   A better option would be have it sleep between tries so it isn't 5x almost 
simultaneous tries. Plus, instead of rebooting, the cloudstack-agent could just 
be stopped on the host instead. This will cause alerts to be issued and if the 
host is disconnected long-enough, depending on the HA code in use, VM HA could 
handle the host failure.
   
   The built-in reboot of the host seemed drastic

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to