[ https://issues.apache.org/jira/browse/CLOUDSTACK-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719171#comment-13719171 ]
Bryan Whitehead commented on CLOUDSTACK-3535: --------------------------------------------- Comment for [~tqlogan]… I've had the e1000 driver die / oops on me causing a disconnect of a host with v3.0.2 of cloudstack (as you described as an unlikely scenario). HA kicked in but qemu-kvm was unable to get a lock on the qcow2 files to start up the VM's on another host. This was resolved by me shutting down the host with no network forcing all the qemu-kvm's to shutdown and release their locks. (Note: the shared filesystem was gluster over infiniband/IPoIB). No harm done. Not sure how other shared filesystem/block devices would react though… I'm not in favor of just alerting an admin if a host is down. I'd like to see the previous 3.0.2 behavior restored of HA kicking in with a host disconnect (after a reasonable amount of time). (FWIW I've not experienced a host failure on 4.0.x so I'm not sure if that version also has this problem). > No HA actions are performed when a KVM host goes offline > -------------------------------------------------------- > > Key: CLOUDSTACK-3535 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3535 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Hypervisor Controller, KVM, Management Server > Affects Versions: 4.1.0, 4.1.1, 4.2.0 > Environment: KVM (CentOS 6.3) with CloudStack 4.1 > Reporter: Paul Angus > Priority: Blocker > > If a KVM host 'goes down', CloudStack does not perform HA for instances which > are marked as HA enabled on that host (including system VMs) > CloudStack does not show the host as disconnected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira