[
https://issues.apache.org/jira/browse/CLOUDSTACK-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719171#comment-13719171
]
Bryan Whitehead commented on CLOUDSTACK-3535:
---------------------------------------------
Comment for [~tqlogan]… I've had the e1000 driver die / oops on me causing a
disconnect of a host with v3.0.2 of cloudstack (as you described as an unlikely
scenario).
HA kicked in but qemu-kvm was unable to get a lock on the qcow2 files to start
up the VM's on another host. This was resolved by me shutting down the host
with no network forcing all the qemu-kvm's to shutdown and release their locks.
(Note: the shared filesystem was gluster over infiniband/IPoIB). No harm done.
Not sure how other shared filesystem/block devices would react though…
I'm not in favor of just alerting an admin if a host is down. I'd like to see
the previous 3.0.2 behavior restored of HA kicking in with a host disconnect
(after a reasonable amount of time). (FWIW I've not experienced a host failure
on 4.0.x so I'm not sure if that version also has this problem).
> No HA actions are performed when a KVM host goes offline
> --------------------------------------------------------
>
> Key: CLOUDSTACK-3535
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3535
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Hypervisor Controller, KVM, Management Server
> Affects Versions: 4.1.0, 4.1.1, 4.2.0
> Environment: KVM (CentOS 6.3) with CloudStack 4.1
> Reporter: Paul Angus
> Priority: Blocker
>
> If a KVM host 'goes down', CloudStack does not perform HA for instances which
> are marked as HA enabled on that host (including system VMs)
> CloudStack does not show the host as disconnected.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira