[
https://issues.apache.org/jira/browse/CLOUDSTACK-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohit Yadav reassigned CLOUDSTACK-8943:
---------------------------------------
Assignee: Rohit Yadav
> KVM HA is broken, let's fix it
> ------------------------------
>
> Key: CLOUDSTACK-8943
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-8943
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Environment: Linux distros with KVM/libvirt
> Reporter: Nux
> Assignee: Rohit Yadav
>
> Currently KVM HA works by monitoring an NFS based heartbeat file and it can
> often fail whenever this network share becomes slower, causing the
> hypervisors to reboot.
> This can be particularly annoying when you have different kinds of primary
> storages in place which are working fine (people running CEPH etc).
> Having to wait for the affected HV which triggered this to come back and
> declare it's not running VMs is a bad idea; this HV could require hours or
> days of maintenance!
> This is embarrassing. How can we fix it? Ideas, suggestions? How are other
> hypervisors doing it?
> Let's discuss, test, implement. :)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)