[
https://issues.apache.org/jira/browse/CLOUDSTACK-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13747484#comment-13747484
]
Paul Angus commented on CLOUDSTACK-3535:
----------------------------------------
I've tested the HA functionality on KVM and found that it did not work.
CloudStack ssems unable to 'stop' the VM which was on a host that failed
because the host is unavailable. I waited an hour and the instance remained in
the state 'stopping'. I then restarted the host and the instance stopped, but
5 hours later it hasn't restarted.
2013-08-22 08:35:09,802 INFO [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-0:work-3) KVMInvestigator found VM[User|HA-Test1]to be alive? null
2013-08-22 08:35:09,802 DEBUG [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-0:work-3) Fencing off VM that we don't know the state of
2013-08-22 08:35:09,802 DEBUG [cloud.ha.XenServerFencer] (HA-Worker-0:work-3)
Don't know how to fence non XenServer hosts KVM
2013-08-22 08:35:09,803 INFO [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-0:work-3) Fencer null returned null
2013-08-22 08:35:09,807 DEBUG [agent.transport.Request] (HA-Worker-0:work-3)
Seq 2-1715210012: Sending { Cmd , MgmtId: 345049337494, via: 2, Ver: v1,
Flags: 100011,
[{"com.cloud.agent.api.FenceCommand":{"vmName":"i-2-42-VM","hostGuid":"fdf1e936-0373-389b-abef-a68e339ff910-LibvirtComputingResource","hostIp":"10.0.100.41","inSeq":false,"wait":0}}]
}
2013-08-22 08:35:09,905 DEBUG [agent.transport.Request]
(AgentManager-Handler-13:null) Seq 2-1715210012: Processing: { Ans: , MgmtId:
345049337494, via: 2, Ver: v1, Flags: 10,
[{"com.cloud.agent.api.FenceAnswer":{"result":true,"wait":0}}] }
2013-08-22 08:35:09,905 DEBUG [agent.transport.Request] (HA-Worker-0:work-3)
Seq 2-1715210012: Received: { Ans: , MgmtId: 345049337494, via: 2, Ver: v1,
Flags: 10, { FenceAnswer } }
2013-08-22 08:35:09,905 INFO [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-0:work-3) Fencer KVMFenceBuilder returned true
2013-08-22 08:35:09,911 DEBUG [cloud.capacity.CapacityManagerImpl]
(HA-Worker-0:work-3) VM state transitted from :Running to Stopping with event:
StopRequestedvm's original host id: 5 new host id: 5 host id before state
transition: 5
2013-08-22 08:35:09,916 WARN [cloud.vm.VirtualMachineManagerImpl]
(HA-Worker-0:work-3) Unable to stop vm, agent unavailable:
com.cloud.exception.AgentUnavailableException: Resource [Host:5] is
unreachable: Host 5: Host with specified id is not in the right state: Down
2013-08-22 08:35:09,916 WARN [cloud.vm.VirtualMachineManagerImpl]
(HA-Worker-0:work-3) Unable to actually stop VM[User|HA-Test1] but continue
with release because it's a force stop
2013-08-22 08:35:09,920 ERROR [cloud.ha.HighAvailabilityManagerImpl]
(HA-Worker-0:work-3) Terminating HAWork[3-HA-42-Running-Investigating]
com.cloud.utils.exception.CloudRuntimeException: Caught exception even though
it should be handled.
at
com.cloud.ha.HighAvailabilityManagerImpl.restart(HighAvailabilityManagerImpl.java:479)
at
com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:831)
Caused by: com.cloud.exception.AgentUnavailableException: Resource [Host:5] is
unreachable: Host 5: Host with specified id is not in the right state: Down
at
com.cloud.agent.manager.ClusteredAgentManagerImpl.getAttache(ClusteredAgentManagerImpl.java:540)
at
com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:479)
at
com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:439)
at
com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1220)
at
com.cloud.ha.HighAvailabilityManagerImpl.restart(HighAvailabilityManagerImpl.java:476)
... 1 more
> No HA actions are performed when a KVM host goes offline
> --------------------------------------------------------
>
> Key: CLOUDSTACK-3535
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3535
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Hypervisor Controller, KVM, Management Server
> Affects Versions: 4.1.0, 4.1.1, 4.2.0
> Environment: KVM (CentOS 6.3) with CloudStack 4.1
> Reporter: Paul Angus
> Assignee: edison su
> Priority: Blocker
> Fix For: 4.2.0
>
> Attachments: extract-management-server.log.2013-08-09,
> KVM-HA-4.1.1.2013-08-09-v1.patch, management-server.log.Agent
>
>
> If a KVM host 'goes down', CloudStack does not perform HA for instances which
> are marked as HA enabled on that host (including system VMs)
> CloudStack does not show the host as disconnected.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira