[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13747484#comment-13747484
 ] 

Paul Angus commented on CLOUDSTACK-3535:
----------------------------------------

I've tested the HA functionality on KVM and found that it did not work.

CloudStack ssems unable to 'stop' the VM which was on a host that failed 
because the host is unavailable.  I waited an hour and the instance remained in 
the state 'stopping'.  I then restarted the host and the instance stopped, but 
5 hours later it hasn't restarted.


2013-08-22 08:35:09,802 INFO  [cloud.ha.HighAvailabilityManagerImpl] 
(HA-Worker-0:work-3) KVMInvestigator found VM[User|HA-Test1]to be alive? null
2013-08-22 08:35:09,802 DEBUG [cloud.ha.HighAvailabilityManagerImpl] 
(HA-Worker-0:work-3) Fencing off VM that we don't know the state of
2013-08-22 08:35:09,802 DEBUG [cloud.ha.XenServerFencer] (HA-Worker-0:work-3) 
Don't know how to fence non XenServer hosts KVM
2013-08-22 08:35:09,803 INFO  [cloud.ha.HighAvailabilityManagerImpl] 
(HA-Worker-0:work-3) Fencer null returned null
2013-08-22 08:35:09,807 DEBUG [agent.transport.Request] (HA-Worker-0:work-3) 
Seq 2-1715210012: Sending  { Cmd , MgmtId: 345049337494, via: 2, Ver: v1, 
Flags: 100011, 
[{"com.cloud.agent.api.FenceCommand":{"vmName":"i-2-42-VM","hostGuid":"fdf1e936-0373-389b-abef-a68e339ff910-LibvirtComputingResource","hostIp":"10.0.100.41","inSeq":false,"wait":0}}]
 }
2013-08-22 08:35:09,905 DEBUG [agent.transport.Request] 
(AgentManager-Handler-13:null) Seq 2-1715210012: Processing:  { Ans: , MgmtId: 
345049337494, via: 2, Ver: v1, Flags: 10, 
[{"com.cloud.agent.api.FenceAnswer":{"result":true,"wait":0}}] }
2013-08-22 08:35:09,905 DEBUG [agent.transport.Request] (HA-Worker-0:work-3) 
Seq 2-1715210012: Received:  { Ans: , MgmtId: 345049337494, via: 2, Ver: v1, 
Flags: 10, { FenceAnswer } }
2013-08-22 08:35:09,905 INFO  [cloud.ha.HighAvailabilityManagerImpl] 
(HA-Worker-0:work-3) Fencer KVMFenceBuilder returned true
2013-08-22 08:35:09,911 DEBUG [cloud.capacity.CapacityManagerImpl] 
(HA-Worker-0:work-3) VM state transitted from :Running to Stopping with event: 
StopRequestedvm's original host id: 5 new host id: 5 host id before state 
transition: 5
2013-08-22 08:35:09,916 WARN  [cloud.vm.VirtualMachineManagerImpl] 
(HA-Worker-0:work-3) Unable to stop vm, agent unavailable: 
com.cloud.exception.AgentUnavailableException: Resource [Host:5] is 
unreachable: Host 5: Host with specified id is not in the right state: Down
2013-08-22 08:35:09,916 WARN  [cloud.vm.VirtualMachineManagerImpl] 
(HA-Worker-0:work-3) Unable to actually stop VM[User|HA-Test1] but continue 
with release because it's a force stop
2013-08-22 08:35:09,920 ERROR [cloud.ha.HighAvailabilityManagerImpl] 
(HA-Worker-0:work-3) Terminating HAWork[3-HA-42-Running-Investigating]
com.cloud.utils.exception.CloudRuntimeException: Caught exception even though 
it should be handled.
        at 
com.cloud.ha.HighAvailabilityManagerImpl.restart(HighAvailabilityManagerImpl.java:479)
        at 
com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:831)
Caused by: com.cloud.exception.AgentUnavailableException: Resource [Host:5] is 
unreachable: Host 5: Host with specified id is not in the right state: Down
        at 
com.cloud.agent.manager.ClusteredAgentManagerImpl.getAttache(ClusteredAgentManagerImpl.java:540)
        at 
com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:479)
        at 
com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:439)
        at 
com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1220)
        at 
com.cloud.ha.HighAvailabilityManagerImpl.restart(HighAvailabilityManagerImpl.java:476)
        ... 1 more

                
> No HA actions are performed when a KVM host goes offline
> --------------------------------------------------------
>
>                 Key: CLOUDSTACK-3535
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3535
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Hypervisor Controller, KVM, Management Server
>    Affects Versions: 4.1.0, 4.1.1, 4.2.0
>         Environment: KVM (CentOS 6.3) with CloudStack 4.1
>            Reporter: Paul Angus
>            Assignee: edison su
>            Priority: Blocker
>             Fix For: 4.2.0
>
>         Attachments: extract-management-server.log.2013-08-09, 
> KVM-HA-4.1.1.2013-08-09-v1.patch, management-server.log.Agent
>
>
> If a KVM host 'goes down', CloudStack does not perform HA for instances which 
> are marked as HA enabled on that host (including system VMs)
> CloudStack does not show the host as disconnected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to