[
https://issues.apache.org/jira/browse/CLOUDSTACK-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13736736#comment-13736736
]
Milamber commented on CLOUDSTACK-3535:
--------------------------------------
I try to backport this patch on 4.1 branch (I attach the patch file). I've
tested this patch with 4.1.1 tag (4.1.1+patch only) but the KVM HA don't works
when I pulls off the ethernet cable.
Here the logs:
2013-08-09 18:29:20,418 INFO [agent.manager.AgentMonitor] (Thread-6:null)
Found the following agents behind on ping: [10]
2013-08-09 18:29:20,430 INFO [agent.manager.AgentManagerImpl]
(AgentTaskPool-3:null) Investigating why host 10 has disconnected with event
PingTimeout
2013-08-09 18:30:20,423 INFO [agent.manager.AgentMonitor] (Thread-6:null)
Found the following agents behind on ping: [10]
2013-08-09 18:30:20,428 INFO [agent.manager.AgentManagerImpl]
(AgentTaskPool-4:null) Investigating why host 10 has disconnected with event
PingTimeout
2013-08-09 18:31:00,447 INFO [utils.exception.CSExceptionErrorCode]
(AgentTaskPool-3:null) Could not find exception:
com.cloud.exception.OperationTimedoutException in error code list for exceptions
2013-08-09 18:31:00,448 WARN [agent.manager.AgentAttache]
(AgentTaskPool-3:null) Seq 10-1119682615: Timed out on Seq 10-1119682615: {
Cmd , MgmtId: 158525531671, via: 10, Ver: v1, Flags: 100011,
[{"CheckHealthCommand":{"wait":50}}] }
2013-08-09 18:31:00,448 WARN [agent.manager.AgentManagerImpl]
(AgentTaskPool-3:null) Operation timed out: Commands 1119682615 to Host 10
timed out after 100
2013-08-09 18:31:12,722 WARN [agent.manager.AgentManagerImpl]
(AgentTaskPool-3:null) Unsupported Command: Unsupported command
issued:com.cloud.agent.api.CheckOnHostCommand. Are you sure you got the right
type of server?
2013-08-09 18:31:12,722 INFO [agent.manager.AgentManagerImpl]
(AgentTaskPool-3:null) The state determined is Up
2013-08-09 18:31:12,722 INFO [agent.manager.AgentManagerImpl]
(AgentTaskPool-3:null) Agent is determined to be up and running
2013-08-09 18:31:20,427 INFO [agent.manager.AgentMonitor] (Thread-6:null)
Found the following agents behind on ping: [10]
2013-08-09 18:31:20,431 INFO [agent.manager.AgentManagerImpl]
(AgentTaskPool-5:null) Investigating why host 10 has disconnected with event
PingTimeout
2013-08-09 18:32:00,433 INFO [utils.exception.CSExceptionErrorCode]
(AgentTaskPool-4:null) Could not find exception:
com.cloud.exception.OperationTimedoutException in error code list for exceptions
2013-08-09 18:32:00,434 WARN [agent.manager.AgentAttache]
(AgentTaskPool-4:null) Seq 10-1119682616: Timed out on Seq 10-1119682616: {
Cmd , MgmtId: 158525531671, via: 10, Ver: v1, Flags: 100011,
[{"CheckHealthCommand":{"wait":50}}] }
2013-08-09 18:32:00,434 WARN [agent.manager.AgentManagerImpl]
(AgentTaskPool-4:null) Operation timed out: Commands 1119682616 to Host 10
timed out after 100
2013-08-09 18:32:05,710 WARN [agent.manager.AgentManagerImpl]
(AgentTaskPool-4:null) Unsupported Command: Unsupported command
issued:com.cloud.agent.api.CheckOnHostCommand. Are you sure you got the right
type of server?
2013-08-09 18:32:05,711 INFO [agent.manager.AgentManagerImpl]
(AgentTaskPool-4:null) The state determined is Up
2013-08-09 18:32:05,711 INFO [agent.manager.AgentManagerImpl]
(AgentTaskPool-4:null) Agent is determined to be up and running
2013-08-09 18:32:20,431 INFO [agent.manager.AgentMonitor] (Thread-6:null)
Found the following agents behind on ping: [10]
2013-08-09 18:32:20,435 INFO [agent.manager.AgentManagerImpl]
(AgentTaskPool-6:null) Investigating why host 10 has disconnected with event
PingTimeout
After pull on the cable, the VM-HA are restarted to another host.
> No HA actions are performed when a KVM host goes offline
> --------------------------------------------------------
>
> Key: CLOUDSTACK-3535
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3535
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Hypervisor Controller, KVM, Management Server
> Affects Versions: 4.1.0, 4.1.1, 4.2.0
> Environment: KVM (CentOS 6.3) with CloudStack 4.1
> Reporter: Paul Angus
> Assignee: edison su
> Priority: Blocker
> Fix For: 4.2.0
>
> Attachments: KVM-HA-4.1.1.2013-08-09-v1.patch,
> management-server.log.Agent
>
>
> If a KVM host 'goes down', CloudStack does not perform HA for instances which
> are marked as HA enabled on that host (including system VMs)
> CloudStack does not show the host as disconnected.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira