[
https://issues.apache.org/jira/browse/CLOUDSTACK-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826803#comment-13826803
]
Sangeetha Hariharan commented on CLOUDSTACK-2140:
-------------------------------------------------
I see the same issue when I was testing the following use case:
Basic Zone with 2 Xenserver 6.2 hosts in cluster.
In my case , both the hosts rebooted at the same time ( probably because of a
failed heartbeat to storage).
The hosts in the management server continued to be in "Up" state.
But the Vms were marked as being in "Stopped" state by Vm sync.
When I tried to strat the VMs , the Vms were started successfully. But we were
not able to program any SG rules, Following exception seen in management server
logs:
2013-11-18 19:54:56,816 WARN [xen.resource.CitrixResourceBase]
(DirectAgent-278:null) callHostPlugin failed for cmd: network_rules with args
seqno: 4, vmIP: 10.223.50.223, deflated: true, secIps: 0:, vmID: 60, vmMAC:
06:d1:2c:00:00:1c, vmName: i-2-60-VM, rules:
eJzztMpMzi2w0jUEIQM9MNQ30PFzjQhR8LQqSS6wMrSyNDAwwJQrTcElBwBskRQZ, signature:
8ccf4bc05a5c732547c37605cb869041, due to There was a failure communicating
with the plugin.
2013-11-18 19:54:56,817 WARN [agent.manager.DirectAgentAttache]
(DirectAgent-278:null) Seq 4-891945632: Exception Caught while executing command
com.cloud.utils.exception.CloudRuntimeException: callHostPlugin failed for cmd:
network_rules with args seqno: 4, vmIP: 10.223.50.223, deflated: true, secIps:
0:, vmID: 60, vmMAC: 06:d1:2c:00:00:1c, vmName: i-2-60-VM, rules:
eJzztMpMzi2w0jUEIQM9MNQ30PFzjQhR8LQqSS6wMrSyNDAwwJQrTcElBwBskRQZ, signature:
8ccf4bc05a5c732547c37605cb869041, due to There was a failure communicating
with the plugin.
at
com.cloud.hypervisor.xen.resource.CitrixResourceBase.callHostPlugin(CitrixResourceBase.java:4181)
at
com.cloud.hypervisor.xen.resource.CitrixResourceBase.execute(CitrixResourceBase.java:5775)
at
com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:565)
at
com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:59)
at
com.cloud.hypervisor.xen.resource.XenServer610Resource.executeRequest(XenServer610Resource.java:106)
at
com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:186)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
2013-11-18 19:54:56,818 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-278:null) Seq 4-891945632: Response Received:
2013-11-18 19:54:56,818 DEBUG [agent.transport.Request] (DirectAgent-278:null)
Seq 4-891945632: Processing: { Ans: , MgmtId: 7261447522054, via: 4, Ver: v1,
Flags: 110,
[{"com.cloud.agent.api.Answer":{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException:
callHostPlugin failed for cmd: network_rules with args seqno: 4, vmIP:
10.223.50.223, deflated: true, secIps: 0:, vmID: 60, vmMAC: 06:d1:2c:00:00:1c,
vmName: i-2-60-VM, rules:
eJzztMpMzi2w0jUEIQM9MNQ30PFzjQhR8LQqSS6wMrSyNDAwwJQrTcElBwBskRQZ, signature:
8ccf4bc05a5c732547c37605cb869041, due to There was a failure communicating
with the plugin.","wait":0}}] }
The workaround for this case was to force reconnect the hosts and once again
stop and start all the Vms.
> Host is still marked as being in "Up" state when the host is shutdown (when
> there are no more hosts in the cluster)
> -------------------------------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-2140
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-2140
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
> Affects Versions: 4.2.0, 4.2.1
> Environment: build from master
> Reporter: Sangeetha Hariharan
> Assignee: Koushik Das
> Fix For: Future
>
> Attachments: management-server.rar
>
>
> Host is still marked as being in "Up" state when the host is shutdown (when
> there are no more hosts in the cluster.
> Set up:
> Advanced zone.
> 3 hosts in a cluster ( in my case host id - 7 ,8 ,9 ).
> I did not have any problems when host 8 and host 9 where shutdown.
> When I tried to shutdown host 7 , I see the host still being in "Up" state ,
> even after the management server detected that it is not able to connect with
> this host.
> Following exception seen in management server logs:
> 2013-04-22 14:48:18,350 DEBUG [xen.resource.XenServerConnectionPool]
> (DirectAgent-350:null) localLogout has problem Failed to read server's
> response: connect timed out
> 2013-04-22 14:48:18,350 WARN [xen.resource.CitrixResourceBase]
> (DirectAgent-350:null) Unable to stop i-3-45-VM due to
> com.cloud.utils.exception.CloudRuntimeException: Unable to reset master of
> slave 10.223.59.4 to 10.223.59.2 due to org.apache.xmlrpc.XmlRpcException:
> Failed to read server's response: connect timed out
> at
> com.cloud.hypervisor.xen.resource.XenServerConnectionPool.PoolEmergencyResetMaster(XenServerConnectionPool.java:443)
> at
> com.cloud.hypervisor.xen.resource.XenServerConnectionPool.connect(XenServerConnectionPool.java:661)
> at
> com.cloud.hypervisor.xen.resource.CitrixResourceBase.getConnection(CitrixResourceBase.java:5583)
> at
> com.cloud.hypervisor.xen.resource.CitrixResourceBase.execute(CitrixResourceBase.java:3728)
> at
> com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:474)
> at
> com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:73)
> at
> com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:186)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:679)
> 2013-04-22 14:48:18,364 DEBUG [agent.manager.DirectAgentAttache]
> (DirectAgent-350:null) Seq 9-72160431: Response Received:
> 2013-04-22 14:48:18,370 DEBUG [agent.transport.Request]
> (DirectAgent-350:null) Seq 9-72160431: Processing: { Ans: , MgmtId:
> 7508777239729, via: 9, Ver: v1, Flags: 110,
> [{"StopAnswer":{"result":false,"details":"Exception:
> com.cloud.utils.exception.CloudRuntimeException\nMessage: Unable to reset
> master of slave 10.223.59.4 to 10.223.59.2 due to
> org.apache.xmlrpc.XmlRpcException: Failed to read server's response: connect
> timed out\nStack: com.cloud.utils.exception.CloudRuntimeException: Unable to
> reset master of slave 10.223.59.4 to 10.223.59.2 due to
> org.apache.xmlrpc.XmlRpcException: Failed to read server's response: connect
> timed out\n\tat
> com.cloud.hypervisor.xen.resource.XenServerConnectionPool.PoolEmergencyResetMaster(XenServerConnectionPool.java:443)\n\tat
>
> com.cloud.hypervisor.xen.resource.XenServerConnectionPool.connect(XenServerConnectionPool.java:661)\n\tat
>
> com.cloud.hypervisor.xen.resource.CitrixResourceBase.getConnection(CitrixResourceBase.java:5583)\n\tat
>
> com.cloud.hypervisor.xen.resource.CitrixResourceBase.execute(CitrixResourceBase.java:3728)\n\tat
>
> com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:474)\n\tat
>
> com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:73)\n\tat
>
> com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:186)\n\tat
>
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)\n\tat
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)\n\tat
> java.util.concurrent.FutureTask.run(FutureTask.java:166)\n\tat
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)\n\tat
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)\n\tat
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)\n\tat
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)\n\tat
> java.lang.Thread.run(Thread.java:679)\n","wait":0}}] }
> 2013-04-22 14:48:18,370 DEBUG [agent.transport.Request]
> (DirectAgent-276:null) Seq 9-72160431: Received: { Ans: , MgmtId:
> 7508777239729, via: 9, Ver: v1, Flags: 110, { StopAnswer } }
> 2013-04-22 14:48:18,370 WARN [cloud.vm.VirtualMachineManagerImpl]
> (DirectAgent-276:null) Unable to actually stop VM[User|anan5] but continue
> with release because it's a force stop
> 2013-04-22 14:48:18,371 WARN [agent.manager.DirectAgentAttache]
> (DirectAgent-276:null) Seq 7-1177944069: Exception caught
> com.cloud.utils.exception.CloudRuntimeException: Unable to stop the virtual
> machine due to Exception: com.cloud.utils.exception.CloudRuntimeException
> Host entries in DB:
> | 7 | Rack3Host17.lab.vmops.com |
> ea8a5618-3e10-4a02-a6ea-9a8e10e7efc7 | Up | Routing |
> 10.223.59.2 | 255.255.255.192 | bc:30:5b:d4:1c:36 | 10.223.59.2
> | 255.255.255.192 | bc:30:5b:d4:1c:36 | NULL | NULL
> | NULL | 6 | 10.223.59.2 |
> 255.255.255.192 | bc:30:5b:d4:1c:36 | NULL | 1 | 4 |
> 4 | 2261 | iqn.2005-03.org.open-iscsi:7ad5ccd9c587 |
> NULL | XenServer | 6.0.2 | 16190149248 |
> com.cloud.hypervisor.xen.resource.XenServer602Resource | 4.2.0-SNAPSHOT |
> NULL | NULL | xen-3.0-x86_64 ,
> xen-3.0-x86_32p , hvm-3.0-x86_32 , hvm-3.0-x86_32p , hvm-3.0-x86_64 |
> fb549fb6-82b6-4b74-a468-c511d744b238 | 1 |
> 1 | 0 | 1334307227 | 7508777239729 | 2013-04-19 00:20:22 |
> 2013-04-18 21:22:16 | NULL | 6 | Enabled | NULL | NULL
> | Disabled |
> | 8 | Rack3Host18.lab.vmops.com |
> df1afd25-4871-41f7-bccb-44d8f9ce193d | Down | Routing |
> 10.223.59.3 | 255.255.255.192 | bc:30:5b:d4:23:54 | 10.223.59.3
> | 255.255.255.192 | bc:30:5b:d4:23:54 | NULL | NULL
> | NULL | 6 | 10.223.59.3 |
> 255.255.255.192 | bc:30:5b:d4:23:54 | NULL | 1 | 4 |
> 4 | 2261 | iqn.2005-03.org.open-iscsi:8191f9f922ef |
> NULL | XenServer | 6.0.2 | 16190149248 |
> com.cloud.hypervisor.xen.resource.XenServer602Resource | 4.2.0-SNAPSHOT |
> NULL | NULL | xen-3.0-x86_64 ,
> xen-3.0-x86_32p , hvm-3.0-x86_32 , hvm-3.0-x86_32p , hvm-3.0-x86_64 |
> b127d031-d3bc-4859-bc52-633962ba61a9 | 1 |
> 1 | 0 | 1334635374 | NULL | 2013-04-19 00:20:22 |
> 2013-04-18 21:23:10 | NULL | 90 | Enabled | NULL | NULL
> | Disabled |
> | 9 | Rack3Host19.lab.vmops.com |
> 565dff5f-e17a-4216-a20a-6283e2bab0bf | Down | Routing |
> 10.223.59.4 | 255.255.255.192 | bc:30:5b:d4:15:d2 | 10.223.59.4
> | 255.255.255.192 | bc:30:5b:d4:15:d2 | NULL | NULL
> | NULL | 6 | 10.223.59.4 |
> 255.255.255.192 | bc:30:5b:d4:15:d2 | NULL | 1 | 4 |
> 4 | 2261 | iqn.2005-03.org.open-iscsi:f3aa69c5a08c |
> NULL | XenServer | 6.0.2 | 3701658240 |
> com.cloud.hypervisor.xen.resource.XenServer602Resource | 4.2.0-SNAPSHOT |
> NULL | NULL | xen-3.0-x86_64 ,
> xen-3.0-x86_32p , hvm-3.0-x86_32 , hvm-3.0-x86_32p , hvm-3.0-x86_64 |
> bae94938-b22d-4874-8f18-1e9615338b3c | 1 |
> 1 | 0 | 1334637141 | NULL | 2013-04-19 00:20:22 |
> 2013-04-18 21:23:22 | NULL | 67 | Enabled
--
This message was sent by Atlassian JIRA
(v6.1#6144)