Sangeetha Hariharan created CLOUDSTACK-5470:
-----------------------------------------------

             Summary: Xenserver - Host continues to remain in "Up" state when 
powere off due to exception - "Unable to reset master of slave"
                 Key: CLOUDSTACK-5470
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5470
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Management Server
    Affects Versions: 4.3.0
         Environment: Build from 4.3
            Reporter: Sangeetha Hariharan
             Fix For: 4.3.0


Set up - 

Advanced zone set up with 2 Xenserver 6.2 hosts

Had about 5 vms on each of the hosts.
I had hourly snapshots that were scheduled for ROOT volumes of all the Vms.

Shutdown master host. ( I powered off the host machine using IPMI).

But the host continue to show up as being in "Up" state:

I see the following exception in the logs:

Both hosts show the status as being "UP" in the cloud platform and following 
exception is seen:

2013-12-04 18:13:15,510 WARN [c.c.a.m.DirectAgentAttache] 
(DirectAgent-59:ctx-0bb6ad1e) Seq 1-2071592963: Exception Caught while 
executing command
com.cloud.utils.exception.CloudRuntimeException: Unable to reset master of 
slave 10.223.59.66 to 10.223.59.67 due to org.apache.xmlrpc.XmlRpcException: 
Failed to read server's response: connect timed out
at 
com.cloud.hypervisor.xen.resource.XenServerConnectionPool.PoolEmergencyResetMaster(XenServerConnectionPool.java:443)
at 
com.cloud.hypervisor.xen.resource.XenServerConnectionPool.connect(XenServerConnectionPool.java:661)
at 
com.cloud.hypervisor.xen.resource.CitrixResourceBase.getConnection(CitrixResourceBase.java:5985)
at 
com.cloud.hypervisor.xen.resource.CitrixResourceBase.execute(CitrixResourceBase.java:8248)
at 
com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:587)
at 
com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:59)
at 
com.cloud.hypervisor.xen.resource.XenServer610Resource.executeRequest(XenServer610Resource.java:106)
at 
com.cloud.agent.manager.DirectAgentAttache$Task.runInContext(DirectAgentAttache.java:216)
at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
2013-12-04 18:13:15,511 DEBUG [c.c.a.m.DirectAgentAttache] 
(DirectAgent-59:ctx-0bb6ad1e) Seq 1-2071592963: Response Received:
2013-12-04 18:13:15,511 DEBUG [c.c.a.t.Request] (DirectAgent-59:ctx-0bb6ad1e) 
Seq 1-2071592963: Processing: { Ans: , MgmtId: 112516401760401, via: 1, Ver: 
v1, Flags: 10, 
[{"com.cloud.agent.api.Answer":{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException:
 Unable to reset master of slave 10.223.59.66 to 10.223.59.67 due to 
org.apache.xmlrpc.XmlRpcException: Failed to read server's response: connect 
timed out","wait":0}}] }

Also all the snapshots that were sent to the host that is currently down fails 
with following exception:

2013-12-04 18:10:24,856 DEBUG [o.a.c.s.s.SnapshotServiceImpl] 
(Job-Executor-27:ctx-6cb3f72a ctx-dedf771a) create snapshot TestVM-1_ROOT-3_201
31204231014 failed: com.cloud.utils.exception.CloudRuntimeException: Unable to 
reset master of slave 10.223.59.66 to 10.223.59.67 due to org.
apache.xmlrpc.XmlRpcException: Failed to read server's response: connect timed 
out
2013-12-04 18:10:24,867 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy] 
(Job-Executor-27:ctx-6cb3f72a ctx-dedf771a) Failed to take snapshot: com.
cloud.utils.exception.CloudRuntimeException: Unable to reset master of slave 
10.223.59.66 to 10.223.59.67 due to org.apache.xmlrpc.XmlRpcExce
ption: Failed to read server's response: connect timed out
2013-12-04 18:10:24,872 DEBUG [c.c.s.s.SnapshotManagerImpl] 
(Job-Executor-27:ctx-6cb3f72a ctx-dedf771a) Failed to create snapshot
com.cloud.utils.exception.CloudRuntimeException: 
com.cloud.utils.exception.CloudRuntimeException: Unable to reset master of 
slave 10.223.59.66 to 10.223.59.67 due to org.apache.xmlrpc.XmlRpcException: 
Failed to read server's response: connect timed out
at 
org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:281)
at 
com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:951)
at sun.reflect.GeneratedMethodAccessor334.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy160.takeSnapshot(Unknown Source)
at 
org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot(VolumeServiceImpl.java:1342)
at 
com.cloud.storage.VolumeApiServiceImpl.takeSnapshot(VolumeApiServiceImpl.java:1402)
at sun.reflect.GeneratedMethodAccessor333.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy232.takeSnapshot(Unknown Source)
at 
org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute(CreateSnapshotCmd.java:181)
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161)
at 
com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109)
at com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66)
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63)
at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:520)
at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
2013-12-04 18:10:24,883 DEBUG [o.a.c.s.v.VolumeServiceImpl] 
(Job-Executor-27:ctx-6cb3f72a ctx-dedf771a) Take snapshot: 3 failed



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to