Hi Makrand, Yes this rings a bell – first of I would advise you to thread very carefully – this is most likely an issue with your underlying XAPI db on your poolmaster, so there is a risk of further problems.
We have seen this in the past with a couple of clients – and I think we found XS servers still in MM in XenCentre (unbeknownst to CloudStack) – but we have then had some problems getting the hosts out of MM again from the Xen side. We have also seen situations where taking one host out of MM in XenCentre puts another host into MM, which is odd. I know in on one occasion we ended up removing / rebuilding / reading the stubborn MM host. Unfortunately we never found the actual root cause. Hopefully your issue is something simpler – have you checked that all SRs are plugged on all hosts? Regards, Dag Sonstebo Cloud Architect ShapeBlue On 22/02/2018, 10:32, "Makrand" <[email protected]> wrote: Hi All, Couple of days back we had some iSCSI issue and all the LUNs were disconnected from Xenserver hosts. After the issue was fixed and when all LUNs were back online, for some BIOS checks, we put one of compute node in Maintenance Mode from cloudstack. It took more than usual time for it to go into MM (was stuck in PrepateforMaintenance), but it went anyhow. Now whenever we are trying to cancel its MM, it just fails:- Command failed due to Internal Server Error. The logs are indicating below 2018-02-16 09:44:24,291 INFO [o.a.c.f.j.i.AsyncJobMonitor] (API-Job-Executor-27:ctx-1e865550 job-72477) Add job-72477 into job monitoring 2018-02-16 09:44:24,292 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-27:ctx-1e865550 job-72477) Executing AsyncJobVO {id:72477, userId: 2, accountId: 2, instanceType: Host, instanceId: 26, cmd: org.apache.cloudstack.api.command.admin.host.CancelMaintenanceCmd, cmdInfo: {"id":"4bca233d-0e61-495c-a522-43800fe311fc","r esponse":"json","sessionkey":"ZxtGyco2RuYHil/VnglSOgguw5c\u003d","ctxDetails":"{\"com.cloud.host.Host\":\"4bca233d-0e61-495c-a522-43800fe311fc\"}","cmdEventType":"MA INT.CANCEL","ctxUserId":"2","httpmethod":"GET","_":"1518774059073","uuid":"4bca233d-0e61-495c-a522-43800fe311fc","ctxAccountId":"2","ctxStartEventId":"51924"}, cmdVe rsion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 16143068278473, completeMsid: null, lastUpdated: null, lastPolled: null, crea ted: null} 2018-02-16 09:44:24,301 ERROR [c.c.a.ApiAsyncJobDispatcher] (API-Job-Executor-27:ctx-1e865550 job-72477) Unexpected exception while executing org.apache.cloudstack.a pi.command.admin.host.CancelMaintenanceCmd java.lang.NullPointerException at com.cloud.resource.ResourceManagerImpl.doCancelMaintenance(ResourceManagerImpl.java:2083) at com.cloud.resource.ResourceManagerImpl.cancelMaintenance(ResourceManagerImpl.java:2140) at com.cloud.resource.ResourceManagerImpl.cancelMaintenance(ResourceManagerImpl.java:1127) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) at com.sun.proxy.$Proxy147.cancelMaintenance(Unknown Source) at org.apache.cloudstack.api.command.admin.host.CancelMaintenanceCmd.execute(CancelMaintenanceCmd.java:102) at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:141) at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:503) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:460) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2018-02-16 09:44:24,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-27:ctx-1e865550 job-72477) Complete async job-72477, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530} 2018-02-16 09:44:24,320 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] (DirectAgent-303:ctx-d1ac93ce) Done with process of VM state report. host: 1 2018-02-16 09:44:24,322 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (CapacityChecker:ctx-038e67bd) MessageBus message: host reserved capacity released for VM: 1, checking if host reservation can be released for host:1 2018-02-16 09:44:24,329 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (CapacityChecker:ctx-038e67bd) Cannot release reservation, Found 7 VMs Running on host 1 2018-02-16 09:44:24,340 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (CapacityChecker:ctx-038e67bd) MessageBus message: host reserved capacity released for VM: 1, checking if host reservation can be released for host:1 2018-02-16 09:44:24,347 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (CapacityChecker:ctx-038e67bd) Cannot release reservation, Found 7 VMs Running on host 1 2018-02-16 09:44:24,352 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-27:ctx-1e865550 job-72477) Done executing org.apache.cloudstack.api.command.admin.host.CancelMaintenanceCmd for job-72477 2018-02-16 09:44:24,352 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (CapacityChecker:ctx-038e67bd) MessageBus message: host reserved capacity released for VM: 1, checking if host reservation can be released for host:1 2018-02-16 09:44:24,356 DEBUG [c.c.d.DeploymentPlanningManagerImpl] (CapacityChecker:ctx-038e67bd) Cannot release reservation, Found 7 VMs Running on host 1 2018-02-16 09:44:24,357 DEBUG [c.c.c.CapacityManagerImpl] (CapacityChecker:ctx-038e67bd) No need to calibrate cpu capacity, host:1 usedCpu: 13500 reservedCpu: 0 2018-02-16 09:44:24,357 DEBUG [c.c.c.CapacityManagerImpl] (CapacityChecker:ctx-038e67bd) No need to calibrate memory capacity, host:1 usedMem: 21881683968 reservedMem: 0 2018-02-16 09:44:24,363 INFO [o.a.c.f.j.i.AsyncJobMonitor] (API-Job-Executor-27:ctx-1e865550 job-72477) Remove job-72477 from job monitoring So far tried:- 1) Rebooted Compute multiple times, but no help 2) Edited DB and marked resource state as Enabled. Then forcefully reconnected host. All looking ok. Again put back in MM and tried taking it out, its FAILING with same error. Not sure what is issue here. Anyone?? -- Makrand [email protected] www.shapeblue.com 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue
