Hi Dag,

1) Thanks for the reply. I was talking about canceling MM from cloud stack.
No issues taking out it in and out of MM on xenserver/xencenter level. With
normaly scene, one first puts host in MM from Cstack>>Then from XEN
center>>DO your reboot>>Exit MM from Xencenter>>Exit from Cstack.

2) When you said rebuild, you mean eject the host out of the pool and
reinstall OS? Also, I am yet to try to delete the host from Cstack and add
it back. Should I try that? Do you think it will work?

3) I also found this:- https://issues.apache.org/jira/browse/CLOUDSTACK-8210.
I know this is for KVM, but we are using Cstack 4.4.

BTW, on a broader view, this zone has some funky stuff happening. Its
Cstack 4.4.x and XEN server 6.2
We have noticed that VRs go into reboot loops once we reboot the storage.
VMs are stuck on XenServer in start stages. Sometimes we can't shut down
VMs. Sometimes we can't migrate VMs between hosts. We have also found dead
beef on Xenservers (whatever that means......one of our engineers told
me).  Let me dig some logs for these things and I will try to share it.

I am seriously thinking of reinstalling everything here. But I just need to
justify this to senior management.









--
Makrand


On Thu, Feb 22, 2018 at 6:14 PM, Dag Sonstebo <dag.sonst...@shapeblue.com>
wrote:

> Hi Makrand,
>
> Yes this rings a bell – first of I would advise you to thread very
> carefully – this is most likely an issue with your underlying XAPI db on
> your poolmaster, so there is a risk of further problems.
>
> We have seen this in the past with a couple of clients – and I think we
> found XS servers still in MM in XenCentre (unbeknownst to CloudStack) – but
> we have then had some problems getting the hosts out of MM again from the
> Xen side. We have also seen situations where taking one host out of MM in
> XenCentre puts another host into MM, which is odd. I know in on one
> occasion we ended up removing / rebuilding / reading the stubborn MM host.
> Unfortunately we never found the actual root cause.
>
> Hopefully your issue is something simpler – have you checked that all SRs
> are plugged on all hosts?
>
> Regards,
> Dag Sonstebo
> Cloud Architect
> ShapeBlue
>
> On 22/02/2018, 10:32, "Makrand" <makrandsa...@gmail.com> wrote:
>
>     Hi All,
>
>     Couple of days back we had some iSCSI issue and all the LUNs were
>     disconnected from Xenserver hosts. After the issue was  fixed and when
> all
>     LUNs were back online, for some BIOS checks, we put one of compute
> node in
>     Maintenance Mode from cloudstack. It took more than usual time for it
> to go
>     into MM (was stuck in PrepateforMaintenance), but it went anyhow. Now
>     whenever we are trying to cancel its MM, it just fails:- Command
> failed due
>     to Internal Server Error.
>
>     The logs are indicating below
>
>     2018-02-16 09:44:24,291 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>     (API-Job-Executor-27:ctx-1e865550 job-72477) Add job-72477 into job
>     monitoring
>     2018-02-16 09:44:24,292 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (API-Job-Executor-27:ctx-1e865550 job-72477) Executing AsyncJobVO
>     {id:72477, userId: 2, accountId: 2,
>      instanceType: Host, instanceId: 26, cmd:
>     org.apache.cloudstack.api.command.admin.host.CancelMaintenanceCmd,
> cmdInfo:
>     {"id":"4bca233d-0e61-495c-a522-43800fe311fc","r
>     esponse":"json","sessionkey":"ZxtGyco2RuYHil/VnglSOgguw5c\
> u003d","ctxDetails":"{\"com.cloud.host.Host\":\"4bca233d-
> 0e61-495c-a522-43800fe311fc\"}","cmdEventType":"MA
>     INT.CANCEL","ctxUserId":"2","httpmethod":"GET","_":"
> 1518774059073","uuid":"4bca233d-0e61-495c-a522-
> 43800fe311fc","ctxAccountId":"2","ctxStartEventId":"51924"},
>     cmdVe
>     rsion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result:
>     null, initMsid: 16143068278473, completeMsid: null, lastUpdated: null,
>     lastPolled: null, crea
>     ted: null}
>     2018-02-16 09:44:24,301 ERROR [c.c.a.ApiAsyncJobDispatcher]
>     (API-Job-Executor-27:ctx-1e865550 job-72477) Unexpected exception
> while
>     executing org.apache.cloudstack.a
>     pi.command.admin.host.CancelMaintenanceCmd
>     java.lang.NullPointerException
>             at
>     com.cloud.resource.ResourceManagerImpl.doCancelMaintenance(
> ResourceManagerImpl.java:2083)
>             at
>     com.cloud.resource.ResourceManagerImpl.cancelMaintenance(
> ResourceManagerImpl.java:2140)
>             at
>     com.cloud.resource.ResourceManagerImpl.cancelMaintenance(
> ResourceManagerImpl.java:1127)
>             at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>             at
>     sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
>             at
>     sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>             at java.lang.reflect.Method.invoke(Method.java:606)
>             at
>     org.springframework.aop.support.AopUtils.
> invokeJoinpointUsingReflection(AopUtils.java:317)
>             at
>     org.springframework.aop.framework.ReflectiveMethodInvocation.
> invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>             at
>     org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(
> ReflectiveMethodInvocation.java:150)
>             at
>     org.springframework.aop.interceptor.ExposeInvocationInterceptor.
> invoke(ExposeInvocationInterceptor.java:91)
>             at
>     org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(
> ReflectiveMethodInvocation.java:172)
>             at
>     org.springframework.aop.framework.JdkDynamicAopProxy.
> invoke(JdkDynamicAopProxy.java:204)
>             at com.sun.proxy.$Proxy147.cancelMaintenance(Unknown Source)
>     at
>     org.apache.cloudstack.api.command.admin.host.
> CancelMaintenanceCmd.execute(CancelMaintenanceCmd.java:102)
>             at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:
> 141)
>             at
>     com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:
> 108)
>             at
>     org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.
> runInContext(AsyncJobManagerImpl.java:503)
>             at
>     org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(
> ManagedContextRunnable.java:49)
>             at
>     org.apache.cloudstack.managed.context.impl.
> DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>             at
>     org.apache.cloudstack.managed.context.impl.DefaultManagedContext.
> callWithContext(DefaultManagedContext.java:103)
>             at
>     org.apache.cloudstack.managed.context.impl.DefaultManagedContext.
> runWithContext(DefaultManagedContext.java:53)
>             at
>     org.apache.cloudstack.managed.context.ManagedContextRunnable.run(
> ManagedContextRunnable.java:46)
>             at
>     org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(
> AsyncJobManagerImpl.java:460)
>             at
>     java.util.concurrent.Executors$RunnableAdapter.
> call(Executors.java:471)
>             at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>             at
>     java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
>             at
>     java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
>             at java.lang.Thread.run(Thread.java:745)
>     2018-02-16 09:44:24,305 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (API-Job-Executor-27:ctx-1e865550 job-72477) Complete async job-72477,
>     jobStatus: FAILED, resultCode: 530, result:
>     org.apache.cloudstack.api.response.ExceptionResponse/
> null/{"uuidList":[],"errorcode":530}
>     2018-02-16 09:44:24,320 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
>     (DirectAgent-303:ctx-d1ac93ce) Done with process of VM state report.
> host: 1
>     2018-02-16 09:44:24,322 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>     (CapacityChecker:ctx-038e67bd) MessageBus message: host reserved
> capacity
>     released for VM: 1, checking if host reservation can be released for
> host:1
>     2018-02-16 09:44:24,329 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>     (CapacityChecker:ctx-038e67bd) Cannot release reservation, Found 7 VMs
>     Running on host 1
>     2018-02-16 09:44:24,340 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>     (CapacityChecker:ctx-038e67bd) MessageBus message: host reserved
> capacity
>     released for VM: 1, checking if host reservation can be released for
> host:1
>     2018-02-16 09:44:24,347 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>     (CapacityChecker:ctx-038e67bd) Cannot release reservation, Found 7 VMs
>     Running on host 1
>     2018-02-16 09:44:24,352 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>     (API-Job-Executor-27:ctx-1e865550 job-72477) Done executing
>     org.apache.cloudstack.api.command.admin.host.CancelMaintenanceCmd for
>     job-72477
>     2018-02-16 09:44:24,352 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>     (CapacityChecker:ctx-038e67bd) MessageBus message: host reserved
> capacity
>     released for VM: 1, checking if host reservation can be released for
> host:1
>     2018-02-16 09:44:24,356 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
>     (CapacityChecker:ctx-038e67bd) Cannot release reservation, Found 7 VMs
>     Running on host 1
>     2018-02-16 09:44:24,357 DEBUG [c.c.c.CapacityManagerImpl]
>     (CapacityChecker:ctx-038e67bd) No need to calibrate cpu capacity,
> host:1
>     usedCpu: 13500 reservedCpu: 0
>     2018-02-16 09:44:24,357 DEBUG [c.c.c.CapacityManagerImpl]
>     (CapacityChecker:ctx-038e67bd) No need to calibrate memory capacity,
> host:1
>     usedMem: 21881683968 reservedMem: 0
>     2018-02-16 09:44:24,363 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>     (API-Job-Executor-27:ctx-1e865550 job-72477) Remove job-72477 from job
>     monitoring
>
>     So far tried:-
>     1) Rebooted Compute multiple times, but no help
>     2) Edited DB and marked resource state as Enabled. Then forcefully
>     reconnected host. All looking ok. Again put back in MM and tried
> taking it
>     out, its FAILING with same error.
>
>     Not sure what is issue here. Anyone??
>
>
>
>     --
>     Makrand
>
>
>
> dag.sonst...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>

Reply via email to