[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangeetha Hariharan closed CLOUDSTACK-5430.
-------------------------------------------


Tested with latest build from 4.3:

1. Deploy few Vms in each of the hosts with 10 GB ROOT volume size , so we 
start with 10 Vms.
2. Create snaposhot for ROOT volumes.
3. When snapshot is still in progress , Make the primary storage unavailable 
for 10 mts.

This results in the KVM hosts to reboot.

But reboot of KVM host is not successful.It is stuck at trying to unmount nfs 
mount points. This is tracked in CLOUDSTACK-5429.

Stop and start KM hosts manually to workaround this problem.

KVM is now marked as being in "Up" state , even though the Primary store is 
down and the agent was not able to create a mount point.

At this point all the Vms are marked as "Stopped" state in CloudStack.

4. Now make the primary store available. 

There is no way of the mount points to be recreated at this point.

5. Force reconnect the host.

This results in the primary store mount point being recreated on the host.

After this I am able to start new Vms successfully. Also I am able to take new 
snapshots successfully.




> KVM - Primary store down - Not abel to start Vms/take snapshots after the 
> primary store is brought down and brough back up again.
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5430
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5430
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: 4.3.0
>         Environment: Build from 4.3
>            Reporter: Sangeetha Hariharan
>            Assignee: Marcus Sorensen
>            Priority: Critical
>             Fix For: 4.3.0
>
>         Attachments: psdown.rar
>
>
> KVM - Primary store down - Not abel to start Vms/take snapshots after the 
> primary store is brought down and brough back up again.
> Set up:
> Advanced zone with KVM (RHEL 6.3) hosts.
> Steps to reproduce the problem:
> 1. Deploy few Vms in each of the hosts with 10 GB ROOT volume size , so we 
> start with 10 Vms.
> 2. Create snaposhot for ROOT volumes.
> 3. When snapshot is still in progress , Make the primary storage unavailable 
> for 10 mts.
> This results in the KVM hosts to reboot.
> But reboot of KVM host is not successful.It is stuck at trying to unmount nfs 
> mount points. This is tracked in CLOUDSTACK-5429.
> Stop and start KM hosts manually to workaround this problem.
> At this point all the Vms are marked as "Stopped" state in CloudStack.
> 4. Now make the primary store available.
> 5. Attempt to start the VM.
> It fails to start with the following exception:
> 2013-12-09 20:35:55,891 DEBUG [c.c.a.t.Request] (AgentManager-Handler-2:null) 
> Seq 2-1983250480: Processing:  { Ans
> : , MgmtId: 82324189320212, via: 2, Ver: v1, Flags: 10, 
> [{"com.cloud.agent.api.Answer":{"result":false,"details":"
> java.lang.NullPointerException\n\tat 
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtCom
> putingResource.java:2488)\n\tat 
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtC
> omputingResource.java:1260)\n\tat 
> com.cloud.agent.Agent.processRequest(Agent.java:498)\n\tat 
> com.cloud.agent.Agent
> $AgentRequestHandler.doTask(Agent.java:806)\n\tat 
> com.cloud.utils.nio.Task.run(Task.java:83)\n\tat java.util.concu
> rrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)\n\tat 
> java.util.concurrent.ThreadPoolExecutor$Wor
> ker.run(ThreadPoolExecutor.java:603)\n\tat 
> java.lang.Thread.run(Thread.java:679)\n","wait":0}}] }
> 2013-12-09 20:35:55,891 DEBUG [c.c.a.t.Request] 
> (StatsCollector-3:ctx-f0d35c47) Seq 2-1983250480: Received:  { Ans
> : , MgmtId: 82324189320212, via: 2, Ver: v1, Flags: 10, { Answer } }
> 2013-12-09 20:35:56,939 DEBUG [c.c.a.ApiServlet] 
> (catalina-exec-13:ctx-35adede4) ===START===  10.216.50.147 -- GET
>   
> command=queryAsyncJobResult&jobId=489806e9-96f9-4940-9ea0-6bd9516aabb0&response=json&sessionkey=qRSeXYRCfc1PSAXc
> omRT8ue1f%2BE%3D&_=1386639381768
> 2013-12-09 20:35:56,953 DEBUG [c.c.a.ApiServlet] 
> (catalina-exec-13:ctx-35adede4 ctx-065180b8) ===END===  10.216.50
> .147 -- GET  
> command=queryAsyncJobResult&jobId=489806e9-96f9-4940-9ea0-6bd9516aabb0&response=json&sessionkey=qRSeX
> YRCfc1PSAXcomRT8ue1f%2BE%3D&_=1386639381768
> 2013-12-09 20:35:59,322 DEBUG [c.c.a.t.Request] 
> (AgentManager-Handler-14:null) Seq 1-539557989: Processing:  { Ans
> : , MgmtId: 82324189320212, via: 1, Ver: v1, Flags: 10, 
> [{"com.cloud.agent.api.Answer":{"result":false,"details":"
> java.lang.NullPointerException\n\tat 
> com.cloud.hypervisor.kvm.storage.KVMStoragePoolManager.disconnectPhysicalDisk
> sViaVmSpec(KVMStoragePoolManager.java:181)\n\tat 
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execut
> e(LibvirtComputingResource.java:3672)\n\tat 
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1282)\n\tat
>  com.cloud.agent.Agent.processRequest(Agent.java:498)\n\tat 
> com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:806)\n\tat 
> com.cloud.utils.nio.Task.run(Task.java:83)\n\tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)\n\tat
>  
> java.lang.Thread.run(Thread.java:679)\n","wait":0}},{"com.cloud.agent.api.Answer":{"result":false,"details":"Stopped
>  by previous 
> failure","wait":0}},{"com.cloud.agent.api.Answer":{"result":false,"details":"Stopped
>  by previous 
> failure","wait":0}},{"com.cloud.agent.api.Answer":{"result":false,"details":"Stopped
>  by previous 
> failure","wait":0}},{"com.cloud.agent.api.Answer":{"result":false,"details":"Stopped
>  by previous 
> failure","wait":0}},{"com.cloud.agent.api.Answer":{"result":false,"details":"Stopped
>  by previous failure","wait":0}}] }
> 2013-12-09 20:35:59,322 DEBUG [c.c.a.t.Request] (Job-Executor-26:ctx-0382e21d 
> ctx-d8f9d323) Seq 1-539557989: Received:  { Ans: , MgmtId: 82324189320212, 
> via: 1, Ver: v1, Flags: 10, { Answer, Answer, Answer, Answer, Answer, Answer 
> } }
> 6. Attempting to take snapshots also fails with following exception:
> 2013-12-09 20:54:10,509 DEBUG [c.c.a.t.Request] 
> (AgentManager-Handler-10:null) Seq 2-1983250525: Processing:  { An
> s: , MgmtId: 82324189320212, via: 2, Ver: v1, Flags: 10, 
> [{"org.apache.cloudstack.storage.command.CreateObjectAnsw
> er":{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException:
>  java.lang.NullPointerException","w
> ait":0}}] }
> 2013-12-09 20:54:10,509 DEBUG [c.c.a.t.Request] (Job-Executor-34:ctx-eb237191 
> ctx-20bb478f) Seq 2-1983250525: Rece
> ived:  { Ans: , MgmtId: 82324189320212, via: 2, Ver: v1, Flags: 10, { 
> CreateObjectAnswer } }
> 2013-12-09 20:54:10,509 DEBUG [o.a.c.s.s.SnapshotServiceImpl] 
> (Job-Executor-34:ctx-eb237191 ctx-20bb478f) create s
> napshot TestVM-tiny-host-0ps-0-4_ROOT-49_20131210014410 failed: 
> com.cloud.utils.exception.CloudRuntimeException: j
> ava.lang.NullPointerException
> 2013-12-09 20:54:10,519 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy] 
> (Job-Executor-34:ctx-eb237191 ctx-20bb478f) Fa
> iled to take snapshot: com.cloud.utils.exception.CloudRuntimeException: 
> java.lang.NullPointerException
> 2013-12-09 20:54:10,536 DEBUG [c.c.s.s.SnapshotManagerImpl] 
> (Job-Executor-34:ctx-eb237191 ctx-20bb478f) Failed to
> create snapshot
> com.cloud.utils.exception.CloudRuntimeException: 
> com.cloud.utils.exception.CloudRuntimeException: java.lang.NullPo
> interException
>         at 
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy
> .java:281)
>         at 
> com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:951)
>         at sun.reflect.GeneratedMethodAccessor230.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at 
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation
> .java:183)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:15
> 0)
>         at 
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java
> :91)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:17
> 2)
>         at 
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>         at $Proxy161.takeSnapshot(Unknown Source)
>         at 
> org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot(VolumeServiceImpl.java:1341)
>         at 
> com.cloud.storage.VolumeApiServiceImpl.takeSnapshot(VolumeApiServiceImpl.java:1461)
>         at sun.reflect.GeneratedMethodAccessor229.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at 
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>         at 
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>         at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>         at 
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>         at $Proxy233.takeSnapshot(Unknown Source)
>         at 
> org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute(CreateSnapshotCmd.java:181)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161)
>         at 
> com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109)
>         at 
> com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>         at 
> com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63)
>         at 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:520)
>         at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>         at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:722)
> 2013-12-09 20:54:10,544 DEBUG [o.a.c.s.v.VolumeServiceImpl] 
> (Job-Executor-34:ctx-eb237191 ctx-20bb478f) Take snapshot: 49 failed
> com.cloud.utils.exception.CloudRuntimeException: Failed to create snapshot



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to