manasaveloori created CLOUDSTACK-5792:
-----------------------------------------

             Summary: All the VMs are shown as running even after the host is 
put in maintenance.NPE  in logs
                 Key: CLOUDSTACK-5792
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5792
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Hypervisor Controller
    Affects Versions: 4.3.0
         Environment: upgraded from 2.2.16 to 4.3
            Reporter: manasaveloori
            Priority: Critical
             Fix For: 4.3.0
         Attachments: management-server.rar

Steps:

 Deploy CS 2.2 X.16 using Xen5.6 sp2 HV.
2. Add the External firewall SRX to CS.
3. Set the GC parameter firewall.rule.ui.enabled to "true."
4. Now acquire the IP and configure firewall and PF rules.
5. Upgrade the CS to 4.3.
6. Stop and start all the System VMs and router VMs so that the new template is 
upgraded.
7. Now perform Network restart on which the firwall and PF rules are configured.
Here there is a issue  CLOUDSTACK-5747 
8.Enable maintenance mode on host.---CLOUDSTACK-5706
9. Now unmanage the cluster.

Observation:

Even after the host is in maintenance CS shows the VMs as running and 
continuously seeing the following NPE in logs:



2014-01-06 21:15:40,767 DEBUG [c.c.a.m.DirectAgentAttache] 
(DirectAgent-80:ctx-531f0e5c) Seq 1-1135935494: Executing request
2014-01-06 21:15:40,934 WARN  [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped 
i-2-11-VM
2014-01-06 21:15:40,934 WARN  [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped r-6-VM
2014-01-06 21:15:40,935 WARN  [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped 
s-10-VM
2014-01-06 21:15:40,935 WARN  [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped 
r-12-VM
2014-01-06 21:15:40,935 WARN  [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped 
r-13-VM
2014-01-06 21:15:40,936 WARN  [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped r-4-VM
2014-01-06 21:15:40,936 WARN  [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped v-7-VM
2014-01-06 21:15:40,936 DEBUG [c.c.a.m.DirectAgentAttache] 
(DirectAgent-80:ctx-531f0e5c) Seq 1-1135935494: Response Received:
2014-01-06 21:15:40,938 DEBUG [c.c.a.t.Request] (DirectAgent-80:ctx-531f0e5c) 
Seq 1-1135935494: Processing:  { Ans: , MgmtId: 7588401905746, via: 1, Ver: v1, 
Flags: 10, 
[{"com.cloud.agent.api.ClusterSyncAnswer":{"_clusterId":1,"_newStates":{"i-2-11-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"},"r-6-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"},"s-10-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"},"r-12-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"},"r-13-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"},"r-4-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"},"v-7-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"}},"_isExecuted":false,"result":true,"wait":0}}]
 }
2014-01-06 21:15:40,938 DEBUG [c.c.a.m.AgentAttache] 
(DirectAgent-80:ctx-531f0e5c) Seq 1-1135935494: Unable to find listener.
2014-01-06 21:15:40,965 DEBUG [c.c.v.VirtualMachineManagerImpl] 
(DirectAgent-80:ctx-531f0e5c) VM r-4-VM: cs state = Running and realState = 
Stopped
2014-01-06 21:15:40,965 DEBUG [c.c.v.VirtualMachineManagerImpl] 
(DirectAgent-80:ctx-531f0e5c) VM r-4-VM: cs state = Running and realState = 
Stopped
2014-01-06 21:15:40,965 DEBUG [c.c.h.HighAvailabilityManagerImpl] 
(DirectAgent-80:ctx-531f0e5c) VM does not require investigation so I'm marking 
it as Stopped: VM[DomainRouter|r-4-VM]
2014-01-06 21:15:40,965 WARN  [c.c.a.m.DirectAgentAttache] 
(DirectAgent-80:ctx-531f0e5c) Seq 1-1135935494: Exception caught
java.lang.NullPointerException
 at 
com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1268)
 at 
com.cloud.ha.HighAvailabilityManagerImpl.scheduleRestart(HighAvailabilityManagerImpl.java:346)
 at 
com.cloud.vm.VirtualMachineManagerImpl.compareState(VirtualMachineManagerImpl.java:2719)
 at 
com.cloud.vm.VirtualMachineManagerImpl.deltaSync(VirtualMachineManagerImpl.java:2353)
 at 
com.cloud.vm.VirtualMachineManagerImpl.processAnswers(VirtualMachineManagerImpl.java:2830)
 at 
com.cloud.agent.manager.AgentManagerImpl.notifyAnswersToMonitors(AgentManagerImpl.java:301)
 at com.cloud.agent.manager.AgentAttache.processAnswers(AgentAttache.java:306)
 at 
com.cloud.agent.manager.ClusteredDirectAgentAttache.processAnswers(ClusteredDirectAgentAttache.java:65)
 at 
com.cloud.agent.manager.DirectAgentAttache$Task.runInContext(DirectAgentAttache.java:242)
 at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
 at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
 at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
 at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
 at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:636)
2014-01-06 21:15:45,667 INFO  [c.c.a.m.AgentManagerImpl] 
(AgentMonitor-1:ctx-c19a3a0d) Found the following agents behind on ping: [7, 10]
2014-01-06 21:15:45,670 DEBUG [c.c.h.Status] (AgentMonitor-1:ctx-c19a3a0d) Ping 
timeout for host 7, do invstigation
2014-01-06 21:15:45,672 INFO  [c.c.a.m.AgentManagerImpl] 
(AgentTaskPool-14:ctx-e3a1c19a) Investigating why host 7 has disconnected with 
event PingTimeout
2014-01-06 21:15:45,673 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentTaskPool-14:ctx-e3a1c19a) checking if agent (7) is alive


Attaching the MS logs:



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to