manasaveloori created CLOUDSTACK-5792:
-----------------------------------------
Summary: All the VMs are shown as running even after the host is
put in maintenance.NPE in logs
Key: CLOUDSTACK-5792
URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5792
Project: CloudStack
Issue Type: Bug
Security Level: Public (Anyone can view this level - this is the default.)
Components: Hypervisor Controller
Affects Versions: 4.3.0
Environment: upgraded from 2.2.16 to 4.3
Reporter: manasaveloori
Priority: Critical
Fix For: 4.3.0
Attachments: management-server.rar
Steps:
Deploy CS 2.2 X.16 using Xen5.6 sp2 HV.
2. Add the External firewall SRX to CS.
3. Set the GC parameter firewall.rule.ui.enabled to "true."
4. Now acquire the IP and configure firewall and PF rules.
5. Upgrade the CS to 4.3.
6. Stop and start all the System VMs and router VMs so that the new template is
upgraded.
7. Now perform Network restart on which the firwall and PF rules are configured.
Here there is a issue CLOUDSTACK-5747
8.Enable maintenance mode on host.---CLOUDSTACK-5706
9. Now unmanage the cluster.
Observation:
Even after the host is in maintenance CS shows the VMs as running and
continuously seeing the following NPE in logs:
2014-01-06 21:15:40,767 DEBUG [c.c.a.m.DirectAgentAttache]
(DirectAgent-80:ctx-531f0e5c) Seq 1-1135935494: Executing request
2014-01-06 21:15:40,934 WARN [c.c.h.x.r.CitrixResourceBase]
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped
i-2-11-VM
2014-01-06 21:15:40,934 WARN [c.c.h.x.r.CitrixResourceBase]
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped r-6-VM
2014-01-06 21:15:40,935 WARN [c.c.h.x.r.CitrixResourceBase]
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped
s-10-VM
2014-01-06 21:15:40,935 WARN [c.c.h.x.r.CitrixResourceBase]
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped
r-12-VM
2014-01-06 21:15:40,935 WARN [c.c.h.x.r.CitrixResourceBase]
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped
r-13-VM
2014-01-06 21:15:40,936 WARN [c.c.h.x.r.CitrixResourceBase]
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped r-4-VM
2014-01-06 21:15:40,936 WARN [c.c.h.x.r.CitrixResourceBase]
(DirectAgent-80:ctx-531f0e5c) The VM is now missing marking it as Stopped v-7-VM
2014-01-06 21:15:40,936 DEBUG [c.c.a.m.DirectAgentAttache]
(DirectAgent-80:ctx-531f0e5c) Seq 1-1135935494: Response Received:
2014-01-06 21:15:40,938 DEBUG [c.c.a.t.Request] (DirectAgent-80:ctx-531f0e5c)
Seq 1-1135935494: Processing: { Ans: , MgmtId: 7588401905746, via: 1, Ver: v1,
Flags: 10,
[{"com.cloud.agent.api.ClusterSyncAnswer":{"_clusterId":1,"_newStates":{"i-2-11-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"},"r-6-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"},"s-10-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"},"r-12-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"},"r-13-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"},"r-4-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"},"v-7-VM":{"t":"aef4c88e-59df-46df-8441-4985a6481ee8","u":"Stopped"}},"_isExecuted":false,"result":true,"wait":0}}]
}
2014-01-06 21:15:40,938 DEBUG [c.c.a.m.AgentAttache]
(DirectAgent-80:ctx-531f0e5c) Seq 1-1135935494: Unable to find listener.
2014-01-06 21:15:40,965 DEBUG [c.c.v.VirtualMachineManagerImpl]
(DirectAgent-80:ctx-531f0e5c) VM r-4-VM: cs state = Running and realState =
Stopped
2014-01-06 21:15:40,965 DEBUG [c.c.v.VirtualMachineManagerImpl]
(DirectAgent-80:ctx-531f0e5c) VM r-4-VM: cs state = Running and realState =
Stopped
2014-01-06 21:15:40,965 DEBUG [c.c.h.HighAvailabilityManagerImpl]
(DirectAgent-80:ctx-531f0e5c) VM does not require investigation so I'm marking
it as Stopped: VM[DomainRouter|r-4-VM]
2014-01-06 21:15:40,965 WARN [c.c.a.m.DirectAgentAttache]
(DirectAgent-80:ctx-531f0e5c) Seq 1-1135935494: Exception caught
java.lang.NullPointerException
at
com.cloud.vm.VirtualMachineManagerImpl.advanceStop(VirtualMachineManagerImpl.java:1268)
at
com.cloud.ha.HighAvailabilityManagerImpl.scheduleRestart(HighAvailabilityManagerImpl.java:346)
at
com.cloud.vm.VirtualMachineManagerImpl.compareState(VirtualMachineManagerImpl.java:2719)
at
com.cloud.vm.VirtualMachineManagerImpl.deltaSync(VirtualMachineManagerImpl.java:2353)
at
com.cloud.vm.VirtualMachineManagerImpl.processAnswers(VirtualMachineManagerImpl.java:2830)
at
com.cloud.agent.manager.AgentManagerImpl.notifyAnswersToMonitors(AgentManagerImpl.java:301)
at com.cloud.agent.manager.AgentAttache.processAnswers(AgentAttache.java:306)
at
com.cloud.agent.manager.ClusteredDirectAgentAttache.processAnswers(ClusteredDirectAgentAttache.java:65)
at
com.cloud.agent.manager.DirectAgentAttache$Task.runInContext(DirectAgentAttache.java:242)
at
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
at
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
2014-01-06 21:15:45,667 INFO [c.c.a.m.AgentManagerImpl]
(AgentMonitor-1:ctx-c19a3a0d) Found the following agents behind on ping: [7, 10]
2014-01-06 21:15:45,670 DEBUG [c.c.h.Status] (AgentMonitor-1:ctx-c19a3a0d) Ping
timeout for host 7, do invstigation
2014-01-06 21:15:45,672 INFO [c.c.a.m.AgentManagerImpl]
(AgentTaskPool-14:ctx-e3a1c19a) Investigating why host 7 has disconnected with
event PingTimeout
2014-01-06 21:15:45,673 DEBUG [c.c.a.m.AgentManagerImpl]
(AgentTaskPool-14:ctx-e3a1c19a) checking if agent (7) is alive
Attaching the MS logs:
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)