Hi Dag, Yes I did recreate the new system VMs. The version is "Cloudstack release 4.11.0".
Thanks! Chen On Fri, Feb 23, 2018 at 9:27 AM, Dag Sonstebo <dag.sonst...@shapeblue.com> wrote: > Hi Chen, > > You say you just upgraded to 4.11 – did you destroy your system VMs and > let them recreate after the upgrade? > > Can you also check what version a “cat /etc/cloudstack-release” shows up > with on your SSVM/CPVM? > > Regards, > Dag Sonstebo > Cloud Architect > ShapeBlue > > On 23/02/2018, 14:00, "Chen Zhang" <iamczh...@gmail.com> wrote: > > Hello, > > > I am new in the list and I am stuck with a very annoying issue on > CPVM/SSVM. > > > When I start the Cloudstack-management, everything is good. After > around 3-4 > <outlook-data-detector://0> hours, the agent state of CPVM/SSVM > automatically turns to "Disconnected" and the secondary storage goes to > "0kb/0kb", but the VM state is still "running". Once manually rebooting > CPVM/SSVM, the agent state would turn back to "up" and the secondary > storage would be back as well. After 3-4 hours, the issue repeats > again. > > > Here is the log when SSVM/CPVM goes down: > > > ---- > 2018-02-21 15:57:47,517 INFO [c.c.a.m.AgentManagerImpl] > (AgentMonitor-1:ctx-81471e1e) (logid:d0bdac05) Found the following > agents > behind on ping: [3] > 2018-02-21 15:57:47,521 WARN [c.c.a.m.AgentManagerImpl] > (AgentMonitor-1:ctx-81471e1e) (logid:d0bdac05) Disconnect agent for > CPVM/SSVM due to physical connection close. host: 3 > 2018-02-21 15:57:47,522 INFO [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Host 3 is disconnecting > with event ShutdownRequested > 2018-02-21 15:57:47,524 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) The next status of > agent > 3is Disconnected, current status is Up > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Deregistering link for > 3 > with state Disconnected > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Remove Agent : 3 > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.ConnectedAgentAttache] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Processing Disconnect. > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentAttache] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Seq > 3-906630899985023222: > Sending disconnect to class com.cloud.agent.manager. > SynchronousListener > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.hypervisor.xenserver.discoverer. > XcpServerDiscoverer > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.hypervisor.hyperv.discoverer. > HypervServerDiscoverer > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.storage.listener.StoragePoolMonitor > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: org.apache.cloudstack.engine.orchestration. > NetworkOrchestrator > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.storage.secondary.SecondaryStorageListener > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.network.security.SecurityGroupListener > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentAttache] > (StatsCollector-4:ctx-410838d0) (logid:4efe8dd2) Seq > 3-906630899985023222: > Waiting some more time because this is the current command > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.deploy.DeploymentPlanningManagerImpl > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.vm.ClusteredVirtualMachineManagerImpl > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.network.SshKeysDistriMonitor > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.network.router.VirtualNetworkApplianceManagerImpl > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.consoleproxy.ConsoleProxyListener > 2018-02-21 15:57:47,525 DEBUG [c.c.a.m.AgentAttache] > (StatsCollector-4:ctx-410838d0) (logid:4efe8dd2) Seq > 3-906630899985023222: > Waiting some more time because this is the current command > 2018-02-21 15:57:47,526 INFO [c.c.u.e.CSExceptionErrorCode] > (StatsCollector-4:ctx-410838d0) (logid:4efe8dd2) Could not find > exception: > com.cloud.exception.OperationTimedoutException in error code list for > exceptions > 2018-02-21 15:57:47,526 WARN [c.c.a.m.AgentAttache] > (StatsCollector-4:ctx-410838d0) (logid:4efe8dd2) Seq > 3-906630899985023222: > Timed out on null > 2018-02-21 15:57:47,526 DEBUG [c.c.a.m.AgentAttache] > (StatsCollector-4:ctx-410838d0) (logid:4efe8dd2) Seq > 3-906630899985023222: > Cancelling. > 2018-02-21 15:57:47,526 DEBUG [o.a.c.s.RemoteHostEndPoint] > (StatsCollector-4:ctx-410838d0) (logid:4efe8dd2) Failed to send > command, > due to Agent:3, com.cloud.exception.OperationTimedoutException: > Commands > 906630899985023222 to Host 3 timed out after 3600 > 2018-02-21 15:57:47,526 ERROR [c.c.s.StatsCollector] > (StatsCollector-4:ctx-410838d0) (logid:4efe8dd2) Error trying to > retrieve > storage stats > com.cloud.utils.exception.CloudRuntimeException: Failed to send > command, > due to Agent:3, com.cloud.exception.OperationTimedoutException: > Commands > 906630899985023222 to Host 3 timed out after 3600 > at > org.apache.cloudstack.storage.RemoteHostEndPoint.sendMessage( > RemoteHostEndPoint.java:133) > at > com.cloud.server.StatsCollector$StorageCollector.runInContext( > StatsCollector.java:985) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run( > ManagedContextRunnable.java:49) > at > org.apache.cloudstack.managed.context.impl. > DefaultManagedContext$1.call(DefaultManagedContext.java:56) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext. > callWithContext(DefaultManagedContext.java:103) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext. > runWithContext(DefaultManagedContext.java:53) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable.run( > ManagedContextRunnable.java:46) > at java.util.concurrent.Executors$RunnableAdapter. > call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ > ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ > ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2018-02-21 15:57:47,527 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: > com.cloud.network.NetworkUsageManagerImpl$DirectNetworkStatsListener > 2018-02-21 15:57:47,527 DEBUG [c.c.n.NetworkUsageManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Disconnected called on > 3 > with status Disconnected > 2018-02-21 15:57:47,527 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.agent.manager.AgentManagerImpl$ > BehindOnPingListener > 2018-02-21 15:57:47,527 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.agent.manager.AgentManagerImpl$ > SetHostParamsListener > 2018-02-21 15:57:47,527 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.capacity.StorageCapacityListener > 2018-02-21 15:57:47,527 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.capacity.ComputeCapacityListener > 2018-02-21 15:57:47,527 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.network.SshKeysDistriMonitor > 2018-02-21 15:57:47,527 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.network.router.VpcVirtualNetworkApplianceMana > gerImpl > 2018-02-21 15:57:47,527 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.storage.LocalStoragePoolListener > 2018-02-21 15:57:47,527 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.storage.upload.UploadListener > 2018-02-21 15:57:47,527 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Sending Disconnect to > listener: com.cloud.storage.download.DownloadListener > 2018-02-21 15:57:47,527 DEBUG [c.c.h.Status] > (AgentTaskPool-7:ctx-67ec16e3) > (logid:d6a36e24) Transition:[Resource state = Enabled, Agent event = > ShutdownRequested, Host id = 3, name = s-1-VM] > 2018-02-21 15:57:47,620 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] > (AgentTaskPool-7:ctx-67ec16e3) (logid:d6a36e24) Notifying other nodes > of to > disconnect > ---- > > When the issue arises, all instances, hosts, and other resources are > running fine. I just updated the cloudstack-management and > cloudstack-agent > to to 4.11, but the problem is still there. Any ideas? > > > Thanks! > > Chen > > > > dag.sonst...@shapeblue.com > www.shapeblue.com > 53 Chandos Place, Covent Garden, London WC2N 4HSUK > @shapeblue > > > >