Short update. Was able to 'solve' this problem in DB, changeing the state from 'Maintenance' to 'Enabled'. Afterwards the host came back online like a charm. Nevertheless i had the same problem on all our hosts. Will be interesting to see what happens on the next CS Upgrade then...
Regards, Chris Am Mo., 26. Sept. 2022 um 23:22 Uhr schrieb vas...@gmx.de <vas...@gmx.de>: > Hi everyone, > > after performing the regular security updates provided for ubuntu server > 20.04.5 and a proper restart of the host, the CS Agent doesn't come up > properly and is know in a "bootloop". > Current CS Version 4.17.0.1 > > Process for upgradeing the host: > 1. Enable "maintenance mode" in CS Management for host -> Successfull > 2. Stop CS Agent on host --> Successfull > 3. Update ubutu with newest availeable security updates > 4. Restart the host > > On reboot CS Agent starts as expected but the state shown in gui only > changes into 'connecting' - 'alert' - 'connecting' - 'alert' .... > > The logs from the management server: > > 2022-09-26 20:02:32,282 DEBUG [o.a.c.c.p.RootCACustomTrustManager] > (pool-12-thread-1:null) (logid:) Client/agent connection from ip=172.17.0.2 > has been validated and trusted. > 2022-09-26 20:02:32,405 DEBUG [o.a.c.h.HAManagerImpl] > (BackgroundTaskPollManager-3:ctx-47ba4f29) (logid:829bd5fc) HA health check > task is running... > 2022-09-26 20:02:32,423 DEBUG [c.c.a.t.Request] > (AgentManager-Handler-7:null) (logid:) Seq 0-0: Scheduling the first > command { Cmd , MgmtId: -1, via: 0, Ver: v1, Flags: 1, > [{"com.cloud.agent.api.StartupRoutingCommand":{"cpuSockets":"1","cpus":"48","speed":"2650","memory":"268725788672","dom0MinMemory":"1073741824","poolSync":"false","supportsClonedVolumes":"false","caps":"hvm,snapshot","pool":"/root","hypervisorType":"KVM","hostDetails":{"Host.OS.Kernel.Version":"5.4.0-126-generic","com.cloud.network.Networks.RouterPrivateIpStrategy":"HostLocal","Host.OS.Version":"20.04","secured":"true","Host.OS":"Ubuntu"},"hostTags":[],"groupDetails":{},"type":"Routing","dataCenter":"1","pod":"1","cluster":"1","guid":"dcb7e9d3-b26a-3da5-b91b-10dd1e28d97a-LibvirtComputingResource","name":"srv-xxxxxx.xxxxxxxxxxxxxxxxx","id":"0","version":"4.17.0.1","iqn":"iqn.1993-08.org.debian:01:e0741deca62","privateIpAddress":"172.17.0.2","privateMacAddress":"b0:7b:25:c0:1a:8b","privateNetmask":"255.255.255.192","storageIpAddress":"172.17.0.2","storageNetmask":"255.255.255.192","storageMacAddress":"b0:7b:25:c0:1a:8b","resourceName":"LibvirtComputingResource","gatewayIpAddress":"172.16.2.1","msHostList":"172.17.1.2@static","wait":"0","bypassHostMaintenance":"false"}},{"com.cloud.agent.api.StartupStorageCommand":{"totalSize":"(0 > bytes) > 0","poolInfo":{"uuid":"5603c980-a676-4b0f-8e00-b8bbc7ef740a","host":"172.17.0.2","localPath":"/var/lib/libvirt/images","hostPath":"/var/lib/libvirt/images","poolType":"Filesystem","capacityBytes":"(438.55 > GB) 470886195200","availableBytes":"(423.36 GB) > 454582964224"},"resourceType":"STORAGE_POOL","hostDetails":{},"type":"Storage","dataCenter":"1","pod":"1","guid":"dcb7e9d3-b26a-3da5-b91b-10dd1e28d97a-LibvirtComputingResource","name":"srv-xxxxxx.xxxxxxxxxxxxxxxxx","id":"0","version":"4.17.0.1","resourceName":"LibvirtComputingResource","msHostList":"172.17.1.2@static","wait":"0","bypassHostMaintenance":"false"}}] > } > 2022-09-26 20:02:32,424 DEBUG [c.c.a.t.Request] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Seq 0-0: Processing > the first command { Cmd , MgmtId: -1, via: 0, Ver: v1, Flags: 1, > [{"com.cloud.agent.api.StartupRoutingCommand":{"cpuSockets":"1","cpus":"48","speed":"2650","memory":"268725788672","dom0MinMemory":"1073741824","poolSync":"false","supportsClonedVolumes":"false","caps":"hvm,snapshot","pool":"/root","hypervisorType":"KVM","hostDetails":{"Host.OS.Kernel.Version":"5.4.0-126-generic","com.cloud.network.Networks.RouterPrivateIpStrategy":"HostLocal","Host.OS.Version":"20.04","secured":"true","Host.OS":"Ubuntu"},"hostTags":[],"groupDetails":{},"type":"Routing","dataCenter":"1","pod":"1","cluster":"1","guid":"dcb7e9d3-b26a-3da5-b91b-10dd1e28d97a-LibvirtComputingResource","name":"srv-xxxxxx.xxxxxxxxxxxxxxxxx","id":"0","version":"4.17.0.1","iqn":"iqn.1993-08.org.debian:01:e0741deca62","privateIpAddress":"172.17.0.2","privateMacAddress":"b0:7b:25:c0:1a:8b","privateNetmask":"255.255.255.192","storageIpAddress":"172.17.0.2","storageNetmask":"255.255.255.192","storageMacAddress":"b0:7b:25:c0:1a:8b","resourceName":"LibvirtComputingResource","gatewayIpAddress":"172.16.2.1","msHostList":"172.17.1.2@static","wait":"0","bypassHostMaintenance":"false"}},{"com.cloud.agent.api.StartupStorageCommand":{"totalSize":"(0 > bytes) > 0","poolInfo":{"uuid":"5603c980-a676-4b0f-8e00-b8bbc7ef740a","host":"172.17.0.2","localPath":"/var/lib/libvirt/images","hostPath":"/var/lib/libvirt/images","poolType":"Filesystem","capacityBytes":"(438.55 > GB) 470886195200","availableBytes":"(423.36 GB) > 454582964224"},"resourceType":"STORAGE_POOL","hostDetails":{},"type":"Storage","dataCenter":"1","pod":"1","guid":"dcb7e9d3-b26a-3da5-b91b-10dd1e28d97a-LibvirtComputingResource","name":"srv-xxxxxx.xxxxxxxxxxxxxxxxx","id":"0","version":"4.17.0.1","resourceName":"LibvirtComputingResource","msHostList":"172.17.1.2@static","wait":"0","bypassHostMaintenance":"false"}}] > } > 2022-09-26 20:02:32,430 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to CiscoVnmcElement > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to BareMetalDiscoverer > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to NetscalerElement > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to > JuniperSRXExternalFirewallElement > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to HypervServerDiscoverer > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to BaremetalPxeManagerImpl > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to XcpServerDiscoverer > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to NiciraNvp > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to BrocadeVcsElement > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to Ovm3Discoverer > 2022-09-26 20:02:32,431 DEBUG [c.c.h.o.r.Ovm3Discoverer] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) > createHostVOForConnectedAgent: Host {"id": "1", "name": > "srv-xxxxxx.xxxxxxxxxxxxxxxxx", "uuid": > "883f7f62-55f2-42f4-8242-16b6847c4204", "type"="Routing"} > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to LxcServerDiscoverer > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to NetworkUsageManagerImpl > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to > PremiumSecondaryStorageManagerImpl > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to Ovs > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to VmwareServerDiscoverer > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to ConsoleProxyManagerImpl > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to OvmDiscoverer > 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource > state event CREATE_HOST_VO_FOR_CONNECTED to KvmServerDiscoverer > 2022-09-26 20:02:32,459 DEBUG [c.c.r.ResourceState] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Resource state > update: [id = 1; name = srv-xxxxxx.xxxxxxxxxxxxxxxxx; old state = > Maintenance; event = InternalCreated; new state = Maintenance] > 2022-09-26 20:02:32,459 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Transition:[Resource > state = Maintenance, Agent event = AgentConnected, Host id = 1, name = > srv-xxxxxx.xxxxxxxxxxxxxxxxx] > 2022-09-26 20:02:32,472 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Agent's management > server host list is not up to date, sending list update:[172.17.1.2] > 2022-09-26 20:02:32,472 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) create > ClusteredAgentAttache for 1 > 2022-09-26 20:02:32,473 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Connect to > listener: XcpServerDiscoverer > 2022-09-26 20:02:32,473 DEBUG [c.c.h.x.d.XcpServerDiscoverer] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Not XenServer so > moving on. > 2022-09-26 20:02:32,473 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Connect to > listener: HypervServerDiscoverer > 2022-09-26 20:02:32,474 DEBUG [c.c.h.h.d.HypervServerDiscoverer] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Not Hyper-V > hypervisor, so moving on. > 2022-09-26 20:02:32,474 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Connect to > listener: StoragePoolMonitor > 2022-09-26 20:02:32,479 DEBUG [c.c.s.l.StoragePoolMonitor] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Host 1 connected, > connecting host to shared pool id 1 and sending storage pool information ... > 2022-09-26 20:02:32,480 DEBUG [c.c.s.StorageManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Adding pool > root_storage to host 1 > 2022-09-26 20:02:32,484 DEBUG [c.c.a.t.Request] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Seq > 1-605171199927910401: Sending { Cmd , MgmtId: 90520740759984, via: > 1(srv-xxxxxx.xxxxxxxxxxxxxxxxx), Ver: v1, Flags: 100011, > [{"com.cloud.agent.api.ModifyStoragePoolCommand":{"add":"true","pool":{"id":"1","uuid":"8bc1feaa-a099-3590-aa2b-2d12df0e242b","host":"172.17.2.6","path":"/volume4/primary_storage","port":"2049","type":"NetworkFilesystem"},"localPath":"/mnt//8bc1feaa-a099-3590-aa2b-2d12df0e242b","wait":"0","bypassHostMaintenance":"false"}}] > } > 2022-09-26 20:02:32,558 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-9:null) (logid:) Ping from Routing host > 1(srv-xxxxxx.xxxxxxxxxxxxxxxxx) > 2022-09-26 20:02:32,558 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] > (AgentManager-Handler-9:null) (logid:) Process host VM state report from > ping process. host: 1 > 2022-09-26 20:02:32,558 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] > (AgentManager-Handler-9:null) (logid:) Process VM state report. host: 1, > number of records in report: 0 > 2022-09-26 20:02:32,560 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] > (AgentManager-Handler-9:null) (logid:) Done with process of VM state > report. host: 1 > 2022-09-26 20:02:32,562 INFO [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-9:null) (logid:) PingMap for agent: 1 will not be > updated because agent is no longer in the PingMap > 2022-09-26 20:02:32,564 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-9:null) (logid:) Not processing PingRoutingCommand > for agent id=0; can't find the host in the DB > 2022-09-26 20:02:32,568 DEBUG [c.c.a.t.Request] > (AgentManager-Handler-1:null) (logid:) Seq 1-605171199927910401: > Processing: { Ans: , MgmtId: 90520740759984, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.ModifyStoragePoolAnswer":{"poolInfo":{"host":"172.17.2.6","localPath":"/mnt//8bc1feaa-a099-3590-aa2b-2d12df0e242b","hostPath":"/volume4/primary_storage","poolType":"NetworkFilesystem","capacityBytes":"(3.4759 > TB) 3821825818624","availableBytes":"(1.6431 TB) > 1806561574912"},"templateInfo":{},"datastoreClusterChildren":[],"result":"true","wait":"0","bypassHostMaintenance":"false"}}] > } > 2022-09-26 20:02:32,568 DEBUG [c.c.a.t.Request] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Seq > 1-605171199927910401: Received: { Ans: , MgmtId: 90520740759984, via: > 1(srv-xxxxxx.xxxxxxxxxxxxxxxxx), Ver: v1, Flags: 10, { > ModifyStoragePoolAnswer } } > 2022-09-26 20:02:32,576 INFO [o.a.c.s.d.p.DefaultHostListener] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) *Connection > established between storage pool > org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@3d21695c and > host 1* > 2022-09-26 20:02:32,587 DEBUG [c.c.n.NetworkModelImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Service > SecurityGroup is not supported in the network id=214 > *2022-09-26 20:02:32,593 WARN [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Resource [Host:1] is > unreachable: Host 1: Unable to send class > com.cloud.agent.api.SetupPersistentNetworkCommand because agent > srv-xxxxxx.xxxxxxxxxxxxxxxxx is in maintenance mode* > *2022-09-26 20:02:32,594 INFO [c.c.u.e.CSExceptionErrorCode] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Could not find > exception: com.cloud.exception.ConnectionException in error code list for > exceptions* > *2022-09-26 20:02:32,594 WARN [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Monitor > StoragePoolMonitor says there is an error in the connect process for 1 due > to Unable to connect host 1 to storage pool id 1 due to > com.cloud.utils.exception.CloudRuntimeException: Unable to get answer to > the setup persistent network command 214* > 2022-09-26 20:02:32,595 INFO [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Host 1 is > disconnecting with event AgentDisconnected > 2022-09-26 20:02:32,596 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) The next status of > agent 1is Alert, current status is Connecting > 2022-09-26 20:02:32,597 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Deregistering link > for 1 with state Alert > 2022-09-26 20:02:32,598 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Remove Agent : 1 > 2022-09-26 20:02:32,598 DEBUG [c.c.a.m.ConnectedAgentAttache] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Processing > Disconnect. > 2022-09-26 20:02:32,598 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.hypervisor.xenserver.discoverer.XcpServerDiscoverer > 2022-09-26 20:02:32,598 DEBUG [c.c.u.n.NioConnection] > (AgentManager-NioConnectionHandler-1:null) (logid:) Closing socket > Socket[addr=/172.17.0.2,port=49552,localport=8250] > 2022-09-26 20:02:32,598 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.hypervisor.hyperv.discoverer.HypervServerDiscoverer > 2022-09-26 20:02:32,599 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.storage.listener.StoragePoolMonitor > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.hypervisor.vmware.manager.VmwareManagerImpl > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.deploy.DeploymentPlanningManagerImpl > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.network.security.SecurityGroupListener > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.storage.secondary.SecondaryStorageListener > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: org.apache.cloudstack.engine.orchestration.NetworkOrchestrator > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.vm.ClusteredVirtualMachineManagerImpl > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: > com.cloud.network.NetworkUsageManagerImpl$DirectNetworkStatsListener > 2022-09-26 20:02:32,602 DEBUG [c.c.n.NetworkUsageManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Disconnected called > on 1 with status Alert > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.storage.upload.UploadListener > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.capacity.StorageCapacityListener > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.capacity.ComputeCapacityListener > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.network.SshKeysDistriMonitor > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.network.router.VirtualNetworkApplianceManagerImpl > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.agent.manager.AgentManagerImpl$BehindOnPingListener > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.agent.manager.AgentManagerImpl$SetHostParamsListener > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.storage.download.DownloadListener > 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.consoleproxy.ConsoleProxyListener > 2022-09-26 20:02:32,603 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.network.SshKeysDistriMonitor > 2022-09-26 20:02:32,603 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.network.router.VpcVirtualNetworkApplianceManagerImpl > 2022-09-26 20:02:32,603 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.storage.LocalStoragePoolListener > 2022-09-26 20:02:32,603 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.hypervisor.kvm.discoverer.LxcServerDiscoverer > 2022-09-26 20:02:32,603 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect > to listener: com.cloud.hypervisor.kvm.discoverer.KvmServerDiscoverer > 2022-09-26 20:02:32,603 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Transition:[Resource > state = Maintenance, Agent event = AgentDisconnected, Host id = 1, name = > srv-xxxxxx.xxxxxxxxxxxxxxxxx] > 2022-09-26 20:02:32,607 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Notifying other > nodes of to disconnect > > *2022-09-26 20:02:32,610 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Failed to handle > host connection: com.cloud.exception.ConnectionException: Unable to connect > host 1 to storage pool id 1 due to > com.cloud.utils.exception.CloudRuntimeException: Unable to get answer to > the setup persistent network command 214* > at > com.cloud.storage.listener.StoragePoolMonitor.processConnect(StoragePoolMonitor.java:138) > at > com.cloud.agent.manager.AgentManagerImpl.notifyMonitorsOfConnection(AgentManagerImpl.java:569) > at > com.cloud.agent.manager.AgentManagerImpl.handleConnectedAgent(AgentManagerImpl.java:1133) > at > com.cloud.agent.manager.AgentManagerImpl$HandleAgentConnectTask.runInContext(AgentManagerImpl.java:1217) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: com.cloud.utils.exception.CloudRuntimeException: Unable to get > answer to the setup persistent network command 214 > at > org.apache.cloudstack.storage.datastore.provider.DefaultHostListener.setupPersistentNetwork(DefaultHostListener.java:245) > at > org.apache.cloudstack.storage.datastore.provider.DefaultHostListener.createPersistentNetworkResourcesOnHost(DefaultHostListener.java:99) > at > org.apache.cloudstack.storage.datastore.provider.DefaultHostListener.hostConnect(DefaultHostListener.java:162) > ... 13 more > 2022-09-26 20:02:32,611 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Can not send command > com.cloud.agent.api.ReadyCommand due to Host 1 is not up > 2022-09-26 20:02:32,688 DEBUG [o.a.c.c.p.RootCACustomTrustManager] > (pool-13-thread-1:null) (logid:) A client/agent attempting connection from > address=172.17.0.2 has presented these certificate(s): > Certificate [1] : > > And from the agent: > > 2022-09-26 20:02:32,351 INFO [kvm.storage.LibvirtStorageAdaptor] > (Agent-Handler-1:null) (logid:) Attempting to create storage pool > 5603c980-a676-4b0f-8e00-b8bbc7ef740a (Filesystem) in libvirt > 2022-09-26 20:02:32,351 DEBUG [kvm.resource.LibvirtConnection] > (Agent-Handler-1:null) (logid:) Looking for libvirtd connection at: > qemu:///system > 2022-09-26 20:02:32,355 ERROR [kvm.resource.LibvirtConnection] > (Agent-Handler-1:null) (logid:) Connection with libvirtd is broken: invalid > connection pointer in virConnectGetVersion > 2022-09-26 20:02:32,355 DEBUG [kvm.resource.LibvirtConnection] > (Agent-Handler-1:null) (logid:) Opening a new libvirtd connection to: > qemu:///system > 2022-09-26 20:02:32,357 INFO [kvm.storage.LibvirtStorageAdaptor] > (Agent-Handler-1:null) (logid:) Found existing defined storage pool > 5603c980-a676-4b0f-8e00-b8bbc7ef740a, using it. > 2022-09-26 20:02:32,358 INFO [kvm.storage.LibvirtStorageAdaptor] > (Agent-Handler-1:null) (logid:) Trying to fetch storage pool > 5603c980-a676-4b0f-8e00-b8bbc7ef740a from libvirt > 2022-09-26 20:02:32,358 DEBUG [kvm.resource.LibvirtConnection] > (Agent-Handler-1:null) (logid:) Looking for libvirtd connection at: > qemu:///system > 2022-09-26 20:02:32,369 DEBUG [kvm.storage.LibvirtStorageAdaptor] > (Agent-Handler-1:null) (logid:) Successfully refreshed pool > 5603c980-a676-4b0f-8e00-b8bbc7ef740a Capacity: (438.55 GB) 470886195200 > Used: (15.18 GB) 16303230976 Available: (423.36 GB) 454582964224 > 2022-09-26 20:02:32,371 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:) Executing: hostname > 2022-09-26 20:02:32,372 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:) Executing while with timeout : 500 > 2022-09-26 20:02:32,373 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:) Execution is successful. > 2022-09-26 20:02:32,373 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:) Executing: hostname > 2022-09-26 20:02:32,374 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:) Executing while with timeout : 500 > 2022-09-26 20:02:32,374 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:) Execution is successful. > 2022-09-26 20:02:32,397 INFO [cloud.serializer.GsonHelper] > (Agent-Handler-1:null) (logid:) Default Builder inited. > 2022-09-26 20:02:32,415 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:) Sending Startup: Seq 0-0: { Cmd , MgmtId: -1, via: 0, Ver: v1, > Flags: 1, > [{"com.cloud.agent.api.StartupRoutingCommand":{"cpuSockets":"1","cpus":"48","speed":"2650","memory":"268725788672","dom0MinMemory":"1073741824","poolSync":"false","supportsClonedVolumes":"false","caps":"hvm,snapshot","pool":"/root","hypervisorType":"KVM","hostDetails":{"Host.OS.Kernel.Version":"5.4.0-126-generic","com.cloud.network.Networks.RouterPrivateIpStrategy":"HostLocal","Host.OS.Version":"20.04","secured":"true","Host.OS":"Ubuntu"},"hostTags":[],"groupDetails":{},"type":"Routing","dataCenter":"1","pod":"1","cluster":"1","guid":"dcb7e9d3-b26a-3da5-b91b-10dd1e28d97a-LibvirtComputingResource","name":"srv-2-xxxxxx.xxxxxxxxxxxxxxxxx","id":"0","version":"4.17.0.1","iqn":"iqn.1993-08.org.debian:01:e0741deca62","privateIpAddress":"172.17.0.2","privateMacAddress":"b0:7b:25:c0:1a:8b","privateNetmask":"255.255.255.192","storageIpAddress":"172.17.0.2","storageNetmask":"255.255.255.192","storageMacAddress":"b0:7b:25:c0:1a:8b","resourceName":"LibvirtComputingResource","gatewayIpAddress":"172.16.2.1","msHostList":"172.17.1.2@static","wait":"0","bypassHostMaintenance":"false"}},{"com.cloud.agent.api.StartupStorageCommand":{"totalSize":"(0 > bytes) > 0","poolInfo":{"uuid":"5603c980-a676-4b0f-8e00-b8bbc7ef740a","host":"172.17.0.2","localPath":"/var/lib/libvirt/images","hostPath":"/var/lib/libvirt/images","poolType":"Filesystem","capacityBytes":"(438.55 > GB) 470886195200","availableBytes":"(423.36 GB) > 454582964224"},"resourceType":"STORAGE_POOL","hostDetails":{},"type":"Storage","dataCenter":"1","pod":"1","guid":"dcb7e9d3-b26a-3da5-b91b-10dd1e28d97a-LibvirtComputingResource","name":"srv-2-xxxxxx.xxxxxxxxxxxxxxxxx","id":"0","version":"4.17.0.1","resourceName":"LibvirtComputingResource","msHostList":"172.17.1.2@static","wait":"0","bypassHostMaintenance":"false"}}] > } > 2022-09-26 20:02:32,415 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null) > (logid:) Startup task created > 2022-09-26 20:02:32,433 DEBUG [cloud.agent.Agent] (Agent-Handler-2:null) > (logid:) Received response: Seq 0-0: { Ans: , MgmtId: 90520740759984, via: > -1, Ver: v1, Flags: 100000, > [{"com.cloud.agent.api.StartupAnswer":{"hostId":"0","pingInterval":"60","result":"true","wait":"0","bypassHostMaintenance":"false"}}] > } > 2022-09-26 20:02:32,433 DEBUG [cloud.agent.Agent] (Agent-Handler-2:null) > (logid:) Startup task cancelled > 2022-09-26 20:02:32,433 INFO [cloud.agent.Agent] (Agent-Handler-2:null) > (logid:) Process agent startup answer, agent id = 0 > 2022-09-26 20:02:32,433 INFO [cloud.agent.Agent] (Agent-Handler-2:null) > (logid:) Set agent id 0 > 2022-09-26 20:02:32,439 DEBUG [cloud.agent.Agent] (Agent-Handler-2:null) > (logid:) Adding a watch list > 2022-09-26 20:02:32,439 INFO [cloud.agent.Agent] (Agent-Handler-2:null) > (logid:) Startup Response Received: agent id = 0 > 2022-09-26 20:02:32,443 DEBUG [kvm.resource.LibvirtComputingResource] > (UgentTask-1:null) (logid:) Executing: > /usr/share/cloudstack-common/scripts/vm/network/security_group.py > get_rule_logs_for_vms > 2022-09-26 20:02:32,444 DEBUG [kvm.resource.LibvirtComputingResource] > (UgentTask-1:null) (logid:) Executing while with timeout : 1800000 > 2022-09-26 20:02:32,494 DEBUG [cloud.agent.Agent] > (agentRequest-Handler-2:null) (logid:929b8bd0) Request:Seq > 1-605171199927910401: { Cmd , MgmtId: 90520740759984, via: 1, Ver: v1, > Flags: 100011, > [{"com.cloud.agent.api.ModifyStoragePoolCommand":{"add":"true","pool":{"id":"1","uuid":"8bc1feaa-a099-3590-aa2b-2d12df0e242b","host":"172.17.2.6","path":"/volume4/primary_storage","port":"2049","type":"NetworkFilesystem"},"localPath":"/mnt//8bc1feaa-a099-3590-aa2b-2d12df0e242b","wait":"0","bypassHostMaintenance":"false"}}] > } > 2022-09-26 20:02:32,495 DEBUG [cloud.agent.Agent] > (agentRequest-Handler-2:null) (logid:929b8bd0) Processing command: > com.cloud.agent.api.ModifyStoragePoolCommand > 2022-09-26 20:02:32,504 DEBUG [kvm.resource.LibvirtComputingResource] > (UgentTask-1:null) (logid:) Execution is successful. > 2022-09-26 20:02:32,509 DEBUG [kvm.resource.LibvirtConnection] > (UgentTask-1:null) (logid:) Looking for libvirtd connection at: > qemu:///system > 2022-09-26 20:02:32,512 DEBUG [cloud.agent.Agent] (UgentTask-1:null) > (logid:) Sending ping: Seq 0-1: { Cmd , MgmtId: -1, via: 0, Ver: v1, > Flags: 11, > [{"com.cloud.agent.api.PingRoutingWithNwGroupsCommand":{"newGroupStates":{},"_hostVmStateReport":{},"_gatewayAccessible":"true","_vnetAccessible":"true","hostType":"Routing","hostId":"0","wait":"0","bypassHostMaintenance":"false"}}] > } > 2022-09-26 20:02:32,555 INFO [kvm.storage.LibvirtStorageAdaptor] > (agentRequest-Handler-2:null) (logid:929b8bd0) Attempting to create storage > pool 8bc1feaa-a099-3590-aa2b-2d12df0e242b (NetworkFilesystem) in libvirt > 2022-09-26 20:02:32,555 DEBUG [kvm.resource.LibvirtConnection] > (agentRequest-Handler-2:null) (logid:929b8bd0) Looking for libvirtd > connection at: qemu:///system > 2022-09-26 20:02:32,556 INFO [kvm.storage.LibvirtStorageAdaptor] > (agentRequest-Handler-2:null) (logid:929b8bd0) Found existing defined > storage pool 8bc1feaa-a099-3590-aa2b-2d12df0e242b, using it. > 2022-09-26 20:02:32,556 INFO [kvm.storage.LibvirtStorageAdaptor] > (agentRequest-Handler-2:null) (logid:929b8bd0) Trying to fetch storage pool > 8bc1feaa-a099-3590-aa2b-2d12df0e242b from libvirt > 2022-09-26 20:02:32,556 DEBUG [kvm.resource.LibvirtConnection] > (agentRequest-Handler-2:null) (logid:929b8bd0) Looking for libvirtd > connection at: qemu:///system > 2022-09-26 20:02:32,560 DEBUG [kvm.storage.LibvirtStorageAdaptor] > (agentRequest-Handler-2:null) (logid:929b8bd0) Successfully refreshed pool > 8bc1feaa-a099-3590-aa2b-2d12df0e242b Capacity: (3.4759 TB) 3821825818624 > Used: (1.8329 TB) 2015264243712 Available: (1.6431 TB) 1806561574912 > 2022-09-26 20:02:32,565 DEBUG [cloud.agent.Agent] > (agentRequest-Handler-2:null) (logid:929b8bd0) Seq 1-605171199927910401: { > Ans: , MgmtId: 90520740759984, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.ModifyStoragePoolAnswer":{"poolInfo":{"host":"172.17.2.6","localPath":"/mnt//8bc1feaa-a099-3590-aa2b-2d12df0e242b","hostPath":"/volume4/primary_storage","poolType":"NetworkFilesystem","capacityBytes":"(3.4759 > TB) 3821825818624","availableBytes":"(1.6431 TB) > 1806561574912"},"templateInfo":{},"datastoreClusterChildren":[],"result":"true","wait":"0","bypassHostMaintenance":"false"}}] > } > 2022-09-26 20:02:32,568 DEBUG [cloud.agent.Agent] (Agent-Handler-4:null) > (logid:) Received response: Seq 0-1: { Ans: , MgmtId: 90520740759984, via: > 1, Ver: v1, Flags: 100010, > [{"com.cloud.agent.api.PingAnswer":{"_command":{"hostType":"Routing","hostId":"0","wait":"0","bypassHostMaintenance":"false"},"result":"true","wait":"0","bypassHostMaintenance":"false"}}] > } > > > > > *2022-09-26 20:02:32,605 DEBUG [utils.nio.NioConnection] > (Agent-NioConnectionHandler-1:null) (logid:) Location 1: Socket > Socket[addr=/172.17.1.2 <http://172.17.1.2>,port=8250,localport=49552] > closed on read. Probably -1 returned: Connection closed with -1 on reading > size.2022-09-26 20:02:32,606 DEBUG [utils.nio.NioConnection] > (Agent-NioConnectionHandler-1:null) (logid:) Closing socket > Socket[addr=/172.17.1.2 > <http://172.17.1.2>,port=8250,localport=49552]2022-09-26 20:02:32,608 DEBUG > [cloud.agent.Agent] (Agent-Handler-5:null) (logid:) Clearing watch list: > 12022-09-26 20:02:32,610 INFO [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:) Lost connection to host: 172.17.1.2. Attempting reconnection while > we still have 0 commands in progress.2022-09-26 20:02:32,611 INFO > [utils.nio.NioClient] (Agent-Handler-5:null) (logid:) NioClient connection > closed* > 2022-09-26 20:02:32,611 INFO [cloud.agent.Agent] (Agent-Handler-5:null) > (logid:) Reconnecting to host:172.17.1.2 > 2022-09-26 20:02:32,611 INFO [utils.nio.NioClient] (Agent-Handler-5:null) > (logid:) Connecting to 172.17.1.2:8250 > 2022-09-26 20:02:32,612 INFO [utils.nio.Link] (Agent-Handler-5:null) > (logid:) Conf file found: /etc/cloudstack/agent/agent.properties > 2022-09-26 20:02:32,692 INFO [utils.nio.NioClient] (Agent-Handler-5:null) > (logid:) SSL: Handshake done > 2022-09-26 20:02:32,692 INFO [utils.nio.NioClient] (Agent-Handler-5:null) > (logid:) Connected to 172.17.1.2:8250 > > Maybe someone can point out what the main problem is=? > I currently only 'see' a command that fails due to the host is in > maintenance-mode which then leads to the conclusion that the host can' > connect to the storage pool... but before that the connection was > established??? > > > Regards, > Chris > >