Short update.

Was able to 'solve' this problem in DB, changeing the state from
'Maintenance' to 'Enabled'.
Afterwards the host came back online like a charm.
Nevertheless i had the same problem on all our hosts. Will be interesting
to see what happens on the next CS Upgrade then...

Regards,
Chris

Am Mo., 26. Sept. 2022 um 23:22 Uhr schrieb vas...@gmx.de <vas...@gmx.de>:

> Hi everyone,
>
> after performing the regular security updates provided for ubuntu server
> 20.04.5 and a proper restart of the host, the CS Agent doesn't come up
> properly and is know in a "bootloop".
> Current CS Version 4.17.0.1
>
> Process for upgradeing the host:
> 1. Enable "maintenance mode" in CS Management for host -> Successfull
> 2. Stop CS Agent on host --> Successfull
> 3. Update ubutu with newest availeable security updates
> 4. Restart the host
>
> On reboot CS Agent starts as expected but the state shown in gui only
> changes into 'connecting' - 'alert' - 'connecting' - 'alert' ....
>
> The logs from the management server:
>
> 2022-09-26 20:02:32,282 DEBUG [o.a.c.c.p.RootCACustomTrustManager]
> (pool-12-thread-1:null) (logid:) Client/agent connection from ip=172.17.0.2
> has been validated and trusted.
> 2022-09-26 20:02:32,405 DEBUG [o.a.c.h.HAManagerImpl]
> (BackgroundTaskPollManager-3:ctx-47ba4f29) (logid:829bd5fc) HA health check
> task is running...
> 2022-09-26 20:02:32,423 DEBUG [c.c.a.t.Request]
> (AgentManager-Handler-7:null) (logid:) Seq 0-0: Scheduling the first
> command  { Cmd , MgmtId: -1, via: 0, Ver: v1, Flags: 1,
> [{"com.cloud.agent.api.StartupRoutingCommand":{"cpuSockets":"1","cpus":"48","speed":"2650","memory":"268725788672","dom0MinMemory":"1073741824","poolSync":"false","supportsClonedVolumes":"false","caps":"hvm,snapshot","pool":"/root","hypervisorType":"KVM","hostDetails":{"Host.OS.Kernel.Version":"5.4.0-126-generic","com.cloud.network.Networks.RouterPrivateIpStrategy":"HostLocal","Host.OS.Version":"20.04","secured":"true","Host.OS":"Ubuntu"},"hostTags":[],"groupDetails":{},"type":"Routing","dataCenter":"1","pod":"1","cluster":"1","guid":"dcb7e9d3-b26a-3da5-b91b-10dd1e28d97a-LibvirtComputingResource","name":"srv-xxxxxx.xxxxxxxxxxxxxxxxx","id":"0","version":"4.17.0.1","iqn":"iqn.1993-08.org.debian:01:e0741deca62","privateIpAddress":"172.17.0.2","privateMacAddress":"b0:7b:25:c0:1a:8b","privateNetmask":"255.255.255.192","storageIpAddress":"172.17.0.2","storageNetmask":"255.255.255.192","storageMacAddress":"b0:7b:25:c0:1a:8b","resourceName":"LibvirtComputingResource","gatewayIpAddress":"172.16.2.1","msHostList":"172.17.1.2@static","wait":"0","bypassHostMaintenance":"false"}},{"com.cloud.agent.api.StartupStorageCommand":{"totalSize":"(0
> bytes)
> 0","poolInfo":{"uuid":"5603c980-a676-4b0f-8e00-b8bbc7ef740a","host":"172.17.0.2","localPath":"/var/lib/libvirt/images","hostPath":"/var/lib/libvirt/images","poolType":"Filesystem","capacityBytes":"(438.55
> GB) 470886195200","availableBytes":"(423.36 GB)
> 454582964224"},"resourceType":"STORAGE_POOL","hostDetails":{},"type":"Storage","dataCenter":"1","pod":"1","guid":"dcb7e9d3-b26a-3da5-b91b-10dd1e28d97a-LibvirtComputingResource","name":"srv-xxxxxx.xxxxxxxxxxxxxxxxx","id":"0","version":"4.17.0.1","resourceName":"LibvirtComputingResource","msHostList":"172.17.1.2@static","wait":"0","bypassHostMaintenance":"false"}}]
> }
> 2022-09-26 20:02:32,424 DEBUG [c.c.a.t.Request]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Seq 0-0: Processing
> the first command  { Cmd , MgmtId: -1, via: 0, Ver: v1, Flags: 1,
> [{"com.cloud.agent.api.StartupRoutingCommand":{"cpuSockets":"1","cpus":"48","speed":"2650","memory":"268725788672","dom0MinMemory":"1073741824","poolSync":"false","supportsClonedVolumes":"false","caps":"hvm,snapshot","pool":"/root","hypervisorType":"KVM","hostDetails":{"Host.OS.Kernel.Version":"5.4.0-126-generic","com.cloud.network.Networks.RouterPrivateIpStrategy":"HostLocal","Host.OS.Version":"20.04","secured":"true","Host.OS":"Ubuntu"},"hostTags":[],"groupDetails":{},"type":"Routing","dataCenter":"1","pod":"1","cluster":"1","guid":"dcb7e9d3-b26a-3da5-b91b-10dd1e28d97a-LibvirtComputingResource","name":"srv-xxxxxx.xxxxxxxxxxxxxxxxx","id":"0","version":"4.17.0.1","iqn":"iqn.1993-08.org.debian:01:e0741deca62","privateIpAddress":"172.17.0.2","privateMacAddress":"b0:7b:25:c0:1a:8b","privateNetmask":"255.255.255.192","storageIpAddress":"172.17.0.2","storageNetmask":"255.255.255.192","storageMacAddress":"b0:7b:25:c0:1a:8b","resourceName":"LibvirtComputingResource","gatewayIpAddress":"172.16.2.1","msHostList":"172.17.1.2@static","wait":"0","bypassHostMaintenance":"false"}},{"com.cloud.agent.api.StartupStorageCommand":{"totalSize":"(0
> bytes)
> 0","poolInfo":{"uuid":"5603c980-a676-4b0f-8e00-b8bbc7ef740a","host":"172.17.0.2","localPath":"/var/lib/libvirt/images","hostPath":"/var/lib/libvirt/images","poolType":"Filesystem","capacityBytes":"(438.55
> GB) 470886195200","availableBytes":"(423.36 GB)
> 454582964224"},"resourceType":"STORAGE_POOL","hostDetails":{},"type":"Storage","dataCenter":"1","pod":"1","guid":"dcb7e9d3-b26a-3da5-b91b-10dd1e28d97a-LibvirtComputingResource","name":"srv-xxxxxx.xxxxxxxxxxxxxxxxx","id":"0","version":"4.17.0.1","resourceName":"LibvirtComputingResource","msHostList":"172.17.1.2@static","wait":"0","bypassHostMaintenance":"false"}}]
> }
> 2022-09-26 20:02:32,430 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to CiscoVnmcElement
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to BareMetalDiscoverer
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to NetscalerElement
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to
> JuniperSRXExternalFirewallElement
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to HypervServerDiscoverer
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to BaremetalPxeManagerImpl
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to XcpServerDiscoverer
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to NiciraNvp
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to BrocadeVcsElement
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to Ovm3Discoverer
> 2022-09-26 20:02:32,431 DEBUG [c.c.h.o.r.Ovm3Discoverer]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0)
> createHostVOForConnectedAgent: Host {"id": "1", "name":
> "srv-xxxxxx.xxxxxxxxxxxxxxxxx", "uuid":
> "883f7f62-55f2-42f4-8242-16b6847c4204", "type"="Routing"}
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to LxcServerDiscoverer
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to NetworkUsageManagerImpl
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to
> PremiumSecondaryStorageManagerImpl
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to Ovs
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to VmwareServerDiscoverer
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to ConsoleProxyManagerImpl
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to OvmDiscoverer
> 2022-09-26 20:02:32,431 DEBUG [c.c.r.ResourceManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Dispatching resource
> state event CREATE_HOST_VO_FOR_CONNECTED to KvmServerDiscoverer
> 2022-09-26 20:02:32,459 DEBUG [c.c.r.ResourceState]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Resource state
> update: [id = 1; name = srv-xxxxxx.xxxxxxxxxxxxxxxxx; old state =
> Maintenance; event = InternalCreated; new state = Maintenance]
> 2022-09-26 20:02:32,459 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Transition:[Resource
> state = Maintenance, Agent event = AgentConnected, Host id = 1, name =
> srv-xxxxxx.xxxxxxxxxxxxxxxxx]
> 2022-09-26 20:02:32,472 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Agent's management
> server host list is not up to date, sending list update:[172.17.1.2]
> 2022-09-26 20:02:32,472 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) create
> ClusteredAgentAttache for 1
> 2022-09-26 20:02:32,473 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Connect to
> listener: XcpServerDiscoverer
> 2022-09-26 20:02:32,473 DEBUG [c.c.h.x.d.XcpServerDiscoverer]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Not XenServer so
> moving on.
> 2022-09-26 20:02:32,473 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Connect to
> listener: HypervServerDiscoverer
> 2022-09-26 20:02:32,474 DEBUG [c.c.h.h.d.HypervServerDiscoverer]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Not Hyper-V
> hypervisor, so moving on.
> 2022-09-26 20:02:32,474 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Connect to
> listener: StoragePoolMonitor
> 2022-09-26 20:02:32,479 DEBUG [c.c.s.l.StoragePoolMonitor]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Host 1 connected,
> connecting host to shared pool id 1 and sending storage pool information ...
> 2022-09-26 20:02:32,480 DEBUG [c.c.s.StorageManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Adding pool
> root_storage to  host 1
> 2022-09-26 20:02:32,484 DEBUG [c.c.a.t.Request]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Seq
> 1-605171199927910401: Sending  { Cmd , MgmtId: 90520740759984, via:
> 1(srv-xxxxxx.xxxxxxxxxxxxxxxxx), Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.ModifyStoragePoolCommand":{"add":"true","pool":{"id":"1","uuid":"8bc1feaa-a099-3590-aa2b-2d12df0e242b","host":"172.17.2.6","path":"/volume4/primary_storage","port":"2049","type":"NetworkFilesystem"},"localPath":"/mnt//8bc1feaa-a099-3590-aa2b-2d12df0e242b","wait":"0","bypassHostMaintenance":"false"}}]
> }
> 2022-09-26 20:02:32,558 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentManager-Handler-9:null) (logid:) Ping from Routing host
> 1(srv-xxxxxx.xxxxxxxxxxxxxxxxx)
> 2022-09-26 20:02:32,558 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> (AgentManager-Handler-9:null) (logid:) Process host VM state report from
> ping process. host: 1
> 2022-09-26 20:02:32,558 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> (AgentManager-Handler-9:null) (logid:) Process VM state report. host: 1,
> number of records in report: 0
> 2022-09-26 20:02:32,560 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> (AgentManager-Handler-9:null) (logid:) Done with process of VM state
> report. host: 1
> 2022-09-26 20:02:32,562 INFO  [c.c.a.m.AgentManagerImpl]
> (AgentManager-Handler-9:null) (logid:) PingMap for agent: 1 will not be
> updated because agent is no longer in the PingMap
> 2022-09-26 20:02:32,564 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentManager-Handler-9:null) (logid:) Not processing PingRoutingCommand
> for agent id=0; can't find the host in the DB
> 2022-09-26 20:02:32,568 DEBUG [c.c.a.t.Request]
> (AgentManager-Handler-1:null) (logid:) Seq 1-605171199927910401:
> Processing:  { Ans: , MgmtId: 90520740759984, via: 1, Ver: v1, Flags: 10,
> [{"com.cloud.agent.api.ModifyStoragePoolAnswer":{"poolInfo":{"host":"172.17.2.6","localPath":"/mnt//8bc1feaa-a099-3590-aa2b-2d12df0e242b","hostPath":"/volume4/primary_storage","poolType":"NetworkFilesystem","capacityBytes":"(3.4759
> TB) 3821825818624","availableBytes":"(1.6431 TB)
> 1806561574912"},"templateInfo":{},"datastoreClusterChildren":[],"result":"true","wait":"0","bypassHostMaintenance":"false"}}]
> }
> 2022-09-26 20:02:32,568 DEBUG [c.c.a.t.Request]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Seq
> 1-605171199927910401: Received:  { Ans: , MgmtId: 90520740759984, via:
> 1(srv-xxxxxx.xxxxxxxxxxxxxxxxx), Ver: v1, Flags: 10, {
> ModifyStoragePoolAnswer } }
> 2022-09-26 20:02:32,576 INFO  [o.a.c.s.d.p.DefaultHostListener]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) *Connection
> established between storage pool
> org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@3d21695c and
> host 1*
> 2022-09-26 20:02:32,587 DEBUG [c.c.n.NetworkModelImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Service
> SecurityGroup is not supported in the network id=214
> *2022-09-26 20:02:32,593 WARN  [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Resource [Host:1] is
> unreachable: Host 1: Unable to send class
> com.cloud.agent.api.SetupPersistentNetworkCommand because agent
> srv-xxxxxx.xxxxxxxxxxxxxxxxx is in maintenance mode*
> *2022-09-26 20:02:32,594 INFO  [c.c.u.e.CSExceptionErrorCode]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Could not find
> exception: com.cloud.exception.ConnectionException in error code list for
> exceptions*
> *2022-09-26 20:02:32,594 WARN  [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Monitor
> StoragePoolMonitor says there is an error in the connect process for 1 due
> to Unable to connect host 1 to storage pool id 1 due to
> com.cloud.utils.exception.CloudRuntimeException: Unable to get answer to
> the setup persistent network command 214*
> 2022-09-26 20:02:32,595 INFO  [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Host 1 is
> disconnecting with event AgentDisconnected
> 2022-09-26 20:02:32,596 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) The next status of
> agent 1is Alert, current status is Connecting
> 2022-09-26 20:02:32,597 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Deregistering link
> for 1 with state Alert
> 2022-09-26 20:02:32,598 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Remove Agent : 1
> 2022-09-26 20:02:32,598 DEBUG [c.c.a.m.ConnectedAgentAttache]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Processing
> Disconnect.
> 2022-09-26 20:02:32,598 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.hypervisor.xenserver.discoverer.XcpServerDiscoverer
> 2022-09-26 20:02:32,598 DEBUG [c.c.u.n.NioConnection]
> (AgentManager-NioConnectionHandler-1:null) (logid:) Closing socket
> Socket[addr=/172.17.0.2,port=49552,localport=8250]
> 2022-09-26 20:02:32,598 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.hypervisor.hyperv.discoverer.HypervServerDiscoverer
> 2022-09-26 20:02:32,599 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.storage.listener.StoragePoolMonitor
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.hypervisor.vmware.manager.VmwareManagerImpl
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.deploy.DeploymentPlanningManagerImpl
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.network.security.SecurityGroupListener
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.storage.secondary.SecondaryStorageListener
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: org.apache.cloudstack.engine.orchestration.NetworkOrchestrator
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.vm.ClusteredVirtualMachineManagerImpl
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener:
> com.cloud.network.NetworkUsageManagerImpl$DirectNetworkStatsListener
> 2022-09-26 20:02:32,602 DEBUG [c.c.n.NetworkUsageManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Disconnected called
> on 1 with status Alert
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.storage.upload.UploadListener
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.capacity.StorageCapacityListener
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.capacity.ComputeCapacityListener
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.network.SshKeysDistriMonitor
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.network.router.VirtualNetworkApplianceManagerImpl
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.agent.manager.AgentManagerImpl$BehindOnPingListener
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.agent.manager.AgentManagerImpl$SetHostParamsListener
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.storage.download.DownloadListener
> 2022-09-26 20:02:32,602 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.consoleproxy.ConsoleProxyListener
> 2022-09-26 20:02:32,603 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.network.SshKeysDistriMonitor
> 2022-09-26 20:02:32,603 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.network.router.VpcVirtualNetworkApplianceManagerImpl
> 2022-09-26 20:02:32,603 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.storage.LocalStoragePoolListener
> 2022-09-26 20:02:32,603 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.hypervisor.kvm.discoverer.LxcServerDiscoverer
> 2022-09-26 20:02:32,603 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Sending Disconnect
> to listener: com.cloud.hypervisor.kvm.discoverer.KvmServerDiscoverer
> 2022-09-26 20:02:32,603 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Transition:[Resource
> state = Maintenance, Agent event = AgentDisconnected, Host id = 1, name =
> srv-xxxxxx.xxxxxxxxxxxxxxxxx]
> 2022-09-26 20:02:32,607 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Notifying other
> nodes of to disconnect
>
> *2022-09-26 20:02:32,610 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Failed to handle
> host connection: com.cloud.exception.ConnectionException: Unable to connect
> host 1 to storage pool id 1 due to
> com.cloud.utils.exception.CloudRuntimeException: Unable to get answer to
> the setup persistent network command 214*
> at
> com.cloud.storage.listener.StoragePoolMonitor.processConnect(StoragePoolMonitor.java:138)
> at
> com.cloud.agent.manager.AgentManagerImpl.notifyMonitorsOfConnection(AgentManagerImpl.java:569)
> at
> com.cloud.agent.manager.AgentManagerImpl.handleConnectedAgent(AgentManagerImpl.java:1133)
> at
> com.cloud.agent.manager.AgentManagerImpl$HandleAgentConnectTask.runInContext(AgentManagerImpl.java:1217)
> at
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
> at
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: com.cloud.utils.exception.CloudRuntimeException: Unable to get
> answer to the setup persistent network command 214
> at
> org.apache.cloudstack.storage.datastore.provider.DefaultHostListener.setupPersistentNetwork(DefaultHostListener.java:245)
> at
> org.apache.cloudstack.storage.datastore.provider.DefaultHostListener.createPersistentNetworkResourcesOnHost(DefaultHostListener.java:99)
> at
> org.apache.cloudstack.storage.datastore.provider.DefaultHostListener.hostConnect(DefaultHostListener.java:162)
> ... 13 more
> 2022-09-26 20:02:32,611 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentConnectTaskPool-6:ctx-5ce58719) (logid:929b8bd0) Can not send command
> com.cloud.agent.api.ReadyCommand due to Host 1 is not up
> 2022-09-26 20:02:32,688 DEBUG [o.a.c.c.p.RootCACustomTrustManager]
> (pool-13-thread-1:null) (logid:) A client/agent attempting connection from
> address=172.17.0.2 has presented these certificate(s):
> Certificate [1] :
>
> And from the agent:
>
> 2022-09-26 20:02:32,351 INFO  [kvm.storage.LibvirtStorageAdaptor]
> (Agent-Handler-1:null) (logid:) Attempting to create storage pool
> 5603c980-a676-4b0f-8e00-b8bbc7ef740a (Filesystem) in libvirt
> 2022-09-26 20:02:32,351 DEBUG [kvm.resource.LibvirtConnection]
> (Agent-Handler-1:null) (logid:) Looking for libvirtd connection at:
> qemu:///system
> 2022-09-26 20:02:32,355 ERROR [kvm.resource.LibvirtConnection]
> (Agent-Handler-1:null) (logid:) Connection with libvirtd is broken: invalid
> connection pointer in virConnectGetVersion
> 2022-09-26 20:02:32,355 DEBUG [kvm.resource.LibvirtConnection]
> (Agent-Handler-1:null) (logid:) Opening a new libvirtd connection to:
> qemu:///system
> 2022-09-26 20:02:32,357 INFO  [kvm.storage.LibvirtStorageAdaptor]
> (Agent-Handler-1:null) (logid:) Found existing defined storage pool
> 5603c980-a676-4b0f-8e00-b8bbc7ef740a, using it.
> 2022-09-26 20:02:32,358 INFO  [kvm.storage.LibvirtStorageAdaptor]
> (Agent-Handler-1:null) (logid:) Trying to fetch storage pool
> 5603c980-a676-4b0f-8e00-b8bbc7ef740a from libvirt
> 2022-09-26 20:02:32,358 DEBUG [kvm.resource.LibvirtConnection]
> (Agent-Handler-1:null) (logid:) Looking for libvirtd connection at:
> qemu:///system
> 2022-09-26 20:02:32,369 DEBUG [kvm.storage.LibvirtStorageAdaptor]
> (Agent-Handler-1:null) (logid:) Successfully refreshed pool
> 5603c980-a676-4b0f-8e00-b8bbc7ef740a Capacity: (438.55 GB) 470886195200
> Used: (15.18 GB) 16303230976 Available: (423.36 GB) 454582964224
> 2022-09-26 20:02:32,371 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null)
> (logid:) Executing: hostname
> 2022-09-26 20:02:32,372 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null)
> (logid:) Executing while with timeout : 500
> 2022-09-26 20:02:32,373 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null)
> (logid:) Execution is successful.
> 2022-09-26 20:02:32,373 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null)
> (logid:) Executing: hostname
> 2022-09-26 20:02:32,374 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null)
> (logid:) Executing while with timeout : 500
> 2022-09-26 20:02:32,374 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null)
> (logid:) Execution is successful.
> 2022-09-26 20:02:32,397 INFO  [cloud.serializer.GsonHelper]
> (Agent-Handler-1:null) (logid:) Default Builder inited.
> 2022-09-26 20:02:32,415 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null)
> (logid:) Sending Startup: Seq 0-0:  { Cmd , MgmtId: -1, via: 0, Ver: v1,
> Flags: 1,
> [{"com.cloud.agent.api.StartupRoutingCommand":{"cpuSockets":"1","cpus":"48","speed":"2650","memory":"268725788672","dom0MinMemory":"1073741824","poolSync":"false","supportsClonedVolumes":"false","caps":"hvm,snapshot","pool":"/root","hypervisorType":"KVM","hostDetails":{"Host.OS.Kernel.Version":"5.4.0-126-generic","com.cloud.network.Networks.RouterPrivateIpStrategy":"HostLocal","Host.OS.Version":"20.04","secured":"true","Host.OS":"Ubuntu"},"hostTags":[],"groupDetails":{},"type":"Routing","dataCenter":"1","pod":"1","cluster":"1","guid":"dcb7e9d3-b26a-3da5-b91b-10dd1e28d97a-LibvirtComputingResource","name":"srv-2-xxxxxx.xxxxxxxxxxxxxxxxx","id":"0","version":"4.17.0.1","iqn":"iqn.1993-08.org.debian:01:e0741deca62","privateIpAddress":"172.17.0.2","privateMacAddress":"b0:7b:25:c0:1a:8b","privateNetmask":"255.255.255.192","storageIpAddress":"172.17.0.2","storageNetmask":"255.255.255.192","storageMacAddress":"b0:7b:25:c0:1a:8b","resourceName":"LibvirtComputingResource","gatewayIpAddress":"172.16.2.1","msHostList":"172.17.1.2@static","wait":"0","bypassHostMaintenance":"false"}},{"com.cloud.agent.api.StartupStorageCommand":{"totalSize":"(0
> bytes)
> 0","poolInfo":{"uuid":"5603c980-a676-4b0f-8e00-b8bbc7ef740a","host":"172.17.0.2","localPath":"/var/lib/libvirt/images","hostPath":"/var/lib/libvirt/images","poolType":"Filesystem","capacityBytes":"(438.55
> GB) 470886195200","availableBytes":"(423.36 GB)
> 454582964224"},"resourceType":"STORAGE_POOL","hostDetails":{},"type":"Storage","dataCenter":"1","pod":"1","guid":"dcb7e9d3-b26a-3da5-b91b-10dd1e28d97a-LibvirtComputingResource","name":"srv-2-xxxxxx.xxxxxxxxxxxxxxxxx","id":"0","version":"4.17.0.1","resourceName":"LibvirtComputingResource","msHostList":"172.17.1.2@static","wait":"0","bypassHostMaintenance":"false"}}]
> }
> 2022-09-26 20:02:32,415 DEBUG [cloud.agent.Agent] (Agent-Handler-1:null)
> (logid:) Startup task created
> 2022-09-26 20:02:32,433 DEBUG [cloud.agent.Agent] (Agent-Handler-2:null)
> (logid:) Received response: Seq 0-0:  { Ans: , MgmtId: 90520740759984, via:
> -1, Ver: v1, Flags: 100000,
> [{"com.cloud.agent.api.StartupAnswer":{"hostId":"0","pingInterval":"60","result":"true","wait":"0","bypassHostMaintenance":"false"}}]
> }
> 2022-09-26 20:02:32,433 DEBUG [cloud.agent.Agent] (Agent-Handler-2:null)
> (logid:) Startup task cancelled
> 2022-09-26 20:02:32,433 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> (logid:) Process agent startup answer, agent id = 0
> 2022-09-26 20:02:32,433 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> (logid:) Set agent id 0
> 2022-09-26 20:02:32,439 DEBUG [cloud.agent.Agent] (Agent-Handler-2:null)
> (logid:) Adding a watch list
> 2022-09-26 20:02:32,439 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
> (logid:) Startup Response Received: agent id = 0
> 2022-09-26 20:02:32,443 DEBUG [kvm.resource.LibvirtComputingResource]
> (UgentTask-1:null) (logid:) Executing:
> /usr/share/cloudstack-common/scripts/vm/network/security_group.py
> get_rule_logs_for_vms
> 2022-09-26 20:02:32,444 DEBUG [kvm.resource.LibvirtComputingResource]
> (UgentTask-1:null) (logid:) Executing while with timeout : 1800000
> 2022-09-26 20:02:32,494 DEBUG [cloud.agent.Agent]
> (agentRequest-Handler-2:null) (logid:929b8bd0) Request:Seq
> 1-605171199927910401:  { Cmd , MgmtId: 90520740759984, via: 1, Ver: v1,
> Flags: 100011,
> [{"com.cloud.agent.api.ModifyStoragePoolCommand":{"add":"true","pool":{"id":"1","uuid":"8bc1feaa-a099-3590-aa2b-2d12df0e242b","host":"172.17.2.6","path":"/volume4/primary_storage","port":"2049","type":"NetworkFilesystem"},"localPath":"/mnt//8bc1feaa-a099-3590-aa2b-2d12df0e242b","wait":"0","bypassHostMaintenance":"false"}}]
> }
> 2022-09-26 20:02:32,495 DEBUG [cloud.agent.Agent]
> (agentRequest-Handler-2:null) (logid:929b8bd0) Processing command:
> com.cloud.agent.api.ModifyStoragePoolCommand
> 2022-09-26 20:02:32,504 DEBUG [kvm.resource.LibvirtComputingResource]
> (UgentTask-1:null) (logid:) Execution is successful.
> 2022-09-26 20:02:32,509 DEBUG [kvm.resource.LibvirtConnection]
> (UgentTask-1:null) (logid:) Looking for libvirtd connection at:
> qemu:///system
> 2022-09-26 20:02:32,512 DEBUG [cloud.agent.Agent] (UgentTask-1:null)
> (logid:) Sending ping: Seq 0-1:  { Cmd , MgmtId: -1, via: 0, Ver: v1,
> Flags: 11,
> [{"com.cloud.agent.api.PingRoutingWithNwGroupsCommand":{"newGroupStates":{},"_hostVmStateReport":{},"_gatewayAccessible":"true","_vnetAccessible":"true","hostType":"Routing","hostId":"0","wait":"0","bypassHostMaintenance":"false"}}]
> }
> 2022-09-26 20:02:32,555 INFO  [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-2:null) (logid:929b8bd0) Attempting to create storage
> pool 8bc1feaa-a099-3590-aa2b-2d12df0e242b (NetworkFilesystem) in libvirt
> 2022-09-26 20:02:32,555 DEBUG [kvm.resource.LibvirtConnection]
> (agentRequest-Handler-2:null) (logid:929b8bd0) Looking for libvirtd
> connection at: qemu:///system
> 2022-09-26 20:02:32,556 INFO  [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-2:null) (logid:929b8bd0) Found existing defined
> storage pool 8bc1feaa-a099-3590-aa2b-2d12df0e242b, using it.
> 2022-09-26 20:02:32,556 INFO  [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-2:null) (logid:929b8bd0) Trying to fetch storage pool
> 8bc1feaa-a099-3590-aa2b-2d12df0e242b from libvirt
> 2022-09-26 20:02:32,556 DEBUG [kvm.resource.LibvirtConnection]
> (agentRequest-Handler-2:null) (logid:929b8bd0) Looking for libvirtd
> connection at: qemu:///system
> 2022-09-26 20:02:32,560 DEBUG [kvm.storage.LibvirtStorageAdaptor]
> (agentRequest-Handler-2:null) (logid:929b8bd0) Successfully refreshed pool
> 8bc1feaa-a099-3590-aa2b-2d12df0e242b Capacity: (3.4759 TB) 3821825818624
> Used: (1.8329 TB) 2015264243712 Available: (1.6431 TB) 1806561574912
> 2022-09-26 20:02:32,565 DEBUG [cloud.agent.Agent]
> (agentRequest-Handler-2:null) (logid:929b8bd0) Seq 1-605171199927910401:  {
> Ans: , MgmtId: 90520740759984, via: 1, Ver: v1, Flags: 10,
> [{"com.cloud.agent.api.ModifyStoragePoolAnswer":{"poolInfo":{"host":"172.17.2.6","localPath":"/mnt//8bc1feaa-a099-3590-aa2b-2d12df0e242b","hostPath":"/volume4/primary_storage","poolType":"NetworkFilesystem","capacityBytes":"(3.4759
> TB) 3821825818624","availableBytes":"(1.6431 TB)
> 1806561574912"},"templateInfo":{},"datastoreClusterChildren":[],"result":"true","wait":"0","bypassHostMaintenance":"false"}}]
> }
> 2022-09-26 20:02:32,568 DEBUG [cloud.agent.Agent] (Agent-Handler-4:null)
> (logid:) Received response: Seq 0-1:  { Ans: , MgmtId: 90520740759984, via:
> 1, Ver: v1, Flags: 100010,
> [{"com.cloud.agent.api.PingAnswer":{"_command":{"hostType":"Routing","hostId":"0","wait":"0","bypassHostMaintenance":"false"},"result":"true","wait":"0","bypassHostMaintenance":"false"}}]
> }
>
>
>
>
> *2022-09-26 20:02:32,605 DEBUG [utils.nio.NioConnection]
> (Agent-NioConnectionHandler-1:null) (logid:) Location 1: Socket
> Socket[addr=/172.17.1.2 <http://172.17.1.2>,port=8250,localport=49552]
> closed on read.  Probably -1 returned: Connection closed with -1 on reading
> size.2022-09-26 20:02:32,606 DEBUG [utils.nio.NioConnection]
> (Agent-NioConnectionHandler-1:null) (logid:) Closing socket
> Socket[addr=/172.17.1.2
> <http://172.17.1.2>,port=8250,localport=49552]2022-09-26 20:02:32,608 DEBUG
> [cloud.agent.Agent] (Agent-Handler-5:null) (logid:) Clearing watch list:
> 12022-09-26 20:02:32,610 INFO  [cloud.agent.Agent] (Agent-Handler-5:null)
> (logid:) Lost connection to host: 172.17.1.2. Attempting reconnection while
> we still have 0 commands in progress.2022-09-26 20:02:32,611 INFO
>  [utils.nio.NioClient] (Agent-Handler-5:null) (logid:) NioClient connection
> closed*
> 2022-09-26 20:02:32,611 INFO  [cloud.agent.Agent] (Agent-Handler-5:null)
> (logid:) Reconnecting to host:172.17.1.2
> 2022-09-26 20:02:32,611 INFO  [utils.nio.NioClient] (Agent-Handler-5:null)
> (logid:) Connecting to 172.17.1.2:8250
> 2022-09-26 20:02:32,612 INFO  [utils.nio.Link] (Agent-Handler-5:null)
> (logid:) Conf file found: /etc/cloudstack/agent/agent.properties
> 2022-09-26 20:02:32,692 INFO  [utils.nio.NioClient] (Agent-Handler-5:null)
> (logid:) SSL: Handshake done
> 2022-09-26 20:02:32,692 INFO  [utils.nio.NioClient] (Agent-Handler-5:null)
> (logid:) Connected to 172.17.1.2:8250
>
> Maybe someone can point out what the main problem is=?
> I currently only 'see' a command that fails due to the host is in
> maintenance-mode which then leads to the conclusion that the host can'
> connect to the storage pool... but before that the connection was
> established???
>
>
> Regards,
> Chris
>
>

Reply via email to