Good evening ladies and gents,

So I'm still trying to set up both management and agent on the same server
running qemu-kvm on Alma Linux 8.4 (basically CentOS). My agent kept
crashing, either for good, or for a moment while host addition going on,
then it'll start up again when I tell it to, only to crash when I try to
add the host in the UI again.

Some log entries from various times the agent crashed:
===============
Received unknown parameters for command addHost. Unknown parameters :
clustertype
 can't setup agent, due to com.cloud.utils.exception.CloudRuntimeException
===============
(AgentShutdownThread:) (logid:) Stopping the agent: Reason = sig.kill
 (AgentManager-Handler-15:null) (logid:) Host 1 has informed us that it is
shutting down with reason sig.kill and detail null
 (AgentTaskPool-1:ctx-04086948) (logid:18b54886) Host 1 is disconnecting
with event ShutdownRequested
 cloudstack-agent.service: Main process exited, code=exited, status=143/n/a
 cloudstack-agent.service: Failed with result 'exit-code'.
===============
ERROR [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:) (logid:)
org.libvirt.LibvirtException: XML error: missing pool source name element
===============
ERROR [c.c.u.n.Link] (AgentManager-SSLHandshakeHandler-1:null) (logid:) SSL
error caught during unwrap data: Received fatal alert: bad_certificate, for
local address=/127.0.0.1:8250, remote address=/127.0.0.1:57550.
ERROR [utils.nio.Link] (Agent-Handler-2:) (logid:) SSL error caught during
unwrap data: Received fatal alert: bad_certificate, for local address=/
127.0.0.1:57550, remote address=localhost/127.0.0.1:8250. The client may
have invalid ca-certificates.
===============
ERROR [kvm.resource.LibvirtConnection] (Agent-Handler-1:) (logid:)
Connection with libvirtd is broken: invalid connection pointer in
virConnectGetVersion
===============
java.io.IOException: keystore password was incorrect
===============
ERROR [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:) (logid:)
org.libvirt.LibvirtException:
XML error: missing pool source name element
===============

When the UI fails to add host, it throws this error in the UI:
Something went wrong; please correct the following:
TypeError: Cannot read property 'value' of undefined

Note: I'm not using SSL.
Previously I didn't have any problem with the agent detecting the bridge
network adapter, although for the past...6 tries or so, it kept failing to
detect the network adapter and crash upon starting. I had to manually
specify it in /etc/cloudstack/agent/agent.properties for it to start, only
then crash with one of the error above. Some of those, especially the SSL
error, or the keystore password, it'll just crash again. Most of the time
I'd use ReaR to bring it back to after update/install/config file update,
only thing missing would be cloudstack-management and cloudstack-agent
install, and try again. Some other times I reinstall the entire thing. I'm
not sure why this is so hard to get right...

I should also mention, even though the host addition failed, it still shows
up under hosts. If I quit the wizard and delete the host/pod/zone/network
and try again, no problem. If I fix errors and try readding, there'll be 2
hosts, one up and one disconnected, and I wouldn't be able to remove the
disconnected one. Clearing out the db and redoing it doesn't seem to 'clean
up' enough and often result in 503 so I'd just ReaR at that point

For the KVM packages, I installed qemu-kvm, libvirt, libvirt-python3,
libguestfs-tools, virt-install, python-virtinst .
The network bridge configuration is basically copy/paste from the
cloudstack install guide with appropriate IP address values
and NM_CONTROLLED=yes instead of no.

Any pointers would be appreciated. Thank you.

Reply via email to