GitHub user bradh352 added a comment to the discussion: Degraded cloudstack 
agent

@TadiosAbebe my environment matches yours .... ubuntu 24.04, hypercoverged 
ceph.  And the reported behavior also matches.

That said, using the force reconnect never did work for me as a resolution.  
And when a host got in that state, just restarting the agents wouldn't work for 
me and I'm not exactly sure why.   I never did fully isolate it.  Sometimes I 
had to restart the management servers which was really concerning, I think I 
tried restarting libvirtd too.

I set up a cron job to restart the agent daily at 5 min intervals across hosts: 
https://github.com/bradh352/ansible-role-service-cloudstack/commit/fdb14d92f115a5845d95880a05f467b42a84aff1

This has helped considerably.  That said, I just went through to "test" 
everything right now to see if everything was healthy, which entails trying to 
open the console on one vm per host, and I found a host in a bad state ... ugh. 
 So restarting the agent daily is definitely not a resolution.  Out of 
curiosity I decided to restart libvirtd, and that *did* fix it on the bad host.




GitHub link: 
https://github.com/apache/cloudstack/discussions/12450#discussioncomment-15515824

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to