GitHub user bradh352 added a comment to the discussion: Degraded cloudstack agent
@TadiosAbebe my environment matches yours .... ubuntu 24.04, hypercoverged ceph. And the reported behavior also matches. That said, using the force reconnect never did work for me as a resolution. And when a host got in that state, just restarting the agents wouldn't work for me and I'm not exactly sure why. I never did fully isolate it. Sometimes I had to restart the management servers which was really concerning, I think I tried restarting libvirtd too. I set up a cron job to restart the agent daily at 5 min intervals across hosts: https://github.com/bradh352/ansible-role-service-cloudstack/commit/fdb14d92f115a5845d95880a05f467b42a84aff1 This has helped considerably. That said, I just went through to "test" everything right now to see if everything was healthy, which entails trying to open the console on one vm per host, and I found a host in a bad state ... ugh. So restarting the agent daily is definitely not a resolution. Out of curiosity I decided to restart libvirtd, and that *did* fix it on the bad host. GitHub link: https://github.com/apache/cloudstack/discussions/12450#discussioncomment-15515824 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
