vgarcia-linube commented on issue #11141:
URL: https://github.com/apache/cloudstack/issues/11141#issuecomment-3292786693

   The problem seems to be a leak in the handler threads while checking storage 
usage. The more agent threads you configure in agent.properties and the less 
time you configure to retrieve volume usage metrics, the worse it gets, and the 
faster it happens.
   
   On a fresh start of the agent, you get the 'Trying to fetch storage pool 
xxxx from libvirt' message whenever the usage service is getting updated 
metrics. Those requests are either leaking or not getting garbage collected or 
something like that in time. Those requests start to overlap with time, and you 
end up seeing the same request to the same primary storage tens or hundreds of 
times. The only way to recover from that is to restart the agent, limit the 
number of threads of the agent and try to read the usage metrics in longer time 
spans (I think it defaults to 10 minutes or something like that, setting it to 
once every two hours mitigates it a bit, just enough so you don't have to 
restart the agent every few hours so it doesn't hog the kvm node cpu).
   
   Here's a log with redacted storage uuids so it's easier to see (take a look 
at the timestamps)
   
[storage-log-no-uuids.log](https://github.com/user-attachments/files/22346459/storage-log-no-uuids.log)
   
   This happens at least since Cloudstack 4.19


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to