Re: [D] Degraded cloudstack agent [cloudstack]

via GitHub Fri, 16 Jan 2026 01:58:55 -0800


GitHub user TadiosAbebe added a comment to the discussion: Degraded cloudstack 
agent


> But I had a test all-in-one ACS on ubuntu 24.04 with libvirt 10.0.0, but I 
> couldn’t reproduce the issue I’m seeing in the production environment. I 
> repeatedly ran your test script:
> 
> ```
> for i in `seq 1 20`;do
>     cmk deploy virtualmachine name=L2-wei-test-$i serviceofferingid=xxx 
> zoneid=xxx templateid=xxx networkids=xxx & >/dev/null;
>     sleep 2;
> done
> ```
> 
> to generate load, and the results were consistently fast:
> 
> ```
> mysql> select id,name,created,update_time,(update_time-created) from 
> vm_instance where removed is null and name like "L2-wei%";
> +-----+----------------+---------------------+---------------------+-----------------------+
> | id  | name           | created             | update_time         | 
> (update_time-created) |
> +-----+----------------+---------------------+---------------------+-----------------------+
> | 191 | L2-wei-test-1  | 2025-11-25 11:22:07 | 2025-11-25 11:22:14 |          
>            7 |
> | 192 | L2-wei-test-2  | 2025-11-25 11:22:09 | 2025-11-25 11:22:16 |          
>            7 |
> | 193 | L2-wei-test-3  | 2025-11-25 11:22:11 | 2025-11-25 11:22:17 |          
>            6 |
> | 194 | L2-wei-test-4  | 2025-11-25 11:22:13 | 2025-11-25 11:22:22 |          
>            9 |
> | 195 | L2-wei-test-5  | 2025-11-25 11:22:15 | 2025-11-25 11:22:20 |          
>            5 |
> | 196 | L2-wei-test-6  | 2025-11-25 11:22:17 | 2025-11-25 11:22:23 |          
>            6 |
> | 197 | L2-wei-test-7  | 2025-11-25 11:22:19 | 2025-11-25 11:22:26 |          
>            7 |
> | 198 | L2-wei-test-8  | 2025-11-25 11:22:21 | 2025-11-25 11:22:27 |          
>            6 |
> | 199 | L2-wei-test-9  | 2025-11-25 11:22:23 | 2025-11-25 11:22:29 |          
>            6 |
> | 200 | L2-wei-test-10 | 2025-11-25 11:22:25 | 2025-11-25 11:22:31 |          
>            6 |
> | 201 | L2-wei-test-11 | 2025-11-25 11:22:27 | 2025-11-25 11:22:34 |          
>            7 |
> | 202 | L2-wei-test-12 | 2025-11-25 11:22:29 | 2025-11-25 11:22:36 |          
>            7 |
> | 203 | L2-wei-test-13 | 2025-11-25 11:22:31 | 2025-11-25 11:22:38 |          
>            7 |
> | 204 | L2-wei-test-14 | 2025-11-25 11:22:33 | 2025-11-25 11:22:41 |          
>            8 |
> | 205 | L2-wei-test-15 | 2025-11-25 11:22:35 | 2025-11-25 11:22:42 |          
>            7 |
> | 206 | L2-wei-test-16 | 2025-11-25 11:22:37 | 2025-11-25 11:22:45 |          
>            8 |
> | 207 | L2-wei-test-17 | 2025-11-25 11:22:39 | 2025-11-25 11:22:48 |          
>            9 |
> | 208 | L2-wei-test-18 | 2025-11-25 11:22:41 | 2025-11-25 11:22:49 |          
>            8 |
> | 209 | L2-wei-test-19 | 2025-11-25 11:22:43 | 2025-11-25 11:22:51 |          
>            8 |
> | 210 | L2-wei-test-20 | 2025-11-25 11:22:45 | 2025-11-25 11:22:55 |          
>           10 |
> +-----+----------------+---------------------+---------------------+-----------------------+
> ```
> 
> I'll try to create a full environment consisting ceph and multiple kvm host 
> on a test environment to see if i can replicate the issue there and see if 
> libvirt 10.6.0 fix it, later this week.

My update:
After yesterdays test i let the all-in-one ACS sit without restarting libvirtd 
and cloudstack-agent for a while now, initially the resource utilization of the 
java process was
```
CPU:  0.5%
MEM:  2.4%
FD: 253
Threads: 83
Conn: 1
```
After about 6 or 7 hours, got up to about CPU: 1.1% without any interaction on 
the host, just running the above 20 small cirros VMs previously launched. Then 
i tried to simulate some workload in a loop creating and destroying those 20 
instances. The resource CPU utilization increased to about 4.5% after about 10 
hours
```
CPU:  4.5%
MEM:  2.6%
FD: 253
Threads: 78
Conn: 1
```
One thing i noticed when looking into the java process in htop is, 5 or 6 java 
process in our production cluster have a high CPU time for example on one of 
our compute host about 4 process have a CPU time value of around 53:54:30. Is 
this normal?

If i restart libvirtd and the CPU time of the java process those process with 
high CPU time goes away and the max becomes  0:05:26 

I'll replace the libvirt with 10.6.0 on the all-in-one instance and see what 
changes now.



GitHub link: 
https://github.com/apache/cloudstack/discussions/12450#discussioncomment-15515843

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Re: [D] Degraded cloudstack agent [cloudstack]

Reply via email to