GitHub user akrasnov-drv closed a discussion: CloudStack starts misbehaving 
under minimal load

_I opened several other discussions/issues for my scale problems with 
CloudStack (partially solved). But now I managed to simplify the env, and want 
to start it clean as a new discussion_
<br>
### My env
```
CloudStack 4.20.0.0
on Ubuntu 22.04.5 LTS (amd64)
with libvirt 8.0.0-1ubuntu7.10
OpenJDK 64-Bit Server VM (build 17.0.13+11-Ubuntu-2ubuntu122.04, mixed mode, 
sharing)
```
<br>
I configured basic shared network, IPs are from my /24 internal subnet. VRouter 
serves as dhcp, metadata and dns server, but VMs are directly accessible from 
network and have direct outside access (via subnet gateway).
I left single host to run VMs, separate from CloudStack management host.
VM host can run about 20 VMs.
Network is 1 Gbps.
Primary and secondary storage are in NFS from another host in the same subnet.
<br>

### The task
I'm trying to start 20 VMs from the same qcow image and the same compute 
offering using local disk (on RAID-0 of 2 SSDs)
There is several seconds delay between VM deployments.
<br>

### The problem
2-3 first VMs start fine. Then VMs in `starting` state begin accumulating.
2 running + 1 starting
3 running + 2 starting
Finally I get about 7 running and 13 starting and that state persists. I mean 
even after some 15 minutes those starting are still starting. 
To make sure, I see just those running ones in virsh output. All "starting" 
ones are not even created.
When I tried to start more (in hope to see them in virsh), they just joined to 
the pool of "starting" ones (at the moment I see about 28 in "starting" state 
on the host that can run just 20)

But that's just a part of problem.
The second part is that from those 7 running just 3-5 started well, and the 
rest did not get hostname from VR metadata server.
Sometimes I also see VMs that do not get IP, but that's more rare. For such 
ones reboot/reset does not help, but reinstall via UI button usually fixes the 
problem.

Logs on the agent side are quite idle. Just periodically I see messages like 
this
```
2025-02-17 18:03:09,959 WARN  [kvm.resource.LibvirtKvmAgentHook] 
(agentRequest-Handler-2:[]) (logid:402faf9c) Groovy script 
'/etc/cloudstack/agent/hooks/libvirt-vm-state-change.groovy' is not available. 
Transformations will not be applied.
2025-02-17 18:03:09,959 WARN  [kvm.resource.LibvirtKvmAgentHook] 
(agentRequest-Handler-2:[]) (logid:402faf9c) Groovy scripting engine is not 
initialized. Data transformation skipped.
```
Management log is active.
In these tests I did not feel UI slowdown as it was when I had more hosts 
connected with advanced network. Nevertheless I was not able to start properly 
VMs I wanted.
<br>

### My questions
- Lately I've got an impression that the last CloudStack versions are 
imperfect. Would be great if someone could share positive experience of running 
dozens of VMs with later versions.
- If you are happy with some older CloudStack version, please also share
- If default configuration is not suitable for dozens of VMs, please advise on 
config changes. If there is any document on this, I'd be glad to work through 
it.

I continue to experiment and am able to share logs if required.
All useful information is appreciated.

Thanks,
Alex.




GitHub link: https://github.com/apache/cloudstack/discussions/10414

----
This is an automatically sent email for users@cloudstack.apache.org.
To unsubscribe, please send an email to: users-unsubscr...@cloudstack.apache.org

Reply via email to