GitHub user akrasnov-drv closed a discussion: CloudStack starts misbehaving under minimal load
_I opened several other discussions/issues for my scale problems with CloudStack (partially solved). But now I managed to simplify the env, and want to start it clean as a new discussion_ <br> ### My env ``` CloudStack 4.20.0.0 on Ubuntu 22.04.5 LTS (amd64) with libvirt 8.0.0-1ubuntu7.10 OpenJDK 64-Bit Server VM (build 17.0.13+11-Ubuntu-2ubuntu122.04, mixed mode, sharing) ``` <br> I configured basic shared network, IPs are from my /24 internal subnet. VRouter serves as dhcp, metadata and dns server, but VMs are directly accessible from network and have direct outside access (via subnet gateway). I left single host to run VMs, separate from CloudStack management host. VM host can run about 20 VMs. Network is 1 Gbps. Primary and secondary storage are in NFS from another host in the same subnet. <br> ### The task I'm trying to start 20 VMs from the same qcow image and the same compute offering using local disk (on RAID-0 of 2 SSDs) There is several seconds delay between VM deployments. <br> ### The problem 2-3 first VMs start fine. Then VMs in `starting` state begin accumulating. 2 running + 1 starting 3 running + 2 starting Finally I get about 7 running and 13 starting and that state persists. I mean even after some 15 minutes those starting are still starting. To make sure, I see just those running ones in virsh output. All "starting" ones are not even created. When I tried to start more (in hope to see them in virsh), they just joined to the pool of "starting" ones (at the moment I see about 28 in "starting" state on the host that can run just 20) But that's just a part of problem. The second part is that from those 7 running just 3-5 started well, and the rest did not get hostname from VR metadata server. Sometimes I also see VMs that do not get IP, but that's more rare. For such ones reboot/reset does not help, but reinstall via UI button usually fixes the problem. Logs on the agent side are quite idle. Just periodically I see messages like this ``` 2025-02-17 18:03:09,959 WARN [kvm.resource.LibvirtKvmAgentHook] (agentRequest-Handler-2:[]) (logid:402faf9c) Groovy script '/etc/cloudstack/agent/hooks/libvirt-vm-state-change.groovy' is not available. Transformations will not be applied. 2025-02-17 18:03:09,959 WARN [kvm.resource.LibvirtKvmAgentHook] (agentRequest-Handler-2:[]) (logid:402faf9c) Groovy scripting engine is not initialized. Data transformation skipped. ``` Management log is active. In these tests I did not feel UI slowdown as it was when I had more hosts connected with advanced network. Nevertheless I was not able to start properly VMs I wanted. <br> ### My questions - Lately I've got an impression that the last CloudStack versions are imperfect. Would be great if someone could share positive experience of running dozens of VMs with later versions. - If you are happy with some older CloudStack version, please also share - If default configuration is not suitable for dozens of VMs, please advise on config changes. If there is any document on this, I'd be glad to work through it. I continue to experiment and am able to share logs if required. All useful information is appreciated. Thanks, Alex. GitHub link: https://github.com/apache/cloudstack/discussions/10414 ---- This is an automatically sent email for users@cloudstack.apache.org. To unsubscribe, please send an email to: users-unsubscr...@cloudstack.apache.org