Hi Juergen, I tried to reproduce the issue you described but wasn’t able to observe the same behavior. Here’s what I did:
* Created 3 pods, each requesting a persistent volume, using the pod.yaml and pvc.yaml examples from https://github.com/leaseweb/cloudstack-csi-driver/tree/master/examples/k8s, renaming the pod and claim names accordingly. * This resulted in the worker node (VM) having disks vda, vdb, vdc, and vdd listed in its libvirt XML. * Then I deleted one of the PVs (by first deleting the corresponding pod, then the PVC). After that, the node had vda, vdb, and vdd. * Next, I created a new pod with a PV, and it attached as vdc and started successfully. So in my test, CloudStack correctly reused the freed device name (vdc), and I couldn’t reproduce the inconsistent state you mentioned. I ran this on ACS 4.21.0, but I don’t think the version should affect this behavior. Could you please confirm if there are any differences in your setup. Regards, Pearl ________________________________ From: Jürgen Gotteswinter <[email protected]> Sent: October 2, 2025 4:46 AM To: [email protected] <[email protected]> Subject: ACS Blockvolumes, Leaseweb cloudstack-csi Hi! I am facing some issues with the cloudstack csi driver (leaseweb fork). In general it works pretty good, but for example when draining a Kubernetes node which triggers a lot of detach, attach operations, randomly something goes wrong and i end up in a inconsistent state and i cant attach devices anymore to the affected instance. Scenario… * Instance a has a few block volumes, requested by the CSI driver. Vda, vdb, vdc, vdd, vde show up in the libvirt xml * Vdd gets detached from instance a * Instance a now has vda, vdb, vdc, vde in its libvirt xml * CSI driver requests a new block volume for instance a, and tries to attach it as vde, instead of using the meanwhile became free vdd >From that point on, no more devices can be attached tot he instance. The >management server shows this 2025-10-01 11:00:52,702 ERROR [c.c.a.ApiAsyncJobDispatcher] (API-Job-Executor-85:[ctx-ee10aa59, job-629270]) (logid:5018a3b3) Unexpected exception while executing org.apache.cloudstack.api.command.user.volume.AttachVolumeCmd com.cloud.utils.exception.CloudRuntimeException: Failed to attach volume pvc-xxxx-b45f-4324-a85b-xxxx to VM kubetest-1; org.libvirt.LibvirtException: XML error: target 'vde' duplicated for disk sources '/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx and '/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx 2025-10-01 11:00:52,702 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-85:[ctx-ee10aa59, job-629270]) (logid:5018a3b3) Complete async job-629270, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Failed to attach volume pvc-xxxx-b45f-4324-a85b-xxxx to VM kubetest -1; org.libvirt.LibvirtException: XML error: target 'vde' duplicated for disk sources /mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx ' and '/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx "} If acs would try to add a new vdd interface (which became free) things would work i guess. After a shutdown/reboot of the affected vm, everything starts working again and new block devices can be attached. We are currently on acs 4.20.1.0 on Ubuntu 24.04 Cheers, Juergen
