Hi Juergen,
I tried to reproduce the issue you described but wasn’t able to observe the 
same behavior.
Here’s what I did:

  *
Created 3 pods, each requesting a persistent volume, using the pod.yaml and 
pvc.yaml examples from 
https://github.com/leaseweb/cloudstack-csi-driver/tree/master/examples/k8s, 
renaming the pod and claim names accordingly.
  *
This resulted in the worker node (VM) having disks vda, vdb, vdc, and vdd 
listed in its libvirt XML.

  *
Then I deleted one of the PVs (by first deleting the corresponding pod, then 
the PVC). After that, the node had vda, vdb, and vdd.
  *
Next, I created a new pod with a PV, and it attached as vdc and started 
successfully.

So in my test, CloudStack correctly reused the freed device name (vdc), and I 
couldn’t reproduce the inconsistent state you mentioned.
I ran this on ACS 4.21.0, but I don’t think the version should affect this 
behavior.
Could you please confirm if there are any differences in your setup.

Regards,
Pearl


________________________________
From: Jürgen Gotteswinter <[email protected]>
Sent: October 2, 2025 4:46 AM
To: [email protected] <[email protected]>
Subject: ACS Blockvolumes, Leaseweb cloudstack-csi

Hi!

I am facing some issues with the cloudstack csi driver (leaseweb fork). In 
general it works pretty good, but for example when draining a Kubernetes node 
which triggers a lot of detach, attach operations, randomly something goes 
wrong and i end up in a inconsistent state and i cant attach devices anymore to 
the affected instance.

Scenario…


  *   Instance a has a few block volumes, requested by the CSI driver. Vda, 
vdb, vdc, vdd, vde show up in the libvirt xml
  *   Vdd gets detached from instance a
  *   Instance a now has vda, vdb, vdc, vde in its libvirt xml
  *   CSI driver requests a new block volume for instance a, and tries to 
attach it as vde, instead of using the meanwhile became free vdd

>From that point on, no more devices can be attached tot he instance. The 
>management server shows this

2025-10-01 11:00:52,702 ERROR [c.c.a.ApiAsyncJobDispatcher] 
(API-Job-Executor-85:[ctx-ee10aa59, job-629270]) (logid:5018a3b3) Unexpected 
exception while executing 
org.apache.cloudstack.api.command.user.volume.AttachVolumeCmd 
com.cloud.utils.exception.CloudRuntimeException: Failed to attach volume 
pvc-xxxx-b45f-4324-a85b-xxxx to VM kubetest-1; org.libvirt.LibvirtException: 
XML error: target 'vde' duplicated for disk sources 
'/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx and 
'/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx

2025-10-01 11:00:52,702 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-85:[ctx-ee10aa59, job-629270]) (logid:5018a3b3) Complete 
async job-629270, jobStatus: FAILED, resultCode: 530, result: 
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Failed
 to attach volume pvc-xxxx-b45f-4324-a85b-xxxx to VM kubetest -1; 
org.libvirt.LibvirtException: XML error: target 'vde' duplicated for disk 
sources /mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx ' and 
'/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx "}

If acs would try to add a new vdd interface (which became free) things would 
work i guess. After a shutdown/reboot of the affected vm, everything starts 
working again and new block devices can be attached.

We are currently on acs 4.20.1.0 on Ubuntu 24.04

Cheers,

Juergen

Reply via email to