Re: ACS Blockvolumes, Leaseweb cloudstack-csi

Jürgen Gotteswinter Sat, 18 Oct 2025 14:10:24 -0700

Hello Pearl,

thanks for looking into this. We saw this kind of problems during high 
frequency volume operations (container based ci pipelines), the agent logs this 
when it happens


Agent log

2025-05-26 00:03:06,141 WARN  [kvm.storage.KVMStorageProcessor] 
(agentRequest-Handler-5:null) (logid:xxx) Failed to attach device to 
i-55-xxx-VM: XML error: target 'vdf' duplicated for disk sources 
'/mnt/xxx-387c-3f14-aea7-0d19104d92dd/xxx-b61b-49cf-abfb-f4d9b3db0c03' and 
'/mnt/xxx-387c-3f14-aea7-0d19104d92dd/xxx-804f-471a-a7c7-ef6d14fedacf'

Management server

DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-98:[ctx-xxx, 
job-xxx]) (logid:xxx) Complete async job-xxx, jobStatus: FAILED, resultCode: 
530, result: 
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Failed
 to attach volume pvc-xxx-b45f-4324-a85b-a518de1516b2 to VM testnode-1; 
org.libvirt.LibvirtException: XML error: target 'vde' duplicated for disk 
sources '/mnt/xxx-387c-3f14-aea7-0d19104d92dd/xxx-c659-4699-8885-5e951c4af3fa' 
and '/mnt/xxx-387c-3f14-aea7-0d19104d92dd/xxx-c659-4699-8885-5e951c4af3fa'"}

In such situations, vdf or vde is indeed already in use. But some lower device 
name would be free to use.

I was never able to reproduce this via manual operations or for example with 
terraform. It feels like some kind of race condition caused by parallel volume 
operations. For now, we replaced it with a different k8s storage solution.

Cheers,

Juergen



Am 14.10.25, 19:58 schrieb "Pearl d'Silva" <[email protected] 
<mailto:[email protected]>>:


Hi Juergen,
I tried to reproduce the issue you described but wasn’t able to observe the 
same behavior.
Here’s what I did:


*
Created 3 pods, each requesting a persistent volume, using the pod.yaml and 
pvc.yaml examples from 
https://github.com/leaseweb/cloudstack-csi-driver/tree/master/examples/k8s 
<https://github.com/leaseweb/cloudstack-csi-driver/tree/master/examples/k8s>, 
renaming the pod and claim names accordingly.
*
This resulted in the worker node (VM) having disks vda, vdb, vdc, and vdd 
listed in its libvirt XML.


*
Then I deleted one of the PVs (by first deleting the corresponding pod, then 
the PVC). After that, the node had vda, vdb, and vdd.
*
Next, I created a new pod with a PV, and it attached as vdc and started 
successfully.


So in my test, CloudStack correctly reused the freed device name (vdc), and I 
couldn’t reproduce the inconsistent state you mentioned.
I ran this on ACS 4.21.0, but I don’t think the version should affect this 
behavior.
Could you please confirm if there are any differences in your setup.


Regards,
Pearl




________________________________
From: Jürgen Gotteswinter <[email protected] 
<mailto:[email protected]>LID>
Sent: October 2, 2025 4:46 AM
To: [email protected] <mailto:[email protected]> 
<[email protected] <mailto:[email protected]>>
Subject: ACS Blockvolumes, Leaseweb cloudstack-csi


Hi!


I am facing some issues with the cloudstack csi driver (leaseweb fork). In 
general it works pretty good, but for example when draining a Kubernetes node 
which triggers a lot of detach, attach operations, randomly something goes 
wrong and i end up in a inconsistent state and i cant attach devices anymore to 
the affected instance.


Scenario…




* Instance a has a few block volumes, requested by the CSI driver. Vda, vdb, 
vdc, vdd, vde show up in the libvirt xml
* Vdd gets detached from instance a
* Instance a now has vda, vdb, vdc, vde in its libvirt xml
* CSI driver requests a new block volume for instance a, and tries to attach it 
as vde, instead of using the meanwhile became free vdd


From that point on, no more devices can be attached tot he instance. The 
management server shows this


2025-10-01 11:00:52,702 ERROR [c.c.a.ApiAsyncJobDispatcher] 
(API-Job-Executor-85:[ctx-ee10aa59, job-629270]) (logid:5018a3b3) Unexpected 
exception while executing 
org.apache.cloudstack.api.command.user.volume.AttachVolumeCmd 
com.cloud.utils.exception.CloudRuntimeException: Failed to attach volume 
pvc-xxxx-b45f-4324-a85b-xxxx to VM kubetest-1; org.libvirt.LibvirtException: 
XML error: target 'vde' duplicated for disk sources 
'/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx and 
'/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx


2025-10-01 11:00:52,702 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-85:[ctx-ee10aa59, job-629270]) (logid:5018a3b3) Complete 
async job-629270, jobStatus: FAILED, resultCode: 530, result: 
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Failed
 to attach volume pvc-xxxx-b45f-4324-a85b-xxxx to VM kubetest -1; 
org.libvirt.LibvirtException: XML error: target 'vde' duplicated for disk 
sources /mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx ' and 
'/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx "}


If acs would try to add a new vdd interface (which became free) things would 
work i guess. After a shutdown/reboot of the affected vm, everything starts 
working again and new block devices can be attached.


We are currently on acs 4.20.1.0 on Ubuntu 24.04


Cheers,


Juergen

Re: ACS Blockvolumes, Leaseweb cloudstack-csi

Reply via email to