Hello Pearl,
thanks for looking into this. We saw this kind of problems during high
frequency volume operations (container based ci pipelines), the agent logs this
when it happens
Agent log
2025-05-26 00:03:06,141 WARN [kvm.storage.KVMStorageProcessor]
(agentRequest-Handler-5:null) (logid:xxx) Failed to attach device to
i-55-xxx-VM: XML error: target 'vdf' duplicated for disk sources
'/mnt/xxx-387c-3f14-aea7-0d19104d92dd/xxx-b61b-49cf-abfb-f4d9b3db0c03' and
'/mnt/xxx-387c-3f14-aea7-0d19104d92dd/xxx-804f-471a-a7c7-ef6d14fedacf'
Management server
DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-98:[ctx-xxx,
job-xxx]) (logid:xxx) Complete async job-xxx, jobStatus: FAILED, resultCode:
530, result:
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Failed
to attach volume pvc-xxx-b45f-4324-a85b-a518de1516b2 to VM testnode-1;
org.libvirt.LibvirtException: XML error: target 'vde' duplicated for disk
sources '/mnt/xxx-387c-3f14-aea7-0d19104d92dd/xxx-c659-4699-8885-5e951c4af3fa'
and '/mnt/xxx-387c-3f14-aea7-0d19104d92dd/xxx-c659-4699-8885-5e951c4af3fa'"}
In such situations, vdf or vde is indeed already in use. But some lower device
name would be free to use.
I was never able to reproduce this via manual operations or for example with
terraform. It feels like some kind of race condition caused by parallel volume
operations. For now, we replaced it with a different k8s storage solution.
Cheers,
Juergen
Am 14.10.25, 19:58 schrieb "Pearl d'Silva" <[email protected]
<mailto:[email protected]>>:
Hi Juergen,
I tried to reproduce the issue you described but wasn’t able to observe the
same behavior.
Here’s what I did:
*
Created 3 pods, each requesting a persistent volume, using the pod.yaml and
pvc.yaml examples from
https://github.com/leaseweb/cloudstack-csi-driver/tree/master/examples/k8s
<https://github.com/leaseweb/cloudstack-csi-driver/tree/master/examples/k8s>,
renaming the pod and claim names accordingly.
*
This resulted in the worker node (VM) having disks vda, vdb, vdc, and vdd
listed in its libvirt XML.
*
Then I deleted one of the PVs (by first deleting the corresponding pod, then
the PVC). After that, the node had vda, vdb, and vdd.
*
Next, I created a new pod with a PV, and it attached as vdc and started
successfully.
So in my test, CloudStack correctly reused the freed device name (vdc), and I
couldn’t reproduce the inconsistent state you mentioned.
I ran this on ACS 4.21.0, but I don’t think the version should affect this
behavior.
Could you please confirm if there are any differences in your setup.
Regards,
Pearl
________________________________
From: Jürgen Gotteswinter <[email protected]
<mailto:[email protected]>LID>
Sent: October 2, 2025 4:46 AM
To: [email protected] <mailto:[email protected]>
<[email protected] <mailto:[email protected]>>
Subject: ACS Blockvolumes, Leaseweb cloudstack-csi
Hi!
I am facing some issues with the cloudstack csi driver (leaseweb fork). In
general it works pretty good, but for example when draining a Kubernetes node
which triggers a lot of detach, attach operations, randomly something goes
wrong and i end up in a inconsistent state and i cant attach devices anymore to
the affected instance.
Scenario…
* Instance a has a few block volumes, requested by the CSI driver. Vda, vdb,
vdc, vdd, vde show up in the libvirt xml
* Vdd gets detached from instance a
* Instance a now has vda, vdb, vdc, vde in its libvirt xml
* CSI driver requests a new block volume for instance a, and tries to attach it
as vde, instead of using the meanwhile became free vdd
From that point on, no more devices can be attached tot he instance. The
management server shows this
2025-10-01 11:00:52,702 ERROR [c.c.a.ApiAsyncJobDispatcher]
(API-Job-Executor-85:[ctx-ee10aa59, job-629270]) (logid:5018a3b3) Unexpected
exception while executing
org.apache.cloudstack.api.command.user.volume.AttachVolumeCmd
com.cloud.utils.exception.CloudRuntimeException: Failed to attach volume
pvc-xxxx-b45f-4324-a85b-xxxx to VM kubetest-1; org.libvirt.LibvirtException:
XML error: target 'vde' duplicated for disk sources
'/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx and
'/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx
2025-10-01 11:00:52,702 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(API-Job-Executor-85:[ctx-ee10aa59, job-629270]) (logid:5018a3b3) Complete
async job-629270, jobStatus: FAILED, resultCode: 530, result:
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Failed
to attach volume pvc-xxxx-b45f-4324-a85b-xxxx to VM kubetest -1;
org.libvirt.LibvirtException: XML error: target 'vde' duplicated for disk
sources /mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx ' and
'/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx "}
If acs would try to add a new vdd interface (which became free) things would
work i guess. After a shutdown/reboot of the affected vm, everything starts
working again and new block devices can be attached.
We are currently on acs 4.20.1.0 on Ubuntu 24.04
Cheers,
Juergen