StepBee commented on issue #11425: URL: https://github.com/apache/cloudstack/issues/11425#issuecomment-3206906765
We actually experience two errors in this scenario, and are not able to do VM Live migrations at all for VMs on L2 networks with ConfigDrive capability - not even after the second try. CloudStack 4.20.1 (Management Server and KVM Agents) KVM based QEMU emulator version 8.2.2 virsh 10.0.0 Ubuntu 24.04.2 LTS Ceph Primary Storage NFS Secondary Storage VM Details: ``` # virsh domblklist i-2-3196-VM Target Source ------------------------------------------------------------------------ vda rbd_cs_md/7d39e71d-9d74-43e8-b773-199c21e833ea hdc - hdd /mnt/31282614-0b57-3a61-8c76-bc5cc028a19d/i-2-3196-VM.iso ``` ``` # virsh dumpxml i-2-3196-VM ... <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/mnt/31282614-0b57-3a61-8c76-bc5cc028a19d/i-2-3196-VM.iso' index='1'/> <backingStore/> <target dev='hdd' bus='ide'/> <readonly/> <alias name='ide0-1-1'/> <address type='drive' controller='0' bus='1' target='0' unit='1'/> </disk> ... ``` **1st Error** Only if the VM does have user-data provided (and the generated ISO file is attached) Agent Log from the KVM Hypervisor: ``` 2025-08-20 15:12:55,164 DEBUG [kvm.resource.LibvirtVMDef] (AgentRequest-Handler-5:[]) (logid:) Using informed label [hdc] for volume [null]. 2025-08-20 15:12:55,164 DEBUG [kvm.resource.LibvirtVMDef] (AgentRequest-Handler-5:[]) (logid:) Using informed label [hdd] for volume [/mnt/31282614-0b57-3a61-8c76-bc5cc028a19d/i-2-3196-VM.iso]. 2025-08-20 15:12:55,164 DEBUG [kvm.resource.LibvirtComputingResource] (AgentRequest-Handler-5:[]) (logid:) Detaching ConfigDrive ISO of the VM i-2-3196-VM, at path /mnt/31282614-0b57-3a61-8c76-bc5cc028a19d/i-2- 3196-VM.iso 2025-08-20 15:12:55,164 DEBUG [kvm.resource.LibvirtVMDef] (AgentRequest-Handler-5:[]) (logid:) Using device ID [4] to define the label [hdd] for volume [null]. 2025-08-20 15:12:55,166 DEBUG [kvm.resource.LibvirtComputingResource] (AgentRequest-Handler-5:[]) (logid:) Attaching device: <disk device='cdrom' type='file'> <driver name='qemu' type='raw' /> <source file=''/> <target dev='hdd' bus='ide'/> </disk> 2025-08-20 15:12:57,258 DEBUG [kvm.resource.LibvirtVMDef] (AgentRequest-Handler-5:[]) (logid:) Using informed label [hdc] for volume [null]. 2025-08-20 15:12:57,258 DEBUG [kvm.resource.LibvirtVMDef] (AgentRequest-Handler-5:[]) (logid:) Using informed label [hdd] for volume []. 2025-08-20 15:12:57,258 DEBUG [kvm.resource.LibvirtComputingResource] (AgentRequest-Handler-5:[]) (logid:) Unable to clean up disk with null path (perhaps empty cdrom drive):<disk device='cdrom' type='file'> <driver name='qemu' type='raw' /> <source file=''/> <target dev='hdd' bus='ide'/> </disk> 2025-08-20 15:12:57,258 DEBUG [kvm.resource.LibvirtComputingResource] (AgentRequest-Handler-5:[]) (logid:) Attaching ConfigDrive ISO of the VM i-2-3196-VM, at path /mnt/31282614-0b57-3a61-8c76-bc5cc028a19d/i-2- 3196-VM.iso 2025-08-20 15:12:57,258 DEBUG [kvm.storage.KVMStoragePoolManager] (AgentRequest-Handler-5:[]) (logid:) Get storage pool by uri: /mnt/31282614-0b57-3a61-8c76-bc5cc028a19d 2025-08-20 15:12:57,261 WARN [cloud.agent.Agent] (AgentRequest-Handler-5:[]) (logid:) Caught: java.lang.NullPointerException: Cannot invoke "com.cloud.storage.Storage$StoragePoolType.toString()" because "type" is null at com.cloud.hypervisor.kvm.storage.LibvirtStorageAdaptor.createStoragePool(LibvirtStorageAdaptor.java:736) at com.cloud.hypervisor.kvm.storage.KVMStoragePoolManager.createStoragePool(KVMStoragePoolManager.java:393) at com.cloud.hypervisor.kvm.storage.KVMStoragePoolManager.getStoragePoolByURI(KVMStoragePoolManager.java:339) at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.attachOrDetachISO(LibvirtComputingResource.java:3513) at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.detachAndAttachConfigDriveISO(LibvirtComputingResource.java:3492) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.execute(LibvirtMigrateCommandWrapper.java:232) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.execute(LibvirtMigrateCommandWrapper.java:88) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78) at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1958) at com.cloud.agent.Agent.processRequest(Agent.java:779) at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1194) at com.cloud.utils.nio.Task.call(Task.java:83) at com.cloud.utils.nio.Task.call(Task.java:29) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) 2025-08-20 15:12:57,263 DEBUG [cloud.agent.Agent] (AgentRequest-Handler-5:[]) (logid:) Seq 3986-3832844757869032974: { Ans: , MgmtId: 33830401474176, via: 3986, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answe r":{"result":"false","details":"java.lang.NullPointerException: Cannot invoke "com.cloud.storage.Storage$StoragePoolType.toString()" because "type" is null at com.cloud.hypervisor.kvm.storage.LibvirtStorageAdaptor.createStoragePool(LibvirtStorageAdaptor.java:736) at com.cloud.hypervisor.kvm.storage.KVMStoragePoolManager.createStoragePool(KVMStoragePoolManager.java:393) at com.cloud.hypervisor.kvm.storage.KVMStoragePoolManager.getStoragePoolByURI(KVMStoragePoolManager.java:339) at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.attachOrDetachISO(LibvirtComputingResource.java:3513) at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.detachAndAttachConfigDriveISO(LibvirtComputingResource.java:3492) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.execute(LibvirtMigrateCommandWrapper.java:232) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.execute(LibvirtMigrateCommandWrapper.java:88) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78) at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1958) at com.cloud.agent.Agent.processRequest(Agent.java:779) at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1194) at com.cloud.utils.nio.Task.call(Task.java:83) at com.cloud.utils.nio.Task.call(Task.java:29) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) ","wait":"0","bypassHostMaintenance":"false"}}] } ``` After the migration attempt, the iso is unmounted ``` # virsh domblklist i-2-3196-VM Target Source ------------------------------------------------------------------------ vda rbd_cs_md/7d39e71d-9d74-43e8-b773-199c21e833ea hdc - hdd - ``` **2nd Error** If the migration is executed a second time or in case user-data is empty (and no iso is attached) ``` 2025-08-20 15:18:03,040 DEBUG [cloud.agent.Agent] (AgentRequest-Handler-3:[]) (logid:) Processing command: com.cloud.agent.api.MigrateCommand 2025-08-20 15:18:03,074 DEBUG [resource.wrapper.LibvirtMigrateCommandWrapper] (AgentRequest-Handler-3:[]) (logid:) Trying to migrate VM [i-2-3196-VM] to destination host: [qemu+tls://10.145.36.37/system]. 2025-08-20 15:18:03,074 DEBUG [agent.properties.AgentPropertiesFileHandler] (AgentRequest-Handler-3:[]) (logid:) Property [hypervisor.uri] has empty or null value. Using default value [null]. 2025-08-20 15:18:03,074 DEBUG [kvm.resource.LibvirtConnection] (AgentRequest-Handler-3:[]) (logid:) Looking for libvirtd connection at: qemu:///system 2025-08-20 15:18:03,093 DEBUG [kvm.resource.LibvirtVMDef] (AgentRequest-Handler-3:[]) (logid:) Using informed label [hdc] for volume [null]. 2025-08-20 15:18:03,093 DEBUG [kvm.resource.LibvirtVMDef] (AgentRequest-Handler-3:[]) (logid:) Using informed label [hdd] for volume []. 2025-08-20 15:18:03,102 DEBUG [kvm.resource.LibvirtVMDef] (AgentRequest-Handler-3:[]) (logid:) Using informed label [hdc] for volume [null]. 2025-08-20 15:18:03,102 DEBUG [kvm.resource.LibvirtVMDef] (AgentRequest-Handler-3:[]) (logid:) Using informed label [hdd] for volume []. 2025-08-20 15:18:03,102 DEBUG [resource.wrapper.LibvirtMigrateCommandWrapper] (AgentRequest-Handler-3:[]) (logid:) Found domain with name [i-2-3196-VM]. Starting VM migration to host [qemu+tls://<kvm-destination-host-ip>/sy stem]. 2025-08-20 15:18:03,110 DEBUG [resource.wrapper.LibvirtMigrateCommandWrapper] (AgentRequest-Handler-3:[]) (logid:) VM [i-2-3196-VM] with XML configuration [<domain type='kvm'> <name>i-2-3196-VM</name> <uuid>193fc957-4206-4feb-a551-67741459cdb7</uuid> <description>Ubuntu 20.04 LTS</description> <memory unit='KiB'>2097152</memory> <currentMemory unit='KiB'>2097152</currentMemory> <vcpu placement='static'>2</vcpu> <cputune> <shares>435</shares> </cputune> <resource> <partition>/machine</partition> </resource> <sysinfo type='smbios'> <system> <entry name='manufacturer'>Apache Software Foundation</entry> <entry name='product'>CloudStack KVM Hypervisor</entry> <entry name='serial'>193fc957-4206-4feb-a551-67741459cdb7</entry> <entry name='uuid'>193fc957-4206-4feb-a551-67741459cdb7</entry> </system> </sysinfo> <os> <type arch='x86_64' machine='pc-i440fx-8.2'>hvm</type> <boot dev='cdrom'/> <boot dev='hd'/> <smbios mode='sysinfo'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='host-model' check='partial'> <topology sockets='2' dies='1' cores='1' threads='1'/> </cpu> <clock offset='utc'> <timer name='kvmclock'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/bin/qemu-system-x86_64</emulator> <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='none'/> <auth username='cloudstack-poc'> <secret type='ceph' uuid='<secret-ceph-uuid>'/> </auth> <source protocol='rbd' name='rbd_cs_md/7d39e71d-9d74-43e8-b773-199c21e833ea'> <host name='<ceph-host-ip>'/> </source> <target dev='vda' bus='virtio'/> <serial>7d39e71d9d7443e8b773</serial> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <target dev='hdc' bus='ide'/> <readonly/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <target dev='hdd' bus='ide'/> <readonly/> <address type='drive' controller='0' bus='1' target='0' unit='1'/> </disk> <controller type='ide' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <controller type='virtio-serial' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <interface type='bridge'> <mac address='02:04:01:61:00:01'/> <source bridge='brbond0-777'/> <bandwidth> <inbound average='1280000' peak='1280000'/> <outbound average='1280000' peak='1280000'/> </bandwidth> <model type='virtio'/> <link state='up'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='pty'> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/i-2-3196-VM.org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> [26/1911] </channel> <input type='tablet' bus='usb'> <address type='usb' bus='0' port='1'/> </input> <input type='mouse' bus='ps2'/> <graphics type='vnc' port='-1' autoport='yes' listen='<consolproxy-ip>' passwd='<vnc-password>'> <listen type='address' address='<consolproxy-ip>'/> </graphics> <video> <model type='cirrus' vram='16384' heads='1' primary='yes'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <watchdog model='i6300esb' action='none'> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </watchdog> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> </devices> </domain> ] will be migrated to host [10.145.36.37]. 2025-08-20 15:18:03,111 DEBUG [resource.wrapper.LibvirtMigrateCommandWrapper] (AgentRequest-Handler-3:[]) (logid:) Replaced the VNC IP address [<graphics type='vnc' port='-1' autoport='yes' listen='<consolproxy-ip> ' passwd='<vnc-password>'> <listen type='address' address='<consolproxy-ip>'/> </graphics>] with [<graphics type='vnc' port='-1' autoport='yes' listen='<consolproxy-ip>' passwd='<vnc-password>'> <listen type='address' address='<consolproxy-ip>'/> </graphics>] in VM [i-2-3196-VM]. 2025-08-20 15:18:03,111 DEBUG [kvm.storage.KVMStoragePoolManager] (AgentRequest-Handler-3:[]) (logid:) Get storage pool by uri: nfs://<nfs-ip>/cloudstack-poc/configdrive 2025-08-20 15:18:03,111 INFO [kvm.storage.LibvirtStorageAdaptor] (AgentRequest-Handler-3:[]) (logid:) Attempting to create storage pool 31282614-0b57-3a61-8c76-bc5cc028a19d (NetworkFilesystem) in libvirt 2025-08-20 15:18:03,111 DEBUG [kvm.resource.LibvirtConnection] (AgentRequest-Handler-3:[]) (logid:) Looking for libvirtd connection at: qemu:///system 2025-08-20 15:18:03,118 INFO [kvm.storage.LibvirtStorageAdaptor] (AgentRequest-Handler-3:[]) (logid:) Found existing defined storage pool 31282614-0b57-3a61-8c76-bc5cc028a19d, using it. 2025-08-20 15:18:03,121 DEBUG [utils.script.Script] (AgentRequest-Handler-3:[]) (logid:) Executing command [/bin/bash -c mountpoint -q /mnt/31282614-0b57-3a61-8c76-bc5cc028a19d ]. 2025-08-20 15:18:03,129 DEBUG [utils.script.Script] (AgentRequest-Handler-3:[]) (logid:) Successfully executed process [2226180] for command [/bin/bash -c mountpoint -q /mnt/31282614-0b57-3a61-8c76-bc5cc028a19d ]. 2025-08-20 15:18:03,129 INFO [kvm.storage.LibvirtStorageAdaptor] (AgentRequest-Handler-3:[]) (logid:) Trying to fetch storage pool 31282614-0b57-3a61-8c76-bc5cc028a19d from libvirt 2025-08-20 15:18:03,129 DEBUG [kvm.resource.LibvirtConnection] (AgentRequest-Handler-3:[]) (logid:) Looking for libvirtd connection at: qemu:///system 2025-08-20 15:18:03,169 DEBUG [resource.wrapper.LibvirtMigrateCommandWrapper] (AgentRequest-Handler-3:[]) (logid:) Editing mount path of iso from null to /mnt/31282614-0b57-3a61-8c76-bc5cc028a19d/i-2-3196-VM.is o 2025-08-20 15:18:03,173 WARN [cloud.agent.Agent] (AgentRequest-Handler-3:[]) (logid:) Caught: java.lang.NullPointerException: Cannot invoke "org.w3c.dom.Node.getNodeValue()" because "sourceNodeAttribute" is nu ll at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.findSourceNode(LibvirtMigrateCommandWrapper.java:902) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.findDiskNode(LibvirtMigrateCommandWrapper.java:886) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.replaceDiskSourceFile(LibvirtMigrateCommandWrapper.java:873) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.execute(LibvirtMigrateCommandWrapper.java:174) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.execute(LibvirtMigrateCommandWrapper.java:88) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78) at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1958) at com.cloud.agent.Agent.processRequest(Agent.java:779) at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1194) at com.cloud.utils.nio.Task.call(Task.java:83) at com.cloud.utils.nio.Task.call(Task.java:29) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) 2025-08-20 15:18:03,174 DEBUG [cloud.agent.Agent] (AgentRequest-Handler-3:[]) (logid:) Seq 3986-3832844757869032987: { Ans: , MgmtId: 33830401474176, via: 3986, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answe r":{"result":"false","details":"java.lang.NullPointerException: Cannot invoke "org.w3c.dom.Node.getNodeValue()" because "sourceNodeAttribute" is null at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.findSourceNode(LibvirtMigrateCommandWrapper.java:902) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.findDiskNode(LibvirtMigrateCommandWrapper.java:886) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.replaceDiskSourceFile(LibvirtMigrateCommandWrapper.java:873) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.execute(LibvirtMigrateCommandWrapper.java:174) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtMigrateCommandWrapper.execute(LibvirtMigrateCommandWrapper.java:88) at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78) at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1958) at com.cloud.agent.Agent.processRequest(Agent.java:779) at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1194) at com.cloud.utils.nio.Task.call(Task.java:83) at com.cloud.utils.nio.Task.call(Task.java:29) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) ","wait":"0","bypassHostMaintenance":"false"}}] } ``` I have to admit, i am not very sure, if both error are really related or if its two separate errors. The first error looks like coming from within this area [https://github.com/apache/cloudstack/blob/c61a5eb430a8ced2b10d7cec9948a8704c2c52ac/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/storage/KVMStoragePoolManager.java#L327](https://github.com/apache/cloudstack/blob/c61a5eb430a8ced2b10d7cec9948a8704c2c52ac/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/storage/KVMStoragePoolManager.java#L327) - where maybe the "protocol" variable is never set in the following if condition and stays "null" as initially set. Therefore, the protocol = null is handed over as "null" to the next function calls, which seems tolerated in the following functions, until it's hitting the .toString at [https://github.com/apache/cloudstack/blob/c61a5eb430a8ced2b10d7cec9948a8704c2c52ac/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java#L736](https://github.com/apache/cloudstack/blob/c61a5eb430a8ced2b10d7cec9948a8704c2c52ac/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java#L736) For the second error, i didn't fully check the details yet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cloudstack.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org