Steps to reproduce on Jammy
---
Stop libvirt systemd units
sudo systemctl stop 'libvirtd*'
Start libvirt in GDB
sudo gdb \
-iex 'set confirm off' \
-iex 'set pagination off' \
-iex 'set debuginfod enabled on' \
-iex 'set debuginfod urls https://debuginfod.ubuntu.com' \
-ex 'set non-stop on' \
-ex 'handle SIGTERM nostop noprint pass' \
-ex 'add-symbol-file /usr/sbin/libvirtd' \
-ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt.so.0' \
-ex 'add-symbol-file /usr/lib/x86_64-linux-gnu/libvirt-qemu.so.0' \
-ex 'add-symbol-file
/usr/lib/x86_64-linux-gnu/libvirt/connection-driver/libvirt_driver_qemu.so' \
/usr/sbin/libvirtd
Add breakpoints for qemu driver cleanup and device deleted event
b qemuStateCleanup
b processDeviceDeletedEvent
run
Start test VM with an USB mouse device
cat <<-EOF >test-vm.xml
<domain type='qemu'>
<name>test-vm</name>
<os>
<type>hvm</type>
</os>
<memory unit='MiB'>32</memory>
<vcpu>1</vcpu>
<devices>
<input type='mouse' bus='usb'/>
</devices>
</domain>
EOF
virsh define test-vm.xml
virsh start test-vm
$ virsh list
Id Name State
-------------------------
1 test-vm running
Delete the USB mouse device
DEVICE_ID=$(virsh qemu-monitor-command test-vm --hmp 'info qtree' |
grep 'dev: usb-mouse' | cut -d'"' -f2)
virsh qemu-monitor-command test-vm --hmp "device_del $DEVICE_ID"
Back to GDB
Thread 25 "qemu-event" hit Breakpoint 2, 0x00007f6179ed20a7 in
processDeviceDeletedEvent (devAlias=<optimized out>, vm=0x7f61842f1020,
driver=0x7f6184035e20) at ../../src/qemu/qemu_driver.c:3536
Add breakpoint to domain status XML save, and continue the thread above
b virDomainObjSave
t 25
c
Thread 25 "qemu-event" hit Breakpoint 3, virDomainObjSave
(obj=0x7f61842f1020, xmlopt=0x7f6184028010, statusDir=0x7f6184035460
"/run/libvirt/qemu") at ../../src/conf/domain_conf.c:28879
Check the backtrace of the domain status XML save function, coming from
device deleted event
(gdb) bt
#0 virDomainObjSave (obj=0x7f61842f1020, xmlopt=0x7f6184028010,
statusDir=0x7f6184035460 "/run/libvirt/qemu") at
../../src/conf/domain_conf.c:28879
#1 0x00007f6179eb68c3 in qemuDomainObjSaveStatus
(driver=0x7f6184035e20, obj=0x7f61842f1020) at ../../src/qemu/qemu_domain.c:5801
#2 0x00007f6179ed2159 in processDeviceDeletedEvent
(devAlias=0x7f617c0073e0 "input0", vm=0x7f61842f1020, driver=0x7f6184035e20) at
../../src/qemu/qemu_driver.c:3557
#3 qemuProcessEventHandler (data=0x7f617c0072b0,
opaque=0x7f6184035e20) at ../../src/qemu/qemu_driver.c:4184
#4 0x00007f61974fc983 in virThreadPoolWorker (opaque=<optimized out>)
at ../../src/util/virthreadpool.c:164
#5 0x00007f61974fb4d9 in virThreadHelper (data=<optimized out>) at
../../src/util/virthread.c:241
#6 0x00007f6196e64ac3 in start_thread (arg=<optimized out>) at
./nptl/pthread_create.c:442
#7 0x00007f6196ef6850 in clone3 () at
../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Leave the thread at this point
Let's trigger the shutdown path
First, increase the shutdown timer (30 seconds is too fast for me; use
30 minutes)
(gdb) b virEventAddTimeout
$ sudo kill $(pidof libvirtd)
Thread 1 "libvirtd" hit Breakpoint 4, virEventAddTimeout
(timeout=30000, cb=0x7f61975bbbc0 <virNetDaemonFinishTimer>,
opaque=0x55aec684a020, ff=0x0) at ../../src/util/virevent.c:148
t 1
set $rdi = 30 * 60 * 1000
(gdb) i r $rdi
rdi 0x1b7740 1800000
Now, skip the qemu driver shutdown wait path, to force the scenario
(unexpected) that it allows a race condition:
b qemuStateShutdownWait
c
Thread 26 "daemon-shutdown" hit Breakpoint 5,
qemuStateShutdownWait () at ../../src/qemu/qemu_driver.c:1055
t 26
ret
c
Thread 1 "libvirtd" hit Breakpoint 1, qemuStateCleanup () at
../../src/qemu/qemu_driver.c:1070
Check there are 2 threads: cleanup and domain status XML save
(gdb) i th
Id Target Id Frame
1 Thread 0x7f6193934ac0 (LWP 2544) "libvirtd" qemuStateCleanup
() at ../../src/qemu/qemu_driver.c:1070
18 Thread 0x7f616a7fc640 (LWP 2563) "gmain" (running)
19 Thread 0x7f6169ffb640 (LWP 2564) "gdbus" (running)
20 Thread 0x7f61697fa640 (LWP 2565) "udev-event" (running)
24 Thread 0x7f616affd640 (LWP 2641) "vm-test-vm" (running)
25 Thread 0x7f61687f8640 (LWP 2660) "qemu-event" virDomainObjSave
(obj=0x7f61842f1020, xmlopt=0x7f6184028010, statusDir=0x7f6184035460
"/run/libvirt/qemu") at ../../src/conf/domain_conf.c:28879
Confirm the qemu driver's domain xml formatter/options is
set/referenced:
t 25
(gdb) p xmlopt.privateData.format
$1 = (virDomainXMLPrivateDataFormatFunc) 0x7f6179eb1da0
<qemuDomainObjPrivateXMLFormat>
(gdb) p xmlopt.parent.parent_instance
$2 = {g_type_instance = {g_class = 0x7f6184053290}, ref_count = 1,
qdata = 0x0}
Let the cleanup function and shutdown path finish
t 1
c &
Check the formatter/options again; it is *NO* longer referenced:
(gdb) p xmlopt.privateData.format
$3 = (virDomainXMLPrivateDataFormatFunc) 0x7f6179eb1da0
<qemuDomainObjPrivateXMLFormat>
(gdb) p xmlopt.parent.parent_instance
$4 = {g_type_instance = {g_class = 0x0}, ref_count = 0, qdata = 0x0}
The object data is _not_ zeroed in the last unreference anymore
in Jammy as it is Focal, but it might happen, as this is really
an use-after-free (and another thread might get/use that memory).
So, let's simulate that.
set xmlopt.privateData.format = 0
(gdb) p xmlopt.privateData.format
$5 = (virDomainXMLPrivateDataFormatFunc) 0x0
Check the VM status XML *before* the save function finishes:
$ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path'
/run/libvirt/qemu/test-vm.xml
<domstatus state='running' reason='booted' pid='2638'>
<monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock'
type='unix'/>
<domain type='qemu' id='1'>
Let the save function continue, and libvirt finishes shutting down:
(gdb) c
Continuing.
...
[Inferior 1 (process 2544) exited normally]
Check the VM status XML *after*:
$ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path'
/run/libvirt/qemu/test-vm.xml
<domstatus state='running' reason='booted' pid='2638'>
<domain type='qemu' id='1'>
It no longer has the 'monitor path' tag/field.
Now, the next time libvirtd starts, it fails to parse that XML:
$ sudo systemctl start libvirtd.service
$ journalctl -b -u libvirtd.service | tail
...
... libvirtd[2789]: internal error: no monitor path
... libvirtd[2789]: Failed to load config for domain 'test-vm'
And libvirt is not aware of the domain, and cannot manage it:
$ virsh list
Id Name State
--------------------
$ virsh list --all
Id Name State
--------------------------
- test-vm shut off
Even though it is still running:
$ pgrep -af qemu-system-x86_64 | cut -d, -f1
2638 /usr/bin/qemu-system-x86_64 -name guest=test-vm,
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2059272
Title:
libvirt domain is not listed/managed after libvirt restart with
messages "internal error: no monitor path" and "Failed to load config
for domain"
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/2059272/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs