Verification done on focal-proposed, following comments 23, 24, 25, 26.
Including in this comment a few key snippets from each test/comment.
---
Environment
---
LXD virtual machine
lxc launch --vm ubuntu:focal lp2059272-focal
lxc exec lp2059272-focal -- su - ubuntu
Enable proposed & debug symbols
cat <<EOF | sudo tee /etc/apt/sources.list.d/proposed.list
deb http://archive.ubuntu.com/ubuntu focal-proposed main universe
deb http://ddebs.ubuntu.com focal-proposed main universe
EOF
cat <<EOF | sudo tee /etc/apt/preferences.d/proposed
Package: *
Pin: release a=focal-proposed
Pin-Priority: 400
EOF
sudo apt install --yes --no-install-recommends gdb qemu-system-x86
ubuntu-dbgsym-keyring
sudo apt update
sudo apt install --yes --no-install-recommends -t focal-proposed
libvirt{0,-daemon{,-driver-qemu,-system}}{,-dbgsym} libvirt-clients
$ apt-cache policy libvirt-daemon-driver-qemu
libvirt-daemon-driver-qemu:
Installed: 6.0.0-0ubuntu8.20
Candidate: 6.0.0-0ubuntu8.20
Version table:
*** 6.0.0-0ubuntu8.20 400
400 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64
Packages
100 /var/lib/dpkg/status
6.0.0-0ubuntu8.19 500
500 http://archive.ubuntu.com/ubuntu focal-updates/main amd64
Packages
500 http://security.ubuntu.com/ubuntu focal-security/main amd64
Packages
6.0.0-0ubuntu8 500
500 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages
newgrp libvirt # or logout/login
Libvirtd debug logging
cat <<-EOF | sudo tee -a /etc/libvirt/libvirtd.conf
log_filters="1:qemu 1:libvirt"
log_outputs="3:syslog:libvirtd
1:file:/var/log/libvirt/libvirtd-debug.log"
EOF
---
Steps with test packages on Focal (normal restarts)
---
<...>
for SLEEP in $(seq 0.1 0.1 2.0); do
<...>
All VMs are still managed by libvirt:
$ virsh list
Id Name State
----------------------------
1 test-vm-1 running
2 test-vm-2 running
3 test-vm-3 running
4 test-vm-4 running
5 test-vm-5 running
6 test-vm-6 running
7 test-vm-7 running
8 test-vm-8 running
9 test-vm-9 running
10 test-vm-10 running
---
Steps with test packages on Focal (shutdown-on-init)
---
Scenario 1) Shutdown wins race against XML update (ie, shutdown happens
first)
<...>
Now, let the qemuProcessReconnect thread continue, it will not update the XML
file,
because 'quit' is set (ie, shutdown in progress)
(gdb) t 20
(gdb) p ((virNetDaemonPtr)anyobj)->quit
$2 = true
$ ls -l /run/libvirt/qemu/test-vm.xml
-rw------- 1 root root 10189 Apr 24 12:02 /run/libvirt/qemu/test-vm.xml
(gdb) c &
$ ls -l /run/libvirt/qemu/test-vm.xml
-rw------- 1 root root 10189 Apr 24 12:02 /run/libvirt/qemu/test-vm.xml
<...>
$ sudo grep 'Leaving the update of .* domain status XML'
/var/log/libvirt/libvirtd-debug.log
2024-04-24 12:08:40.054+0000: 3770: info : qemuProcessReconnect:8157 :
Leaving the update of 'test-vm' domain status XML for the next initialization
(shutdown detected on this initialization).
<...>
$ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path'
/run/libvirt/qemu/test-vm.xml
<domstatus state='running' reason='booted' pid='3726'>
<monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock'
type='unix'/>
<domain type='qemu' id='1'>
Scenario 2) Shutdown loses race against XML update (ie, update happens
first)
<...>
Instead, let the qemuProcessReconnect thread take the lock, and update
the XML file, but not unlock yet
<...>
$ ls -l /run/libvirt/qemu/test-vm.xml
-rw------- 1 root root 10189 Apr 24 12:02 /run/libvirt/qemu/test-vm.xml
(gdb) b virObjectUnlock thread 20 if anyobj == $ptr
(gdb) c
$ ls -l /run/libvirt/qemu/test-vm.xml
-rw------- 1 root root 10189 Apr 24 12:14 /run/libvirt/qemu/test-vm.xml
<...>
$ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path'
/run/libvirt/qemu/test-vm.xml
<domstatus state='running' reason='booted' pid='3726'>
<monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock'
type='unix'/>
<domain type='qemu' id='1'>
Scenario 3) Shutdown happens along QEMU monitor calls (ie, calls don't
finish)
<...>
The XML was not updated, as expected:
$ ls -l /run/libvirt/qemu/test-vm.xml
-rw------- 1 root root 10189 Apr 24 12:14 /run/libvirt/qemu/test-vm.xml
$ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path'
/run/libvirt/qemu/test-vm.xml
<domstatus state='running' reason='booted' pid='3726'>
<monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock'
type='unix'/>
<domain type='qemu' id='1'>
<...>
Now, the next time libvirtd starts, it correctly parses that XML:
$ sudo systemctl start libvirtd.service
$ journalctl -b -u libvirtd.service | grep -A1 error
$
And libvirt is aware of the domain, and can manage it:
$ virsh list
Id Name State
-------------------------
1 test-vm running
$ virsh destroy test-vm
Domain test-vm destroyed
$ virsh undefine test-vm
Domain test-vm has been undefined
---
Steps with test packages on Focal (shutdown-on-runtime)
---
<...>
Check the formatter/options again; it is *STILL* referenced, not 0x0 anymore:
(gdb) t 20
(gdb) p xmlopt.privateData.format
$3 = (virDomainXMLPrivateDataFormatFunc) 0x7fd08c3437c0
<qemuDomainObjPrivateXMLFormat>
(gdb) p/x xmlopt.parent
$4 = {u = {dummy_align1 = 0x1cafe0026, dummy_align2 = 0x1cafe0026, s =
{magic = 0xcafe0026, refs = 0x1}}, klass = 0x7fd080043170}
Let the save function continue, and libvirt finishes shutting down:
<...>
Check the VM status XML *after*:
$ ls -l /run/libvirt/qemu/test-vm.xml
-rw------- 1 root root 10251 Apr 24 12:28 /run/libvirt/qemu/test-vm.xml
$ sudo grep -e '<domstatus' -e '<domain' -e 'monitor path'
/run/libvirt/qemu/test-vm.xml
<domstatus state='running' reason='booted' pid='4055'>
<monitor path='/var/lib/libvirt/qemu/domain-1-test-vm/monitor.sock'
type='unix'/>
<domain type='qemu' id='1'>
Now, the next time libvirtd starts, it correctly parses that XML:
$ sudo systemctl start libvirtd.service
$ journalctl -b -u libvirtd.service | grep -A1 error
$
And libvirt is aware of the domain, and can manage it:
$ virsh list
Id Name State
-------------------------
1 test-vm running
$ virsh destroy test-vm
Domain test-vm destroyed
$ virsh undefine test-vm
Domain test-vm has been undefined
** Tags removed: verification-needed verification-needed-focal
** Tags added: verification-done verification-done-focal
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2059272
Title:
libvirt domain is not listed/managed after libvirt restart with
messages "internal error: no monitor path" and "Failed to load config
for domain"
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/2059272/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs