This is a 22.04 upgraded to 24.04. I never once saw this behaviour prior to
the upgrade (I always use this VM, it is my development environment, so I
would have noticed).
However, I was not using Ubuntu kernels with 22.04, I was using liquorix.
Currently I am on a stock Ubuntu 24.04 kernel.
The ten second delay is not needed, it seems. I was just being very
cautious (or silly). I removed it.
It is not passthrough, it is an Ubuntu guest using libvirt graphics, shared
memory, 3d acceleration.
The hardware is a PC with AMD Ryzen4 7700X and one AMD graphics card.
this is the case without the fix, when autostart failed:
journalctl -b 0 | grep -i -e "\[drm\] Initialized" -e "Starting
libvirtd.service"
Jul 18 20:10:20 black kernel: [drm] Initialized simpledrm 1.0.0 20200625
for simple-framebuffer.0 on minor 0
Jul 18 20:10:23 black systemd[1]: Starting libvirtd.service - libvirt
legacy monolithic daemon...
Jul 18 20:10:25 black kernel: [drm] Initialized amdgpu 3.57.0 20150101 for
0000:03:00.0 on minor 1
Jul 18 20:10:25 black kernel: [drm] Initialized amdgpu 3.57.0 20150101 for
0000:6d:00.0 on minor 0
A second run:
tim@black:~$ journalctl -b 0 | grep -i -e "\[drm\] Initialized" -e
"Starting libvirtd.service"
Jul 18 20:19:54 black kernel: [drm] Initialized simpledrm 1.0.0 20200625
for simple-framebuffer.0 on minor 0
Jul 18 20:19:57 black systemd[1]: Starting libvirtd.service - libvirt
legacy monolithic daemon...
Jul 18 20:19:59 black kernel: [drm] Initialized amdgpu 3.57.0 20150101 for
0000:03:00.0 on minor 1
Jul 18 20:19:59 black kernel: [drm] Initialized amdgpu 3.57.0 20150101 for
0000:6d:00.0 on minor 0
DRI
tim@black:~$ ll /dev/dri
total 0
drwxr-xr-x 3 root root 140 Jul 18 20:19 ./
drwxr-xr-x 22 root root 6680 Jul 18 20:20 ../
drwxr-xr-x 2 root root 120 Jul 18 20:19 by-path/
crw-rw----+ 1 root video 226, 0 Jul 18 20:19 card0
crw-rw----+ 1 root video 226, 1 Jul 18 20:19 card1
crw-rw----+ 1 root render 226, 128 Jul 18 20:19 renderD128
crw-rw----+ 1 root render 226, 129 Jul 18 20:19 renderD129
$ systemctl list-units --all --type device | grep dri
<nothing> [same are yours]
VM spec:
<domain type="kvm">
<name>ubuntu24.04</name>
<uuid>50a13bea-17be-440d-88cf-63086538e1a5</uuid>
<metadata>
<libosinfo:libosinfo xmlns:libosinfo="
http://libosinfo.org/xmlns/libvirt/domain/1.0">
<libosinfo:os id="http://ubuntu.com/ubuntu/24.04"/>
</libosinfo:libosinfo>
</metadata>
<memory unit="KiB">16777216</memory>
<currentMemory unit="KiB">16777216</currentMemory>
<memoryBacking>
<hugepages>
<page size="2048" unit="KiB"/>
</hugepages>
<source type="memfd"/>
<access mode="shared"/>
</memoryBacking>
<vcpu placement="static">16</vcpu>
<os firmware="efi">
<type arch="x86_64" machine="pc-q35-8.2">hvm</type>
<firmware>
<feature enabled="yes" name="enrolled-keys"/>
<feature enabled="yes" name="secure-boot"/>
</firmware>
<loader readonly="yes" secure="yes"
type="pflash">/usr/share/OVMF/OVMF_CODE_4M.ms.fd</loader>
<nvram
template="/usr/share/OVMF/OVMF_VARS_4M.ms.fd">/var/lib/libvirt/qemu/nvram/ubuntu24.04_VARS.fd</nvram>
<boot dev="hd"/>
</os>
<features>
<acpi/>
<apic/>
<vmport state="off"/>
<smm state="on"/>
</features>
<cpu mode="host-passthrough" check="none" migratable="on"/>
<clock offset="utc">
<timer name="rtc" tickpolicy="catchup"/>
<timer name="pit" tickpolicy="delay"/>
<timer name="hpet" present="no"/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<pm>
<suspend-to-mem enabled="no"/>
<suspend-to-disk enabled="no"/>
</pm>
<devices>
<emulator>/usr/bin/qemu-system-x86_64</emulator>
<disk type="file" device="disk">
<driver name="qemu" type="qcow2"/>
<source file="/opt/virtual_machines/ubuntu-dev-2024.qcow2"/>
<target dev="vda" bus="virtio"/>
<address type="pci" domain="0x0000" bus="0x04" slot="0x00"
function="0x0"/>
</disk>
<controller type="usb" index="0" model="qemu-xhci" ports="15">
<address type="pci" domain="0x0000" bus="0x02" slot="0x00"
function="0x0"/>
</controller>
<controller type="pci" index="0" model="pcie-root"/>
<controller type="pci" index="1" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="1" port="0x10"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02"
function="0x0" multifunction="on"/>
</controller>
<controller type="pci" index="2" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="2" port="0x11"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02"
function="0x1"/>
</controller>
<controller type="pci" index="3" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="3" port="0x12"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02"
function="0x2"/>
</controller>
<controller type="pci" index="4" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="4" port="0x13"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02"
function="0x3"/>
</controller>
<controller type="pci" index="5" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="5" port="0x14"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02"
function="0x4"/>
</controller>
<controller type="pci" index="6" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="6" port="0x15"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02"
function="0x5"/>
</controller>
<controller type="pci" index="7" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="7" port="0x16"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02"
function="0x6"/>
</controller>
<controller type="pci" index="8" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="8" port="0x17"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x02"
function="0x7"/>
</controller>
<controller type="pci" index="9" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="9" port="0x18"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x03"
function="0x0" multifunction="on"/>
</controller>
<controller type="pci" index="10" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="10" port="0x19"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x03"
function="0x1"/>
</controller>
<controller type="pci" index="11" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="11" port="0x1a"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x03"
function="0x2"/>
</controller>
<controller type="pci" index="12" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="12" port="0x1b"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x03"
function="0x3"/>
</controller>
<controller type="pci" index="13" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="13" port="0x1c"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x03"
function="0x4"/>
</controller>
<controller type="pci" index="14" model="pcie-root-port">
<model name="pcie-root-port"/>
<target chassis="14" port="0x1d"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x03"
function="0x5"/>
</controller>
<controller type="sata" index="0">
<address type="pci" domain="0x0000" bus="0x00" slot="0x1f"
function="0x2"/>
</controller>
<controller type="virtio-serial" index="0">
<address type="pci" domain="0x0000" bus="0x03" slot="0x00"
function="0x0"/>
</controller>
<filesystem type="mount" accessmode="passthrough">
<driver type="virtiofs"/>
<source dir="/home/tim/Downloads"/>
<target dir="host_downloads"/>
<address type="pci" domain="0x0000" bus="0x07" slot="0x00"
function="0x0"/>
</filesystem>
<interface type="network">
<mac address="52:54:00:9c:2a:df"/>
<source network="br0"/>
<model type="virtio"/>
<address type="pci" domain="0x0000" bus="0x01" slot="0x00"
function="0x0"/>
</interface>
<serial type="pty">
<target type="isa-serial" port="0">
<model name="isa-serial"/>
</target>
</serial>
<console type="pty">
<target type="serial" port="0"/>
</console>
<channel type="unix">
<target type="virtio" name="org.qemu.guest_agent.0"/>
<address type="virtio-serial" controller="0" bus="0" port="1"/>
</channel>
<channel type="spicevmc">
<target type="virtio" name="com.redhat.spice.0"/>
<address type="virtio-serial" controller="0" bus="0" port="2"/>
</channel>
<input type="tablet" bus="usb">
<address type="usb" bus="0" port="1"/>
</input>
<input type="mouse" bus="ps2"/>
<input type="keyboard" bus="ps2"/>
<tpm model="tpm-crb">
<backend type="emulator" version="2.0"/>
</tpm>
<graphics type="spice">
<listen type="none"/>
<image compression="off"/>
<gl enable="yes"/>
</graphics>
<sound model="ich9">
<address type="pci" domain="0x0000" bus="0x00" slot="0x1b"
function="0x0"/>
</sound>
<audio id="1" type="spice"/>
<video>
<model type="virtio" heads="1" primary="yes">
<acceleration accel3d="yes"/>
</model>
<address type="pci" domain="0x0000" bus="0x00" slot="0x01"
function="0x0"/>
</video>
<redirdev bus="usb" type="spicevmc">
<address type="usb" bus="0" port="2"/>
</redirdev>
<redirdev bus="usb" type="spicevmc">
<address type="usb" bus="0" port="3"/>
</redirdev>
<watchdog model="itco" action="reset"/>
<memballoon model="virtio">
<address type="pci" domain="0x0000" bus="0x05" slot="0x00"
function="0x0"/>
</memballoon>
<rng model="virtio">
<backend model="random">/dev/urandom</backend>
<address type="pci" domain="0x0000" bus="0x06" slot="0x00"
function="0x0"/>
</rng>
</devices>
</domain>
On Thu, 18 Jul 2024 at 17:05, Christian Ehrhardt <
[email protected]> wrote:
> Hi again Tim,
> this is an even more interesting case - as it depends on kernel module
> loading will be different between systems. It will differ in:
> 1. Need: only if gpu passthrough or GL rendernodes are configured this
> will matter
> 2. Availability: systems might have more GPUs, which one do we wait on. Or
> they have none with this path never being populated
>
> So this is an interesting effort for a sysadmin - to decide I configured
> and have the need #1, but I know it is available #2 so now how do I tune
> my system to cope with that.
>
> But at the same time tricky for a generic fix to not negatively affect
> those that do not need it or have systems which never have it.
>
> I have no solution yet, but some thoughts and questions.
>
>
> ## 1 Ordering
>
> What you describe is gladly AFAIK the uncommon case, you are describing
> that libvirt starts and then starts the guest before the kernel
> initialized dri/drm.
>
> On the systems I could quickly check that was not the case, I have
> always seen things like:
>
> $ journalctl -b 0 | grep -i -e "\[drm\] Initialized" -e "Starting
> libvirtd.service"
> Mai 21 08:11:03 Keschdeichel kernel: [drm] Initialized simpledrm 1.0.0
> 20200625 for simple-framebuffer.0 on minor 0
> Mai 21 08:11:05 Keschdeichel kernel: [drm] Initialized i915 1.6.0 20230929
> for 0000:00:02.0 on minor 1
> Mai 21 08:11:05 Keschdeichel kernel: [drm] Initialized evdi 1.14.4
> 20240410 for evdi.0 on minor 0
> Mai 21 08:11:05 Keschdeichel kernel: [drm] Initialized evdi 1.14.4
> 20240410 for evdi.1 on minor 2
> Mai 21 08:11:05 Keschdeichel kernel: [drm] Initialized evdi 1.14.4
> 20240410 for evdi.2 on minor 3
> Mai 21 08:11:05 Keschdeichel kernel: [drm] Initialized evdi 1.14.4
> 20240410 for evdi.3 on minor 4
> Mai 21 08:11:10 Keschdeichel systemd[1]: Starting libvirtd.service -
> libvirt legacy monolithic daemon...
>
>
> I'm just curious how to match this fail onto your case.
> How does this look for you?
> I assume that without your change you get libvirt starting before (all)
> drm - is that what you see?
>
>
> ## 2 Waiting
>
> [Service]
> ExecStartPre=/bin/sleep 10
>
>
> I understand that this fixes your issue and keep it until we've found
> something better.
> But that can not be a generic change we'd apply.
> It is 10 seconds for you, maybe someone needs 12 or 123456 - we could
> never set this right and would slow everyone not even needing it down for
> nothing.
>
> Yet, as documented workaround it is nice
>
>
> ## 3 The Unit
>
> [Unit]
> After=multi-user.target dev-dri.device
>
> This looks much better, but AFAICS it should do nothing.
> My system has dri entries
>
> $ ll /dev/dri/
> total 0
> drwxr-xr-x 3 root root 180 Mai 21 08:11 ./
> drwxr-xr-x 22 root root 6060 Jul 18 02:07 ../
> drwxr-xr-x 2 root root 160 Mai 21 08:11 by-path/
> crw-rw----+ 1 root video 226, 0 Mai 24 03:46 card0
> crw-rw----+ 1 root video 226, 1 Mai 24 03:46 card1
> crw-rw----+ 1 root video 226, 2 Mai 24 03:46 card2
> crw-rw----+ 1 root video 226, 3 Mai 24 03:46 card3
> crw-rw----+ 1 root video 226, 4 Mai 24 03:46 card4
> crw-rw----+ 1 root render 226, 128 Mai 24 03:46 renderD128
>
> But there are no such devices defined
>
> $ systemctl list-units --all --type device | grep dri
> <nothing>
>
> That is because the closest to a matchin udev rule is
> $ cat /lib/udev/rules.d/60-drm.rules
> # do not edit this file, it will be overwritten on update
>
> ACTION!="remove", SUBSYSTEM=="drm", SUBSYSTEMS=="pci|usb|platform",
> IMPORT{builtin}="path_id"
>
> # by-path
> KERNEL=="card*", ENV{ID_PATH}=="?*",
> SYMLINK+="dri/by-path/$env{ID_PATH}-card"
> KERNEL=="card*", ENV{ID_PATH_WITH_USB_REVISION}=="?*",
> SYMLINK+="dri/by-path/$env{ID_PATH_WITH_USB_REVISION}-card"
> KERNEL=="controlD*", ENV{ID_PATH}=="?*",
> SYMLINK+="dri/by-path/$env{ID_PATH}-control"
> KERNEL=="controlD*", ENV{ID_PATH_WITH_USB_REVISION}=="?*",
> SYMLINK+="dri/by-path/$env{ID_PATH_WITH_USB_REVISION}-control"
> KERNEL=="renderD*", ENV{ID_PATH}=="?*",
> SYMLINK+="dri/by-path/$env{ID_PATH}-render"
> KERNEL=="renderD*", ENV{ID_PATH_WITH_USB_REVISION}=="?*",
> SYMLINK+="dri/by-path/$env{ID_PATH_WITH_USB_REVISION}-render"
>
> And there is nothing that would create dev-dri.device
> They would need a TAG+="systemd" entry.
> Similar to discussions [1][2]
> But even if we'd have that, AFAIU there would be dev-dri-card0.device but
> not just dev-dri.device
> And even if we'd have a particular guest config might need
> dev-dri-card7.device and that initializes even later - so just waiting on
> DRI sounds neat.
>
> Yet on the other hand, just like selecting the right timeout on the
> sleep - this is dependent on the system config, hardware and needs :-/
>
>
> ## 4 What now?
>
> I'd appreciate if you could:
> - share the initialization order your system really has
> - explain if you found something about dev-dri.device that I do not know
> yet
> - explain in more details which HW general and the GPUs you wait on are
> - share how you configured the guest (is it passthrough, is it gl
> rendering ...?)
>
> That would help us to understand the situation a bit better, if we are
> lucky it might even allow to recreate it.
>
> Still, my expectation is that with all that we eventually need to reach
> out to the project at [3] or [4].
> Due to the "at system config file time we'd never know if and what we need
> to wait on" problem described above I'd expect that this might need
> something completely else. Like libvirt internally (it knows what a guest
> needs as it knows its definition) waiting for that if and only as needed.
> Or I might overlook something obvious which the subject matter experts
> there might know and share.
>
> But for now, I'd appreciate if you could help my curiosity by providing
> the above.
>
> [1]: https://github.com/systemd/systemd/issues/25408
> [2]: https://github.com/joukewitteveen/xlogin/issues/15
> [3]: https://gitlab.com/libvirt/libvirt
> [4]: https://listman.redhat.com/mailman/listinfo/libvir-list
>
> P.S. @Sergio who usually looks at these - sorry for stealing those
> interesting cases in my morning, bad timezone luck for you :-P. Do not
> be concerned, you might deal with is long enough down the road :-)
>
> ** Bug watch added: github.com/systemd/systemd/issues #25408
> https://github.com/systemd/systemd/issues/25408
>
> ** Bug watch added: github.com/joukewitteveen/xlogin/issues #15
> https://github.com/joukewitteveen/xlogin/issues/15
>
> ** Changed in: libvirt (Ubuntu)
> Status: New => Incomplete
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/2073442
>
> Title:
> Failed to autostart VM: cannot open directory '/dev/dri': No such file
> or directory
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/2073442/+subscriptions
>
>
--
Tim Richardson
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2073442
Title:
Failed to autostart VM: cannot open directory '/dev/dri': No such file
or directory
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/2073442/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs