Subject: ovmf: Potential regression: ovmf package update from Debian bookworm to trixie breaks AMD-SEV Package: ovmf X-Debbugs-Cc: [email protected] Version: 2025.02-6 Severity: normal
After upgrading from bookworm to trixie (ovmf package upgraded from 2022.11-6+deb12u2 to 2025.02-6), my SEV encrypted VMs became unable to boot. The OVMF bootloader hang, with the kvm process at 100% CPU, and nothing printed in the console. Nothing changes until the VM is destroyed manually. Other VMs are not affected. If i disable SEV by just removing the launchSecurity section from the VM config, it boots successfully. After more testing i could narrow down the root cause to the ovmf package, and more precisely determine that the regression was introduced between 2024.02-2 (last working version) and 2024.05-1 (first broken one). Reproducing steps: - Start with a bookworm or trixie system Initially, version [2024.02-2](https://snapshot.debian.org/archive/debian/20240604T203040Z/pool/main/e/edk2/ovmf_2024.02-2_all.deb) of the `ovmf` package is installed on the system. ``` # dpkg -i ovmf_2024.02-2_all.deb ``` - Create a SEV encrypted VM, the following is a minimalistic reproducing config, inspired from [this example](https://github.com/AMDESE/AMDSEV/blob/master/xmls/sample-sev.xml): ``` # cat v-testsev.xml <domain type='kvm'> <name>v-testsev</name> <memory unit='KiB'>2097152</memory> <currentMemory unit='KiB'>2097152</currentMemory> <memoryBacking> <locked/> </memoryBacking> <vcpu placement='static'>1</vcpu> <os> <type arch='x86_64' machine='pc-q35-9.2'>hvm</type> <loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE_4M.ms.fd</loader> <nvram template='/usr/share/OVMF/OVMF_VARS_4M.ms.fd'>/var/lib/libvirt/qemu/nvram/v-testsev_VARS.fd</nvram> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <vmport state='off'/> </features> <cpu mode='host-passthrough' check='none' migratable='on'> <cache mode='passthrough'/> </cpu> <clock offset='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/bin/kvm</emulator> <controller type='usb' index='0' model='none'/> <serial type='pty'> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <channel type='unix'> <target type='virtio' name='org.qemu.guest_agent.0'/> </channel> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <audio id='1' type='none'/> <memballoon model='virtio'/> <rng model='virtio'> <backend model='random'>/dev/urandom</backend> </rng> </devices> <launchSecurity type='sev'> <policy>0x0003</policy> <cbitpos>47</cbitpos> <reducedPhysBits>1</reducedPhysBits> </launchSecurity> </domain> # virsh define v-testsev.xml ``` - Start the VM: ``` # virsh start --console v-testsev Domain 'v-testsev' started Connected to domain 'v-testsev' Escape character is ^] (Ctrl + ]) BdsDxe: No bootable option or device was found. BdsDxe: Press any key to enter the Boot Manager Menu. ... Standard PC (Q35 + ICH9, 2009) pc-q35-9.2 2.00 GHz 2024.02-2 2048 MB RAM ... ``` The loader starts and successfully boots into the setup utility as expected. - Destroy the VM is destroyed, delete the nvram varfile upgrade ovmf to [2024.05-1](https://snapshot.debian.org/archive/debian/20240604T203040Z/pool/main/e/edk2/ovmf_2024.05-1_all.deb): ``` # rm /var/lib/libvirt/qemu/nvram/v-testsev_VARS.fd # dpkg -i ovmf_2024.05-1_all.deb ``` - Try starting the VM again: ``` # virsh start --console v-testsev Domain 'v-testsev' started Connected to domain 'v-testsev' Escape character is ^] (Ctrl + ]) <hangs> # top ... PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 21343 libvirt+ 20 0 2497576 2.1g 60444 S 100.3 1.7 0:28.85 kvm ... # strace -p 21343,21351,21354,21355 strace: Process 21343 attached strace: Process 21351 attached strace: Process 21354 attached strace: Process 21355 attached [pid 21355] ioctl(18, KVM_RUN <unfinished ...> [pid 21354] ppoll([{fd=15, events=POLLIN}, {fd=16, events=POLLIN}, {fd=17, events=POLLIN}], 3, NULL, NULL, 8 <unfinished ...> [pid 21351] futex(0x5557bd5c84a8, FUTEX_WAIT, 4294967295, NULL <unfinished ...> [pid 21343] ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=78, events=POLLIN}], 5, {tv_sec=27702, tv_nsec=581671146}, NULL, 8 ``` The VM remains unresponsive, nothing on the console, kvm process at 100% CPU. The issue is reproduced. - Destroy the VM, remove the launchSecurity from the config, and restart it: ``` ~# virsh destroy v-testsev Domain 'v-testsev' destroyed # virsh edit v-testsev <remove launchSecurity section> # virsh start --console v-testsev Domain 'v-testsev' started Connected to domain 'v-testsev' Escape character is ^] (Ctrl + ]) BdsDxe: No bootable option or device was found. BdsDxe: Press any key to enter the Boot Manager Menu. ... Standard PC (Q35 + ICH9, 2009) pc-q35-9.2 2.00 GHz 2025.02-6 2048 MB RAM ... ``` The loader starts successfully and boot into the menu. This shows that the issue only happens when SEV is configured. -- System Information: Debian Release: 13.0 APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 6.1.0-37-amd64 (SMP w/16 CPU threads; PREEMPT) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled ovmf depends on no packages. ovmf recommends no packages. Versions of packages ovmf suggests: ii qemu-system-x86 1:10.0.0+ds-2 -- no debconf information

