Hi Nick, we had it in our minds and already done it on the PPA builds. But you are so right, we haven't spelled it out in the Description and you are therefore only right to ask about it. I updated the description to cover that as well in the test considerations.
Hopefully it is thereby ready to be accepted - thanks for re- considering. ** Description changed: [ Impact ] - * Windows 11 released a Windows update several months back that broke it on AMD Processors on KVM when using host-passthrough CPU mode. Users can no longer run + * Windows 11 released a Windows update several months back that broke it on AMD Processors on KVM when using host-passthrough CPU mode. Users can no longer run the Windows 11 VMs. [ Test Plan ] - * There are several ways to test this fix, one way is to run Windows 11 VM - but it is a little bit complex. We can test this fix by checking if the - faulty behavior that causes Windows 11 stop working has been fixed. + * Before the "for this case" tests we will (actually have on the PPA, but we + will repeat on the final accepted build) the regular qemu regression test + suite that we also run on e.g. major merges. This is focusing on Ubuntu delta + as upstream tests really well history has shown we more likely find something + where they do not look too closely. That has historically been a lot of + things and thereby it covers cross release migrations, and cpu model handling + which is exactly where these changes take place and regressions might happen. + https://git.launchpad.net/~ubuntu-server/ubuntu/+source/qemu-migration-test - * The faulty behavior is about to always enabling (emulate) arch-capabilities - on AMD CPUs even if AMD CPUs do not have this feature. - - * The right behavior is : when users do not explicitly specify the arch- - capabilities feature, it should ONLY be enabled when the CPU has it. + * There are several ways to test this fix, one way is to run Windows 11 VM + but it is a little bit complex. We can test this fix by checking if the + faulty behavior that causes Windows 11 stop working has been fixed. - * Here are the tests steps: - - on AMD CPU without arch-capabilities (for example : AMD EPYC Genoa - but - you can verify if your CPU has it by using cpuid). + * The faulty behavior is about to always enabling (emulate) arch-capabilities + on AMD CPUs even if AMD CPUs do not have this feature. - - First run a VM: - $ qemu-system-x86_64 -enable-kvm -cpu host -vga none -monitor stdio -vnc :999 -qmp unix:/tmp/qmp.sock,server,nowait + * The right behavior is : when users do not explicitly specify the arch- + capabilities feature, it should ONLY be enabled when the CPU has it. - - Second verify that the VM does not have "arch-capabilities" set to + * Here are the tests steps: + - on AMD CPU without arch-capabilities (for example : AMD EPYC Genoa - but + you can verify if your CPU has it by using cpuid). + + - First run a VM: + $ qemu-system-x86_64 -enable-kvm -cpu host -vga none -monitor stdio -vnc :999 -qmp unix:/tmp/qmp.sock,server,nowait + + - Second verify that the VM does not have "arch-capabilities" set to true - $ ( echo '{"execute":"qmp_capabilities"}'; echo + $ ( echo '{"execute":"qmp_capabilities"}'; echo '{"execute":"query-cpu-model- expansion","arguments":{"type":"full","model":{"name":"host"}}}'; ) | sudo socat - UNIX-CONNECT:/tmp/qmp.sock | grep arch-capabilities [ Where problems could occur ] - * Since we modify the CPU features on an old Ubuntu releases, this breaks the - migration of existing VMs to newer Ubuntu releases. + * Since we modify the CPU features on an old Ubuntu releases, this breaks the + migration of existing VMs to newer Ubuntu releases. - To avoid that we add new Ubuntu machine type and enable the fix only for this - new machine type. We also make this new machine type the default one so that - users can benefit from the fix transparently. + To avoid that we add new Ubuntu machine type and enable the fix only for this + new machine type. We also make this new machine type the default one so that + users can benefit from the fix transparently. - This ensure the migration to happen correctly. However, we might want to keep - in mind that if there will be any regression caused by this SRU, it would be likely related to the migration feature. - + This ensure the migration to happen correctly. However, we might want to keep + in mind that if there will be any regression caused by this SRU, it would be likely related to the migration feature. [ Other Info ] - None - + None Original bug report --------------------- Issue has been discovered by HPE : --------------------------------------------------------- Got some updates on this. Firstly, we are using Ubuntu 24.04 LTS Base as 26.04 is not out yet. This means QEMU version is 8.2.2. I tried to use Windows 11 25H2 ISO instead and this sort of worked. We got past install and into the setup wizard. But as soon as Windows Updates were downloaded and kicked off, the system went back to KMODE EXCEPTION (0x1E) I believe this only occurs if Secure Boot is enabled (UEFI Non secure with SW TPM 2.0 seems to work ok last I checked). ---------------------------------------------------------- There are several articles on it online but short summary https://forum.proxmox.com/threads/amd-bsod-unsupported-processor-since- windows-build-26100-4202-update-kb5060842-its-preview-kb5058499.166828/ Windows 11 released a Windows update several months back that broke it on AMD Processors on KVM when using host-passthrough CPU mode. The solutions suggested out there are often to switch cpu model to x86-64-v4. We noticed this only happens when Secure Boot is used (UEFI non secure is fine). We are trying to dig farther we have it tracked in JIRA as PCCP-7972 but were curious if you all hand any insight here or could help us find a solution as this is something inside kvm/qemu . Thanks, David Estes ---------------------------------- This specific observation is subject of a QEMU issue Windows 11 24H2 (KB5063060) fails to boot with 'UNSUPPORTED PROCESSOR' on multi-core QEMU guests with AMD EPYC host (#3001) · Issue · qemu-project/qemu The root cause and QEMU patch https://lore.kernel.org/qemu-devel/[email protected]/ The last update in the QEMU issue tracker from a week ago details the release intercept plans for the fix. Fixed in 10.0.4, though the fixed caused a regression that was fixed in 10.0.6, and confirms 10.1.x has the fix. ** Description changed: [ Impact ] * Windows 11 released a Windows update several months back that broke it on AMD Processors on KVM when using host-passthrough CPU mode. Users can no longer run the Windows 11 VMs. [ Test Plan ] - * Before the "for this case" tests we will (actually have on the PPA, but we - will repeat on the final accepted build) the regular qemu regression test - suite that we also run on e.g. major merges. This is focusing on Ubuntu delta - as upstream tests really well history has shown we more likely find something - where they do not look too closely. That has historically been a lot of - things and thereby it covers cross release migrations, and cpu model handling - which is exactly where these changes take place and regressions might happen. - https://git.launchpad.net/~ubuntu-server/ubuntu/+source/qemu-migration-test + * Before the "for this case" tests we will (actually have on the PPA, but we + will repeat on the final accepted build) the regular qemu regression test + suite that we also run on e.g. major merges. This is focusing on Ubuntu delta + as upstream tests really well history has shown we more likely find something + where they do not look too closely. That has historically been a lot of + things and thereby it covers cross release migrations, and cpu model handling + which is exactly where these changes take place and regressions might happen. + https://git.launchpad.net/~ubuntu-server/ubuntu/+source/qemu-migration-test - * There are several ways to test this fix, one way is to run Windows 11 VM - but it is a little bit complex. We can test this fix by checking if the - faulty behavior that causes Windows 11 stop working has been fixed. + * Then to the actual case and fix, there are several ways to test this fix, + one way is to run Windows 11 VM but it is a little bit complex. We can test + this fix by checking if the faulty behavior that causes Windows 11 stop + working has been fixed. * The faulty behavior is about to always enabling (emulate) arch-capabilities on AMD CPUs even if AMD CPUs do not have this feature. * The right behavior is : when users do not explicitly specify the arch- capabilities feature, it should ONLY be enabled when the CPU has it. * Here are the tests steps: - on AMD CPU without arch-capabilities (for example : AMD EPYC Genoa - but you can verify if your CPU has it by using cpuid). - First run a VM: $ qemu-system-x86_64 -enable-kvm -cpu host -vga none -monitor stdio -vnc :999 -qmp unix:/tmp/qmp.sock,server,nowait - Second verify that the VM does not have "arch-capabilities" set to true $ ( echo '{"execute":"qmp_capabilities"}'; echo '{"execute":"query-cpu-model- expansion","arguments":{"type":"full","model":{"name":"host"}}}'; ) | sudo socat - UNIX-CONNECT:/tmp/qmp.sock | grep arch-capabilities [ Where problems could occur ] * Since we modify the CPU features on an old Ubuntu releases, this breaks the migration of existing VMs to newer Ubuntu releases. To avoid that we add new Ubuntu machine type and enable the fix only for this new machine type. We also make this new machine type the default one so that users can benefit from the fix transparently. This ensure the migration to happen correctly. However, we might want to keep in mind that if there will be any regression caused by this SRU, it would be likely related to the migration feature. [ Other Info ] None Original bug report --------------------- Issue has been discovered by HPE : --------------------------------------------------------- Got some updates on this. Firstly, we are using Ubuntu 24.04 LTS Base as 26.04 is not out yet. This means QEMU version is 8.2.2. I tried to use Windows 11 25H2 ISO instead and this sort of worked. We got past install and into the setup wizard. But as soon as Windows Updates were downloaded and kicked off, the system went back to KMODE EXCEPTION (0x1E) I believe this only occurs if Secure Boot is enabled (UEFI Non secure with SW TPM 2.0 seems to work ok last I checked). ---------------------------------------------------------- There are several articles on it online but short summary https://forum.proxmox.com/threads/amd-bsod-unsupported-processor-since- windows-build-26100-4202-update-kb5060842-its-preview-kb5058499.166828/ Windows 11 released a Windows update several months back that broke it on AMD Processors on KVM when using host-passthrough CPU mode. The solutions suggested out there are often to switch cpu model to x86-64-v4. We noticed this only happens when Secure Boot is used (UEFI non secure is fine). We are trying to dig farther we have it tracked in JIRA as PCCP-7972 but were curious if you all hand any insight here or could help us find a solution as this is something inside kvm/qemu . Thanks, David Estes ---------------------------------- This specific observation is subject of a QEMU issue Windows 11 24H2 (KB5063060) fails to boot with 'UNSUPPORTED PROCESSOR' on multi-core QEMU guests with AMD EPYC host (#3001) · Issue · qemu-project/qemu The root cause and QEMU patch https://lore.kernel.org/qemu-devel/[email protected]/ The last update in the QEMU issue tracker from a week ago details the release intercept plans for the fix. Fixed in 10.0.4, though the fixed caused a regression that was fixed in 10.0.6, and confirms 10.1.x has the fix. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2131822 Title: Known Windows 11 KVM Issue AMD KMODE Exception To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2131822/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
