[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
** Tags removed: server-todo -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nova/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
Hi Giuseppe, That's good news; thanks for testing! Yes, that's correct. I'll review the patch for SRU considerations (e.g., potential side-effects/regressions for existing users, not always directly clear from the code changes) and proceed if all is OK. ** Also affects: nova (Ubuntu) Importance: Undecided Status: New ** Changed in: qemu (Ubuntu) Status: Incomplete => Invalid ** Changed in: nova (Ubuntu) Status: New => Triaged ** Changed in: nova (Ubuntu) Importance: Undecided => Medium ** Changed in: nova (Ubuntu) Assignee: (unassigned) => Mauricio Faria de Oliveira (mfo) ** Changed in: qemu (Ubuntu) Assignee: Sergio Durigan Junior (sergiodj) => (unassigned) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nova/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
Hello Mauricio, thanks for providing the nova patch. It worked as expected. I installed on a node with xsaves enabled and updated nova.conf # dpkg -l | grep nova ii nova-api-metadata 2:21.2.4-0ubuntu2.7~ppa3 all OpenStack Compute - metadata API frontend ii nova-common 2:21.2.4-0ubuntu2.7~ppa3 all OpenStack Compute - common files ii nova-compute 2:21.2.4-0ubuntu2.7~ppa3 all OpenStack Compute - compute node base ii nova-compute-kvm 2:21.2.4-0ubuntu2.7~ppa3 all OpenStack Compute - compute node (KVM) ii nova-compute-libvirt 2:21.2.4-0ubuntu2.7~ppa3 all OpenStack Compute - compute node libvirt support ii python3-nova 2:21.2.4-0ubuntu2.7~ppa3 all OpenStack Compute Python 3 libraries # grep cpu_model /etc/nova/nova.conf cpu_model = EPYC-Rome cpu_model_extra_flags = -xsaves Then I restarted nova-compute. I had the following VM running which was using xsaves: # virsh dumpxml instance-001499b6 | grep xsaves I stopped/started the VM and the xsaves feature was disabled after that # virsh dumpxml instance-001499b6 | grep xsaves and that allowed me to migrate the VM to a node with xsaves disabled (newer kernel) I think next step should be to start SRU process for this patch, correct? Thanks, Giuseppe -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
Hi Giuseppe, Please test nova version 2:21.2.4-0ubuntu2.7~ppa3 in the PPA. (It's finished building and should be published in some time.) The config change should be something along the lines of `cpu_model_extra_flags = -xsaves` I couldn't test the package/functionality yet, but all of the build-time unit tests have passed (just like official packages). Please let us know how it goes. Thanks, Mauricio ** Changed in: qemu (Ubuntu) Status: Confirmed => Incomplete -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
Hmm, sorry, please hold; the unit tests caught an error. I'll check that on Monday. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
Hi Giuseppe, You're right, libvirt checks the specified model against its known models. However, the EPYC-Rome (not-v4) doesn't specify 'xsaves', just EPYC-Milan, so it _seems_ the feature came from the default with cpu_model host-model, which perhaps found EPYC Milan model closer to the host flags for reasons, and used that CPU model file instead, which resulted in 'xsaves' required. Since there is an increase in CPU feature flags changes recently (eg, the 'xsaves' you mentioned and also PKRU/xsave changes/regressions in kernel 5.15.0-85 per bug 2032164 comment 5), and these apparently may continue to grow over time, as errata, security vulnerabilities, and other stuff come up, maybe it's better not to rely on updates that require CPU model updates in packages (specially 2 of them; qemu/libvirt). So, in Yoga and later, Nova extends the 'libvirt.cpu_model_extra_flags' [1] parsing with '+' and '-', so you can disable specific flags with '-', [2] e.g., '-xsaves'. This can be used with the unpatched QEMU, as it can use existing CPU models. I backported the patch to Focal/Ussuri's nova (it's present in Jammy/Yoga), and built it in PPA [3]. Could you please test it, and see how it goes? More details in [1] and [2]. Thanks! [1] https://docs.openstack.org/nova/yoga/admin/cpu-models.html#cpu- feature-flags [2] https://opendev.org/openstack/nova/commit/bcd6b42047ea9422a58a4273d831e23f2ea27092 [3] https://launchpad.net/~mfo/+archive/ubuntu/lp2048517 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
One thing I notice is that the libvirt xml from EPYC at /usr/share/libvirt/cpu_map/x86_EPYC-Rome.xml ofc is not updated but also a new one for using -v4 is not rendered. That means I can't instruct nova to use EPYC-Rome-v4 with # grep -i epyc /etc/nova/nova.conf cpu_model = EPYC-Rome-v4 because nova will stop working with error 2024-04-16 07:30:04.939 4168598 ERROR oslo_service.service nova.exception.InvalidCPUInfo: Configured CPU model: EPYC-Rome-v4 is not correct, or your host CPU arch does not support this model. Please correct your config and try again. Do we need a patch for libvirt as well to have this working? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
Hello Sergio, thanks for providing the patch. I have installed qemu on my test machine. The new epyc-rome-v4 is now listed by qemu ``` # qemu-system-x86_64 -enable-kvm -cpu help| grep -i epyc-rome-v x86 EPYC-Rome-v1 AMD EPYC-Rome Processor x86 EPYC-Rome-v2 AMD EPYC-Rome Processor x86 EPYC-Rome-v3 AMD EPYC-Rome-v3 Processor x86 EPYC-Rome-v4 AMD EPYC-Rome-v4 Processor (no XSAVES) ``` But I am not able to have the VM dropping xsaves. I have stopped/started the VM running on my host but it is still requiring the xsaves feature. Same for newly created VMs on the host, they still require the xsaves feature, that means can't be migrated to hosts which have already dropped the xsaves cpu flag. I have also tried to restart nova-compute and libvirtd but still the same: ``` # virsh dumpxml instance-005ac280 | grep xsaves ``` I believe this patch is not enough to allow migrations between hosts with xsaves and w/o xsaves or I am missing some steps. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
Hello Giuseppe, Thank you. I went ahead and backported the necessary patches. In total, I had to backport the following upstream commits: fb00aa61267c8b9c57a2d1a1fa1e336d02e3bcd1 d7c72735f618a7ee27ee109d8b1468193734606a cca0a000d06f897411a8af4402e5d0522bbe450b I uploaded a version of QEMU with the patches to: https://launchpad.net/~sergiodj/+archive/ubuntu/qemu/+packages The version is 1:4.2-3ubuntu6.29~ppa1. Could you take it for a spin, please? I'm interested in seeing the results of the scenarios suggested by Jan (namely, migrating VMs between hosts that have differing states for XSAVES). Thank you. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
** Changed in: qemu (Ubuntu) Assignee: (unassigned) => Sergio Durigan Junior (sergiodj) ** Tags added: server-todo -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
Hello Sergio, I can help with testing. I have access to EPYC-Rome CPU machines. Thanks -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
Hello Jan, Thank you for reporting this bug, and for providing a good initial analysis of the problem. Would you have access to a host with an EPYC-Rome CPU where you can run some tests? This is something that needs to be done in order to proceed here, especially if we indeed decide to SRU this change into Focal. Thanks. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 2048517] Re: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: qemu (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2048517 Title: EPYC-Rome model without XSAVES may break live migration since the removal of the flag on the physical CPU To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2048517/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs