Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
Sorry, just piecing this together and looking at things that have probably already been looked at! Looking at the Libvirt CPU xml files, it's interesting that both x86_EPYC-Milan.xml <https://github.com/libvirt/libvirt/blob/master/src/cpu_map/x86_EPYC-Milan.xml> and x86_EPYC-Rome.xml <https://github.com/libvirt/libvirt/blob/master/src/cpu_map/x86_EPYC-Rome.xml> have 'npt', I guess the Ubuntu kernel on 18.04 doesn't support npt, you'd see the difference under the host XML in 'virsh capabilities' command. This would be similar to the 'vmx' flag for nested virtualization. You won't find the 'vmx' capability in any of the CPU XML, however if you enable it via kvm module parameter the VM gets it, and then you can't migrate to non-vmx hosts even with the same CPU. If something like this were happening though I'd still expect to see 'npt' in the source VM XML and on its qemu command unless it's similar but not quite the same issue. On Mon, Dec 13, 2021 at 10:32 AM Marcus wrote: > That does sound like some sort of libvirt, then. I don't know why it would > fail to transfer with " unknown CPU feature" when the source VM XML is > not calling for it or a model that would include it. > > On Sat, Dec 11, 2021 at 3:32 AM Wido den Hollander wrote: > >> >> >> Op 11-12-2021 om 00:52 schreef Marcus: >> > Just for clarity - Wido you mention that you tried using a common CPU >> model >> > across the platforms (which presumably doesn't contain npt) but >> migration >> > still fails on npt missing. That does seem like a bug of some sort, I >> would >> > expect that the the following should work: >> > >> >> Indeed, that failed. >> >> > * Update cloudstack agent configs to use 'EPYC-IBPB' common identical >> > model, restart agent >> > * Stop VM on source host (ubuntu 20.04) >> > * Start VM on source host (ubuntu 20.04) - at this point you should not >> > have a feature 'npt' in the XML of the running VM. If you do then >> there's >> > something wrong with the EPYC-IBPB or libvirt's interpretation >> > * Attempt to migrate to destination host (ubuntu 18.04) >> > >> > Is this process failing? Just want to ensure the source VM was restarted >> > and does not contain npt in the XML (and also on the resulting qemu >> command >> > line), but still the migration complains about missing that feature. >> > >> >> I tried with EPYC-IBPB as well and restarted the VM prior to the >> migration. >> >> 20.04 -> 18.04 fails even though the IBPB model in libvirt is exactly >> the same between 18 and 20. >> >> It complains about the npt feature lacking and thus the migration fails. >> >> > I'm also making an assumption here that /proc/cpuinfo on an Epyc 7552 >> does >> > not have npt, but an Epyc 7662 does. Is that correct? >> > >> >> Correct. >> >> > On Tue, Dec 7, 2021 at 6:46 AM Gabriel Bräscher >> > wrote: >> > >> >> Paul, I confused the issues then. >> >> >> >> The one I mentioned fits only with what Wido reported in this thread. >> >> The CPU flag matches with the ones raised on that bug. Flags like >> *npt* & >> >> *nrip-save* which are present when SVM is enabled. >> >> Therefore, affected by kernel commit -- 52297436199d ("kvm: svm: Update >> >> svm_xsaves_supported"). >> >> Additionally, the OS/Qemu versions also do fit with what is reported on >> >> Ubuntu' qemu package "bug #1887490". >> >> >> >> Regards >> >> >> >> On Tue, Dec 7, 2021 at 12:10 PM Paul Angus >> >> wrote: >> >> >> >>> The qemu-ev 2.10 bug was first reported a year or two ago in the >> mailing >> >>> lists. >> >>> >> >>> -Original Message- >> >>> From: Gabriel Bräscher >> >>> Sent: Tuesday, December 7, 2021 9:41 AM >> >>> To: dev >> >>> Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and >> 20.04 >> >>> >> >>> Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point. >> >>> >> >>>> migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely >> >>>> a bug in my point of view. >> >>>> >> >>> >> >>> On the comment 53 (at "bug #1887490"): >> >>> >> >>>> It seems *one of the patches also introduced a regression*: &
Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
That does sound like some sort of libvirt, then. I don't know why it would fail to transfer with " unknown CPU feature" when the source VM XML is not calling for it or a model that would include it. On Sat, Dec 11, 2021 at 3:32 AM Wido den Hollander wrote: > > > Op 11-12-2021 om 00:52 schreef Marcus: > > Just for clarity - Wido you mention that you tried using a common CPU > model > > across the platforms (which presumably doesn't contain npt) but migration > > still fails on npt missing. That does seem like a bug of some sort, I > would > > expect that the the following should work: > > > > Indeed, that failed. > > > * Update cloudstack agent configs to use 'EPYC-IBPB' common identical > > model, restart agent > > * Stop VM on source host (ubuntu 20.04) > > * Start VM on source host (ubuntu 20.04) - at this point you should not > > have a feature 'npt' in the XML of the running VM. If you do then there's > > something wrong with the EPYC-IBPB or libvirt's interpretation > > * Attempt to migrate to destination host (ubuntu 18.04) > > > > Is this process failing? Just want to ensure the source VM was restarted > > and does not contain npt in the XML (and also on the resulting qemu > command > > line), but still the migration complains about missing that feature. > > > > I tried with EPYC-IBPB as well and restarted the VM prior to the migration. > > 20.04 -> 18.04 fails even though the IBPB model in libvirt is exactly > the same between 18 and 20. > > It complains about the npt feature lacking and thus the migration fails. > > > I'm also making an assumption here that /proc/cpuinfo on an Epyc 7552 > does > > not have npt, but an Epyc 7662 does. Is that correct? > > > > Correct. > > > On Tue, Dec 7, 2021 at 6:46 AM Gabriel Bräscher > > wrote: > > > >> Paul, I confused the issues then. > >> > >> The one I mentioned fits only with what Wido reported in this thread. > >> The CPU flag matches with the ones raised on that bug. Flags like *npt* > & > >> *nrip-save* which are present when SVM is enabled. > >> Therefore, affected by kernel commit -- 52297436199d ("kvm: svm: Update > >> svm_xsaves_supported"). > >> Additionally, the OS/Qemu versions also do fit with what is reported on > >> Ubuntu' qemu package "bug #1887490". > >> > >> Regards > >> > >> On Tue, Dec 7, 2021 at 12:10 PM Paul Angus > >> wrote: > >> > >>> The qemu-ev 2.10 bug was first reported a year or two ago in the > mailing > >>> lists. > >>> > >>> -Original Message- > >>> From: Gabriel Bräscher > >>> Sent: Tuesday, December 7, 2021 9:41 AM > >>> To: dev > >>> Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 > >>> > >>> Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point. > >>> > >>>> migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely > >>>> a bug in my point of view. > >>>> > >>> > >>> On the comment 53 (at "bug #1887490"): > >>> > >>>> It seems *one of the patches also introduced a regression*: > >>>> * lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch > >>>> adds various SVM-related flags. Specifically *npt and nrip-save are > >>>> now expected to be present by default* as shown in the updated > >> testdata. > >>>> This however breaks migration from instances using EPYC or EPYC-IBPB > >>>> CPU models started with libvirt versions prior to this one because the > >>>> instance on the target host has these extra flags > >>> > >>> > >>> From the tests reported there, it fails in both ways. > >>> 1. From *older* qemu package to *newer*: > >>> *source* host does not map the CPU flag; however, *target* host > >>> expects the flag to be there, by default. > >>> 2. From *newer* qemu package to *older*: > >>> the instance "domain.xml" in the *source* host has a CPU flag > that is > >>> not mapped by qemu in the *target* host. > >>> > >>> > >>> > >>> On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel wrote: > >>> > >>>> Let me check. We had the same problem on RHEL/CentOS but I am not sure > >>>> if this a bug. What I know there was a chang
Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
Op 11-12-2021 om 00:52 schreef Marcus: Just for clarity - Wido you mention that you tried using a common CPU model across the platforms (which presumably doesn't contain npt) but migration still fails on npt missing. That does seem like a bug of some sort, I would expect that the the following should work: Indeed, that failed. * Update cloudstack agent configs to use 'EPYC-IBPB' common identical model, restart agent * Stop VM on source host (ubuntu 20.04) * Start VM on source host (ubuntu 20.04) - at this point you should not have a feature 'npt' in the XML of the running VM. If you do then there's something wrong with the EPYC-IBPB or libvirt's interpretation * Attempt to migrate to destination host (ubuntu 18.04) Is this process failing? Just want to ensure the source VM was restarted and does not contain npt in the XML (and also on the resulting qemu command line), but still the migration complains about missing that feature. I tried with EPYC-IBPB as well and restarted the VM prior to the migration. 20.04 -> 18.04 fails even though the IBPB model in libvirt is exactly the same between 18 and 20. It complains about the npt feature lacking and thus the migration fails. I'm also making an assumption here that /proc/cpuinfo on an Epyc 7552 does not have npt, but an Epyc 7662 does. Is that correct? Correct. On Tue, Dec 7, 2021 at 6:46 AM Gabriel Bräscher wrote: Paul, I confused the issues then. The one I mentioned fits only with what Wido reported in this thread. The CPU flag matches with the ones raised on that bug. Flags like *npt* & *nrip-save* which are present when SVM is enabled. Therefore, affected by kernel commit -- 52297436199d ("kvm: svm: Update svm_xsaves_supported"). Additionally, the OS/Qemu versions also do fit with what is reported on Ubuntu' qemu package "bug #1887490". Regards On Tue, Dec 7, 2021 at 12:10 PM Paul Angus wrote: The qemu-ev 2.10 bug was first reported a year or two ago in the mailing lists. -Original Message- From: Gabriel Bräscher Sent: Tuesday, December 7, 2021 9:41 AM To: dev Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point. migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely a bug in my point of view. On the comment 53 (at "bug #1887490"): It seems *one of the patches also introduced a regression*: * lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch adds various SVM-related flags. Specifically *npt and nrip-save are now expected to be present by default* as shown in the updated testdata. This however breaks migration from instances using EPYC or EPYC-IBPB CPU models started with libvirt versions prior to this one because the instance on the target host has these extra flags From the tests reported there, it fails in both ways. 1. From *older* qemu package to *newer*: *source* host does not map the CPU flag; however, *target* host expects the flag to be there, by default. 2. From *newer* qemu package to *older*: the instance "domain.xml" in the *source* host has a CPU flag that is not mapped by qemu in the *target* host. On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel wrote: Let me check. We had the same problem on RHEL/CentOS but I am not sure if this a bug. What I know there was a change in the XML. Let me ask one on my colleges in my team. __ Sven Vogel Senior Manager Research and Development - Cloud and Infrastructure EWERK DIGITAL GmbH Brühl 24, D-04109 Leipzig P +49 341 42649 - 99 F +49 341 42649 - 98 s.vo...@ewerk.com www.ewerk.com Geschäftsführer: Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke Registergericht: Leipzig HRB 9065 Support: +49 341 42649 555 Zertifiziert nach: ISO/IEC 27001:2013 DIN EN ISO 9001:2015 DIN ISO/IEC 2-1:2018 ISAE 3402 Typ II Assessed EWERK-Blog<https://blog.ewerk.com/> | LinkedIn< https://www.linkedin.com/company/ewerk-group> | Xing< https://www.xing.com/company/ewerk> | Twitter< https://twitter.com/EWERK_Group> | Facebook< https://de-de.facebook.com/EWERK.Group/> Auskünfte und Angebote per Mail sind freibleibend und unverbindlich. Disclaimer Privacy: Der Inhalt dieser E-Mail (einschließlich etwaiger beigefügter Dateien) ist vertraulich und nur für den Empfänger bestimmt. Sollten Sie nicht der bestimmungsgemäße Empfänger sein, ist Ihnen jegliche Offenlegung, Vervielfältigung, Weitergabe oder Nutzung des Inhalts untersagt. Bitte informieren Sie in diesem Fall unverzüglich den Absender und löschen Sie die E-Mail (einschließlich etwaiger beigefügter Dateien) von Ihrem System. Vielen Dank. The contents of this e-mail (including any attachments) are confidential and may be legally privileged. If you are not the intended recipient of this e-mail, any disclosure, copying, distribution or use of its contents is strictly prohibited, and you should please
Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
Just for clarity - Wido you mention that you tried using a common CPU model across the platforms (which presumably doesn't contain npt) but migration still fails on npt missing. That does seem like a bug of some sort, I would expect that the the following should work: * Update cloudstack agent configs to use 'EPYC-IBPB' common identical model, restart agent * Stop VM on source host (ubuntu 20.04) * Start VM on source host (ubuntu 20.04) - at this point you should not have a feature 'npt' in the XML of the running VM. If you do then there's something wrong with the EPYC-IBPB or libvirt's interpretation * Attempt to migrate to destination host (ubuntu 18.04) Is this process failing? Just want to ensure the source VM was restarted and does not contain npt in the XML (and also on the resulting qemu command line), but still the migration complains about missing that feature. I'm also making an assumption here that /proc/cpuinfo on an Epyc 7552 does not have npt, but an Epyc 7662 does. Is that correct? On Tue, Dec 7, 2021 at 6:46 AM Gabriel Bräscher wrote: > Paul, I confused the issues then. > > The one I mentioned fits only with what Wido reported in this thread. > The CPU flag matches with the ones raised on that bug. Flags like *npt* & > *nrip-save* which are present when SVM is enabled. > Therefore, affected by kernel commit -- 52297436199d ("kvm: svm: Update > svm_xsaves_supported"). > Additionally, the OS/Qemu versions also do fit with what is reported on > Ubuntu' qemu package "bug #1887490". > > Regards > > On Tue, Dec 7, 2021 at 12:10 PM Paul Angus > wrote: > > > The qemu-ev 2.10 bug was first reported a year or two ago in the mailing > > lists. > > > > -Original Message- > > From: Gabriel Bräscher > > Sent: Tuesday, December 7, 2021 9:41 AM > > To: dev > > Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 > > > > Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point. > > > > > migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely > > > a bug in my point of view. > > > > > > > On the comment 53 (at "bug #1887490"): > > > > > It seems *one of the patches also introduced a regression*: > > > * lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch > > > adds various SVM-related flags. Specifically *npt and nrip-save are > > > now expected to be present by default* as shown in the updated > testdata. > > > This however breaks migration from instances using EPYC or EPYC-IBPB > > > CPU models started with libvirt versions prior to this one because the > > > instance on the target host has these extra flags > > > > > > From the tests reported there, it fails in both ways. > > 1. From *older* qemu package to *newer*: > > *source* host does not map the CPU flag; however, *target* host > > expects the flag to be there, by default. > > 2. From *newer* qemu package to *older*: > > the instance "domain.xml" in the *source* host has a CPU flag that is > > not mapped by qemu in the *target* host. > > > > > > > > On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel wrote: > > > > > Let me check. We had the same problem on RHEL/CentOS but I am not sure > > > if this a bug. What I know there was a change in the XML. Let me ask > > > one on my colleges in my team. > > > > > > > > > > > > > > > __ > > > > > > Sven Vogel > > > Senior Manager Research and Development - Cloud and Infrastructure > > > > > > EWERK DIGITAL GmbH > > > Brühl 24, D-04109 Leipzig > > > P +49 341 42649 - 99 > > > F +49 341 42649 - 98 > > > s.vo...@ewerk.com > > > www.ewerk.com > > > > > > Geschäftsführer: > > > Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke > > > Registergericht: Leipzig HRB 9065 > > > > > > Support: > > > +49 341 42649 555 > > > > > > Zertifiziert nach: > > > ISO/IEC 27001:2013 > > > DIN EN ISO 9001:2015 > > > DIN ISO/IEC 2-1:2018 > > > > > > ISAE 3402 Typ II Assessed > > > > > > EWERK-Blog<https://blog.ewerk.com/> | LinkedIn< > > > https://www.linkedin.com/company/ewerk-group> | Xing< > > > https://www.xing.com/company/ewerk> | Twitter< > > > https://twitter.com/EWERK_Group> | Facebook< > > > https://de-de.facebook.com/EWERK.Group/> > > > > > > > > > Auskünfte und Angebote per Mail sind freibleibend und un
Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
Paul, I confused the issues then. The one I mentioned fits only with what Wido reported in this thread. The CPU flag matches with the ones raised on that bug. Flags like *npt* & *nrip-save* which are present when SVM is enabled. Therefore, affected by kernel commit -- 52297436199d ("kvm: svm: Update svm_xsaves_supported"). Additionally, the OS/Qemu versions also do fit with what is reported on Ubuntu' qemu package "bug #1887490". Regards On Tue, Dec 7, 2021 at 12:10 PM Paul Angus wrote: > The qemu-ev 2.10 bug was first reported a year or two ago in the mailing > lists. > > -Original Message- > From: Gabriel Bräscher > Sent: Tuesday, December 7, 2021 9:41 AM > To: dev > Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 > > Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point. > > > migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely > > a bug in my point of view. > > > > On the comment 53 (at "bug #1887490"): > > > It seems *one of the patches also introduced a regression*: > > * lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch > > adds various SVM-related flags. Specifically *npt and nrip-save are > > now expected to be present by default* as shown in the updated testdata. > > This however breaks migration from instances using EPYC or EPYC-IBPB > > CPU models started with libvirt versions prior to this one because the > > instance on the target host has these extra flags > > > From the tests reported there, it fails in both ways. > 1. From *older* qemu package to *newer*: > *source* host does not map the CPU flag; however, *target* host > expects the flag to be there, by default. > 2. From *newer* qemu package to *older*: > the instance "domain.xml" in the *source* host has a CPU flag that is > not mapped by qemu in the *target* host. > > > > On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel wrote: > > > Let me check. We had the same problem on RHEL/CentOS but I am not sure > > if this a bug. What I know there was a change in the XML. Let me ask > > one on my colleges in my team. > > > > > > > > > > __ > > > > Sven Vogel > > Senior Manager Research and Development - Cloud and Infrastructure > > > > EWERK DIGITAL GmbH > > Brühl 24, D-04109 Leipzig > > P +49 341 42649 - 99 > > F +49 341 42649 - 98 > > s.vo...@ewerk.com > > www.ewerk.com > > > > Geschäftsführer: > > Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke > > Registergericht: Leipzig HRB 9065 > > > > Support: > > +49 341 42649 555 > > > > Zertifiziert nach: > > ISO/IEC 27001:2013 > > DIN EN ISO 9001:2015 > > DIN ISO/IEC 2-1:2018 > > > > ISAE 3402 Typ II Assessed > > > > EWERK-Blog<https://blog.ewerk.com/> | LinkedIn< > > https://www.linkedin.com/company/ewerk-group> | Xing< > > https://www.xing.com/company/ewerk> | Twitter< > > https://twitter.com/EWERK_Group> | Facebook< > > https://de-de.facebook.com/EWERK.Group/> > > > > > > Auskünfte und Angebote per Mail sind freibleibend und unverbindlich. > > > > Disclaimer Privacy: > > Der Inhalt dieser E-Mail (einschließlich etwaiger beigefügter Dateien) > > ist vertraulich und nur für den Empfänger bestimmt. Sollten Sie nicht > > der bestimmungsgemäße Empfänger sein, ist Ihnen jegliche Offenlegung, > > Vervielfältigung, Weitergabe oder Nutzung des Inhalts untersagt. Bitte > > informieren Sie in diesem Fall unverzüglich den Absender und löschen > > Sie die E-Mail (einschließlich etwaiger beigefügter Dateien) von Ihrem > System. > > Vielen Dank. > > > > The contents of this e-mail (including any attachments) are > > confidential and may be legally privileged. If you are not the > > intended recipient of this e-mail, any disclosure, copying, > > distribution or use of its contents is strictly prohibited, and you > > should please notify the sender immediately and then delete it > (including any attachments) from your system. Thank you. > > Von: Gabriel Bräscher > > Datum: Dienstag, 7. Dezember 2021 um 09:57 > > An: dev > > Betreff: Re: Live migration between AMD Epyc and Ubuntu 18.04 and > > 20.04 Wei, I agree. > > This is not necessarily a bug per se. > > > > The main point here is: the issue we are seeing is the "bug #1887490" > > raised in Ubuntu's qemu package. > > CPU features were added on the newer releases, which caused the > > compatibility issue when (live) migrating VMs between
RE: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
The qemu-ev 2.10 bug was first reported a year or two ago in the mailing lists. -Original Message- From: Gabriel Bräscher Sent: Tuesday, December 7, 2021 9:41 AM To: dev Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point. > migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely > a bug in my point of view. > On the comment 53 (at "bug #1887490"): > It seems *one of the patches also introduced a regression*: > * lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch > adds various SVM-related flags. Specifically *npt and nrip-save are > now expected to be present by default* as shown in the updated testdata. > This however breaks migration from instances using EPYC or EPYC-IBPB > CPU models started with libvirt versions prior to this one because the > instance on the target host has these extra flags From the tests reported there, it fails in both ways. 1. From *older* qemu package to *newer*: *source* host does not map the CPU flag; however, *target* host expects the flag to be there, by default. 2. From *newer* qemu package to *older*: the instance "domain.xml" in the *source* host has a CPU flag that is not mapped by qemu in the *target* host. On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel wrote: > Let me check. We had the same problem on RHEL/CentOS but I am not sure > if this a bug. What I know there was a change in the XML. Let me ask > one on my colleges in my team. > > > > > __ > > Sven Vogel > Senior Manager Research and Development - Cloud and Infrastructure > > EWERK DIGITAL GmbH > Brühl 24, D-04109 Leipzig > P +49 341 42649 - 99 > F +49 341 42649 - 98 > s.vo...@ewerk.com > www.ewerk.com > > Geschäftsführer: > Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke > Registergericht: Leipzig HRB 9065 > > Support: > +49 341 42649 555 > > Zertifiziert nach: > ISO/IEC 27001:2013 > DIN EN ISO 9001:2015 > DIN ISO/IEC 2-1:2018 > > ISAE 3402 Typ II Assessed > > EWERK-Blog<https://blog.ewerk.com/> | LinkedIn< > https://www.linkedin.com/company/ewerk-group> | Xing< > https://www.xing.com/company/ewerk> | Twitter< > https://twitter.com/EWERK_Group> | Facebook< > https://de-de.facebook.com/EWERK.Group/> > > > Auskünfte und Angebote per Mail sind freibleibend und unverbindlich. > > Disclaimer Privacy: > Der Inhalt dieser E-Mail (einschließlich etwaiger beigefügter Dateien) > ist vertraulich und nur für den Empfänger bestimmt. Sollten Sie nicht > der bestimmungsgemäße Empfänger sein, ist Ihnen jegliche Offenlegung, > Vervielfältigung, Weitergabe oder Nutzung des Inhalts untersagt. Bitte > informieren Sie in diesem Fall unverzüglich den Absender und löschen > Sie die E-Mail (einschließlich etwaiger beigefügter Dateien) von Ihrem System. > Vielen Dank. > > The contents of this e-mail (including any attachments) are > confidential and may be legally privileged. If you are not the > intended recipient of this e-mail, any disclosure, copying, > distribution or use of its contents is strictly prohibited, and you > should please notify the sender immediately and then delete it (including any > attachments) from your system. Thank you. > Von: Gabriel Bräscher > Datum: Dienstag, 7. Dezember 2021 um 09:57 > An: dev > Betreff: Re: Live migration between AMD Epyc and Ubuntu 18.04 and > 20.04 Wei, I agree. > This is not necessarily a bug per se. > > The main point here is: the issue we are seeing is the "bug #1887490" > raised in Ubuntu's qemu package. > CPU features were added on the newer releases, which caused the > compatibility issue when (live) migrating VMs between compatible > hardware but different qemu packages. > > > On Tue, Dec 7, 2021 at 9:26 AM Wei ZHOU wrote: > > > Hi Gabriel, > > > > In my opinion, migration should work from lower version to higher > version, > > but no guarantee from higher version to lower version, like we > > upgrade cloudstack. > > Therefore, migrate should work from ubuntu 18.04 to ubuntu 20.04. > > But it > is > > not a bug if migration fails from ubuntu 20.04 to ubuntu 18.04. > > > > As Paul said, migration fails from qemu-ev 2.10 to qemu-ev 2.12, > > this is definitely a bug in my point of view. > > > > -Wei > > > > On Mon, 6 Dec 2021 at 16:05, Gabriel Bräscher > > wrote: > > > > > Hi Paul (& all), > > > > > > I strongly believe that this is a bug in QEMU. > > > I was looking for bugs and found something that looks related to > > > what > we > &
Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point. > migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely a > bug in my point of view. > On the comment 53 (at "bug #1887490"): > It seems *one of the patches also introduced a regression*: > * lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch > adds various SVM-related flags. Specifically *npt and nrip-save are now > expected to be present by default* as shown in the updated testdata. > This however breaks migration from instances using EPYC or EPYC-IBPB CPU > models started with libvirt versions prior to this one because the instance > on the target host has these extra flags >From the tests reported there, it fails in both ways. 1. From *older* qemu package to *newer*: *source* host does not map the CPU flag; however, *target* host expects the flag to be there, by default. 2. From *newer* qemu package to *older*: the instance "domain.xml" in the *source* host has a CPU flag that is not mapped by qemu in the *target* host. On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel wrote: > Let me check. We had the same problem on RHEL/CentOS but I am not sure if > this a bug. What I know there was a change in the XML. Let me ask one on my > colleges in my team. > > > > > __ > > Sven Vogel > Senior Manager Research and Development - Cloud and Infrastructure > > EWERK DIGITAL GmbH > Brühl 24, D-04109 Leipzig > P +49 341 42649 - 99 > F +49 341 42649 - 98 > s.vo...@ewerk.com > www.ewerk.com > > Geschäftsführer: > Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke > Registergericht: Leipzig HRB 9065 > > Support: > +49 341 42649 555 > > Zertifiziert nach: > ISO/IEC 27001:2013 > DIN EN ISO 9001:2015 > DIN ISO/IEC 2-1:2018 > > ISAE 3402 Typ II Assessed > > EWERK-Blog<https://blog.ewerk.com/> | LinkedIn< > https://www.linkedin.com/company/ewerk-group> | Xing< > https://www.xing.com/company/ewerk> | Twitter< > https://twitter.com/EWERK_Group> | Facebook< > https://de-de.facebook.com/EWERK.Group/> > > > Auskünfte und Angebote per Mail sind freibleibend und unverbindlich. > > Disclaimer Privacy: > Der Inhalt dieser E-Mail (einschließlich etwaiger beigefügter Dateien) ist > vertraulich und nur für den Empfänger bestimmt. Sollten Sie nicht der > bestimmungsgemäße Empfänger sein, ist Ihnen jegliche Offenlegung, > Vervielfältigung, Weitergabe oder Nutzung des Inhalts untersagt. Bitte > informieren Sie in diesem Fall unverzüglich den Absender und löschen Sie > die E-Mail (einschließlich etwaiger beigefügter Dateien) von Ihrem System. > Vielen Dank. > > The contents of this e-mail (including any attachments) are confidential > and may be legally privileged. If you are not the intended recipient of > this e-mail, any disclosure, copying, distribution or use of its contents > is strictly prohibited, and you should please notify the sender immediately > and then delete it (including any attachments) from your system. Thank you. > Von: Gabriel Bräscher > Datum: Dienstag, 7. Dezember 2021 um 09:57 > An: dev > Betreff: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 > Wei, I agree. > This is not necessarily a bug per se. > > The main point here is: the issue we are seeing is the "bug #1887490" > raised in Ubuntu's qemu package. > CPU features were added on the newer releases, which caused the > compatibility issue when (live) migrating VMs between compatible hardware > but different qemu packages. > > > On Tue, Dec 7, 2021 at 9:26 AM Wei ZHOU wrote: > > > Hi Gabriel, > > > > In my opinion, migration should work from lower version to higher > version, > > but no guarantee from higher version to lower version, like we upgrade > > cloudstack. > > Therefore, migrate should work from ubuntu 18.04 to ubuntu 20.04. But it > is > > not a bug if migration fails from ubuntu 20.04 to ubuntu 18.04. > > > > As Paul said, migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is > > definitely a bug in my point of view. > > > > -Wei > > > > On Mon, 6 Dec 2021 at 16:05, Gabriel Bräscher > > wrote: > > > > > Hi Paul (& all), > > > > > > I strongly believe that this is a bug in QEMU. > > > I was looking for bugs and found something that looks related to what > we > > > are seeing. Precisely at Ubuntu's bug #*1887490* > > > <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490>: > > > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490 > > > > > > In the link above, there was the following commen
Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
Wei, I agree. This is not necessarily a bug per se. The main point here is: the issue we are seeing is the "bug #1887490" raised in Ubuntu's qemu package. CPU features were added on the newer releases, which caused the compatibility issue when (live) migrating VMs between compatible hardware but different qemu packages. On Tue, Dec 7, 2021 at 9:26 AM Wei ZHOU wrote: > Hi Gabriel, > > In my opinion, migration should work from lower version to higher version, > but no guarantee from higher version to lower version, like we upgrade > cloudstack. > Therefore, migrate should work from ubuntu 18.04 to ubuntu 20.04. But it is > not a bug if migration fails from ubuntu 20.04 to ubuntu 18.04. > > As Paul said, migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is > definitely a bug in my point of view. > > -Wei > > On Mon, 6 Dec 2021 at 16:05, Gabriel Bräscher > wrote: > > > Hi Paul (& all), > > > > I strongly believe that this is a bug in QEMU. > > I was looking for bugs and found something that looks related to what we > > are seeing. Precisely at Ubuntu's bug #*1887490* > > <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490>: > > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490 > > > > In the link above, there was the following comment: > > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490/comments/53 > > > > It seems one of the patches also introduced a regression:* > > lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patchadds various > > SVM-related flags. Specifically npt and nrip-save are now expected to be > > present by default as shown in the updated testdata.This however breaks > > migration from instances using *EPYC* or *EPYC-IBPB* CPU models started > > with libvirt versions prior to this one because the instance on the > target > > host has these extra flags > > > > > > More about #*1887490* > > <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490> can be > found > > at the mail > > > https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg5842376.html. > > We can see that the specific bug was addressed in "linux (5.4.0-49.53) > > focal". > > > > linux (5.4.0-49.53) focal; urgency=medium > > > > * Add/Backport EPYC-v3 and EPYC-Rome CPU model (LP: #1887490) > > - kvm: svm: Update svm_xsaves_supported > > > > > > Regards, > > Gabriel. > > > > On Fri, Dec 3, 2021 at 10:59 AM Paul Angus > > wrote: > > > > > Which version(s) of QEMU are you using Wido? > > > > > > We've just be upgrading CentOS 7.6 to 7.9 > > > Most 7.6 hosts had qemu-ev 2.10 on it (the buggy one). 2.12 was on the > > > new hosts. > > > We were getting errors complaining that the ibpb CPU feature wasn't > > > available when migrating to the updated OS hosts (even though identical > > > hardware). > > > > > > Upgrading qemu-ev to 2.12 on the originating host, then stopping and > > > starting the VMs, then allowed us to migrate. We couldn't find any > > > solution that didn't involve stopping and starting the VMs. > > > > > > Paul. > > > > > > -Original Message- > > > From: Wido den Hollander > > > Sent: Monday, November 29, 2021 7:57 AM > > > To: dev@cloudstack.apache.org; Wei ZHOU > > > Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 > > > > > > > > > > > > On 11/24/21 10:36 PM, Wei ZHOU wrote: > > > > Hi Wido, > > > > > > > > I think it is not good to run an environment with two ubuntu/qemu > > > versions. > > > > It always happens that some cpu features are supported in the higher > > > > version but not supported in the older version. > > > > From my experience, the migration from older version to higher > version > > > > works like a charm, but there were many issues in migration from > > > > higher version to older version. > > > > > > > > > > I understand. But with a large amount of hosts and working your way > > > through upgrades you sometimes run into these situations. Therefor it > > would > > > be welcome if it works. > > > > > > > I do not have a solution for you. I have tried to hack > > > > /etc/libvirt/hooks/qemu but it didn't work. > > > > Have you tried with other cpu models like x86_Opteron_G5 ? you can > > > > find the cpu features of each cpu model in > /usr/share/lib
Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
Hi Gabriel, In my opinion, migration should work from lower version to higher version, but no guarantee from higher version to lower version, like we upgrade cloudstack. Therefore, migrate should work from ubuntu 18.04 to ubuntu 20.04. But it is not a bug if migration fails from ubuntu 20.04 to ubuntu 18.04. As Paul said, migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely a bug in my point of view. -Wei On Mon, 6 Dec 2021 at 16:05, Gabriel Bräscher wrote: > Hi Paul (& all), > > I strongly believe that this is a bug in QEMU. > I was looking for bugs and found something that looks related to what we > are seeing. Precisely at Ubuntu's bug #*1887490* > <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490>: > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490 > > In the link above, there was the following comment: > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490/comments/53 > > It seems one of the patches also introduced a regression:* > lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patchadds various > SVM-related flags. Specifically npt and nrip-save are now expected to be > present by default as shown in the updated testdata.This however breaks > migration from instances using *EPYC* or *EPYC-IBPB* CPU models started > with libvirt versions prior to this one because the instance on the target > host has these extra flags > > > More about #*1887490* > <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490> can be found > at the mail > https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg5842376.html. > We can see that the specific bug was addressed in "linux (5.4.0-49.53) > focal". > > linux (5.4.0-49.53) focal; urgency=medium > > * Add/Backport EPYC-v3 and EPYC-Rome CPU model (LP: #1887490) > - kvm: svm: Update svm_xsaves_supported > > > Regards, > Gabriel. > > On Fri, Dec 3, 2021 at 10:59 AM Paul Angus > wrote: > > > Which version(s) of QEMU are you using Wido? > > > > We've just be upgrading CentOS 7.6 to 7.9 > > Most 7.6 hosts had qemu-ev 2.10 on it (the buggy one). 2.12 was on the > > new hosts. > > We were getting errors complaining that the ibpb CPU feature wasn't > > available when migrating to the updated OS hosts (even though identical > > hardware). > > > > Upgrading qemu-ev to 2.12 on the originating host, then stopping and > > starting the VMs, then allowed us to migrate. We couldn't find any > > solution that didn't involve stopping and starting the VMs. > > > > Paul. > > > > -Original Message- > > From: Wido den Hollander > > Sent: Monday, November 29, 2021 7:57 AM > > To: dev@cloudstack.apache.org; Wei ZHOU > > Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 > > > > > > > > On 11/24/21 10:36 PM, Wei ZHOU wrote: > > > Hi Wido, > > > > > > I think it is not good to run an environment with two ubuntu/qemu > > versions. > > > It always happens that some cpu features are supported in the higher > > > version but not supported in the older version. > > > From my experience, the migration from older version to higher version > > > works like a charm, but there were many issues in migration from > > > higher version to older version. > > > > > > > I understand. But with a large amount of hosts and working your way > > through upgrades you sometimes run into these situations. Therefor it > would > > be welcome if it works. > > > > > I do not have a solution for you. I have tried to hack > > > /etc/libvirt/hooks/qemu but it didn't work. > > > Have you tried with other cpu models like x86_Opteron_G5 ? you can > > > find the cpu features of each cpu model in /usr/share/libvirt/cpu_map/ > > > > > > > I have not tried that yet, but I can see if that works. > > > > The EPYC-IBPB CPU model is identical on 18.04 and 20.04, but even using > > that model we can't seem to migrate as it complains about the 'npt' > feature. > > > > Wido > > > > > Anyway, even if the vm migration succeeds, you do not know if vm works > > > fine. I believe the best solution is upgrading all hosts to the same > > > OS version. > > > > > > -Wei > > > > > > On Tue, 23 Nov 2021 at 16:31, Wido den Hollander > wrote: > > > > > >> Hi, > > >> > > >> I'm trying to debug an issue with live migrations between Ubuntu > > >> 18.04 and 20.04 machines each with different CPUs: > > >> > > >&g
Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
Hi Paul (& all), I strongly believe that this is a bug in QEMU. I was looking for bugs and found something that looks related to what we are seeing. Precisely at Ubuntu's bug #*1887490* <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490>: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490 In the link above, there was the following comment: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490/comments/53 It seems one of the patches also introduced a regression:* lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patchadds various SVM-related flags. Specifically npt and nrip-save are now expected to be present by default as shown in the updated testdata.This however breaks migration from instances using *EPYC* or *EPYC-IBPB* CPU models started with libvirt versions prior to this one because the instance on the target host has these extra flags More about #*1887490* <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490> can be found at the mail https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg5842376.html. We can see that the specific bug was addressed in "linux (5.4.0-49.53) focal". linux (5.4.0-49.53) focal; urgency=medium * Add/Backport EPYC-v3 and EPYC-Rome CPU model (LP: #1887490) - kvm: svm: Update svm_xsaves_supported Regards, Gabriel. On Fri, Dec 3, 2021 at 10:59 AM Paul Angus wrote: > Which version(s) of QEMU are you using Wido? > > We've just be upgrading CentOS 7.6 to 7.9 > Most 7.6 hosts had qemu-ev 2.10 on it (the buggy one). 2.12 was on the > new hosts. > We were getting errors complaining that the ibpb CPU feature wasn't > available when migrating to the updated OS hosts (even though identical > hardware). > > Upgrading qemu-ev to 2.12 on the originating host, then stopping and > starting the VMs, then allowed us to migrate. We couldn't find any > solution that didn't involve stopping and starting the VMs. > > Paul. > > -Original Message- > From: Wido den Hollander > Sent: Monday, November 29, 2021 7:57 AM > To: dev@cloudstack.apache.org; Wei ZHOU > Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 > > > > On 11/24/21 10:36 PM, Wei ZHOU wrote: > > Hi Wido, > > > > I think it is not good to run an environment with two ubuntu/qemu > versions. > > It always happens that some cpu features are supported in the higher > > version but not supported in the older version. > > From my experience, the migration from older version to higher version > > works like a charm, but there were many issues in migration from > > higher version to older version. > > > > I understand. But with a large amount of hosts and working your way > through upgrades you sometimes run into these situations. Therefor it would > be welcome if it works. > > > I do not have a solution for you. I have tried to hack > > /etc/libvirt/hooks/qemu but it didn't work. > > Have you tried with other cpu models like x86_Opteron_G5 ? you can > > find the cpu features of each cpu model in /usr/share/libvirt/cpu_map/ > > > > I have not tried that yet, but I can see if that works. > > The EPYC-IBPB CPU model is identical on 18.04 and 20.04, but even using > that model we can't seem to migrate as it complains about the 'npt' feature. > > Wido > > > Anyway, even if the vm migration succeeds, you do not know if vm works > > fine. I believe the best solution is upgrading all hosts to the same > > OS version. > > > > -Wei > > > > On Tue, 23 Nov 2021 at 16:31, Wido den Hollander wrote: > > > >> Hi, > >> > >> I'm trying to debug an issue with live migrations between Ubuntu > >> 18.04 and 20.04 machines each with different CPUs: > >> > >> - Ubuntu 18.04 with AMD Epyc 7552 (Rome) > >> - Ubuntu 20.04 with AMD Epyc 7662 (Milan) > >> > >> We are currently using this setting: > >> > >> guest.cpu.mode=custom > >> guest.cpu.model=EPYC > >> > >> This does not allow for live migrations: > >> > >> Ubuntu 20.04 with Epyc 7662 to Ubuntu 18.04 with Epyc 7552 fails > >> > >> "ExecutionException : org.libvirt.LibvirtException: unsupported > >> configuration: unknown CPU feature: npt" > >> > >> So we tried to define a set of features manually: > >> > >> guest.cpu.features=3dnowprefetch abm adx aes apic arat avx avx2 bmi1 > >> bmi2 clflush clflushopt cmov cr8legacy cx16 cx8 de f16c fma fpu > >> fsgsbase fxsr fxsr_opt lahf_lm lm mca mce misalignsse mmx mmxext > >> monitor movbe msr mtrr nx osvw pae pat pclmuldq pdpe1gb pge pni > >> popcnt pse pse36 rdrand rd
RE: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
Which version(s) of QEMU are you using Wido? We've just be upgrading CentOS 7.6 to 7.9 Most 7.6 hosts had qemu-ev 2.10 on it (the buggy one). 2.12 was on the new hosts. We were getting errors complaining that the ibpb CPU feature wasn't available when migrating to the updated OS hosts (even though identical hardware). Upgrading qemu-ev to 2.12 on the originating host, then stopping and starting the VMs, then allowed us to migrate. We couldn't find any solution that didn't involve stopping and starting the VMs. Paul. -Original Message- From: Wido den Hollander Sent: Monday, November 29, 2021 7:57 AM To: dev@cloudstack.apache.org; Wei ZHOU Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 On 11/24/21 10:36 PM, Wei ZHOU wrote: > Hi Wido, > > I think it is not good to run an environment with two ubuntu/qemu versions. > It always happens that some cpu features are supported in the higher > version but not supported in the older version. > From my experience, the migration from older version to higher version > works like a charm, but there were many issues in migration from > higher version to older version. > I understand. But with a large amount of hosts and working your way through upgrades you sometimes run into these situations. Therefor it would be welcome if it works. > I do not have a solution for you. I have tried to hack > /etc/libvirt/hooks/qemu but it didn't work. > Have you tried with other cpu models like x86_Opteron_G5 ? you can > find the cpu features of each cpu model in /usr/share/libvirt/cpu_map/ > I have not tried that yet, but I can see if that works. The EPYC-IBPB CPU model is identical on 18.04 and 20.04, but even using that model we can't seem to migrate as it complains about the 'npt' feature. Wido > Anyway, even if the vm migration succeeds, you do not know if vm works > fine. I believe the best solution is upgrading all hosts to the same > OS version. > > -Wei > > On Tue, 23 Nov 2021 at 16:31, Wido den Hollander wrote: > >> Hi, >> >> I'm trying to debug an issue with live migrations between Ubuntu >> 18.04 and 20.04 machines each with different CPUs: >> >> - Ubuntu 18.04 with AMD Epyc 7552 (Rome) >> - Ubuntu 20.04 with AMD Epyc 7662 (Milan) >> >> We are currently using this setting: >> >> guest.cpu.mode=custom >> guest.cpu.model=EPYC >> >> This does not allow for live migrations: >> >> Ubuntu 20.04 with Epyc 7662 to Ubuntu 18.04 with Epyc 7552 fails >> >> "ExecutionException : org.libvirt.LibvirtException: unsupported >> configuration: unknown CPU feature: npt" >> >> So we tried to define a set of features manually: >> >> guest.cpu.features=3dnowprefetch abm adx aes apic arat avx avx2 bmi1 >> bmi2 clflush clflushopt cmov cr8legacy cx16 cx8 de f16c fma fpu >> fsgsbase fxsr fxsr_opt lahf_lm lm mca mce misalignsse mmx mmxext >> monitor movbe msr mtrr nx osvw pae pat pclmuldq pdpe1gb pge pni >> popcnt pse pse36 rdrand rdseed rdtscp sep sha-ni smap smep sse sse2 >> sse4.1 sse4.2 sse4a >> ssse3 svm syscall tsc vme xgetbv1 xsave xsavec xsaveopt -npt -x2apic >> -hypervisor -topoext -nrip-save >> >> This results in this going into the XML: >> >> >> >> You would say that works, but then the target host (18.04 with the >> 7552) says it doesn't support the feature 'npt' and the migration still >> fails. >> >> Now we could ofcourse use the kvm64 CPU from Qemu, but that's lacking >> so many features that for example TLS offloading isn't available. >> >> I also tried to set 'EPYC-Rome' on the Ubuntu 20.04 hypervisor, but >> it then complains on the Ubuntu 18.04 hypervisor that the CPU 'EPYC-Rome' >> is unknown as the 18.04 hypervisor doesn't have that profile. >> >> Any ideas on how to get this working? >> >> Wido >> > This message is confidential and may be legally privileged or otherwise protected from disclosure. If you are not the intended recipient, please telephone or email the sender and delete this message and any attachment from your system; you must not copy or disclose the contents of this message or any attachment to any other person. We may monitor email traffic and the content of internal and external messages sent to and from us to ensure compliance with internal policies and for the purposes of security. Ticketmaster UK Limited. Registered Office: 30 St John Street, London EC1M 4AY. Registered in England and Wales. Company Number 02662632.
Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
On 11/24/21 10:36 PM, Wei ZHOU wrote: > Hi Wido, > > I think it is not good to run an environment with two ubuntu/qemu versions. > It always happens that some cpu features are supported in the higher > version but not supported in the older version. > From my experience, the migration from older version to higher version > works like a charm, but there were many issues in migration from higher > version to older version. > I understand. But with a large amount of hosts and working your way through upgrades you sometimes run into these situations. Therefor it would be welcome if it works. > I do not have a solution for you. I have tried to hack > /etc/libvirt/hooks/qemu but it didn't work. > Have you tried with other cpu models like x86_Opteron_G5 ? you can find the > cpu features of each cpu model in /usr/share/libvirt/cpu_map/ > I have not tried that yet, but I can see if that works. The EPYC-IBPB CPU model is identical on 18.04 and 20.04, but even using that model we can't seem to migrate as it complains about the 'npt' feature. Wido > Anyway, even if the vm migration succeeds, you do not know if vm works > fine. I believe the best solution is upgrading all hosts to the same OS > version. > > -Wei > > On Tue, 23 Nov 2021 at 16:31, Wido den Hollander wrote: > >> Hi, >> >> I'm trying to debug an issue with live migrations between Ubuntu 18.04 >> and 20.04 machines each with different CPUs: >> >> - Ubuntu 18.04 with AMD Epyc 7552 (Rome) >> - Ubuntu 20.04 with AMD Epyc 7662 (Milan) >> >> We are currently using this setting: >> >> guest.cpu.mode=custom >> guest.cpu.model=EPYC >> >> This does not allow for live migrations: >> >> Ubuntu 20.04 with Epyc 7662 to Ubuntu 18.04 with Epyc 7552 fails >> >> "ExecutionException : org.libvirt.LibvirtException: unsupported >> configuration: unknown CPU feature: npt" >> >> So we tried to define a set of features manually: >> >> guest.cpu.features=3dnowprefetch abm adx aes apic arat avx avx2 bmi1 >> bmi2 clflush clflushopt cmov cr8legacy cx16 cx8 de f16c fma fpu fsgsbase >> fxsr fxsr_opt lahf_lm lm mca mce misalignsse mmx mmxext monitor movbe >> msr mtrr nx osvw pae pat pclmuldq pdpe1gb pge pni popcnt pse pse36 >> rdrand rdseed rdtscp sep sha-ni smap smep sse sse2 sse4.1 sse4.2 sse4a >> ssse3 svm syscall tsc vme xgetbv1 xsave xsavec xsaveopt -npt -x2apic >> -hypervisor -topoext -nrip-save >> >> This results in this going into the XML: >> >> >> >> You would say that works, but then the target host (18.04 with the 7552) >> says it doesn't support the feature 'npt' and the migration still fails. >> >> Now we could ofcourse use the kvm64 CPU from Qemu, but that's lacking so >> many features that for example TLS offloading isn't available. >> >> I also tried to set 'EPYC-Rome' on the Ubuntu 20.04 hypervisor, but it >> then complains on the Ubuntu 18.04 hypervisor that the CPU 'EPYC-Rome' >> is unknown as the 18.04 hypervisor doesn't have that profile. >> >> Any ideas on how to get this working? >> >> Wido >> >
Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
Cant help, but I've seen this exact issue (if not mistaken) - a CPU flag that DOES exist on the destination KVM host, but libvirt complaining it doesn't - I would guess some kernel issue, as I've seen those. On Wed, 24 Nov 2021 at 22:36, Wei ZHOU wrote: > Hi Wido, > > I think it is not good to run an environment with two ubuntu/qemu versions. > It always happens that some cpu features are supported in the higher > version but not supported in the older version. > From my experience, the migration from older version to higher version > works like a charm, but there were many issues in migration from higher > version to older version. > > I do not have a solution for you. I have tried to hack > /etc/libvirt/hooks/qemu but it didn't work. > Have you tried with other cpu models like x86_Opteron_G5 ? you can find the > cpu features of each cpu model in /usr/share/libvirt/cpu_map/ > > Anyway, even if the vm migration succeeds, you do not know if vm works > fine. I believe the best solution is upgrading all hosts to the same OS > version. > > -Wei > > On Tue, 23 Nov 2021 at 16:31, Wido den Hollander wrote: > > > Hi, > > > > I'm trying to debug an issue with live migrations between Ubuntu 18.04 > > and 20.04 machines each with different CPUs: > > > > - Ubuntu 18.04 with AMD Epyc 7552 (Rome) > > - Ubuntu 20.04 with AMD Epyc 7662 (Milan) > > > > We are currently using this setting: > > > > guest.cpu.mode=custom > > guest.cpu.model=EPYC > > > > This does not allow for live migrations: > > > > Ubuntu 20.04 with Epyc 7662 to Ubuntu 18.04 with Epyc 7552 fails > > > > "ExecutionException : org.libvirt.LibvirtException: unsupported > > configuration: unknown CPU feature: npt" > > > > So we tried to define a set of features manually: > > > > guest.cpu.features=3dnowprefetch abm adx aes apic arat avx avx2 bmi1 > > bmi2 clflush clflushopt cmov cr8legacy cx16 cx8 de f16c fma fpu fsgsbase > > fxsr fxsr_opt lahf_lm lm mca mce misalignsse mmx mmxext monitor movbe > > msr mtrr nx osvw pae pat pclmuldq pdpe1gb pge pni popcnt pse pse36 > > rdrand rdseed rdtscp sep sha-ni smap smep sse sse2 sse4.1 sse4.2 sse4a > > ssse3 svm syscall tsc vme xgetbv1 xsave xsavec xsaveopt -npt -x2apic > > -hypervisor -topoext -nrip-save > > > > This results in this going into the XML: > > > > > > > > You would say that works, but then the target host (18.04 with the 7552) > > says it doesn't support the feature 'npt' and the migration still fails. > > > > Now we could ofcourse use the kvm64 CPU from Qemu, but that's lacking so > > many features that for example TLS offloading isn't available. > > > > I also tried to set 'EPYC-Rome' on the Ubuntu 20.04 hypervisor, but it > > then complains on the Ubuntu 18.04 hypervisor that the CPU 'EPYC-Rome' > > is unknown as the 18.04 hypervisor doesn't have that profile. > > > > Any ideas on how to get this working? > > > > Wido > > > -- Andrija Panić
Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
Hi Wido, I think it is not good to run an environment with two ubuntu/qemu versions. It always happens that some cpu features are supported in the higher version but not supported in the older version. >From my experience, the migration from older version to higher version works like a charm, but there were many issues in migration from higher version to older version. I do not have a solution for you. I have tried to hack /etc/libvirt/hooks/qemu but it didn't work. Have you tried with other cpu models like x86_Opteron_G5 ? you can find the cpu features of each cpu model in /usr/share/libvirt/cpu_map/ Anyway, even if the vm migration succeeds, you do not know if vm works fine. I believe the best solution is upgrading all hosts to the same OS version. -Wei On Tue, 23 Nov 2021 at 16:31, Wido den Hollander wrote: > Hi, > > I'm trying to debug an issue with live migrations between Ubuntu 18.04 > and 20.04 machines each with different CPUs: > > - Ubuntu 18.04 with AMD Epyc 7552 (Rome) > - Ubuntu 20.04 with AMD Epyc 7662 (Milan) > > We are currently using this setting: > > guest.cpu.mode=custom > guest.cpu.model=EPYC > > This does not allow for live migrations: > > Ubuntu 20.04 with Epyc 7662 to Ubuntu 18.04 with Epyc 7552 fails > > "ExecutionException : org.libvirt.LibvirtException: unsupported > configuration: unknown CPU feature: npt" > > So we tried to define a set of features manually: > > guest.cpu.features=3dnowprefetch abm adx aes apic arat avx avx2 bmi1 > bmi2 clflush clflushopt cmov cr8legacy cx16 cx8 de f16c fma fpu fsgsbase > fxsr fxsr_opt lahf_lm lm mca mce misalignsse mmx mmxext monitor movbe > msr mtrr nx osvw pae pat pclmuldq pdpe1gb pge pni popcnt pse pse36 > rdrand rdseed rdtscp sep sha-ni smap smep sse sse2 sse4.1 sse4.2 sse4a > ssse3 svm syscall tsc vme xgetbv1 xsave xsavec xsaveopt -npt -x2apic > -hypervisor -topoext -nrip-save > > This results in this going into the XML: > > > > You would say that works, but then the target host (18.04 with the 7552) > says it doesn't support the feature 'npt' and the migration still fails. > > Now we could ofcourse use the kvm64 CPU from Qemu, but that's lacking so > many features that for example TLS offloading isn't available. > > I also tried to set 'EPYC-Rome' on the Ubuntu 20.04 hypervisor, but it > then complains on the Ubuntu 18.04 hypervisor that the CPU 'EPYC-Rome' > is unknown as the 18.04 hypervisor doesn't have that profile. > > Any ideas on how to get this working? > > Wido >