Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-12-13 Thread Marcus
Sorry, just piecing this together and looking at things that have probably
already been looked at!

Looking at the Libvirt CPU xml files, it's interesting that both
x86_EPYC-Milan.xml
<https://github.com/libvirt/libvirt/blob/master/src/cpu_map/x86_EPYC-Milan.xml>
 and x86_EPYC-Rome.xml
<https://github.com/libvirt/libvirt/blob/master/src/cpu_map/x86_EPYC-Rome.xml>
have
'npt', I guess the Ubuntu kernel on 18.04 doesn't support npt, you'd see
the difference under the host XML in 'virsh capabilities' command.

This would be similar to the 'vmx' flag for nested virtualization. You
won't find the 'vmx' capability in any of the CPU XML, however if you
enable it via kvm module parameter the VM gets it, and then you can't
migrate to non-vmx hosts even with the same CPU.  If something like this
were happening though I'd still expect to see 'npt' in the source VM XML
and on its qemu command unless it's similar but not quite the same issue.

On Mon, Dec 13, 2021 at 10:32 AM Marcus  wrote:

> That does sound like some sort of libvirt, then. I don't know why it would
> fail to transfer with " unknown CPU feature" when the source VM XML is
> not calling for it or a model that would include it.
>
> On Sat, Dec 11, 2021 at 3:32 AM Wido den Hollander  wrote:
>
>>
>>
>> Op 11-12-2021 om 00:52 schreef Marcus:
>> > Just for clarity - Wido you mention that you tried using a common CPU
>> model
>> > across the platforms (which presumably doesn't contain npt) but
>> migration
>> > still fails on npt missing. That does seem like a bug of some sort, I
>> would
>> > expect that the the following should work:
>> >
>>
>> Indeed, that failed.
>>
>> > * Update cloudstack agent configs to use 'EPYC-IBPB' common identical
>> > model, restart agent
>> > * Stop VM on source host (ubuntu 20.04)
>> > * Start VM on source host (ubuntu 20.04) - at this point you should not
>> > have a feature 'npt' in the XML of the running VM. If you do then
>> there's
>> > something wrong with the EPYC-IBPB or libvirt's interpretation
>> > * Attempt to migrate to destination host (ubuntu 18.04)
>> >
>> > Is this process failing? Just want to ensure the source VM was restarted
>> > and does not contain npt in the XML (and also on the resulting qemu
>> command
>> > line), but still the migration complains about missing that feature.
>> >
>>
>> I tried with EPYC-IBPB as well and restarted the VM prior to the
>> migration.
>>
>> 20.04 -> 18.04 fails even though the IBPB model in libvirt is exactly
>> the same between 18 and 20.
>>
>> It complains about the npt feature lacking and thus the migration fails.
>>
>> > I'm also making an assumption here that /proc/cpuinfo on an Epyc 7552
>> does
>> > not have npt, but an Epyc 7662 does. Is that correct?
>> >
>>
>> Correct.
>>
>> > On Tue, Dec 7, 2021 at 6:46 AM Gabriel Bräscher 
>> > wrote:
>> >
>> >> Paul, I confused the issues then.
>> >>
>> >> The one I mentioned fits only with what Wido reported in this thread.
>> >> The CPU flag matches with the ones raised on that bug. Flags like
>> *npt* &
>> >> *nrip-save* which are present when SVM is enabled.
>> >> Therefore, affected by kernel commit -- 52297436199d ("kvm: svm: Update
>> >> svm_xsaves_supported").
>> >> Additionally, the OS/Qemu versions also do fit with what is reported on
>> >> Ubuntu' qemu package "bug #1887490".
>> >>
>> >> Regards
>> >>
>> >> On Tue, Dec 7, 2021 at 12:10 PM Paul Angus 
>> >> wrote:
>> >>
>> >>> The qemu-ev 2.10 bug was first reported a year or two ago in the
>> mailing
>> >>> lists.
>> >>>
>> >>> -Original Message-
>> >>> From: Gabriel Bräscher 
>> >>> Sent: Tuesday, December 7, 2021 9:41 AM
>> >>> To: dev 
>> >>> Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and
>> 20.04
>> >>>
>> >>> Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point.
>> >>>
>> >>>> migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely
>> >>>> a bug in my point of view.
>> >>>>
>> >>>
>> >>> On the comment 53 (at "bug #1887490"):
>> >>>
>> >>>> It seems *one of the patches also introduced a regression*:
&

Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-12-13 Thread Marcus
That does sound like some sort of libvirt, then. I don't know why it would
fail to transfer with " unknown CPU feature" when the source VM XML is not
calling for it or a model that would include it.

On Sat, Dec 11, 2021 at 3:32 AM Wido den Hollander  wrote:

>
>
> Op 11-12-2021 om 00:52 schreef Marcus:
> > Just for clarity - Wido you mention that you tried using a common CPU
> model
> > across the platforms (which presumably doesn't contain npt) but migration
> > still fails on npt missing. That does seem like a bug of some sort, I
> would
> > expect that the the following should work:
> >
>
> Indeed, that failed.
>
> > * Update cloudstack agent configs to use 'EPYC-IBPB' common identical
> > model, restart agent
> > * Stop VM on source host (ubuntu 20.04)
> > * Start VM on source host (ubuntu 20.04) - at this point you should not
> > have a feature 'npt' in the XML of the running VM. If you do then there's
> > something wrong with the EPYC-IBPB or libvirt's interpretation
> > * Attempt to migrate to destination host (ubuntu 18.04)
> >
> > Is this process failing? Just want to ensure the source VM was restarted
> > and does not contain npt in the XML (and also on the resulting qemu
> command
> > line), but still the migration complains about missing that feature.
> >
>
> I tried with EPYC-IBPB as well and restarted the VM prior to the migration.
>
> 20.04 -> 18.04 fails even though the IBPB model in libvirt is exactly
> the same between 18 and 20.
>
> It complains about the npt feature lacking and thus the migration fails.
>
> > I'm also making an assumption here that /proc/cpuinfo on an Epyc 7552
> does
> > not have npt, but an Epyc 7662 does. Is that correct?
> >
>
> Correct.
>
> > On Tue, Dec 7, 2021 at 6:46 AM Gabriel Bräscher 
> > wrote:
> >
> >> Paul, I confused the issues then.
> >>
> >> The one I mentioned fits only with what Wido reported in this thread.
> >> The CPU flag matches with the ones raised on that bug. Flags like *npt*
> &
> >> *nrip-save* which are present when SVM is enabled.
> >> Therefore, affected by kernel commit -- 52297436199d ("kvm: svm: Update
> >> svm_xsaves_supported").
> >> Additionally, the OS/Qemu versions also do fit with what is reported on
> >> Ubuntu' qemu package "bug #1887490".
> >>
> >> Regards
> >>
> >> On Tue, Dec 7, 2021 at 12:10 PM Paul Angus 
> >> wrote:
> >>
> >>> The qemu-ev 2.10 bug was first reported a year or two ago in the
> mailing
> >>> lists.
> >>>
> >>> -Original Message-
> >>> From: Gabriel Bräscher 
> >>> Sent: Tuesday, December 7, 2021 9:41 AM
> >>> To: dev 
> >>> Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
> >>>
> >>> Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point.
> >>>
> >>>> migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely
> >>>> a bug in my point of view.
> >>>>
> >>>
> >>> On the comment 53 (at "bug #1887490"):
> >>>
> >>>> It seems *one of the patches also introduced a regression*:
> >>>> * lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch
> >>>> adds various SVM-related flags. Specifically *npt and nrip-save are
> >>>> now expected to be present by default* as shown in the updated
> >> testdata.
> >>>> This however breaks migration from instances using EPYC or EPYC-IBPB
> >>>> CPU models started with libvirt versions prior to this one because the
> >>>> instance on the target host has these extra flags
> >>>
> >>>
> >>>  From the tests reported there, it fails in both ways.
> >>> 1. From *older* qemu package to *newer*:
> >>>  *source* host does not map the CPU flag; however, *target* host
> >>> expects the flag to be there, by default.
> >>> 2. From *newer* qemu package to *older*:
> >>>  the instance "domain.xml" in the *source* host has a CPU flag
> that is
> >>> not mapped by qemu in the *target* host.
> >>>
> >>>
> >>>
> >>> On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel  wrote:
> >>>
> >>>> Let me check. We had the same problem on RHEL/CentOS but I am not sure
> >>>> if this a bug. What I know there was a chang

Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-12-11 Thread Wido den Hollander




Op 11-12-2021 om 00:52 schreef Marcus:

Just for clarity - Wido you mention that you tried using a common CPU model
across the platforms (which presumably doesn't contain npt) but migration
still fails on npt missing. That does seem like a bug of some sort, I would
expect that the the following should work:



Indeed, that failed.


* Update cloudstack agent configs to use 'EPYC-IBPB' common identical
model, restart agent
* Stop VM on source host (ubuntu 20.04)
* Start VM on source host (ubuntu 20.04) - at this point you should not
have a feature 'npt' in the XML of the running VM. If you do then there's
something wrong with the EPYC-IBPB or libvirt's interpretation
* Attempt to migrate to destination host (ubuntu 18.04)

Is this process failing? Just want to ensure the source VM was restarted
and does not contain npt in the XML (and also on the resulting qemu command
line), but still the migration complains about missing that feature.



I tried with EPYC-IBPB as well and restarted the VM prior to the migration.

20.04 -> 18.04 fails even though the IBPB model in libvirt is exactly 
the same between 18 and 20.


It complains about the npt feature lacking and thus the migration fails.


I'm also making an assumption here that /proc/cpuinfo on an Epyc 7552 does
not have npt, but an Epyc 7662 does. Is that correct?



Correct.


On Tue, Dec 7, 2021 at 6:46 AM Gabriel Bräscher 
wrote:


Paul, I confused the issues then.

The one I mentioned fits only with what Wido reported in this thread.
The CPU flag matches with the ones raised on that bug. Flags like *npt* &
*nrip-save* which are present when SVM is enabled.
Therefore, affected by kernel commit -- 52297436199d ("kvm: svm: Update
svm_xsaves_supported").
Additionally, the OS/Qemu versions also do fit with what is reported on
Ubuntu' qemu package "bug #1887490".

Regards

On Tue, Dec 7, 2021 at 12:10 PM Paul Angus 
wrote:


The qemu-ev 2.10 bug was first reported a year or two ago in the mailing
lists.

-Original Message-
From: Gabriel Bräscher 
Sent: Tuesday, December 7, 2021 9:41 AM
To: dev 
Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point.


migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely
a bug in my point of view.



On the comment 53 (at "bug #1887490"):


It seems *one of the patches also introduced a regression*:
* lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch
adds various SVM-related flags. Specifically *npt and nrip-save are
now expected to be present by default* as shown in the updated

testdata.

This however breaks migration from instances using EPYC or EPYC-IBPB
CPU models started with libvirt versions prior to this one because the
instance on the target host has these extra flags



 From the tests reported there, it fails in both ways.
1. From *older* qemu package to *newer*:
 *source* host does not map the CPU flag; however, *target* host
expects the flag to be there, by default.
2. From *newer* qemu package to *older*:
 the instance "domain.xml" in the *source* host has a CPU flag that is
not mapped by qemu in the *target* host.



On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel  wrote:


Let me check. We had the same problem on RHEL/CentOS but I am not sure
if this a bug. What I know there was a change in the XML. Let me ask
one on my colleges in my team.




__

Sven Vogel
Senior Manager Research and Development - Cloud and Infrastructure

EWERK DIGITAL GmbH
Brühl 24, D-04109 Leipzig
P +49 341 42649 - 99
F +49 341 42649 - 98
s.vo...@ewerk.com
www.ewerk.com

Geschäftsführer:
Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke
Registergericht: Leipzig HRB 9065

Support:
+49 341 42649 555

Zertifiziert nach:
ISO/IEC 27001:2013
DIN EN ISO 9001:2015
DIN ISO/IEC 2-1:2018

ISAE 3402 Typ II Assessed

EWERK-Blog<https://blog.ewerk.com/> | LinkedIn<
https://www.linkedin.com/company/ewerk-group> | Xing<
https://www.xing.com/company/ewerk> | Twitter<
https://twitter.com/EWERK_Group> | Facebook<
https://de-de.facebook.com/EWERK.Group/>


Auskünfte und Angebote per Mail sind freibleibend und unverbindlich.

Disclaimer Privacy:
Der Inhalt dieser E-Mail (einschließlich etwaiger beigefügter Dateien)
ist vertraulich und nur für den Empfänger bestimmt. Sollten Sie nicht
der bestimmungsgemäße Empfänger sein, ist Ihnen jegliche Offenlegung,
Vervielfältigung, Weitergabe oder Nutzung des Inhalts untersagt. Bitte
informieren Sie in diesem Fall unverzüglich den Absender und löschen
Sie die E-Mail (einschließlich etwaiger beigefügter Dateien) von Ihrem

System.

Vielen Dank.

The contents of this e-mail (including any attachments) are
confidential and may be legally privileged. If you are not the
intended recipient of this e-mail, any disclosure, copying,
distribution or use of its contents is strictly prohibited, and you
should please 

Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-12-10 Thread Marcus
Just for clarity - Wido you mention that you tried using a common CPU model
across the platforms (which presumably doesn't contain npt) but migration
still fails on npt missing. That does seem like a bug of some sort, I would
expect that the the following should work:

* Update cloudstack agent configs to use 'EPYC-IBPB' common identical
model, restart agent
* Stop VM on source host (ubuntu 20.04)
* Start VM on source host (ubuntu 20.04) - at this point you should not
have a feature 'npt' in the XML of the running VM. If you do then there's
something wrong with the EPYC-IBPB or libvirt's interpretation
* Attempt to migrate to destination host (ubuntu 18.04)

Is this process failing? Just want to ensure the source VM was restarted
and does not contain npt in the XML (and also on the resulting qemu command
line), but still the migration complains about missing that feature.

I'm also making an assumption here that /proc/cpuinfo on an Epyc 7552 does
not have npt, but an Epyc 7662 does. Is that correct?

On Tue, Dec 7, 2021 at 6:46 AM Gabriel Bräscher 
wrote:

> Paul, I confused the issues then.
>
> The one I mentioned fits only with what Wido reported in this thread.
> The CPU flag matches with the ones raised on that bug. Flags like *npt* &
> *nrip-save* which are present when SVM is enabled.
> Therefore, affected by kernel commit -- 52297436199d ("kvm: svm: Update
> svm_xsaves_supported").
> Additionally, the OS/Qemu versions also do fit with what is reported on
> Ubuntu' qemu package "bug #1887490".
>
> Regards
>
> On Tue, Dec 7, 2021 at 12:10 PM Paul Angus 
> wrote:
>
> > The qemu-ev 2.10 bug was first reported a year or two ago in the mailing
> > lists.
> >
> > -Original Message-
> > From: Gabriel Bräscher 
> > Sent: Tuesday, December 7, 2021 9:41 AM
> > To: dev 
> > Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
> >
> > Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point.
> >
> > > migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely
> > > a bug in my point of view.
> > >
> >
> > On the comment 53 (at "bug #1887490"):
> >
> > > It seems *one of the patches also introduced a regression*:
> > > * lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch
> > > adds various SVM-related flags. Specifically *npt and nrip-save are
> > > now expected to be present by default* as shown in the updated
> testdata.
> > > This however breaks migration from instances using EPYC or EPYC-IBPB
> > > CPU models started with libvirt versions prior to this one because the
> > > instance on the target host has these extra flags
> >
> >
> > From the tests reported there, it fails in both ways.
> > 1. From *older* qemu package to *newer*:
> > *source* host does not map the CPU flag; however, *target* host
> > expects the flag to be there, by default.
> > 2. From *newer* qemu package to *older*:
> > the instance "domain.xml" in the *source* host has a CPU flag that is
> > not mapped by qemu in the *target* host.
> >
> >
> >
> > On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel  wrote:
> >
> > > Let me check. We had the same problem on RHEL/CentOS but I am not sure
> > > if this a bug. What I know there was a change in the XML. Let me ask
> > > one on my colleges in my team.
> > >
> > > 
> > >
> > >
> > > __
> > >
> > > Sven Vogel
> > > Senior Manager Research and Development - Cloud and Infrastructure
> > >
> > > EWERK DIGITAL GmbH
> > > Brühl 24, D-04109 Leipzig
> > > P +49 341 42649 - 99
> > > F +49 341 42649 - 98
> > > s.vo...@ewerk.com
> > > www.ewerk.com
> > >
> > > Geschäftsführer:
> > > Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke
> > > Registergericht: Leipzig HRB 9065
> > >
> > > Support:
> > > +49 341 42649 555
> > >
> > > Zertifiziert nach:
> > > ISO/IEC 27001:2013
> > > DIN EN ISO 9001:2015
> > > DIN ISO/IEC 2-1:2018
> > >
> > > ISAE 3402 Typ II Assessed
> > >
> > > EWERK-Blog<https://blog.ewerk.com/> | LinkedIn<
> > > https://www.linkedin.com/company/ewerk-group> | Xing<
> > > https://www.xing.com/company/ewerk> | Twitter<
> > > https://twitter.com/EWERK_Group> | Facebook<
> > > https://de-de.facebook.com/EWERK.Group/>
> > >
> > >
> > > Auskünfte und Angebote per Mail sind freibleibend und un

Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-12-07 Thread Gabriel Bräscher
Paul, I confused the issues then.

The one I mentioned fits only with what Wido reported in this thread.
The CPU flag matches with the ones raised on that bug. Flags like *npt* &
*nrip-save* which are present when SVM is enabled.
Therefore, affected by kernel commit -- 52297436199d ("kvm: svm: Update
svm_xsaves_supported").
Additionally, the OS/Qemu versions also do fit with what is reported on
Ubuntu' qemu package "bug #1887490".

Regards

On Tue, Dec 7, 2021 at 12:10 PM Paul Angus 
wrote:

> The qemu-ev 2.10 bug was first reported a year or two ago in the mailing
> lists.
>
> -Original Message-
> From: Gabriel Bräscher 
> Sent: Tuesday, December 7, 2021 9:41 AM
> To: dev 
> Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
>
> Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point.
>
> > migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely
> > a bug in my point of view.
> >
>
> On the comment 53 (at "bug #1887490"):
>
> > It seems *one of the patches also introduced a regression*:
> > * lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch
> > adds various SVM-related flags. Specifically *npt and nrip-save are
> > now expected to be present by default* as shown in the updated testdata.
> > This however breaks migration from instances using EPYC or EPYC-IBPB
> > CPU models started with libvirt versions prior to this one because the
> > instance on the target host has these extra flags
>
>
> From the tests reported there, it fails in both ways.
> 1. From *older* qemu package to *newer*:
> *source* host does not map the CPU flag; however, *target* host
> expects the flag to be there, by default.
> 2. From *newer* qemu package to *older*:
> the instance "domain.xml" in the *source* host has a CPU flag that is
> not mapped by qemu in the *target* host.
>
>
>
> On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel  wrote:
>
> > Let me check. We had the same problem on RHEL/CentOS but I am not sure
> > if this a bug. What I know there was a change in the XML. Let me ask
> > one on my colleges in my team.
> >
> > 
> >
> >
> > __
> >
> > Sven Vogel
> > Senior Manager Research and Development - Cloud and Infrastructure
> >
> > EWERK DIGITAL GmbH
> > Brühl 24, D-04109 Leipzig
> > P +49 341 42649 - 99
> > F +49 341 42649 - 98
> > s.vo...@ewerk.com
> > www.ewerk.com
> >
> > Geschäftsführer:
> > Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke
> > Registergericht: Leipzig HRB 9065
> >
> > Support:
> > +49 341 42649 555
> >
> > Zertifiziert nach:
> > ISO/IEC 27001:2013
> > DIN EN ISO 9001:2015
> > DIN ISO/IEC 2-1:2018
> >
> > ISAE 3402 Typ II Assessed
> >
> > EWERK-Blog<https://blog.ewerk.com/> | LinkedIn<
> > https://www.linkedin.com/company/ewerk-group> | Xing<
> > https://www.xing.com/company/ewerk> | Twitter<
> > https://twitter.com/EWERK_Group> | Facebook<
> > https://de-de.facebook.com/EWERK.Group/>
> >
> >
> > Auskünfte und Angebote per Mail sind freibleibend und unverbindlich.
> >
> > Disclaimer Privacy:
> > Der Inhalt dieser E-Mail (einschließlich etwaiger beigefügter Dateien)
> > ist vertraulich und nur für den Empfänger bestimmt. Sollten Sie nicht
> > der bestimmungsgemäße Empfänger sein, ist Ihnen jegliche Offenlegung,
> > Vervielfältigung, Weitergabe oder Nutzung des Inhalts untersagt. Bitte
> > informieren Sie in diesem Fall unverzüglich den Absender und löschen
> > Sie die E-Mail (einschließlich etwaiger beigefügter Dateien) von Ihrem
> System.
> > Vielen Dank.
> >
> > The contents of this e-mail (including any attachments) are
> > confidential and may be legally privileged. If you are not the
> > intended recipient of this e-mail, any disclosure, copying,
> > distribution or use of its contents is strictly prohibited, and you
> > should please notify the sender immediately and then delete it
> (including any attachments) from your system. Thank you.
> > Von: Gabriel Bräscher 
> > Datum: Dienstag, 7. Dezember 2021 um 09:57
> > An: dev 
> > Betreff: Re: Live migration between AMD Epyc and Ubuntu 18.04 and
> > 20.04 Wei, I agree.
> > This is not necessarily a bug per se.
> >
> > The main point here is: the issue we are seeing is the "bug #1887490"
> > raised in Ubuntu's qemu package.
> > CPU features were added on the newer releases, which caused the
> > compatibility issue when (live) migrating VMs between

RE: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-12-07 Thread Paul Angus
The qemu-ev 2.10 bug was first reported a year or two ago in the mailing lists.

-Original Message-
From: Gabriel Bräscher  
Sent: Tuesday, December 7, 2021 9:41 AM
To: dev 
Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point.

> migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely 
> a bug in my point of view.
>

On the comment 53 (at "bug #1887490"):

> It seems *one of the patches also introduced a regression*:
> * lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch
> adds various SVM-related flags. Specifically *npt and nrip-save are 
> now expected to be present by default* as shown in the updated testdata.
> This however breaks migration from instances using EPYC or EPYC-IBPB 
> CPU models started with libvirt versions prior to this one because the 
> instance on the target host has these extra flags


From the tests reported there, it fails in both ways.
1. From *older* qemu package to *newer*:
*source* host does not map the CPU flag; however, *target* host expects the 
flag to be there, by default.
2. From *newer* qemu package to *older*:
the instance "domain.xml" in the *source* host has a CPU flag that is not 
mapped by qemu in the *target* host.



On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel  wrote:

> Let me check. We had the same problem on RHEL/CentOS but I am not sure 
> if this a bug. What I know there was a change in the XML. Let me ask 
> one on my colleges in my team.
>
> 
>
>
> __
>
> Sven Vogel
> Senior Manager Research and Development - Cloud and Infrastructure
>
> EWERK DIGITAL GmbH
> Brühl 24, D-04109 Leipzig
> P +49 341 42649 - 99
> F +49 341 42649 - 98
> s.vo...@ewerk.com
> www.ewerk.com
>
> Geschäftsführer:
> Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke
> Registergericht: Leipzig HRB 9065
>
> Support:
> +49 341 42649 555
>
> Zertifiziert nach:
> ISO/IEC 27001:2013
> DIN EN ISO 9001:2015
> DIN ISO/IEC 2-1:2018
>
> ISAE 3402 Typ II Assessed
>
> EWERK-Blog<https://blog.ewerk.com/> | LinkedIn< 
> https://www.linkedin.com/company/ewerk-group> | Xing< 
> https://www.xing.com/company/ewerk> | Twitter< 
> https://twitter.com/EWERK_Group> | Facebook< 
> https://de-de.facebook.com/EWERK.Group/>
>
>
> Auskünfte und Angebote per Mail sind freibleibend und unverbindlich.
>
> Disclaimer Privacy:
> Der Inhalt dieser E-Mail (einschließlich etwaiger beigefügter Dateien) 
> ist vertraulich und nur für den Empfänger bestimmt. Sollten Sie nicht 
> der bestimmungsgemäße Empfänger sein, ist Ihnen jegliche Offenlegung, 
> Vervielfältigung, Weitergabe oder Nutzung des Inhalts untersagt. Bitte 
> informieren Sie in diesem Fall unverzüglich den Absender und löschen 
> Sie die E-Mail (einschließlich etwaiger beigefügter Dateien) von Ihrem System.
> Vielen Dank.
>
> The contents of this e-mail (including any attachments) are 
> confidential and may be legally privileged. If you are not the 
> intended recipient of this e-mail, any disclosure, copying, 
> distribution or use of its contents is strictly prohibited, and you 
> should please notify the sender immediately and then delete it (including any 
> attachments) from your system. Thank you.
> Von: Gabriel Bräscher 
> Datum: Dienstag, 7. Dezember 2021 um 09:57
> An: dev 
> Betreff: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 
> 20.04 Wei, I agree.
> This is not necessarily a bug per se.
>
> The main point here is: the issue we are seeing is the "bug #1887490"
> raised in Ubuntu's qemu package.
> CPU features were added on the newer releases, which caused the 
> compatibility issue when (live) migrating VMs between compatible 
> hardware but different qemu packages.
>
>
> On Tue, Dec 7, 2021 at 9:26 AM Wei ZHOU  wrote:
>
> > Hi Gabriel,
> >
> > In my opinion, migration should work from lower version to higher
> version,
> > but no guarantee from higher version to lower version, like we 
> > upgrade cloudstack.
> > Therefore, migrate should work from ubuntu 18.04 to ubuntu 20.04. 
> > But it
> is
> > not a bug if migration fails from ubuntu 20.04 to ubuntu 18.04.
> >
> > As Paul said, migration fails from qemu-ev 2.10 to qemu-ev 2.12, 
> > this is definitely a bug in my point of view.
> >
> > -Wei
> >
> > On Mon, 6 Dec 2021 at 16:05, Gabriel Bräscher 
> > wrote:
> >
> > > Hi Paul (& all),
> > >
> > > I strongly believe that this is a bug in QEMU.
> > > I was looking for bugs and found something that looks related to 
> > > what
> we
> &

Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-12-07 Thread Gabriel Bräscher
Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point.

> migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely a
> bug in my point of view.
>

On the comment 53 (at "bug #1887490"):

> It seems *one of the patches also introduced a regression*:
> * lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch
> adds various SVM-related flags. Specifically *npt and nrip-save are now
> expected to be present by default* as shown in the updated testdata.
> This however breaks migration from instances using EPYC or EPYC-IBPB CPU
> models started with libvirt versions prior to this one because the instance
> on the target host has these extra flags


>From the tests reported there, it fails in both ways.
1. From *older* qemu package to *newer*:
*source* host does not map the CPU flag; however, *target* host expects
the flag to be there, by default.
2. From *newer* qemu package to *older*:
the instance "domain.xml" in the *source* host has a CPU flag that is
not mapped by qemu in the *target* host.



On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel  wrote:

> Let me check. We had the same problem on RHEL/CentOS but I am not sure if
> this a bug. What I know there was a change in the XML. Let me ask one on my
> colleges in my team.
>
> 
>
>
> __
>
> Sven Vogel
> Senior Manager Research and Development - Cloud and Infrastructure
>
> EWERK DIGITAL GmbH
> Brühl 24, D-04109 Leipzig
> P +49 341 42649 - 99
> F +49 341 42649 - 98
> s.vo...@ewerk.com
> www.ewerk.com
>
> Geschäftsführer:
> Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke
> Registergericht: Leipzig HRB 9065
>
> Support:
> +49 341 42649 555
>
> Zertifiziert nach:
> ISO/IEC 27001:2013
> DIN EN ISO 9001:2015
> DIN ISO/IEC 2-1:2018
>
> ISAE 3402 Typ II Assessed
>
> EWERK-Blog<https://blog.ewerk.com/> | LinkedIn<
> https://www.linkedin.com/company/ewerk-group> | Xing<
> https://www.xing.com/company/ewerk> | Twitter<
> https://twitter.com/EWERK_Group> | Facebook<
> https://de-de.facebook.com/EWERK.Group/>
>
>
> Auskünfte und Angebote per Mail sind freibleibend und unverbindlich.
>
> Disclaimer Privacy:
> Der Inhalt dieser E-Mail (einschließlich etwaiger beigefügter Dateien) ist
> vertraulich und nur für den Empfänger bestimmt. Sollten Sie nicht der
> bestimmungsgemäße Empfänger sein, ist Ihnen jegliche Offenlegung,
> Vervielfältigung, Weitergabe oder Nutzung des Inhalts untersagt. Bitte
> informieren Sie in diesem Fall unverzüglich den Absender und löschen Sie
> die E-Mail (einschließlich etwaiger beigefügter Dateien) von Ihrem System.
> Vielen Dank.
>
> The contents of this e-mail (including any attachments) are confidential
> and may be legally privileged. If you are not the intended recipient of
> this e-mail, any disclosure, copying, distribution or use of its contents
> is strictly prohibited, and you should please notify the sender immediately
> and then delete it (including any attachments) from your system. Thank you.
> Von: Gabriel Bräscher 
> Datum: Dienstag, 7. Dezember 2021 um 09:57
> An: dev 
> Betreff: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
> Wei, I agree.
> This is not necessarily a bug per se.
>
> The main point here is: the issue we are seeing is the "bug #1887490"
> raised in Ubuntu's qemu package.
> CPU features were added on the newer releases, which caused the
> compatibility issue when (live) migrating VMs between compatible hardware
> but different qemu packages.
>
>
> On Tue, Dec 7, 2021 at 9:26 AM Wei ZHOU  wrote:
>
> > Hi Gabriel,
> >
> > In my opinion, migration should work from lower version to higher
> version,
> > but no guarantee from higher version to lower version, like we upgrade
> > cloudstack.
> > Therefore, migrate should work from ubuntu 18.04 to ubuntu 20.04. But it
> is
> > not a bug if migration fails from ubuntu 20.04 to ubuntu 18.04.
> >
> > As Paul said, migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is
> > definitely a bug in my point of view.
> >
> > -Wei
> >
> > On Mon, 6 Dec 2021 at 16:05, Gabriel Bräscher 
> > wrote:
> >
> > > Hi Paul (& all),
> > >
> > > I strongly believe that this is a bug in QEMU.
> > > I was looking for bugs and found something that looks related to what
> we
> > > are seeing. Precisely at Ubuntu's bug #*1887490*
> > > <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490>:
> > > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490
> > >
> > > In the link above, there was the following commen

Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-12-07 Thread Gabriel Bräscher
Wei, I agree.
This is not necessarily a bug per se.

The main point here is: the issue we are seeing is the "bug #1887490"
raised in Ubuntu's qemu package.
CPU features were added on the newer releases, which caused the
compatibility issue when (live) migrating VMs between compatible hardware
but different qemu packages.


On Tue, Dec 7, 2021 at 9:26 AM Wei ZHOU  wrote:

> Hi Gabriel,
>
> In my opinion, migration should work from lower version to higher version,
> but no guarantee from higher version to lower version, like we upgrade
> cloudstack.
> Therefore, migrate should work from ubuntu 18.04 to ubuntu 20.04. But it is
> not a bug if migration fails from ubuntu 20.04 to ubuntu 18.04.
>
> As Paul said, migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is
> definitely a bug in my point of view.
>
> -Wei
>
> On Mon, 6 Dec 2021 at 16:05, Gabriel Bräscher 
> wrote:
>
> > Hi Paul (& all),
> >
> > I strongly believe that this is a bug in QEMU.
> > I was looking for bugs and found something that looks related to what we
> > are seeing. Precisely at Ubuntu's bug #*1887490*
> > <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490>:
> > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490
> >
> > In the link above, there was the following comment:
> > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490/comments/53
> >
> > It seems one of the patches also introduced a regression:*
> > lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patchadds various
> > SVM-related flags. Specifically npt and nrip-save are now expected to be
> > present by default as shown in the updated testdata.This however breaks
> > migration from instances using *EPYC* or *EPYC-IBPB* CPU models started
> > with libvirt versions prior to this one because the instance on the
> target
> > host has these extra flags
> >
> >
> > More about #*1887490*
> > <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490> can be
> found
> > at the mail
> >
> https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg5842376.html.
> > We can see that the specific bug was addressed in "linux (5.4.0-49.53)
> > focal".
> >
> > linux (5.4.0-49.53) focal; urgency=medium
> >
> >   * Add/Backport EPYC-v3 and EPYC-Rome CPU model (LP: #1887490)
> > - kvm: svm: Update svm_xsaves_supported
> >
> >
> > Regards,
> > Gabriel.
> >
> > On Fri, Dec 3, 2021 at 10:59 AM Paul Angus 
> > wrote:
> >
> > > Which version(s) of QEMU are you using Wido?
> > >
> > > We've just be upgrading CentOS 7.6 to 7.9
> > > Most 7.6 hosts had qemu-ev 2.10 on it  (the buggy one). 2.12 was on the
> > > new hosts.
> > > We were getting errors complaining that the ibpb CPU feature wasn't
> > > available when migrating to the updated OS hosts (even though identical
> > > hardware).
> > >
> > > Upgrading qemu-ev to 2.12 on the originating host, then stopping and
> > > starting the VMs, then allowed us to migrate.  We couldn't find any
> > > solution that didn't involve stopping and starting the VMs.
> > >
> > > Paul.
> > >
> > > -Original Message-
> > > From: Wido den Hollander 
> > > Sent: Monday, November 29, 2021 7:57 AM
> > > To: dev@cloudstack.apache.org; Wei ZHOU 
> > > Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
> > >
> > >
> > >
> > > On 11/24/21 10:36 PM, Wei ZHOU wrote:
> > > > Hi Wido,
> > > >
> > > > I think it is not good to run an environment with two ubuntu/qemu
> > > versions.
> > > > It always happens that some cpu features are supported in the higher
> > > > version but not supported in the older version.
> > > > From my experience, the migration from older version to higher
> version
> > > > works like a charm, but there were many issues in migration from
> > > > higher version to older version.
> > > >
> > >
> > > I understand. But with a large amount of hosts and working your way
> > > through upgrades you sometimes run into these situations. Therefor it
> > would
> > > be welcome if it works.
> > >
> > > > I do not have a solution for you. I have tried to hack
> > > > /etc/libvirt/hooks/qemu but it didn't work.
> > > > Have you tried with other cpu models like x86_Opteron_G5 ? you can
> > > > find the cpu features of each cpu model in
> /usr/share/lib

Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-12-07 Thread Wei ZHOU
Hi Gabriel,

In my opinion, migration should work from lower version to higher version,
but no guarantee from higher version to lower version, like we upgrade
cloudstack.
Therefore, migrate should work from ubuntu 18.04 to ubuntu 20.04. But it is
not a bug if migration fails from ubuntu 20.04 to ubuntu 18.04.

As Paul said, migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is
definitely a bug in my point of view.

-Wei

On Mon, 6 Dec 2021 at 16:05, Gabriel Bräscher  wrote:

> Hi Paul (& all),
>
> I strongly believe that this is a bug in QEMU.
> I was looking for bugs and found something that looks related to what we
> are seeing. Precisely at Ubuntu's bug #*1887490*
> <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490>:
> https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490
>
> In the link above, there was the following comment:
> https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490/comments/53
>
> It seems one of the patches also introduced a regression:*
> lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patchadds various
> SVM-related flags. Specifically npt and nrip-save are now expected to be
> present by default as shown in the updated testdata.This however breaks
> migration from instances using *EPYC* or *EPYC-IBPB* CPU models started
> with libvirt versions prior to this one because the instance on the target
> host has these extra flags
>
>
> More about #*1887490*
> <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490> can be found
> at the mail
> https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg5842376.html.
> We can see that the specific bug was addressed in "linux (5.4.0-49.53)
> focal".
>
> linux (5.4.0-49.53) focal; urgency=medium
>
>   * Add/Backport EPYC-v3 and EPYC-Rome CPU model (LP: #1887490)
> - kvm: svm: Update svm_xsaves_supported
>
>
> Regards,
> Gabriel.
>
> On Fri, Dec 3, 2021 at 10:59 AM Paul Angus 
> wrote:
>
> > Which version(s) of QEMU are you using Wido?
> >
> > We've just be upgrading CentOS 7.6 to 7.9
> > Most 7.6 hosts had qemu-ev 2.10 on it  (the buggy one). 2.12 was on the
> > new hosts.
> > We were getting errors complaining that the ibpb CPU feature wasn't
> > available when migrating to the updated OS hosts (even though identical
> > hardware).
> >
> > Upgrading qemu-ev to 2.12 on the originating host, then stopping and
> > starting the VMs, then allowed us to migrate.  We couldn't find any
> > solution that didn't involve stopping and starting the VMs.
> >
> > Paul.
> >
> > -Original Message-
> > From: Wido den Hollander 
> > Sent: Monday, November 29, 2021 7:57 AM
> > To: dev@cloudstack.apache.org; Wei ZHOU 
> > Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
> >
> >
> >
> > On 11/24/21 10:36 PM, Wei ZHOU wrote:
> > > Hi Wido,
> > >
> > > I think it is not good to run an environment with two ubuntu/qemu
> > versions.
> > > It always happens that some cpu features are supported in the higher
> > > version but not supported in the older version.
> > > From my experience, the migration from older version to higher version
> > > works like a charm, but there were many issues in migration from
> > > higher version to older version.
> > >
> >
> > I understand. But with a large amount of hosts and working your way
> > through upgrades you sometimes run into these situations. Therefor it
> would
> > be welcome if it works.
> >
> > > I do not have a solution for you. I have tried to hack
> > > /etc/libvirt/hooks/qemu but it didn't work.
> > > Have you tried with other cpu models like x86_Opteron_G5 ? you can
> > > find the cpu features of each cpu model in /usr/share/libvirt/cpu_map/
> > >
> >
> > I have not tried that yet, but I can see if that works.
> >
> > The EPYC-IBPB CPU model is identical on 18.04 and 20.04, but even using
> > that model we can't seem to migrate as it complains about the 'npt'
> feature.
> >
> > Wido
> >
> > > Anyway, even if the vm migration succeeds, you do not know if vm works
> > > fine. I believe the best solution is upgrading all hosts to the same
> > > OS version.
> > >
> > > -Wei
> > >
> > > On Tue, 23 Nov 2021 at 16:31, Wido den Hollander 
> wrote:
> > >
> > >> Hi,
> > >>
> > >> I'm trying to debug an issue with live migrations between Ubuntu
> > >> 18.04 and 20.04 machines each with different CPUs:
> > >>
> > >&g

Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-12-06 Thread Gabriel Bräscher
Hi Paul (& all),

I strongly believe that this is a bug in QEMU.
I was looking for bugs and found something that looks related to what we
are seeing. Precisely at Ubuntu's bug #*1887490*
<https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490>:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490

In the link above, there was the following comment:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490/comments/53

It seems one of the patches also introduced a regression:*
lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patchadds various
SVM-related flags. Specifically npt and nrip-save are now expected to be
present by default as shown in the updated testdata.This however breaks
migration from instances using *EPYC* or *EPYC-IBPB* CPU models started
with libvirt versions prior to this one because the instance on the target
host has these extra flags


More about #*1887490*
<https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490> can be found
at the mail
https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg5842376.html.
We can see that the specific bug was addressed in "linux (5.4.0-49.53)
focal".

linux (5.4.0-49.53) focal; urgency=medium

  * Add/Backport EPYC-v3 and EPYC-Rome CPU model (LP: #1887490)
- kvm: svm: Update svm_xsaves_supported


Regards,
Gabriel.

On Fri, Dec 3, 2021 at 10:59 AM Paul Angus 
wrote:

> Which version(s) of QEMU are you using Wido?
>
> We've just be upgrading CentOS 7.6 to 7.9
> Most 7.6 hosts had qemu-ev 2.10 on it  (the buggy one). 2.12 was on the
> new hosts.
> We were getting errors complaining that the ibpb CPU feature wasn't
> available when migrating to the updated OS hosts (even though identical
> hardware).
>
> Upgrading qemu-ev to 2.12 on the originating host, then stopping and
> starting the VMs, then allowed us to migrate.  We couldn't find any
> solution that didn't involve stopping and starting the VMs.
>
> Paul.
>
> -Original Message-
> From: Wido den Hollander 
> Sent: Monday, November 29, 2021 7:57 AM
> To: dev@cloudstack.apache.org; Wei ZHOU 
> Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04
>
>
>
> On 11/24/21 10:36 PM, Wei ZHOU wrote:
> > Hi Wido,
> >
> > I think it is not good to run an environment with two ubuntu/qemu
> versions.
> > It always happens that some cpu features are supported in the higher
> > version but not supported in the older version.
> > From my experience, the migration from older version to higher version
> > works like a charm, but there were many issues in migration from
> > higher version to older version.
> >
>
> I understand. But with a large amount of hosts and working your way
> through upgrades you sometimes run into these situations. Therefor it would
> be welcome if it works.
>
> > I do not have a solution for you. I have tried to hack
> > /etc/libvirt/hooks/qemu but it didn't work.
> > Have you tried with other cpu models like x86_Opteron_G5 ? you can
> > find the cpu features of each cpu model in /usr/share/libvirt/cpu_map/
> >
>
> I have not tried that yet, but I can see if that works.
>
> The EPYC-IBPB CPU model is identical on 18.04 and 20.04, but even using
> that model we can't seem to migrate as it complains about the 'npt' feature.
>
> Wido
>
> > Anyway, even if the vm migration succeeds, you do not know if vm works
> > fine. I believe the best solution is upgrading all hosts to the same
> > OS version.
> >
> > -Wei
> >
> > On Tue, 23 Nov 2021 at 16:31, Wido den Hollander  wrote:
> >
> >> Hi,
> >>
> >> I'm trying to debug an issue with live migrations between Ubuntu
> >> 18.04 and 20.04 machines each with different CPUs:
> >>
> >> - Ubuntu 18.04 with AMD Epyc 7552 (Rome)
> >> - Ubuntu 20.04 with AMD Epyc 7662 (Milan)
> >>
> >> We are currently using this setting:
> >>
> >> guest.cpu.mode=custom
> >> guest.cpu.model=EPYC
> >>
> >> This does not allow for live migrations:
> >>
> >> Ubuntu 20.04 with Epyc 7662 to Ubuntu 18.04 with Epyc 7552 fails
> >>
> >> "ExecutionException : org.libvirt.LibvirtException: unsupported
> >> configuration: unknown CPU feature: npt"
> >>
> >> So we tried to define a set of features manually:
> >>
> >> guest.cpu.features=3dnowprefetch abm adx aes apic arat avx avx2 bmi1
> >> bmi2 clflush clflushopt cmov cr8legacy cx16 cx8 de f16c fma fpu
> >> fsgsbase fxsr fxsr_opt lahf_lm lm mca mce misalignsse mmx mmxext
> >> monitor movbe msr mtrr nx osvw pae pat pclmuldq pdpe1gb pge pni
> >> popcnt pse pse36 rdrand rd

RE: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-12-03 Thread Paul Angus
Which version(s) of QEMU are you using Wido?

We've just be upgrading CentOS 7.6 to 7.9
Most 7.6 hosts had qemu-ev 2.10 on it  (the buggy one). 2.12 was on the new 
hosts.
We were getting errors complaining that the ibpb CPU feature wasn't available 
when migrating to the updated OS hosts (even though identical hardware).

Upgrading qemu-ev to 2.12 on the originating host, then stopping and starting 
the VMs, then allowed us to migrate.  We couldn't find any solution that didn't 
involve stopping and starting the VMs.

Paul.

-Original Message-
From: Wido den Hollander 
Sent: Monday, November 29, 2021 7:57 AM
To: dev@cloudstack.apache.org; Wei ZHOU 
Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04



On 11/24/21 10:36 PM, Wei ZHOU wrote:
> Hi Wido,
>
> I think it is not good to run an environment with two ubuntu/qemu versions.
> It always happens that some cpu features are supported in the higher
> version but not supported in the older version.
> From my experience, the migration from older version to higher version
> works like a charm, but there were many issues in migration from
> higher version to older version.
>

I understand. But with a large amount of hosts and working your way through 
upgrades you sometimes run into these situations. Therefor it would be welcome 
if it works.

> I do not have a solution for you. I have tried to hack
> /etc/libvirt/hooks/qemu but it didn't work.
> Have you tried with other cpu models like x86_Opteron_G5 ? you can
> find the cpu features of each cpu model in /usr/share/libvirt/cpu_map/
>

I have not tried that yet, but I can see if that works.

The EPYC-IBPB CPU model is identical on 18.04 and 20.04, but even using that 
model we can't seem to migrate as it complains about the 'npt' feature.

Wido

> Anyway, even if the vm migration succeeds, you do not know if vm works
> fine. I believe the best solution is upgrading all hosts to the same
> OS version.
>
> -Wei
>
> On Tue, 23 Nov 2021 at 16:31, Wido den Hollander  wrote:
>
>> Hi,
>>
>> I'm trying to debug an issue with live migrations between Ubuntu
>> 18.04 and 20.04 machines each with different CPUs:
>>
>> - Ubuntu 18.04 with AMD Epyc 7552 (Rome)
>> - Ubuntu 20.04 with AMD Epyc 7662 (Milan)
>>
>> We are currently using this setting:
>>
>> guest.cpu.mode=custom
>> guest.cpu.model=EPYC
>>
>> This does not allow for live migrations:
>>
>> Ubuntu 20.04 with Epyc 7662 to Ubuntu 18.04 with Epyc 7552 fails
>>
>> "ExecutionException : org.libvirt.LibvirtException: unsupported
>> configuration: unknown CPU feature: npt"
>>
>> So we tried to define a set of features manually:
>>
>> guest.cpu.features=3dnowprefetch abm adx aes apic arat avx avx2 bmi1
>> bmi2 clflush clflushopt cmov cr8legacy cx16 cx8 de f16c fma fpu
>> fsgsbase fxsr fxsr_opt lahf_lm lm mca mce misalignsse mmx mmxext
>> monitor movbe msr mtrr nx osvw pae pat pclmuldq pdpe1gb pge pni
>> popcnt pse pse36 rdrand rdseed rdtscp sep sha-ni smap smep sse sse2
>> sse4.1 sse4.2 sse4a
>> ssse3 svm syscall tsc vme xgetbv1 xsave xsavec xsaveopt -npt -x2apic
>> -hypervisor -topoext -nrip-save
>>
>> This results in this going into the XML:
>>
>> 
>>
>> You would say that works, but then the target host (18.04 with the
>> 7552) says it doesn't support the feature 'npt' and the migration still 
>> fails.
>>
>> Now we could ofcourse use the kvm64 CPU from Qemu, but that's lacking
>> so many features that for example TLS offloading isn't available.
>>
>> I also tried to set 'EPYC-Rome' on the Ubuntu 20.04 hypervisor, but
>> it then complains on the Ubuntu 18.04 hypervisor that the CPU 'EPYC-Rome'
>> is unknown as the 18.04 hypervisor doesn't have that profile.
>>
>> Any ideas on how to get this working?
>>
>> Wido
>>
>
This message is confidential and may be legally privileged or otherwise 
protected from disclosure. If you are not the intended recipient, please 
telephone or email the sender and delete this message and any attachment from 
your system; you must not copy or disclose the contents of this message or any 
attachment to any other person. We may monitor email traffic and the content of 
internal and external messages sent to and from us to ensure compliance with 
internal policies and for the purposes of security.

Ticketmaster UK Limited. Registered Office: 30 St John Street, London EC1M 4AY. 
Registered in England and Wales. Company Number 02662632.


Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-11-28 Thread Wido den Hollander



On 11/24/21 10:36 PM, Wei ZHOU wrote:
> Hi Wido,
> 
> I think it is not good to run an environment with two ubuntu/qemu versions.
> It always happens that some cpu features are supported in the higher
> version but not supported in the older version.
> From my experience, the migration from older version to higher version
> works like a charm, but there were many issues in migration from higher
> version to older version.
> 

I understand. But with a large amount of hosts and working your way
through upgrades you sometimes run into these situations. Therefor it
would be welcome if it works.

> I do not have a solution for you. I have tried to hack
> /etc/libvirt/hooks/qemu but it didn't work.
> Have you tried with other cpu models like x86_Opteron_G5 ? you can find the
> cpu features of each cpu model in /usr/share/libvirt/cpu_map/
> 

I have not tried that yet, but I can see if that works.

The EPYC-IBPB CPU model is identical on 18.04 and 20.04, but even using
that model we can't seem to migrate as it complains about the 'npt' feature.

Wido

> Anyway, even if the vm migration succeeds, you do not know if vm works
> fine. I believe the best solution is upgrading all hosts to the same OS
> version.
> 
> -Wei
> 
> On Tue, 23 Nov 2021 at 16:31, Wido den Hollander  wrote:
> 
>> Hi,
>>
>> I'm trying to debug an issue with live migrations between Ubuntu 18.04
>> and 20.04 machines each with different CPUs:
>>
>> - Ubuntu 18.04 with AMD Epyc 7552 (Rome)
>> - Ubuntu 20.04 with AMD Epyc 7662 (Milan)
>>
>> We are currently using this setting:
>>
>> guest.cpu.mode=custom
>> guest.cpu.model=EPYC
>>
>> This does not allow for live migrations:
>>
>> Ubuntu 20.04 with Epyc 7662 to Ubuntu 18.04 with Epyc 7552 fails
>>
>> "ExecutionException : org.libvirt.LibvirtException: unsupported
>> configuration: unknown CPU feature: npt"
>>
>> So we tried to define a set of features manually:
>>
>> guest.cpu.features=3dnowprefetch abm adx aes apic arat avx avx2 bmi1
>> bmi2 clflush clflushopt cmov cr8legacy cx16 cx8 de f16c fma fpu fsgsbase
>> fxsr fxsr_opt lahf_lm lm mca mce misalignsse mmx mmxext monitor movbe
>> msr mtrr nx osvw pae pat pclmuldq pdpe1gb pge pni popcnt pse pse36
>> rdrand rdseed rdtscp sep sha-ni smap smep sse sse2 sse4.1 sse4.2 sse4a
>> ssse3 svm syscall tsc vme xgetbv1 xsave xsavec xsaveopt -npt -x2apic
>> -hypervisor -topoext -nrip-save
>>
>> This results in this going into the XML:
>>
>> 
>>
>> You would say that works, but then the target host (18.04 with the 7552)
>> says it doesn't support the feature 'npt' and the migration still fails.
>>
>> Now we could ofcourse use the kvm64 CPU from Qemu, but that's lacking so
>> many features that for example TLS offloading isn't available.
>>
>> I also tried to set 'EPYC-Rome' on the Ubuntu 20.04 hypervisor, but it
>> then complains on the Ubuntu 18.04 hypervisor that the CPU 'EPYC-Rome'
>> is unknown as the 18.04 hypervisor doesn't have that profile.
>>
>> Any ideas on how to get this working?
>>
>> Wido
>>
> 


Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-11-26 Thread Andrija Panic
Cant help, but I've seen this exact issue (if not mistaken) - a CPU flag
that DOES exist on the destination KVM host, but libvirt complaining it
doesn't - I would guess some kernel issue, as I've seen those.

On Wed, 24 Nov 2021 at 22:36, Wei ZHOU  wrote:

> Hi Wido,
>
> I think it is not good to run an environment with two ubuntu/qemu versions.
> It always happens that some cpu features are supported in the higher
> version but not supported in the older version.
> From my experience, the migration from older version to higher version
> works like a charm, but there were many issues in migration from higher
> version to older version.
>
> I do not have a solution for you. I have tried to hack
> /etc/libvirt/hooks/qemu but it didn't work.
> Have you tried with other cpu models like x86_Opteron_G5 ? you can find the
> cpu features of each cpu model in /usr/share/libvirt/cpu_map/
>
> Anyway, even if the vm migration succeeds, you do not know if vm works
> fine. I believe the best solution is upgrading all hosts to the same OS
> version.
>
> -Wei
>
> On Tue, 23 Nov 2021 at 16:31, Wido den Hollander  wrote:
>
> > Hi,
> >
> > I'm trying to debug an issue with live migrations between Ubuntu 18.04
> > and 20.04 machines each with different CPUs:
> >
> > - Ubuntu 18.04 with AMD Epyc 7552 (Rome)
> > - Ubuntu 20.04 with AMD Epyc 7662 (Milan)
> >
> > We are currently using this setting:
> >
> > guest.cpu.mode=custom
> > guest.cpu.model=EPYC
> >
> > This does not allow for live migrations:
> >
> > Ubuntu 20.04 with Epyc 7662 to Ubuntu 18.04 with Epyc 7552 fails
> >
> > "ExecutionException : org.libvirt.LibvirtException: unsupported
> > configuration: unknown CPU feature: npt"
> >
> > So we tried to define a set of features manually:
> >
> > guest.cpu.features=3dnowprefetch abm adx aes apic arat avx avx2 bmi1
> > bmi2 clflush clflushopt cmov cr8legacy cx16 cx8 de f16c fma fpu fsgsbase
> > fxsr fxsr_opt lahf_lm lm mca mce misalignsse mmx mmxext monitor movbe
> > msr mtrr nx osvw pae pat pclmuldq pdpe1gb pge pni popcnt pse pse36
> > rdrand rdseed rdtscp sep sha-ni smap smep sse sse2 sse4.1 sse4.2 sse4a
> > ssse3 svm syscall tsc vme xgetbv1 xsave xsavec xsaveopt -npt -x2apic
> > -hypervisor -topoext -nrip-save
> >
> > This results in this going into the XML:
> >
> > 
> >
> > You would say that works, but then the target host (18.04 with the 7552)
> > says it doesn't support the feature 'npt' and the migration still fails.
> >
> > Now we could ofcourse use the kvm64 CPU from Qemu, but that's lacking so
> > many features that for example TLS offloading isn't available.
> >
> > I also tried to set 'EPYC-Rome' on the Ubuntu 20.04 hypervisor, but it
> > then complains on the Ubuntu 18.04 hypervisor that the CPU 'EPYC-Rome'
> > is unknown as the 18.04 hypervisor doesn't have that profile.
> >
> > Any ideas on how to get this working?
> >
> > Wido
> >
>


-- 

Andrija Panić


Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04

2021-11-24 Thread Wei ZHOU
Hi Wido,

I think it is not good to run an environment with two ubuntu/qemu versions.
It always happens that some cpu features are supported in the higher
version but not supported in the older version.
>From my experience, the migration from older version to higher version
works like a charm, but there were many issues in migration from higher
version to older version.

I do not have a solution for you. I have tried to hack
/etc/libvirt/hooks/qemu but it didn't work.
Have you tried with other cpu models like x86_Opteron_G5 ? you can find the
cpu features of each cpu model in /usr/share/libvirt/cpu_map/

Anyway, even if the vm migration succeeds, you do not know if vm works
fine. I believe the best solution is upgrading all hosts to the same OS
version.

-Wei

On Tue, 23 Nov 2021 at 16:31, Wido den Hollander  wrote:

> Hi,
>
> I'm trying to debug an issue with live migrations between Ubuntu 18.04
> and 20.04 machines each with different CPUs:
>
> - Ubuntu 18.04 with AMD Epyc 7552 (Rome)
> - Ubuntu 20.04 with AMD Epyc 7662 (Milan)
>
> We are currently using this setting:
>
> guest.cpu.mode=custom
> guest.cpu.model=EPYC
>
> This does not allow for live migrations:
>
> Ubuntu 20.04 with Epyc 7662 to Ubuntu 18.04 with Epyc 7552 fails
>
> "ExecutionException : org.libvirt.LibvirtException: unsupported
> configuration: unknown CPU feature: npt"
>
> So we tried to define a set of features manually:
>
> guest.cpu.features=3dnowprefetch abm adx aes apic arat avx avx2 bmi1
> bmi2 clflush clflushopt cmov cr8legacy cx16 cx8 de f16c fma fpu fsgsbase
> fxsr fxsr_opt lahf_lm lm mca mce misalignsse mmx mmxext monitor movbe
> msr mtrr nx osvw pae pat pclmuldq pdpe1gb pge pni popcnt pse pse36
> rdrand rdseed rdtscp sep sha-ni smap smep sse sse2 sse4.1 sse4.2 sse4a
> ssse3 svm syscall tsc vme xgetbv1 xsave xsavec xsaveopt -npt -x2apic
> -hypervisor -topoext -nrip-save
>
> This results in this going into the XML:
>
> 
>
> You would say that works, but then the target host (18.04 with the 7552)
> says it doesn't support the feature 'npt' and the migration still fails.
>
> Now we could ofcourse use the kvm64 CPU from Qemu, but that's lacking so
> many features that for example TLS offloading isn't available.
>
> I also tried to set 'EPYC-Rome' on the Ubuntu 20.04 hypervisor, but it
> then complains on the Ubuntu 18.04 hypervisor that the CPU 'EPYC-Rome'
> is unknown as the 18.04 hypervisor doesn't have that profile.
>
> Any ideas on how to get this working?
>
> Wido
>