Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Babu Moger
Responding to Eduardo's question. Some emails are not comming to my
mailbox for some reason. Responding git send-email --in-reply-to.


>> > > > I understood that each die in EPYC chip is a numa node, which encodes
>> > > > NUMA node ID (system wide) in CPUID_Fn801E_ECX, that's why I
>> > > > wrote earlier that EPYC makes -numa non optional.
>> > > 
>> > > AFAIK, that isnt a hard requirement.  In bare metal EPYC machine I
>> > > have used, the BIOS lets you choose whether the dies are exposed as
>> > > 1, 2 or 4 NUMA nodes. So there's no fixed  die == numa node mapping
>> > > that I see.
>> > 
>> > If you change that setting, will all CPUID bits be kept the same,
>> > or the die topology seen by the OS will change?
>> 
>> I don't know offhand, and don't currently have access to the hardware.
>> All I know is that I was able to change between 1, 2 and 4 NUMA nodes
>> and that was reflected in numactl display, I didn't check the CPUID
>> when I was testing previously.
>
>Babu, do you know the answer here?
>
>If CPUID is kept the same with 1, 2 and 4 NUMA nodes, then having
>NUMA configured is not a requirement at all.

Yes. The CPUID are kept the same with 1, 2 and 4 NUMA nodes.
So, having numa configered in not a requirement. Following are the
details of NPS2 and NPS4. Seing the same behaviour with NPS1 also.

NPS2:

#lscpu
Architecture:x86_64
CPU op-mode(s):  32-bit, 64-bit
Byte Order:  Little Endian
CPU(s):  256
On-line CPU(s) list: 0-255
Thread(s) per core:  2
Core(s) per socket:  64
Socket(s):   2
NUMA node(s):4
Vendor ID:   AuthenticAMD
CPU family:  23
Model:   49
Model name:  AMD EPYC 7742 64-Core Processor
Stepping:0
CPU MHz: 1785.033
CPU max MHz: 2250.
CPU min MHz: 1500.
BogoMIPS:4491.51
Virtualization:  AMD-V
L1d cache:   32K
L1i cache:   32K
L2 cache:512K
L3 cache:16384K
NUMA node0 CPU(s):   0-31,128-159
NUMA node1 CPU(s):   32-63,160-191
NUMA node2 CPU(s):   64-95,192-223
NUMA node3 CPU(s):   96-127,224-255

#cpuid -l 0x801e -r

CPU 0:
   0x801e 0x00: eax=0x ebx=0x0100 ecx=0x edx=0x
CPU 1:
   0x801e 0x00: eax=0x0002 ebx=0x0101 ecx=0x edx=0x
CPU 2:
   0x801e 0x00: eax=0x0004 ebx=0x0102 ecx=0x edx=0x
CPU 3:
   0x801e 0x00: eax=0x0006 ebx=0x0103 ecx=0x edx=0x
CPU 4:
   0x801e 0x00: eax=0x0008 ebx=0x0104 ecx=0x edx=0x
CPU 5:
   0x801e 0x00: eax=0x000a ebx=0x0105 ecx=0x edx=0x
CPU 6:
   0x801e 0x00: eax=0x000c ebx=0x0106 ecx=0x edx=0x
CPU 7:
   0x801e 0x00: eax=0x000e ebx=0x0107 ecx=0x edx=0x
CPU 8:
   0x801e 0x00: eax=0x0010 ebx=0x0108 ecx=0x edx=0x
CPU 9:
   0x801e 0x00: eax=0x0012 ebx=0x0109 ecx=0x edx=0x
CPU 10:
   0x801e 0x00: eax=0x0014 ebx=0x010a ecx=0x edx=0x
CPU 11:
   0x801e 0x00: eax=0x0016 ebx=0x010b ecx=0x edx=0x
CPU 12:
   0x801e 0x00: eax=0x0018 ebx=0x010c ecx=0x edx=0x
CPU 13:
   0x801e 0x00: eax=0x001a ebx=0x010d ecx=0x edx=0x
CPU 14:
   0x801e 0x00: eax=0x001c ebx=0x010e ecx=0x edx=0x
CPU 15:
   0x801e 0x00: eax=0x001e ebx=0x010f ecx=0x edx=0x
CPU 16:
   0x801e 0x00: eax=0x0020 ebx=0x0110 ecx=0x edx=0x
CPU 17:
   0x801e 0x00: eax=0x0022 ebx=0x0111 ecx=0x edx=0x
CPU 18:
   0x801e 0x00: eax=0x0024 ebx=0x0112 ecx=0x edx=0x
CPU 19:
   0x801e 0x00: eax=0x0026 ebx=0x0113 ecx=0x edx=0x
CPU 20:
   0x801e 0x00: eax=0x0028 ebx=0x0114 ecx=0x edx=0x
CPU 21:
   0x801e 0x00: eax=0x002a ebx=0x0115 ecx=0x edx=0x
CPU 22:
   0x801e 0x00: eax=0x002c ebx=0x0116 ecx=0x edx=0x
CPU 23:
   0x801e 0x00: eax=0x002e ebx=0x0117 ecx=0x edx=0x
CPU 24:
   0x801e 0x00: eax=0x0030 ebx=0x0118 ecx=0x edx=0x
CPU 25:
   0x801e 0x00: eax=0x0032 ebx=0x0119 ecx=0x edx=0x
CPU 26:
   0x801e 0x00: eax=0x0034 ebx=0x011a ecx=0x edx=0x
CPU 27:
   0x801e 0x00: eax=0x0036 ebx=0x011b ecx=0x edx=0x
CPU 28:
   0x801e 0x00: eax=0x0038 ebx=0x011c ecx=0x edx=0x
CPU 29:
   0x801e 0x00: eax=0x003a ebx=0x011d ecx=0x edx=0x
CPU 30:
   0x801e 0x00: eax=0x003c ebx=0x011e ecx=0x edx=0x
CPU 31:
   0x801e 0x00: eax=0x003e ebx=0x011f ecx=0x 

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Eduardo Habkost
On Fri, Aug 28, 2020 at 05:32:51PM +0100, Daniel P. Berrangé wrote:
> On Fri, Aug 28, 2020 at 12:29:31PM -0400, Eduardo Habkost wrote:
> > On Fri, Aug 28, 2020 at 09:55:33AM +0100, Daniel P. Berrangé wrote:
> > > On Thu, Aug 27, 2020 at 10:55:26PM +0200, Igor Mammedov wrote:
> > > > On Thu, 27 Aug 2020 15:07:52 -0400
> > > > Eduardo Habkost  wrote:
> > > > 
> > > > > On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > > > > > On Wed, 26 Aug 2020 16:03:40 +0100
> > > > > > Daniel P. Berrangé  wrote:
> > > > > > 
> > > > > > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > > > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > > > > > Daniel P. Berrangé  wrote:
> > > > > > > > 
> > > > > > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > > > > > Daniel P. Berrangé  wrote:
> > > > > > > > > >   
> > > > > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov 
> > > > > > > > > > > wrote:  
> > > > > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > > > > > Babu Moger  wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > > To support some of the complex topology, we 
> > > > > > > > > > > > > introduced EPYC mode apicid decode.
> > > > > > > > > > > > > But, EPYC mode decode is running into problems. Also 
> > > > > > > > > > > > > it can become quite a
> > > > > > > > > > > > > maintenance problem in the future. So, it was decided 
> > > > > > > > > > > > > to remove that code and
> > > > > > > > > > > > > use the generic decode which works for majority of 
> > > > > > > > > > > > > the topology. Most of the
> > > > > > > > > > > > > SPECed configuration would work just fine. With some 
> > > > > > > > > > > > > non-SPECed user inputs,
> > > > > > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > > > > > > > > > > > > 
> > > > > > > > > > > > > This series removes all the EPYC mode specific apicid 
> > > > > > > > > > > > > changes and use the generic
> > > > > > > > > > > > > apicid decode.
> > > > > > > > > > > > 
> > > > > > > > > > > > the main difference between EPYC and all other CPUs is 
> > > > > > > > > > > > that
> > > > > > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > > > > > so we need an extra patch on top of this series to 
> > > > > > > > > > > > enfoce that, i.e:
> > > > > > > > > > > > 
> > > > > > > > > > > >  if (epyc && !numa) 
> > > > > > > > > > > > error("EPYC cpu requires numa to be configured")
> > > > > > > > > > > 
> > > > > > > > > > > Please no. This will break 90% of current usage of the 
> > > > > > > > > > > EPYC CPU in
> > > > > > > > > > > real world QEMU deployments. That is way too user hostile 
> > > > > > > > > > > to introduce
> > > > > > > > > > > as a requirement.
> > > > > > > > > > > 
> > > > > > > > > > > Why do we need to force this ?  People have been 
> > > > > > > > > > > successfuly using
> > > > > > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > > > > > 
> > > > > > > > > > > It might not match behaviour of bare metal silicon, but 
> > > > > > > > > > > that hasn't
> > > > > > > > > > > obviously caused the world to come crashing down.  
> > > > > > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > > > > > (resulting performance might be suboptimal), but I haven't 
> > > > > > > > > > seen
> > > > > > > > > > anyone reporting crashes yet.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > What other options do we have?
> > > > > > > > > > Perhaps we can turn on strict check for new machine types 
> > > > > > > > > > only,
> > > > > > > > > > so old configs can keep broken topology (CPUID),
> > > > > > > > > > while new ones would require -numa and produce correct 
> > > > > > > > > > topology.  
> > > > > > > > > 
> > > > > > > > > No, tieing this to machine types is not viable either. That 
> > > > > > > > > is still
> > > > > > > > > going to break essentially every single management 
> > > > > > > > > application that
> > > > > > > > > exists today using QEMU.
> > > > > > > > for that we have deprecation process, so users could switch to 
> > > > > > > > new CLI
> > > > > > > > that would be required.
> > > > > > > 
> > > > > > > We could, but I don't find the cost/benefit tradeoff is 
> > > > > > > compelling.
> > > > > > > 
> > > > > > > There are so many places where we diverge from what bare metal 
> > > > > > > would
> > > > > > > do, that I don't see a good reason to introduce this breakage, 
> > > > > > > even
> > > > > > > if we notify users via a deprecation message. 
> > > > > > I find (3) and (4) good enough reasons to use deprecation.
> > > > > > 
> > > > > > > If QEMU wants to require NUMA for EPYC, 

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Daniel P . Berrangé
On Fri, Aug 28, 2020 at 12:29:31PM -0400, Eduardo Habkost wrote:
> On Fri, Aug 28, 2020 at 09:55:33AM +0100, Daniel P. Berrangé wrote:
> > On Thu, Aug 27, 2020 at 10:55:26PM +0200, Igor Mammedov wrote:
> > > On Thu, 27 Aug 2020 15:07:52 -0400
> > > Eduardo Habkost  wrote:
> > > 
> > > > On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > > > > On Wed, 26 Aug 2020 16:03:40 +0100
> > > > > Daniel P. Berrangé  wrote:
> > > > > 
> > > > > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > > > > Daniel P. Berrangé  wrote:
> > > > > > > 
> > > > > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > > > > Daniel P. Berrangé  wrote:
> > > > > > > > >   
> > > > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov 
> > > > > > > > > > wrote:  
> > > > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > > > > Babu Moger  wrote:
> > > > > > > > > > > 
> > > > > > > > > > > > To support some of the complex topology, we introduced 
> > > > > > > > > > > > EPYC mode apicid decode.
> > > > > > > > > > > > But, EPYC mode decode is running into problems. Also it 
> > > > > > > > > > > > can become quite a
> > > > > > > > > > > > maintenance problem in the future. So, it was decided 
> > > > > > > > > > > > to remove that code and
> > > > > > > > > > > > use the generic decode which works for majority of the 
> > > > > > > > > > > > topology. Most of the
> > > > > > > > > > > > SPECed configuration would work just fine. With some 
> > > > > > > > > > > > non-SPECed user inputs,
> > > > > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > > > > > > > > > > > 
> > > > > > > > > > > > This series removes all the EPYC mode specific apicid 
> > > > > > > > > > > > changes and use the generic
> > > > > > > > > > > > apicid decode.
> > > > > > > > > > > 
> > > > > > > > > > > the main difference between EPYC and all other CPUs is 
> > > > > > > > > > > that
> > > > > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > > > > so we need an extra patch on top of this series to enfoce 
> > > > > > > > > > > that, i.e:
> > > > > > > > > > > 
> > > > > > > > > > >  if (epyc && !numa) 
> > > > > > > > > > > error("EPYC cpu requires numa to be configured")
> > > > > > > > > > 
> > > > > > > > > > Please no. This will break 90% of current usage of the EPYC 
> > > > > > > > > > CPU in
> > > > > > > > > > real world QEMU deployments. That is way too user hostile 
> > > > > > > > > > to introduce
> > > > > > > > > > as a requirement.
> > > > > > > > > > 
> > > > > > > > > > Why do we need to force this ?  People have been 
> > > > > > > > > > successfuly using
> > > > > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > > > > 
> > > > > > > > > > It might not match behaviour of bare metal silicon, but 
> > > > > > > > > > that hasn't
> > > > > > > > > > obviously caused the world to come crashing down.  
> > > > > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > > > > (resulting performance might be suboptimal), but I haven't 
> > > > > > > > > seen
> > > > > > > > > anyone reporting crashes yet.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > What other options do we have?
> > > > > > > > > Perhaps we can turn on strict check for new machine types 
> > > > > > > > > only,
> > > > > > > > > so old configs can keep broken topology (CPUID),
> > > > > > > > > while new ones would require -numa and produce correct 
> > > > > > > > > topology.  
> > > > > > > > 
> > > > > > > > No, tieing this to machine types is not viable either. That is 
> > > > > > > > still
> > > > > > > > going to break essentially every single management application 
> > > > > > > > that
> > > > > > > > exists today using QEMU.
> > > > > > > for that we have deprecation process, so users could switch to 
> > > > > > > new CLI
> > > > > > > that would be required.
> > > > > > 
> > > > > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > > > > 
> > > > > > There are so many places where we diverge from what bare metal would
> > > > > > do, that I don't see a good reason to introduce this breakage, even
> > > > > > if we notify users via a deprecation message. 
> > > > > I find (3) and (4) good enough reasons to use deprecation.
> > > > > 
> > > > > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > > > > create a single NUMA node if none was specified for new machine
> > > > > > types, such that there is no visible change or breakage to any
> > > > > > mgmt apps.  
> > > > > 
> > > > > (1) for configs that started without -numa &&|| without -smp dies>1,
> > > > > 

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Eduardo Habkost
On Fri, Aug 28, 2020 at 09:55:33AM +0100, Daniel P. Berrangé wrote:
> On Thu, Aug 27, 2020 at 10:55:26PM +0200, Igor Mammedov wrote:
> > On Thu, 27 Aug 2020 15:07:52 -0400
> > Eduardo Habkost  wrote:
> > 
> > > On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > > > On Wed, 26 Aug 2020 16:03:40 +0100
> > > > Daniel P. Berrangé  wrote:
> > > > 
> > > > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > > > Daniel P. Berrangé  wrote:
> > > > > > 
> > > > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > > > Daniel P. Berrangé  wrote:
> > > > > > > >   
> > > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov 
> > > > > > > > > wrote:  
> > > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > > > Babu Moger  wrote:
> > > > > > > > > > 
> > > > > > > > > > > To support some of the complex topology, we introduced 
> > > > > > > > > > > EPYC mode apicid decode.
> > > > > > > > > > > But, EPYC mode decode is running into problems. Also it 
> > > > > > > > > > > can become quite a
> > > > > > > > > > > maintenance problem in the future. So, it was decided to 
> > > > > > > > > > > remove that code and
> > > > > > > > > > > use the generic decode which works for majority of the 
> > > > > > > > > > > topology. Most of the
> > > > > > > > > > > SPECed configuration would work just fine. With some 
> > > > > > > > > > > non-SPECed user inputs,
> > > > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > > > > > > > > > > 
> > > > > > > > > > > This series removes all the EPYC mode specific apicid 
> > > > > > > > > > > changes and use the generic
> > > > > > > > > > > apicid decode.
> > > > > > > > > > 
> > > > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > > > so we need an extra patch on top of this series to enfoce 
> > > > > > > > > > that, i.e:
> > > > > > > > > > 
> > > > > > > > > >  if (epyc && !numa) 
> > > > > > > > > > error("EPYC cpu requires numa to be configured")
> > > > > > > > > 
> > > > > > > > > Please no. This will break 90% of current usage of the EPYC 
> > > > > > > > > CPU in
> > > > > > > > > real world QEMU deployments. That is way too user hostile to 
> > > > > > > > > introduce
> > > > > > > > > as a requirement.
> > > > > > > > > 
> > > > > > > > > Why do we need to force this ?  People have been successfuly 
> > > > > > > > > using
> > > > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > > > 
> > > > > > > > > It might not match behaviour of bare metal silicon, but that 
> > > > > > > > > hasn't
> > > > > > > > > obviously caused the world to come crashing down.  
> > > > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > > > anyone reporting crashes yet.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > What other options do we have?
> > > > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > > > so old configs can keep broken topology (CPUID),
> > > > > > > > while new ones would require -numa and produce correct 
> > > > > > > > topology.  
> > > > > > > 
> > > > > > > No, tieing this to machine types is not viable either. That is 
> > > > > > > still
> > > > > > > going to break essentially every single management application 
> > > > > > > that
> > > > > > > exists today using QEMU.
> > > > > > for that we have deprecation process, so users could switch to new 
> > > > > > CLI
> > > > > > that would be required.
> > > > > 
> > > > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > > > 
> > > > > There are so many places where we diverge from what bare metal would
> > > > > do, that I don't see a good reason to introduce this breakage, even
> > > > > if we notify users via a deprecation message. 
> > > > I find (3) and (4) good enough reasons to use deprecation.
> > > > 
> > > > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > > > create a single NUMA node if none was specified for new machine
> > > > > types, such that there is no visible change or breakage to any
> > > > > mgmt apps.  
> > > > 
> > > > (1) for configs that started without -numa &&|| without -smp dies>1,
> > > >   QEMU can do just that (enable auto_enable_numa).
> > > 
> > > Why exactly do we need auto_enable_numa with dies=1?
> > > 
> > > If I understand correctly, Babu said earlier in this thread[1]
> > > that we don't need auto_enable_numa.
> > > 
> > > [1] 
> > > 

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Igor Mammedov
On Fri, 28 Aug 2020 09:17:42 -0500
Babu Moger  wrote:

> > -Original Message-
> > From: Igor Mammedov 
> > Sent: Friday, August 28, 2020 6:25 AM
> > To: Daniel P. Berrangé 
> > Cc: Moger, Babu ; Dr. David Alan Gilbert
> > ; ehabk...@redhat.com; m...@redhat.com; Michal
> > Privoznik ; qemu-devel@nongnu.org;
> > pbonz...@redhat.com; r...@twiddle.net
> > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > generic decode
> > 
> > On Fri, 28 Aug 2020 09:58:03 +0100
> > Daniel P. Berrangé  wrote:
> >   
> > > On Thu, Aug 27, 2020 at 10:21:10PM +0200, Igor Mammedov wrote:  
> > > > On Wed, 26 Aug 2020 13:45:51 -0500
> > > > Babu Moger  wrote:
> > > >  
> > > > >
> > > > >  
> > > > > > -Original Message-
> > > > > > From: Dr. David Alan Gilbert 
> > > > > > Sent: Wednesday, August 26, 2020 1:34 PM
> > > > > > To: Moger, Babu 
> > > > > > Cc: Igor Mammedov ; Daniel P. Berrangé
> > > > > > ; ehabk...@redhat.com; m...@redhat.com;
> > > > > > Michal Privoznik ; qemu-  
> > de...@nongnu.org;  
> > > > > > pbonz...@redhat.com; r...@twiddle.net
> > > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and
> > > > > > use generic decode
> > > > > >
> > > > > > * Babu Moger (babu.mo...@amd.com) wrote:  
> > > > > > >  
> > > > > > > > -Original Message-
> > > > > > > > From: Igor Mammedov 
> > > > > > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > > > > > To: Daniel P. Berrangé 
> > > > > > > > Cc: Moger, Babu ;  
> > pbonz...@redhat.com;  
> > > > > > > > r...@twiddle.net; ehabk...@redhat.com; qemu-  
> > de...@nongnu.org;  
> > > > > > > > m...@redhat.com; Michal Privoznik 
> > > > > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode
> > > > > > > > and use generic decode
> > > > > > > >
> > > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100 Daniel P. Berrangé
> > > > > > > >  wrote:
> > > > > > > >  
> > > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov  
> > wrote:  
> > > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > > > > >  wrote:
> > > > > > > > > >  
> > > > > > > > > > > To support some of the complex topology, we introduced
> > > > > > > > > > > EPYC mode  
> > > > > > > > apicid decode.  
> > > > > > > > > > > But, EPYC mode decode is running into problems. Also
> > > > > > > > > > > it can become quite a maintenance problem in the
> > > > > > > > > > > future. So, it was decided to remove that code and use
> > > > > > > > > > > the generic decode which works for majority of the
> > > > > > > > > > > topology. Most of the SPECed configuration would work
> > > > > > > > > > > just fine. With some non-SPECed user inputs, it will
> > > > > > > > > > > create some sub-  
> > > > > > > > optimal configuration.  
> > > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=ht
> > > > > > > > > > > tps%3A%252
> > > > > > > > > > > F%2F
> > > > > > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-  
> > e468  
> > > > > > > > > > > -  
> > > > > > > > d5b437c1b25  
> > > > > > > > > > >  
> > > > > > > >  
> > > > > >  
> > 4%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C8a
> > 5c  
> > > > > > > > 52f92  
> > > > > > > > > > >  
> > > > > > > >  
> > > > > >  
> > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7  
> &

RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Babu Moger



> -Original Message-
> From: Igor Mammedov 
> Sent: Friday, August 28, 2020 3:43 AM
> To: Moger, Babu 
> Cc: Dr. David Alan Gilbert ; ehabk...@redhat.com;
> m...@redhat.com; qemu-devel@nongnu.org; pbonz...@redhat.com;
> r...@twiddle.net
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> generic decode
> 
> On Thu, 27 Aug 2020 17:58:01 -0500
> Babu Moger  wrote:
> 
> > > -Original Message-
> > > From: Igor Mammedov 
> > > Sent: Thursday, August 27, 2020 4:19 PM
> > > To: Dr. David Alan Gilbert 
> > > Cc: ehabk...@redhat.com; m...@redhat.com; qemu-devel@nongnu.org;
> > > Moger, Babu ; pbonz...@redhat.com;
> > > r...@twiddle.net
> > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > generic decode
> > >
> > > On Wed, 26 Aug 2020 15:10:46 +0100
> > > "Dr. David Alan Gilbert"  wrote:
> > >
> > > > * Igor Mammedov (imamm...@redhat.com) wrote:
> > > > > On Tue, 25 Aug 2020 16:25:21 +0100 "Dr. David Alan Gilbert"
> > > > >  wrote:
> > > > >
> > > > > > * Igor Mammedov (imamm...@redhat.com) wrote:
> > > > > > > On Tue, 25 Aug 2020 09:15:04 +0100 "Dr. David Alan Gilbert"
> > > > > > >  wrote:
> > > > > > >
> > > > > > > > * Babu Moger (babu.mo...@amd.com) wrote:
> > > > > > > > > Hi Dave,
> > > > > > > > >
> > > > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > > > > > > > > > * Babu Moger (babu.mo...@amd.com) wrote:
> > > > > > > > > >> To support some of the complex topology, we
> > > > > > > > > >> introduced EPYC
> > > mode apicid decode.
> > > > > > > > > >> But, EPYC mode decode is running into problems. Also
> > > > > > > > > >> it can become quite a maintenance problem in the
> > > > > > > > > >> future. So, it was decided to remove that code and
> > > > > > > > > >> use the generic decode which works for majority of
> > > > > > > > > >> the topology. Most of the SPECed configuration would
> > > > > > > > > >> work just fine. With some
> > > non-SPECed user inputs, it will create some sub-optimal configuration.
> > > > > > > > > >> Here is the discussion thread.
> > > > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=h
> > > > > > > > > >> ttps
> > > > > > > > > >> %3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-
> > > 1d84-a6e
> > > > > > > > > >> 7-e468-
> > > d5b437c1b254%40amd.com%2Fdata=02%7C01%7Cbabu.
> > > > > > > > > >>
> > > moger%40amd.com%7C9b15ee395daa4935640408d84acedf13%7C3dd8
> > > > > > > > > >>
> > > 961fe4884e608e11a82d994e183d%7C0%7C0%7C637341599663177545
> > > > > > > > > >>
> > > sdata=4okYGU%2F8QTYqEOZEd1EBC%2BEsIIrEV59HZrHzpbsR8s
> > > > > > > > > >> U%3Dreserved=0
> > > > > > > > > >>
> > > > > > > > > >> This series removes all the EPYC mode specific apicid
> > > > > > > > > >> changes
> > > and use the generic
> > > > > > > > > >> apicid decode.
> > > > > > > > > >
> > > > > > > > > > Hi Babu,
> > > > > > > > > >   This does simplify things a lot!
> > > > > > > > > > One worry, what happens about a live migration of a VM
> > > > > > > > > > from
> > > an old qemu
> > > > > > > > > > that was using the node-id to a qemu with this new scheme?
> > > > > > > > >
> > > > > > > > > The node_id which we introduced was only used
> > > > > > > > > internally. This
> > > wasn't
> > > > > > > > > exposed outside. I don't think live migration will be an 
> > > > > > > > > issue.
> > > > > > > >
> > > > > > > > Didn't it beco

RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Babu Moger



> -Original Message-
> From: Igor Mammedov 
> Sent: Friday, August 28, 2020 6:25 AM
> To: Daniel P. Berrangé 
> Cc: Moger, Babu ; Dr. David Alan Gilbert
> ; ehabk...@redhat.com; m...@redhat.com; Michal
> Privoznik ; qemu-devel@nongnu.org;
> pbonz...@redhat.com; r...@twiddle.net
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> generic decode
> 
> On Fri, 28 Aug 2020 09:58:03 +0100
> Daniel P. Berrangé  wrote:
> 
> > On Thu, Aug 27, 2020 at 10:21:10PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 13:45:51 -0500
> > > Babu Moger  wrote:
> > >
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Dr. David Alan Gilbert 
> > > > > Sent: Wednesday, August 26, 2020 1:34 PM
> > > > > To: Moger, Babu 
> > > > > Cc: Igor Mammedov ; Daniel P. Berrangé
> > > > > ; ehabk...@redhat.com; m...@redhat.com;
> > > > > Michal Privoznik ; qemu-
> de...@nongnu.org;
> > > > > pbonz...@redhat.com; r...@twiddle.net
> > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and
> > > > > use generic decode
> > > > >
> > > > > * Babu Moger (babu.mo...@amd.com) wrote:
> > > > > >
> > > > > > > -Original Message-
> > > > > > > From: Igor Mammedov 
> > > > > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > > > > To: Daniel P. Berrangé 
> > > > > > > Cc: Moger, Babu ;
> pbonz...@redhat.com;
> > > > > > > r...@twiddle.net; ehabk...@redhat.com; qemu-
> de...@nongnu.org;
> > > > > > > m...@redhat.com; Michal Privoznik 
> > > > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode
> > > > > > > and use generic decode
> > > > > > >
> > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100 Daniel P. Berrangé
> > > > > > >  wrote:
> > > > > > >
> > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov
> wrote:
> > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > > To support some of the complex topology, we introduced
> > > > > > > > > > EPYC mode
> > > > > > > apicid decode.
> > > > > > > > > > But, EPYC mode decode is running into problems. Also
> > > > > > > > > > it can become quite a maintenance problem in the
> > > > > > > > > > future. So, it was decided to remove that code and use
> > > > > > > > > > the generic decode which works for majority of the
> > > > > > > > > > topology. Most of the SPECed configuration would work
> > > > > > > > > > just fine. With some non-SPECed user inputs, it will
> > > > > > > > > > create some sub-
> > > > > > > optimal configuration.
> > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=ht
> > > > > > > > > > tps%3A%252
> > > > > > > > > > F%2F
> > > > > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-
> e468
> > > > > > > > > > -
> > > > > > > d5b437c1b25
> > > > > > > > > >
> > > > > > >
> > > > >
> 4%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C8a
> 5c
> > > > > > > 52f92
> > > > > > > > > >
> > > > > > >
> > > > >
> 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > > > > > > C0
> > > > > > > > > >
> > > > > > >
> > > > >
> %7C637340454473508873sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMe
> A
> > > > > > > YO%2B
> > > > > > > > > > 3WAzo3DeY5Ha8%3Dreserved=0
> > > > > > > > > >
> > > > > > > > > > This series removes all the EPYC mode specific apicid
> > > > > > > > > > 

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Igor Mammedov
On Fri, 28 Aug 2020 09:48:30 +0100
"Dr. David Alan Gilbert"  wrote:

> * Igor Mammedov (imamm...@redhat.com) wrote:
> > On Wed, 26 Aug 2020 15:10:46 +0100
> > "Dr. David Alan Gilbert"  wrote:
> >   
> > > * Igor Mammedov (imamm...@redhat.com) wrote:  
> > > > On Tue, 25 Aug 2020 16:25:21 +0100
> > > > "Dr. David Alan Gilbert"  wrote:
> > > >   
> > > > > * Igor Mammedov (imamm...@redhat.com) wrote:  
> > > > > > On Tue, 25 Aug 2020 09:15:04 +0100
> > > > > > "Dr. David Alan Gilbert"  wrote:
> > > > > > 
> > > > > > > * Babu Moger (babu.mo...@amd.com) wrote:
> > > > > > > > Hi Dave,
> > > > > > > > 
> > > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:  
> > > > > > > > > * Babu Moger (babu.mo...@amd.com) wrote:  
> > > > > > > > >> To support some of the complex topology, we introduced EPYC 
> > > > > > > > >> mode apicid decode.
> > > > > > > > >> But, EPYC mode decode is running into problems. Also it can 
> > > > > > > > >> become quite a
> > > > > > > > >> maintenance problem in the future. So, it was decided to 
> > > > > > > > >> remove that code and
> > > > > > > > >> use the generic decode which works for majority of the 
> > > > > > > > >> topology. Most of the
> > > > > > > > >> SPECed configuration would work just fine. With some 
> > > > > > > > >> non-SPECed user inputs,
> > > > > > > > >> it will create some sub-optimal configuration.
> > > > > > > > >> Here is the discussion thread.
> > > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3Dreserved=0
> > > > > > > > >>
> > > > > > > > >> This series removes all the EPYC mode specific apicid 
> > > > > > > > >> changes and use the generic
> > > > > > > > >> apicid decode.  
> > > > > > > > > 
> > > > > > > > > Hi Babu,
> > > > > > > > >   This does simplify things a lot!
> > > > > > > > > One worry, what happens about a live migration of a VM from 
> > > > > > > > > an old qemu
> > > > > > > > > that was using the node-id to a qemu with this new scheme?
> > > > > > > > >   
> > > > > > > > 
> > > > > > > > The node_id which we introduced was only used internally. This 
> > > > > > > > wasn't
> > > > > > > > exposed outside. I don't think live migration will be an issue. 
> > > > > > > >  
> > > > > > > 
> > > > > > > Didn't it become part of the APIC ID visible to the guest?
> > > > > > 
> > > > > > Daniel asked similar question wrt hard error on start up,
> > > > > > when CLI is not sufficient to create EPYC cpu.
> > > > > > 
> > > > > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > > > > > 
> > > > > > Migration might fall into the same category.
> > > > > > Also looking at the history, 5.0 commit 
> > > > > >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC 
> > > > > > based cpus models
> > > > > > silently broke APIC ID (without versioning), for all EPYC models 
> > > > > > (that's were 1 new and 1 old one).
> > > > > > 
> > > > > > (I'm not aware of somebody complaining about it)
> > > > > > 
> > > > > > Another commit ed78467a21459, changed CPUID_8000_001E without 
> > > > > > versioning as well.
> > > > > > 
> > > > > > 
> > > > > > With current EPYC apicid code, if all starts align (no numa or 1 
> > > > > > numa node only on
> > > > > > CLI and no -smp dies=) it might produce a valid CPU 
> > > > > > (apicid+CPUID_8000_001E).
> > > > > > No numa is gray area, since EPYC spec implies that it has to be 
> > > > > > numa machine in case of real EPYC cpus.
> > > > > > Multi-node configs would be correct only if user assigns cpus to 
> > > > > > numa nodes
> > > > > > by duplicating internal node_id algorithm that this series removes.
> > > > > > 
> > > > > > There might be other broken cases that I don't recall anymore
> > > > > > (should be mentioned in previous versions of this series)
> > > > > > 
> > > > > > 
> > > > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > > > 
> > > > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration
> > > > > 
> > > > > Oh 
> > > > >   
> > > > > >  2) with this series (lets call it qemu 5.2)
> > > > > >  pre-5.0 ==> qemu 5.2 - should work as series basically 
> > > > > > rollbacks current code to pre-5.0
> > > > > >  qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > > > 
> > > > > > It's all about picking which poison to choose,
> > > > > > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > > > > > doesn't work as expected.
> > > > > 
> > > > > I think that would make our lives easier for other reasons; so I'm 
> > > > > happy
> > > > > to go with that.  
> > > > 
> > > > to make things less painful for users, me wonders if there is a way
> > > > to block 

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Igor Mammedov
On Fri, 28 Aug 2020 09:58:03 +0100
Daniel P. Berrangé  wrote:

> On Thu, Aug 27, 2020 at 10:21:10PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 13:45:51 -0500
> > Babu Moger  wrote:
> >   
> > > 
> > >   
> > > > -Original Message-
> > > > From: Dr. David Alan Gilbert 
> > > > Sent: Wednesday, August 26, 2020 1:34 PM
> > > > To: Moger, Babu 
> > > > Cc: Igor Mammedov ; Daniel P. Berrangé
> > > > ; ehabk...@redhat.com; m...@redhat.com; Michal
> > > > Privoznik ; qemu-devel@nongnu.org;
> > > > pbonz...@redhat.com; r...@twiddle.net
> > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use 
> > > > generic
> > > > decode
> > > > 
> > > > * Babu Moger (babu.mo...@amd.com) wrote:  
> > > > >  
> > > > > > -Original Message-
> > > > > > From: Igor Mammedov 
> > > > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > > > To: Daniel P. Berrangé 
> > > > > > Cc: Moger, Babu ; pbonz...@redhat.com;
> > > > > > r...@twiddle.net; ehabk...@redhat.com; qemu-devel@nongnu.org;
> > > > > > m...@redhat.com; Michal Privoznik 
> > > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > > > > generic decode
> > > > > >
> > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > Daniel P. Berrangé  wrote:
> > > > > >  
> > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > > >  wrote:
> > > > > > > >  
> > > > > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > > > > mode  
> > > > > > apicid decode.  
> > > > > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > > > > become quite a maintenance problem in the future. So, it was
> > > > > > > > > decided to remove that code and use the generic decode which
> > > > > > > > > works for majority of the topology. Most of the SPECed
> > > > > > > > > configuration would work just fine. With some non-SPECed user
> > > > > > > > > inputs, it will create some sub-  
> > > > > > optimal configuration.  
> > > > > > > > > Here is the discussion thread.
> > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > > > > > F%2F
> > > > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-  
> > > > > > d5b437c1b25  
> > > > > > > > >  
> > > > > >  
> > > > 4%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C8a5c  
> > > > > > 52f92  
> > > > > > > > >  
> > > > > >  
> > > > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7  
> > > > > > C0  
> > > > > > > > >  
> > > > > >  
> > > > %7C637340454473508873sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA  
> > > > > > YO%2B  
> > > > > > > > > 3WAzo3DeY5Ha8%3Dreserved=0
> > > > > > > > >
> > > > > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > > > > and use the generic apicid decode.  
> > > > > > > >
> > > > > > > > the main difference between EPYC and all other CPUs is that it
> > > > > > > > requires numa configuration (it's not optional) so we need an
> > > > > > > > extra  
> > > > > No, That is not true. Because of that assumption we made all these
> > > > > apicid changes. And here we are now.
> > > > >
> > > > > AMD supports varies mixed configurations. In case of EPYC-Rome, we
> > > > > have NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1,
> > > > > basically we have all the cores in a socket under one numa node. This
> > > > > is non-numa configuration.
> > > > > Looking at the various co

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Daniel P . Berrangé
On Thu, Aug 27, 2020 at 10:21:10PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 13:45:51 -0500
> Babu Moger  wrote:
> 
> > 
> > 
> > > -Original Message-
> > > From: Dr. David Alan Gilbert 
> > > Sent: Wednesday, August 26, 2020 1:34 PM
> > > To: Moger, Babu 
> > > Cc: Igor Mammedov ; Daniel P. Berrangé
> > > ; ehabk...@redhat.com; m...@redhat.com; Michal
> > > Privoznik ; qemu-devel@nongnu.org;
> > > pbonz...@redhat.com; r...@twiddle.net
> > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> > > decode
> > > 
> > > * Babu Moger (babu.mo...@amd.com) wrote:
> > > >
> > > > > -Original Message-
> > > > > From: Igor Mammedov 
> > > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > > To: Daniel P. Berrangé 
> > > > > Cc: Moger, Babu ; pbonz...@redhat.com;
> > > > > r...@twiddle.net; ehabk...@redhat.com; qemu-devel@nongnu.org;
> > > > > m...@redhat.com; Michal Privoznik 
> > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > > > generic decode
> > > > >
> > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > Daniel P. Berrangé  wrote:
> > > > >
> > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > >  wrote:
> > > > > > >
> > > > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > > > mode
> > > > > apicid decode.
> > > > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > > > become quite a maintenance problem in the future. So, it was
> > > > > > > > decided to remove that code and use the generic decode which
> > > > > > > > works for majority of the topology. Most of the SPECed
> > > > > > > > configuration would work just fine. With some non-SPECed user
> > > > > > > > inputs, it will create some sub-
> > > > > optimal configuration.
> > > > > > > > Here is the discussion thread.
> > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > > > > F%2F
> > > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> > > > > d5b437c1b25
> > > > > > > >
> > > > >
> > > 4%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> > > > > 52f92
> > > > > > > >
> > > > >
> > > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > > > > C0
> > > > > > > >
> > > > >
> > > %7C637340454473508873sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> > > > > YO%2B
> > > > > > > > 3WAzo3DeY5Ha8%3Dreserved=0
> > > > > > > >
> > > > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > > > and use the generic apicid decode.
> > > > > > >
> > > > > > > the main difference between EPYC and all other CPUs is that it
> > > > > > > requires numa configuration (it's not optional) so we need an
> > > > > > > extra
> > > > No, That is not true. Because of that assumption we made all these
> > > > apicid changes. And here we are now.
> > > >
> > > > AMD supports varies mixed configurations. In case of EPYC-Rome, we
> > > > have NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1,
> > > > basically we have all the cores in a socket under one numa node. This
> > > > is non-numa configuration.
> > > > Looking at the various configurations and also discussing internally,
> > > > it is not advisable to have (epyc && !numa) check.
> > > 
> > > Indeed on real hardware, I don't think we always see NUMA; my single 
> > > socket,
> > > 16 core/32 thread 7302P Dell box, shows the kernel printing 'No NUMA
> > > configuration found...Faking a node.'
> looks like firmware bug or maybe it's feature and there is a knob in fw
> to turn it on/off in case used OS doesn't like it for some reas

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Daniel P . Berrangé
On Thu, Aug 27, 2020 at 10:55:26PM +0200, Igor Mammedov wrote:
> On Thu, 27 Aug 2020 15:07:52 -0400
> Eduardo Habkost  wrote:
> 
> > On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 16:03:40 +0100
> > > Daniel P. Berrangé  wrote:
> > > 
> > > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > > Daniel P. Berrangé  wrote:
> > > > > 
> > > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > > Daniel P. Berrangé  wrote:
> > > > > > >   
> > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > > Babu Moger  wrote:
> > > > > > > > > 
> > > > > > > > > > To support some of the complex topology, we introduced EPYC 
> > > > > > > > > > mode apicid decode.
> > > > > > > > > > But, EPYC mode decode is running into problems. Also it can 
> > > > > > > > > > become quite a
> > > > > > > > > > maintenance problem in the future. So, it was decided to 
> > > > > > > > > > remove that code and
> > > > > > > > > > use the generic decode which works for majority of the 
> > > > > > > > > > topology. Most of the
> > > > > > > > > > SPECed configuration would work just fine. With some 
> > > > > > > > > > non-SPECed user inputs,
> > > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > > > > > > > > > 
> > > > > > > > > > This series removes all the EPYC mode specific apicid 
> > > > > > > > > > changes and use the generic
> > > > > > > > > > apicid decode.
> > > > > > > > > 
> > > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > > so we need an extra patch on top of this series to enfoce 
> > > > > > > > > that, i.e:
> > > > > > > > > 
> > > > > > > > >  if (epyc && !numa) 
> > > > > > > > > error("EPYC cpu requires numa to be configured")
> > > > > > > > 
> > > > > > > > Please no. This will break 90% of current usage of the EPYC CPU 
> > > > > > > > in
> > > > > > > > real world QEMU deployments. That is way too user hostile to 
> > > > > > > > introduce
> > > > > > > > as a requirement.
> > > > > > > > 
> > > > > > > > Why do we need to force this ?  People have been successfuly 
> > > > > > > > using
> > > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > > 
> > > > > > > > It might not match behaviour of bare metal silicon, but that 
> > > > > > > > hasn't
> > > > > > > > obviously caused the world to come crashing down.  
> > > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > > anyone reporting crashes yet.
> > > > > > > 
> > > > > > > 
> > > > > > > What other options do we have?
> > > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > > so old configs can keep broken topology (CPUID),
> > > > > > > while new ones would require -numa and produce correct topology.  
> > > > > > 
> > > > > > No, tieing this to machine types is not viable either. That is still
> > > > > > going to break essentially every single management application that
> > > > > > exists today using QEMU.
> > > > > for that we have deprecation process, so users could switch to new CLI
> > > > > that would be required.
> > > > 
> > > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > > 
> > > > There are so many places where we diverge from what bare metal would
> > > > do, that I don't see a good reason to introduce this breakage, even
> > > > if we notify users via a deprecation message. 
> > > I find (3) and (4) good enough reasons to use deprecation.
> > > 
> > > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > > create a single NUMA node if none was specified for new machine
> > > > types, such that there is no visible change or breakage to any
> > > > mgmt apps.  
> > > 
> > > (1) for configs that started without -numa &&|| without -smp dies>1,
> > >   QEMU can do just that (enable auto_enable_numa).
> > 
> > Why exactly do we need auto_enable_numa with dies=1?
> > 
> > If I understand correctly, Babu said earlier in this thread[1]
> > that we don't need auto_enable_numa.
> > 
> > [1] 
> > https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d60...@amd.com/
> 
> in case of 1 die, -numa is not must have as it's one numa node only.
> Though having auto_enable_numa, will allow to reuse the CPU.node-id property
> to compose CPUID_Fn801E_ECX. i.e only code one path vs numa|non-numa 
> variant.
> 
>  
> > > (2) As for configs that are out of spec, I do not care much 

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Dr. David Alan Gilbert
* Igor Mammedov (imamm...@redhat.com) wrote:
> On Wed, 26 Aug 2020 15:10:46 +0100
> "Dr. David Alan Gilbert"  wrote:
> 
> > * Igor Mammedov (imamm...@redhat.com) wrote:
> > > On Tue, 25 Aug 2020 16:25:21 +0100
> > > "Dr. David Alan Gilbert"  wrote:
> > > 
> > > > * Igor Mammedov (imamm...@redhat.com) wrote:
> > > > > On Tue, 25 Aug 2020 09:15:04 +0100
> > > > > "Dr. David Alan Gilbert"  wrote:
> > > > >   
> > > > > > * Babu Moger (babu.mo...@amd.com) wrote:  
> > > > > > > Hi Dave,
> > > > > > > 
> > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > > > > > > > * Babu Moger (babu.mo...@amd.com) wrote:
> > > > > > > >> To support some of the complex topology, we introduced EPYC 
> > > > > > > >> mode apicid decode.
> > > > > > > >> But, EPYC mode decode is running into problems. Also it can 
> > > > > > > >> become quite a
> > > > > > > >> maintenance problem in the future. So, it was decided to 
> > > > > > > >> remove that code and
> > > > > > > >> use the generic decode which works for majority of the 
> > > > > > > >> topology. Most of the
> > > > > > > >> SPECed configuration would work just fine. With some 
> > > > > > > >> non-SPECed user inputs,
> > > > > > > >> it will create some sub-optimal configuration.
> > > > > > > >> Here is the discussion thread.
> > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3Dreserved=0
> > > > > > > >>
> > > > > > > >> This series removes all the EPYC mode specific apicid changes 
> > > > > > > >> and use the generic
> > > > > > > >> apicid decode.
> > > > > > > > 
> > > > > > > > Hi Babu,
> > > > > > > >   This does simplify things a lot!
> > > > > > > > One worry, what happens about a live migration of a VM from an 
> > > > > > > > old qemu
> > > > > > > > that was using the node-id to a qemu with this new scheme?
> > > > > > > 
> > > > > > > The node_id which we introduced was only used internally. This 
> > > > > > > wasn't
> > > > > > > exposed outside. I don't think live migration will be an issue.   
> > > > > > >  
> > > > > > 
> > > > > > Didn't it become part of the APIC ID visible to the guest?  
> > > > > 
> > > > > Daniel asked similar question wrt hard error on start up,
> > > > > when CLI is not sufficient to create EPYC cpu.
> > > > > 
> > > > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > > > > 
> > > > > Migration might fall into the same category.
> > > > > Also looking at the history, 5.0 commit 
> > > > >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC 
> > > > > based cpus models
> > > > > silently broke APIC ID (without versioning), for all EPYC models 
> > > > > (that's were 1 new and 1 old one).
> > > > > 
> > > > > (I'm not aware of somebody complaining about it)
> > > > > 
> > > > > Another commit ed78467a21459, changed CPUID_8000_001E without 
> > > > > versioning as well.
> > > > > 
> > > > > 
> > > > > With current EPYC apicid code, if all starts align (no numa or 1 numa 
> > > > > node only on
> > > > > CLI and no -smp dies=) it might produce a valid CPU 
> > > > > (apicid+CPUID_8000_001E).
> > > > > No numa is gray area, since EPYC spec implies that it has to be numa 
> > > > > machine in case of real EPYC cpus.
> > > > > Multi-node configs would be correct only if user assigns cpus to numa 
> > > > > nodes
> > > > > by duplicating internal node_id algorithm that this series removes.
> > > > > 
> > > > > There might be other broken cases that I don't recall anymore
> > > > > (should be mentioned in previous versions of this series)
> > > > > 
> > > > > 
> > > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > > 
> > > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> > > > 
> > > > Oh 
> > > > 
> > > > >  2) with this series (lets call it qemu 5.2)
> > > > >  pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks 
> > > > > current code to pre-5.0
> > > > >  qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > > 
> > > > > It's all about picking which poison to choose,
> > > > > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > > > > doesn't work as expected.  
> > > > 
> > > > I think that would make our lives easier for other reasons; so I'm happy
> > > > to go with that.
> > > 
> > > to make things less painful for users, me wonders if there is a way
> > > to block migration if epyc and specific QEMU versions are used?
> > 
> > We have no way to block based on version - and that's a pretty painful
> > thing to do; we can block based on machine type.
> > 
> > But before we get there; can we understand in which combinations that
> > things break and why exactly - would it break on a 1 or 2 

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-28 Thread Igor Mammedov
On Thu, 27 Aug 2020 17:58:01 -0500
Babu Moger  wrote:

> > -Original Message-
> > From: Igor Mammedov 
> > Sent: Thursday, August 27, 2020 4:19 PM
> > To: Dr. David Alan Gilbert 
> > Cc: ehabk...@redhat.com; m...@redhat.com; qemu-devel@nongnu.org;
> > Moger, Babu ; pbonz...@redhat.com;
> > r...@twiddle.net
> > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > generic decode
> > 
> > On Wed, 26 Aug 2020 15:10:46 +0100
> > "Dr. David Alan Gilbert"  wrote:
> >   
> > > * Igor Mammedov (imamm...@redhat.com) wrote:  
> > > > On Tue, 25 Aug 2020 16:25:21 +0100
> > > > "Dr. David Alan Gilbert"  wrote:
> > > >  
> > > > > * Igor Mammedov (imamm...@redhat.com) wrote:  
> > > > > > On Tue, 25 Aug 2020 09:15:04 +0100 "Dr. David Alan Gilbert"
> > > > > >  wrote:
> > > > > >  
> > > > > > > * Babu Moger (babu.mo...@amd.com) wrote:  
> > > > > > > > Hi Dave,
> > > > > > > >
> > > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:  
> > > > > > > > > * Babu Moger (babu.mo...@amd.com) wrote:  
> > > > > > > > >> To support some of the complex topology, we introduced EPYC  
> > mode apicid decode.  
> > > > > > > > >> But, EPYC mode decode is running into problems. Also it
> > > > > > > > >> can become quite a maintenance problem in the future. So,
> > > > > > > > >> it was decided to remove that code and use the generic
> > > > > > > > >> decode which works for majority of the topology. Most of
> > > > > > > > >> the SPECed configuration would work just fine. With some  
> > non-SPECed user inputs, it will create some sub-optimal configuration.  
> > > > > > > > >> Here is the discussion thread.
> > > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https
> > > > > > > > >> %3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-  
> > 1d84-a6e  
> > > > > > > > >> 7-e468-  
> > d5b437c1b254%40amd.com%2Fdata=02%7C01%7Cbabu.  
> > > > > > > > >>  
> > moger%40amd.com%7C9b15ee395daa4935640408d84acedf13%7C3dd8  
> > > > > > > > >>  
> > 961fe4884e608e11a82d994e183d%7C0%7C0%7C637341599663177545  
> > > > > > > > >>  
> > sdata=4okYGU%2F8QTYqEOZEd1EBC%2BEsIIrEV59HZrHzpbsR8s  
> > > > > > > > >> U%3Dreserved=0
> > > > > > > > >>
> > > > > > > > >> This series removes all the EPYC mode specific apicid 
> > > > > > > > >> changes  
> > and use the generic  
> > > > > > > > >> apicid decode.  
> > > > > > > > >
> > > > > > > > > Hi Babu,
> > > > > > > > >   This does simplify things a lot!
> > > > > > > > > One worry, what happens about a live migration of a VM from  
> > an old qemu  
> > > > > > > > > that was using the node-id to a qemu with this new scheme?  
> > > > > > > >
> > > > > > > > The node_id which we introduced was only used internally. This  
> > wasn't  
> > > > > > > > exposed outside. I don't think live migration will be an issue. 
> > > > > > > >  
> > > > > > >
> > > > > > > Didn't it become part of the APIC ID visible to the guest?  
> > > > > >
> > > > > > Daniel asked similar question wrt hard error on start up, when
> > > > > > CLI is not sufficient to create EPYC cpu.
> > > > > >
> > > > > >  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%  
> > > > > > 2Fwww.mail-archive.com%2Fqemu-  
> > devel%40nongnu.org%2Fmsg728536.htm  
> > > > > >  
> > ldata=02%7C01%7Cbabu.moger%40amd.com%7C9b15ee395daa49356
> > 404  
> > > > > >  
> > 08d84acedf13%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63734
> > 1  
> > > > > >  
> > 599663177545sdata=OnHz23W4F4TdYwlxPZwC%2B8YRY1K3qJ5U9Sfdo
> > Oc  
> > > > > > GXtw%3Dreserved=0
&g

RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-27 Thread Babu Moger



> -Original Message-
> From: Igor Mammedov 
> Sent: Thursday, August 27, 2020 4:19 PM
> To: Dr. David Alan Gilbert 
> Cc: ehabk...@redhat.com; m...@redhat.com; qemu-devel@nongnu.org;
> Moger, Babu ; pbonz...@redhat.com;
> r...@twiddle.net
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> generic decode
> 
> On Wed, 26 Aug 2020 15:10:46 +0100
> "Dr. David Alan Gilbert"  wrote:
> 
> > * Igor Mammedov (imamm...@redhat.com) wrote:
> > > On Tue, 25 Aug 2020 16:25:21 +0100
> > > "Dr. David Alan Gilbert"  wrote:
> > >
> > > > * Igor Mammedov (imamm...@redhat.com) wrote:
> > > > > On Tue, 25 Aug 2020 09:15:04 +0100 "Dr. David Alan Gilbert"
> > > > >  wrote:
> > > > >
> > > > > > * Babu Moger (babu.mo...@amd.com) wrote:
> > > > > > > Hi Dave,
> > > > > > >
> > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > > > > > > > * Babu Moger (babu.mo...@amd.com) wrote:
> > > > > > > >> To support some of the complex topology, we introduced EPYC
> mode apicid decode.
> > > > > > > >> But, EPYC mode decode is running into problems. Also it
> > > > > > > >> can become quite a maintenance problem in the future. So,
> > > > > > > >> it was decided to remove that code and use the generic
> > > > > > > >> decode which works for majority of the topology. Most of
> > > > > > > >> the SPECed configuration would work just fine. With some
> non-SPECed user inputs, it will create some sub-optimal configuration.
> > > > > > > >> Here is the discussion thread.
> > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https
> > > > > > > >> %3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-
> 1d84-a6e
> > > > > > > >> 7-e468-
> d5b437c1b254%40amd.com%2Fdata=02%7C01%7Cbabu.
> > > > > > > >>
> moger%40amd.com%7C9b15ee395daa4935640408d84acedf13%7C3dd8
> > > > > > > >>
> 961fe4884e608e11a82d994e183d%7C0%7C0%7C637341599663177545
> > > > > > > >>
> sdata=4okYGU%2F8QTYqEOZEd1EBC%2BEsIIrEV59HZrHzpbsR8s
> > > > > > > >> U%3Dreserved=0
> > > > > > > >>
> > > > > > > >> This series removes all the EPYC mode specific apicid changes
> and use the generic
> > > > > > > >> apicid decode.
> > > > > > > >
> > > > > > > > Hi Babu,
> > > > > > > >   This does simplify things a lot!
> > > > > > > > One worry, what happens about a live migration of a VM from
> an old qemu
> > > > > > > > that was using the node-id to a qemu with this new scheme?
> > > > > > >
> > > > > > > The node_id which we introduced was only used internally. This
> wasn't
> > > > > > > exposed outside. I don't think live migration will be an issue.
> > > > > >
> > > > > > Didn't it become part of the APIC ID visible to the guest?
> > > > >
> > > > > Daniel asked similar question wrt hard error on start up, when
> > > > > CLI is not sufficient to create EPYC cpu.
> > > > >
> > > > >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%
> > > > > 2Fwww.mail-archive.com%2Fqemu-
> devel%40nongnu.org%2Fmsg728536.htm
> > > > >
> ldata=02%7C01%7Cbabu.moger%40amd.com%7C9b15ee395daa49356
> 404
> > > > >
> 08d84acedf13%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63734
> 1
> > > > >
> 599663177545sdata=OnHz23W4F4TdYwlxPZwC%2B8YRY1K3qJ5U9Sfdo
> Oc
> > > > > GXtw%3Dreserved=0
> > > > >
> > > > > Migration might fall into the same category.
> > > > > Also looking at the history, 5.0 commit
> > > > >   247b18c593ec29 target/i386: Enable new apic id encoding for
> > > > > EPYC based cpus models silently broke APIC ID (without versioning),
> for all EPYC models (that's were 1 new and 1 old one).
> > > > >
> > > > > (I'm not aware of somebody complaining about it)
> > > > >
> > > > > Another commit ed78467a21459, changed CPUID_8000_001E without
> versioning as well.
>

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-27 Thread Igor Mammedov
On Wed, 26 Aug 2020 15:10:46 +0100
"Dr. David Alan Gilbert"  wrote:

> * Igor Mammedov (imamm...@redhat.com) wrote:
> > On Tue, 25 Aug 2020 16:25:21 +0100
> > "Dr. David Alan Gilbert"  wrote:
> > 
> > > * Igor Mammedov (imamm...@redhat.com) wrote:
> > > > On Tue, 25 Aug 2020 09:15:04 +0100
> > > > "Dr. David Alan Gilbert"  wrote:
> > > >   
> > > > > * Babu Moger (babu.mo...@amd.com) wrote:  
> > > > > > Hi Dave,
> > > > > > 
> > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > > > > > > * Babu Moger (babu.mo...@amd.com) wrote:
> > > > > > >> To support some of the complex topology, we introduced EPYC mode 
> > > > > > >> apicid decode.
> > > > > > >> But, EPYC mode decode is running into problems. Also it can 
> > > > > > >> become quite a
> > > > > > >> maintenance problem in the future. So, it was decided to remove 
> > > > > > >> that code and
> > > > > > >> use the generic decode which works for majority of the topology. 
> > > > > > >> Most of the
> > > > > > >> SPECed configuration would work just fine. With some non-SPECed 
> > > > > > >> user inputs,
> > > > > > >> it will create some sub-optimal configuration.
> > > > > > >> Here is the discussion thread.
> > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3Dreserved=0
> > > > > > >>
> > > > > > >> This series removes all the EPYC mode specific apicid changes 
> > > > > > >> and use the generic
> > > > > > >> apicid decode.
> > > > > > > 
> > > > > > > Hi Babu,
> > > > > > >   This does simplify things a lot!
> > > > > > > One worry, what happens about a live migration of a VM from an 
> > > > > > > old qemu
> > > > > > > that was using the node-id to a qemu with this new scheme?
> > > > > > 
> > > > > > The node_id which we introduced was only used internally. This 
> > > > > > wasn't
> > > > > > exposed outside. I don't think live migration will be an issue.
> > > > > 
> > > > > Didn't it become part of the APIC ID visible to the guest?  
> > > > 
> > > > Daniel asked similar question wrt hard error on start up,
> > > > when CLI is not sufficient to create EPYC cpu.
> > > > 
> > > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > > > 
> > > > Migration might fall into the same category.
> > > > Also looking at the history, 5.0 commit 
> > > >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC 
> > > > based cpus models
> > > > silently broke APIC ID (without versioning), for all EPYC models 
> > > > (that's were 1 new and 1 old one).
> > > > 
> > > > (I'm not aware of somebody complaining about it)
> > > > 
> > > > Another commit ed78467a21459, changed CPUID_8000_001E without 
> > > > versioning as well.
> > > > 
> > > > 
> > > > With current EPYC apicid code, if all starts align (no numa or 1 numa 
> > > > node only on
> > > > CLI and no -smp dies=) it might produce a valid CPU 
> > > > (apicid+CPUID_8000_001E).
> > > > No numa is gray area, since EPYC spec implies that it has to be numa 
> > > > machine in case of real EPYC cpus.
> > > > Multi-node configs would be correct only if user assigns cpus to numa 
> > > > nodes
> > > > by duplicating internal node_id algorithm that this series removes.
> > > > 
> > > > There might be other broken cases that I don't recall anymore
> > > > (should be mentioned in previous versions of this series)
> > > > 
> > > > 
> > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > 
> > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> > > 
> > > Oh 
> > > 
> > > >  2) with this series (lets call it qemu 5.2)
> > > >  pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks 
> > > > current code to pre-5.0
> > > >  qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > 
> > > > It's all about picking which poison to choose,
> > > > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > > > doesn't work as expected.  
> > > 
> > > I think that would make our lives easier for other reasons; so I'm happy
> > > to go with that.
> > 
> > to make things less painful for users, me wonders if there is a way
> > to block migration if epyc and specific QEMU versions are used?
> 
> We have no way to block based on version - and that's a pretty painful
> thing to do; we can block based on machine type.
> 
> But before we get there; can we understand in which combinations that
> things break and why exactly - would it break on a 1 or 2 vCPU guest -
> or would it only break when we get to the point the upper bits start
> being used for example?  Why exaclty would it break - i.e. is it going
> to change the name of sections in the migration stream - or are the
> values we need actually going to migrate 

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-27 Thread Igor Mammedov
On Thu, 27 Aug 2020 15:07:52 -0400
Eduardo Habkost  wrote:

> On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 16:03:40 +0100
> > Daniel P. Berrangé  wrote:
> > 
> > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > Daniel P. Berrangé  wrote:
> > > > 
> > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > Daniel P. Berrangé  wrote:
> > > > > >   
> > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > Babu Moger  wrote:
> > > > > > > > 
> > > > > > > > > To support some of the complex topology, we introduced EPYC 
> > > > > > > > > mode apicid decode.
> > > > > > > > > But, EPYC mode decode is running into problems. Also it can 
> > > > > > > > > become quite a
> > > > > > > > > maintenance problem in the future. So, it was decided to 
> > > > > > > > > remove that code and
> > > > > > > > > use the generic decode which works for majority of the 
> > > > > > > > > topology. Most of the
> > > > > > > > > SPECed configuration would work just fine. With some 
> > > > > > > > > non-SPECed user inputs,
> > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > Here is the discussion thread.
> > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > > > > > > > > 
> > > > > > > > > This series removes all the EPYC mode specific apicid changes 
> > > > > > > > > and use the generic
> > > > > > > > > apicid decode.
> > > > > > > > 
> > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > so we need an extra patch on top of this series to enfoce that, 
> > > > > > > > i.e:
> > > > > > > > 
> > > > > > > >  if (epyc && !numa) 
> > > > > > > > error("EPYC cpu requires numa to be configured")
> > > > > > > 
> > > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > > real world QEMU deployments. That is way too user hostile to 
> > > > > > > introduce
> > > > > > > as a requirement.
> > > > > > > 
> > > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > 
> > > > > > > It might not match behaviour of bare metal silicon, but that 
> > > > > > > hasn't
> > > > > > > obviously caused the world to come crashing down.  
> > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > anyone reporting crashes yet.
> > > > > > 
> > > > > > 
> > > > > > What other options do we have?
> > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > so old configs can keep broken topology (CPUID),
> > > > > > while new ones would require -numa and produce correct topology.  
> > > > > 
> > > > > No, tieing this to machine types is not viable either. That is still
> > > > > going to break essentially every single management application that
> > > > > exists today using QEMU.
> > > > for that we have deprecation process, so users could switch to new CLI
> > > > that would be required.
> > > 
> > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > 
> > > There are so many places where we diverge from what bare metal would
> > > do, that I don't see a good reason to introduce this breakage, even
> > > if we notify users via a deprecation message. 
> > I find (3) and (4) good enough reasons to use deprecation.
> > 
> > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > create a single NUMA node if none was specified for new machine
> > > types, such that there is no visible change or breakage to any
> > > mgmt apps.  
> > 
> > (1) for configs that started without -numa &&|| without -smp dies>1,
> >   QEMU can do just that (enable auto_enable_numa).
> 
> Why exactly do we need auto_enable_numa with dies=1?
> 
> If I understand correctly, Babu said earlier in this thread[1]
> that we don't need auto_enable_numa.
> 
> [1] 
> https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d60...@amd.com/

in case of 1 die, -numa is not must have as it's one numa node only.
Though having auto_enable_numa, will allow to reuse the CPU.node-id property
to compose CPUID_Fn801E_ECX. i.e only code one path vs numa|non-numa 
variant.

 
> > (2) As for configs that are out of spec, I do not care much (junk in - junk 
> > out)
> > (though not having to spend time on bug reports and debug issues, just to 
> > say
> > it's not supported in the end, makes deprecation sound like a reasonable
> > choice)
> > 
> > (3) However if config matches bare metal i.e. CPU has more than 1 die and 
> > within
> > 

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-27 Thread Igor Mammedov
On Wed, 26 Aug 2020 13:45:51 -0500
Babu Moger  wrote:

> 
> 
> > -Original Message-
> > From: Dr. David Alan Gilbert 
> > Sent: Wednesday, August 26, 2020 1:34 PM
> > To: Moger, Babu 
> > Cc: Igor Mammedov ; Daniel P. Berrangé
> > ; ehabk...@redhat.com; m...@redhat.com; Michal
> > Privoznik ; qemu-devel@nongnu.org;
> > pbonz...@redhat.com; r...@twiddle.net
> > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> > decode
> > 
> > * Babu Moger (babu.mo...@amd.com) wrote:
> > >
> > > > -Original Message-
> > > > From: Igor Mammedov 
> > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > To: Daniel P. Berrangé 
> > > > Cc: Moger, Babu ; pbonz...@redhat.com;
> > > > r...@twiddle.net; ehabk...@redhat.com; qemu-devel@nongnu.org;
> > > > m...@redhat.com; Michal Privoznik 
> > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > > generic decode
> > > >
> > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > Daniel P. Berrangé  wrote:
> > > >
> > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > >  wrote:
> > > > > >
> > > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > > mode
> > > > apicid decode.
> > > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > > become quite a maintenance problem in the future. So, it was
> > > > > > > decided to remove that code and use the generic decode which
> > > > > > > works for majority of the topology. Most of the SPECed
> > > > > > > configuration would work just fine. With some non-SPECed user
> > > > > > > inputs, it will create some sub-
> > > > optimal configuration.
> > > > > > > Here is the discussion thread.
> > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > > > F%2F
> > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> > > > d5b437c1b25
> > > > > > >
> > > >
> > 4%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> > > > 52f92
> > > > > > >
> > > >
> > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > > > C0
> > > > > > >
> > > >
> > %7C637340454473508873sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> > > > YO%2B
> > > > > > > 3WAzo3DeY5Ha8%3Dreserved=0
> > > > > > >
> > > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > > and use the generic apicid decode.
> > > > > >
> > > > > > the main difference between EPYC and all other CPUs is that it
> > > > > > requires numa configuration (it's not optional) so we need an
> > > > > > extra
> > > No, That is not true. Because of that assumption we made all these
> > > apicid changes. And here we are now.
> > >
> > > AMD supports varies mixed configurations. In case of EPYC-Rome, we
> > > have NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1,
> > > basically we have all the cores in a socket under one numa node. This
> > > is non-numa configuration.
> > > Looking at the various configurations and also discussing internally,
> > > it is not advisable to have (epyc && !numa) check.
> > 
> > Indeed on real hardware, I don't think we always see NUMA; my single socket,
> > 16 core/32 thread 7302P Dell box, shows the kernel printing 'No NUMA
> > configuration found...Faking a node.'
looks like firmware bug or maybe it's feature and there is a knob in fw
to turn it on/off in case used OS doesn't like it for some reason.


> > So if real hardware hasn't got a NUMA node, what's the real problem?
> 
> I don't see any problem once we revert all these changes(patch 1-7).
> We don't need if (epyc && !numa) error check or auto_enable_numa=true
> unconditionally.

We need revert to unbreak migration from QEMU < 5.0,
everything else (fixes for CPUID_Fn801E) could go on top.

So what's on top (because old code also wasn't correct when
CPUID_Fn801E is taken in account, tha's why we

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-27 Thread Eduardo Habkost
On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 16:03:40 +0100
> Daniel P. Berrangé  wrote:
> 
> > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > Daniel P. Berrangé  wrote:
> > > 
> > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > Daniel P. Berrangé  wrote:
> > > > >   
> > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > Babu Moger  wrote:
> > > > > > > 
> > > > > > > > To support some of the complex topology, we introduced EPYC 
> > > > > > > > mode apicid decode.
> > > > > > > > But, EPYC mode decode is running into problems. Also it can 
> > > > > > > > become quite a
> > > > > > > > maintenance problem in the future. So, it was decided to remove 
> > > > > > > > that code and
> > > > > > > > use the generic decode which works for majority of the 
> > > > > > > > topology. Most of the
> > > > > > > > SPECed configuration would work just fine. With some non-SPECed 
> > > > > > > > user inputs,
> > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > Here is the discussion thread.
> > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > > > > > > > 
> > > > > > > > This series removes all the EPYC mode specific apicid changes 
> > > > > > > > and use the generic
> > > > > > > > apicid decode.
> > > > > > > 
> > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > it requires numa configuration (it's not optional)
> > > > > > > so we need an extra patch on top of this series to enfoce that, 
> > > > > > > i.e:
> > > > > > > 
> > > > > > >  if (epyc && !numa) 
> > > > > > > error("EPYC cpu requires numa to be configured")
> > > > > > 
> > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > real world QEMU deployments. That is way too user hostile to 
> > > > > > introduce
> > > > > > as a requirement.
> > > > > > 
> > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > 
> > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > obviously caused the world to come crashing down.  
> > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > anyone reporting crashes yet.
> > > > > 
> > > > > 
> > > > > What other options do we have?
> > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > so old configs can keep broken topology (CPUID),
> > > > > while new ones would require -numa and produce correct topology.  
> > > > 
> > > > No, tieing this to machine types is not viable either. That is still
> > > > going to break essentially every single management application that
> > > > exists today using QEMU.
> > > for that we have deprecation process, so users could switch to new CLI
> > > that would be required.
> > 
> > We could, but I don't find the cost/benefit tradeoff is compelling.
> > 
> > There are so many places where we diverge from what bare metal would
> > do, that I don't see a good reason to introduce this breakage, even
> > if we notify users via a deprecation message. 
> I find (3) and (4) good enough reasons to use deprecation.
> 
> > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > create a single NUMA node if none was specified for new machine
> > types, such that there is no visible change or breakage to any
> > mgmt apps.  
> 
> (1) for configs that started without -numa &&|| without -smp dies>1,
>   QEMU can do just that (enable auto_enable_numa).

Why exactly do we need auto_enable_numa with dies=1?

If I understand correctly, Babu said earlier in this thread[1]
that we don't need auto_enable_numa.

[1] 
https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d60...@amd.com/

> 
> (2) As for configs that are out of spec, I do not care much (junk in - junk 
> out)
> (though not having to spend time on bug reports and debug issues, just to say
> it's not supported in the end, makes deprecation sound like a reasonable
> choice)
> 
> (3) However if config matches bare metal i.e. CPU has more than 1 die and 
> within
> dies limits (spec wise), QEMU has to produce valid CPUs.
> In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> on user's behalf. That's where we have to error out and ask for explicit
> numa configuration.
> 
> For such configs, current code (since 5.0), will produce in the best case
> performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> in the worst case issues might be related to invalid APIC ID if running on 
> EPYC host
> and HW takes in 

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-27 Thread Igor Mammedov
On Wed, 26 Aug 2020 16:03:40 +0100
Daniel P. Berrangé  wrote:

> On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 14:36:38 +0100
> > Daniel P. Berrangé  wrote:
> > 
> > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > Daniel P. Berrangé  wrote:
> > > >   
> > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > Babu Moger  wrote:
> > > > > > 
> > > > > > > To support some of the complex topology, we introduced EPYC mode 
> > > > > > > apicid decode.
> > > > > > > But, EPYC mode decode is running into problems. Also it can 
> > > > > > > become quite a
> > > > > > > maintenance problem in the future. So, it was decided to remove 
> > > > > > > that code and
> > > > > > > use the generic decode which works for majority of the topology. 
> > > > > > > Most of the
> > > > > > > SPECed configuration would work just fine. With some non-SPECed 
> > > > > > > user inputs,
> > > > > > > it will create some sub-optimal configuration.
> > > > > > > Here is the discussion thread.
> > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > > > > > > 
> > > > > > > This series removes all the EPYC mode specific apicid changes and 
> > > > > > > use the generic
> > > > > > > apicid decode.
> > > > > > 
> > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > it requires numa configuration (it's not optional)
> > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > 
> > > > > >  if (epyc && !numa) 
> > > > > > error("EPYC cpu requires numa to be configured")
> > > > > 
> > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > as a requirement.
> > > > > 
> > > > > Why do we need to force this ?  People have been successfuly using
> > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > 
> > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > obviously caused the world to come crashing down.  
> > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > (resulting performance might be suboptimal), but I haven't seen
> > > > anyone reporting crashes yet.
> > > > 
> > > > 
> > > > What other options do we have?
> > > > Perhaps we can turn on strict check for new machine types only,
> > > > so old configs can keep broken topology (CPUID),
> > > > while new ones would require -numa and produce correct topology.  
> > > 
> > > No, tieing this to machine types is not viable either. That is still
> > > going to break essentially every single management application that
> > > exists today using QEMU.
> > for that we have deprecation process, so users could switch to new CLI
> > that would be required.
> 
> We could, but I don't find the cost/benefit tradeoff is compelling.
> 
> There are so many places where we diverge from what bare metal would
> do, that I don't see a good reason to introduce this breakage, even
> if we notify users via a deprecation message. 
I find (3) and (4) good enough reasons to use deprecation.

> If QEMU wants to require NUMA for EPYC, then QEMU could internally
> create a single NUMA node if none was specified for new machine
> types, such that there is no visible change or breakage to any
> mgmt apps.  

(1) for configs that started without -numa &&|| without -smp dies>1,
  QEMU can do just that (enable auto_enable_numa).

(2) As for configs that are out of spec, I do not care much (junk in - junk out)
(though not having to spend time on bug reports and debug issues, just to say
it's not supported in the end, makes deprecation sound like a reasonable
choice)

(3) However if config matches bare metal i.e. CPU has more than 1 die and within
dies limits (spec wise), QEMU has to produce valid CPUs.
In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
on user's behalf. That's where we have to error out and ask for explicit
numa configuration.

For such configs, current code (since 5.0), will produce in the best case
performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
in the worst case issues might be related to invalid APIC ID if running on EPYC 
host
and HW takes in account subfields of APIC ID (according to Babu real CPU uses
die_id(aka node_id) internally).
I'd rather error out on nonsense configs earlier than debug such issues
and than error out anyways later (upsetting more users).

(4)
If I were non hobby user, I'd hate if QEMU allowed me to start invalid config,
that I'd have to spend time on debugging issues (including performance ones),
instead of clearly telling me what's wrong and how config should be corrected.
I'd probably jump to another hypervisor that does the job right,
instead of 

RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Babu Moger



> -Original Message-
> From: Dr. David Alan Gilbert 
> Sent: Wednesday, August 26, 2020 1:34 PM
> To: Moger, Babu 
> Cc: Igor Mammedov ; Daniel P. Berrangé
> ; ehabk...@redhat.com; m...@redhat.com; Michal
> Privoznik ; qemu-devel@nongnu.org;
> pbonz...@redhat.com; r...@twiddle.net
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> decode
> 
> * Babu Moger (babu.mo...@amd.com) wrote:
> >
> > > -Original Message-
> > > From: Igor Mammedov 
> > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > To: Daniel P. Berrangé 
> > > Cc: Moger, Babu ; pbonz...@redhat.com;
> > > r...@twiddle.net; ehabk...@redhat.com; qemu-devel@nongnu.org;
> > > m...@redhat.com; Michal Privoznik 
> > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > generic decode
> > >
> > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > Daniel P. Berrangé  wrote:
> > >
> > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > >  wrote:
> > > > >
> > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > mode
> > > apicid decode.
> > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > become quite a maintenance problem in the future. So, it was
> > > > > > decided to remove that code and use the generic decode which
> > > > > > works for majority of the topology. Most of the SPECed
> > > > > > configuration would work just fine. With some non-SPECed user
> > > > > > inputs, it will create some sub-
> > > optimal configuration.
> > > > > > Here is the discussion thread.
> > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > > F%2F
> > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> > > d5b437c1b25
> > > > > >
> > >
> 4%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> > > 52f92
> > > > > >
> > >
> 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > > C0
> > > > > >
> > >
> %7C637340454473508873sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> > > YO%2B
> > > > > > 3WAzo3DeY5Ha8%3Dreserved=0
> > > > > >
> > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > and use the generic apicid decode.
> > > > >
> > > > > the main difference between EPYC and all other CPUs is that it
> > > > > requires numa configuration (it's not optional) so we need an
> > > > > extra
> > No, That is not true. Because of that assumption we made all these
> > apicid changes. And here we are now.
> >
> > AMD supports varies mixed configurations. In case of EPYC-Rome, we
> > have NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1,
> > basically we have all the cores in a socket under one numa node. This
> > is non-numa configuration.
> > Looking at the various configurations and also discussing internally,
> > it is not advisable to have (epyc && !numa) check.
> 
> Indeed on real hardware, I don't think we always see NUMA; my single socket,
> 16 core/32 thread 7302P Dell box, shows the kernel printing 'No NUMA
> configuration found...Faking a node.'
> 
> So if real hardware hasn't got a NUMA node, what's the real problem?

I don't see any problem once we revert all these changes(patch 1-7).
We don't need if (epyc && !numa) error check or auto_enable_numa=true
unconditionally.

> 
> Dave
> 
> > > > > patch on top of this series to enfoce that, i.e:
> > > > >
> > > > >  if (epyc && !numa)
> > > > > error("EPYC cpu requires numa to be configured")
> > > >
> > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > real world QEMU deployments. That is way too user hostile to
> > > > introduce as a requirement.
> > > >
> > > > Why do we need to force this ?  People have been successfuly using
> > > > EPYC CPUs without NUMA in QEMU for years now.
> > > >
> > > > It might not match behaviour of bare metal silicon, but that
> > > > hasn't obviously caused the world to come crashing down.
> > > So far it produces warning in linux kernel (RHBZ1728166), (resulting
> > > performance might be suboptimal), but I haven't seen anyone reporting
> crashes yet.
> > >
> > >
> > > What other options do we have?
> > > Perhaps we can turn on strict check for new machine types only, so
> > > old configs can keep broken topology (CPUID), while new ones would
> > > require -numa and produce correct topology.
> > >
> > >
> > > >
> > > > Regards,
> > > > Daniel
> >
> >
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Dr. David Alan Gilbert
* Babu Moger (babu.mo...@amd.com) wrote:
> 
> > -Original Message-
> > From: Igor Mammedov 
> > Sent: Wednesday, August 26, 2020 8:31 AM
> > To: Daniel P. Berrangé 
> > Cc: Moger, Babu ; pbonz...@redhat.com;
> > r...@twiddle.net; ehabk...@redhat.com; qemu-devel@nongnu.org;
> > m...@redhat.com; Michal Privoznik 
> > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> > decode
> > 
> > On Wed, 26 Aug 2020 13:50:59 +0100
> > Daniel P. Berrangé  wrote:
> > 
> > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > Babu Moger  wrote:
> > > >
> > > > > To support some of the complex topology, we introduced EPYC mode
> > apicid decode.
> > > > > But, EPYC mode decode is running into problems. Also it can become
> > > > > quite a maintenance problem in the future. So, it was decided to
> > > > > remove that code and use the generic decode which works for
> > > > > majority of the topology. Most of the SPECed configuration would
> > > > > work just fine. With some non-SPECed user inputs, it will create some 
> > > > > sub-
> > optimal configuration.
> > > > > Here is the discussion thread.
> > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> > d5b437c1b25
> > > > >
> > 4%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> > 52f92
> > > > >
> > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > C0
> > > > >
> > %7C637340454473508873sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> > YO%2B
> > > > > 3WAzo3DeY5Ha8%3Dreserved=0
> > > > >
> > > > > This series removes all the EPYC mode specific apicid changes and
> > > > > use the generic apicid decode.
> > > >
> > > > the main difference between EPYC and all other CPUs is that it
> > > > requires numa configuration (it's not optional) so we need an extra
> No, That is not true. Because of that assumption we made all these apicid
> changes. And here we are now.
> 
> AMD supports varies mixed configurations. In case of EPYC-Rome, we have
> NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1, basically we
> have all the cores in a socket under one numa node. This is non-numa
> configuration.
> Looking at the various configurations and also discussing internally, it
> is not advisable to have (epyc && !numa) check.

Indeed on real hardware, I don't think we always see NUMA; my single
socket, 16 core/32 thread 7302P Dell box, shows the kernel printing
'No NUMA configuration found...Faking a node.'

So if real hardware hasn't got a NUMA node, what's the real problem?

Dave

> > > > patch on top of this series to enfoce that, i.e:
> > > >
> > > >  if (epyc && !numa)
> > > > error("EPYC cpu requires numa to be configured")
> > >
> > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > real world QEMU deployments. That is way too user hostile to introduce
> > > as a requirement.
> > >
> > > Why do we need to force this ?  People have been successfuly using
> > > EPYC CPUs without NUMA in QEMU for years now.
> > >
> > > It might not match behaviour of bare metal silicon, but that hasn't
> > > obviously caused the world to come crashing down.
> > So far it produces warning in linux kernel (RHBZ1728166), (resulting 
> > performance
> > might be suboptimal), but I haven't seen anyone reporting crashes yet.
> > 
> > 
> > What other options do we have?
> > Perhaps we can turn on strict check for new machine types only, so old 
> > configs
> > can keep broken topology (CPUID), while new ones would require -numa and
> > produce correct topology.
> > 
> > 
> > >
> > > Regards,
> > > Daniel
> 
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Babu Moger


> -Original Message-
> From: Igor Mammedov 
> Sent: Wednesday, August 26, 2020 8:31 AM
> To: Daniel P. Berrangé 
> Cc: Moger, Babu ; pbonz...@redhat.com;
> r...@twiddle.net; ehabk...@redhat.com; qemu-devel@nongnu.org;
> m...@redhat.com; Michal Privoznik 
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> decode
> 
> On Wed, 26 Aug 2020 13:50:59 +0100
> Daniel P. Berrangé  wrote:
> 
> > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > Babu Moger  wrote:
> > >
> > > > To support some of the complex topology, we introduced EPYC mode
> apicid decode.
> > > > But, EPYC mode decode is running into problems. Also it can become
> > > > quite a maintenance problem in the future. So, it was decided to
> > > > remove that code and use the generic decode which works for
> > > > majority of the topology. Most of the SPECed configuration would
> > > > work just fine. With some non-SPECed user inputs, it will create some 
> > > > sub-
> optimal configuration.
> > > > Here is the discussion thread.
> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> d5b437c1b25
> > > >
> 4%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> 52f92
> > > >
> 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> C0
> > > >
> %7C637340454473508873sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> YO%2B
> > > > 3WAzo3DeY5Ha8%3Dreserved=0
> > > >
> > > > This series removes all the EPYC mode specific apicid changes and
> > > > use the generic apicid decode.
> > >
> > > the main difference between EPYC and all other CPUs is that it
> > > requires numa configuration (it's not optional) so we need an extra
No, That is not true. Because of that assumption we made all these apicid
changes. And here we are now.

AMD supports varies mixed configurations. In case of EPYC-Rome, we have
NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1, basically we
have all the cores in a socket under one numa node. This is non-numa
configuration.
Looking at the various configurations and also discussing internally, it
is not advisable to have (epyc && !numa) check.

> > > patch on top of this series to enfoce that, i.e:
> > >
> > >  if (epyc && !numa)
> > > error("EPYC cpu requires numa to be configured")
> >
> > Please no. This will break 90% of current usage of the EPYC CPU in
> > real world QEMU deployments. That is way too user hostile to introduce
> > as a requirement.
> >
> > Why do we need to force this ?  People have been successfuly using
> > EPYC CPUs without NUMA in QEMU for years now.
> >
> > It might not match behaviour of bare metal silicon, but that hasn't
> > obviously caused the world to come crashing down.
> So far it produces warning in linux kernel (RHBZ1728166), (resulting 
> performance
> might be suboptimal), but I haven't seen anyone reporting crashes yet.
> 
> 
> What other options do we have?
> Perhaps we can turn on strict check for new machine types only, so old configs
> can keep broken topology (CPUID), while new ones would require -numa and
> produce correct topology.
> 
> 
> >
> > Regards,
> > Daniel




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Eduardo Habkost
On Wed, Aug 26, 2020 at 04:03:40PM +0100, Daniel P. Berrangé wrote:
> On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 14:36:38 +0100
> > Daniel P. Berrangé  wrote:
> > 
> > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > Daniel P. Berrangé  wrote:
> > > >   
> > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > Babu Moger  wrote:
> > > > > > 
> > > > > > > To support some of the complex topology, we introduced EPYC mode 
> > > > > > > apicid decode.
> > > > > > > But, EPYC mode decode is running into problems. Also it can 
> > > > > > > become quite a
> > > > > > > maintenance problem in the future. So, it was decided to remove 
> > > > > > > that code and
> > > > > > > use the generic decode which works for majority of the topology. 
> > > > > > > Most of the
> > > > > > > SPECed configuration would work just fine. With some non-SPECed 
> > > > > > > user inputs,
> > > > > > > it will create some sub-optimal configuration.
> > > > > > > Here is the discussion thread.
> > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > > > > > > 
> > > > > > > This series removes all the EPYC mode specific apicid changes and 
> > > > > > > use the generic
> > > > > > > apicid decode.
> > > > > > 
> > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > it requires numa configuration (it's not optional)
> > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > 
> > > > > >  if (epyc && !numa) 
> > > > > > error("EPYC cpu requires numa to be configured")
> > > > > 
> > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > as a requirement.
> > > > > 
> > > > > Why do we need to force this ?  People have been successfuly using
> > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > 
> > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > obviously caused the world to come crashing down.  
> > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > (resulting performance might be suboptimal), but I haven't seen
> > > > anyone reporting crashes yet.
> > > > 
> > > > 
> > > > What other options do we have?
> > > > Perhaps we can turn on strict check for new machine types only,
> > > > so old configs can keep broken topology (CPUID),
> > > > while new ones would require -numa and produce correct topology.  
> > > 
> > > No, tieing this to machine types is not viable either. That is still
> > > going to break essentially every single management application that
> > > exists today using QEMU.
> > for that we have deprecation process, so users could switch to new CLI
> > that would be required.
> 
> We could, but I don't find the cost/benefit tradeoff is compelling.
> 
> There are so many places where we diverge from what bare metal would
> do, that I don't see a good reason to introduce this breakage, even
> if we notify users via a deprecation message. 
> 
> If QEMU wants to require NUMA for EPYC, then QEMU could internally
> create a single NUMA node if none was specified for new machine
> types, such that there is no visible change or breakage to any
> mgmt apps.  

Is anything expected to break if we just set
auto_enable_numa=true unconditionally on pc-*-5.2 and newer?

-- 
Eduardo




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Daniel P . Berrangé
On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 14:36:38 +0100
> Daniel P. Berrangé  wrote:
> 
> > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > Daniel P. Berrangé  wrote:
> > >   
> > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > Babu Moger  wrote:
> > > > > 
> > > > > > To support some of the complex topology, we introduced EPYC mode 
> > > > > > apicid decode.
> > > > > > But, EPYC mode decode is running into problems. Also it can become 
> > > > > > quite a
> > > > > > maintenance problem in the future. So, it was decided to remove 
> > > > > > that code and
> > > > > > use the generic decode which works for majority of the topology. 
> > > > > > Most of the
> > > > > > SPECed configuration would work just fine. With some non-SPECed 
> > > > > > user inputs,
> > > > > > it will create some sub-optimal configuration.
> > > > > > Here is the discussion thread.
> > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > > > > > 
> > > > > > This series removes all the EPYC mode specific apicid changes and 
> > > > > > use the generic
> > > > > > apicid decode.
> > > > > 
> > > > > the main difference between EPYC and all other CPUs is that
> > > > > it requires numa configuration (it's not optional)
> > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > 
> > > > >  if (epyc && !numa) 
> > > > > error("EPYC cpu requires numa to be configured")
> > > > 
> > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > as a requirement.
> > > > 
> > > > Why do we need to force this ?  People have been successfuly using
> > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > 
> > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > obviously caused the world to come crashing down.  
> > > So far it produces warning in linux kernel (RHBZ1728166),
> > > (resulting performance might be suboptimal), but I haven't seen
> > > anyone reporting crashes yet.
> > > 
> > > 
> > > What other options do we have?
> > > Perhaps we can turn on strict check for new machine types only,
> > > so old configs can keep broken topology (CPUID),
> > > while new ones would require -numa and produce correct topology.  
> > 
> > No, tieing this to machine types is not viable either. That is still
> > going to break essentially every single management application that
> > exists today using QEMU.
> for that we have deprecation process, so users could switch to new CLI
> that would be required.

We could, but I don't find the cost/benefit tradeoff is compelling.

There are so many places where we diverge from what bare metal would
do, that I don't see a good reason to introduce this breakage, even
if we notify users via a deprecation message. 

If QEMU wants to require NUMA for EPYC, then QEMU could internally
create a single NUMA node if none was specified for new machine
types, such that there is no visible change or breakage to any
mgmt apps.  


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Dr. David Alan Gilbert
* Igor Mammedov (imamm...@redhat.com) wrote:
> On Tue, 25 Aug 2020 16:25:21 +0100
> "Dr. David Alan Gilbert"  wrote:
> 
> > * Igor Mammedov (imamm...@redhat.com) wrote:
> > > On Tue, 25 Aug 2020 09:15:04 +0100
> > > "Dr. David Alan Gilbert"  wrote:
> > >   
> > > > * Babu Moger (babu.mo...@amd.com) wrote:  
> > > > > Hi Dave,
> > > > > 
> > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > > > > > * Babu Moger (babu.mo...@amd.com) wrote:
> > > > > >> To support some of the complex topology, we introduced EPYC mode 
> > > > > >> apicid decode.
> > > > > >> But, EPYC mode decode is running into problems. Also it can become 
> > > > > >> quite a
> > > > > >> maintenance problem in the future. So, it was decided to remove 
> > > > > >> that code and
> > > > > >> use the generic decode which works for majority of the topology. 
> > > > > >> Most of the
> > > > > >> SPECed configuration would work just fine. With some non-SPECed 
> > > > > >> user inputs,
> > > > > >> it will create some sub-optimal configuration.
> > > > > >> Here is the discussion thread.
> > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3Dreserved=0
> > > > > >>
> > > > > >> This series removes all the EPYC mode specific apicid changes and 
> > > > > >> use the generic
> > > > > >> apicid decode.
> > > > > > 
> > > > > > Hi Babu,
> > > > > >   This does simplify things a lot!
> > > > > > One worry, what happens about a live migration of a VM from an old 
> > > > > > qemu
> > > > > > that was using the node-id to a qemu with this new scheme?
> > > > > 
> > > > > The node_id which we introduced was only used internally. This wasn't
> > > > > exposed outside. I don't think live migration will be an issue.
> > > > 
> > > > Didn't it become part of the APIC ID visible to the guest?  
> > > 
> > > Daniel asked similar question wrt hard error on start up,
> > > when CLI is not sufficient to create EPYC cpu.
> > > 
> > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > > 
> > > Migration might fall into the same category.
> > > Also looking at the history, 5.0 commit 
> > >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based 
> > > cpus models
> > > silently broke APIC ID (without versioning), for all EPYC models (that's 
> > > were 1 new and 1 old one).
> > > 
> > > (I'm not aware of somebody complaining about it)
> > > 
> > > Another commit ed78467a21459, changed CPUID_8000_001E without versioning 
> > > as well.
> > > 
> > > 
> > > With current EPYC apicid code, if all starts align (no numa or 1 numa 
> > > node only on
> > > CLI and no -smp dies=) it might produce a valid CPU 
> > > (apicid+CPUID_8000_001E).
> > > No numa is gray area, since EPYC spec implies that it has to be numa 
> > > machine in case of real EPYC cpus.
> > > Multi-node configs would be correct only if user assigns cpus to numa 
> > > nodes
> > > by duplicating internal node_id algorithm that this series removes.
> > > 
> > > There might be other broken cases that I don't recall anymore
> > > (should be mentioned in previous versions of this series)
> > > 
> > > 
> > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > 
> > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> > 
> > Oh 
> > 
> > >  2) with this series (lets call it qemu 5.2)
> > >  pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks 
> > > current code to pre-5.0
> > >  qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > 
> > > It's all about picking which poison to choose,
> > > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > > doesn't work as expected.  
> > 
> > I think that would make our lives easier for other reasons; so I'm happy
> > to go with that.
> 
> to make things less painful for users, me wonders if there is a way
> to block migration if epyc and specific QEMU versions are used?

We have no way to block based on version - and that's a pretty painful
thing to do; we can block based on machine type.

But before we get there; can we understand in which combinations that
things break and why exactly - would it break on a 1 or 2 vCPU guest -
or would it only break when we get to the point the upper bits start
being used for example?  Why exaclty would it break - i.e. is it going
to change the name of sections in the migration stream - or are the
values we need actually going to migrate OK?

Dave


> > > PS:
> > >  I didn't review it yet, but with this series we aren't
> > >  making up internal node_ids that should match user provided numa node 
> > > ids somehow.
> > >  It seems series lost the patch that would enforce numa in case -smp 
> > > dies>1,
> > >  but otherwise 

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Eduardo Habkost
On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> On Fri, 21 Aug 2020 17:12:19 -0500
> Babu Moger  wrote:
> 
> > To support some of the complex topology, we introduced EPYC mode apicid 
> > decode.
> > But, EPYC mode decode is running into problems. Also it can become quite a
> > maintenance problem in the future. So, it was decided to remove that code 
> > and
> > use the generic decode which works for majority of the topology. Most of the
> > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > it will create some sub-optimal configuration.
> > Here is the discussion thread.
> > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > 
> > This series removes all the EPYC mode specific apicid changes and use the 
> > generic
> > apicid decode.
> 
> the main difference between EPYC and all other CPUs is that
> it requires numa configuration (it's not optional)
> so we need an extra patch on top of this series to enfoce that, i.e:
> 
>  if (epyc && !numa) 
> error("EPYC cpu requires numa to be configured")
> 
> I think there was a patch in previous revisions that aimed for this.
> Simplest form would be above snippet.
> 
> More complex one, would be moving auto_enable_numa from MachineClass to
> MachineState so we can change it at runtime if EPYC is used. That should
> take care of use case where user hasn't provided -numa.

This sounds like a good solution.  It actually sounds simpler
than the alternatives (which just move the complexity to other
components).

We can keep MachineClass::auto_enable_numa as is, and just use it
to initialize the default value of MachineState::auto_enable_numa.

-- 
Eduardo




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Igor Mammedov
On Wed, 26 Aug 2020 14:36:38 +0100
Daniel P. Berrangé  wrote:

> On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 13:50:59 +0100
> > Daniel P. Berrangé  wrote:
> >   
> > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > Babu Moger  wrote:
> > > > 
> > > > > To support some of the complex topology, we introduced EPYC mode 
> > > > > apicid decode.
> > > > > But, EPYC mode decode is running into problems. Also it can become 
> > > > > quite a
> > > > > maintenance problem in the future. So, it was decided to remove that 
> > > > > code and
> > > > > use the generic decode which works for majority of the topology. Most 
> > > > > of the
> > > > > SPECed configuration would work just fine. With some non-SPECed user 
> > > > > inputs,
> > > > > it will create some sub-optimal configuration.
> > > > > Here is the discussion thread.
> > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > > > > 
> > > > > This series removes all the EPYC mode specific apicid changes and use 
> > > > > the generic
> > > > > apicid decode.
> > > > 
> > > > the main difference between EPYC and all other CPUs is that
> > > > it requires numa configuration (it's not optional)
> > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > 
> > > >  if (epyc && !numa) 
> > > > error("EPYC cpu requires numa to be configured")
> > > 
> > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > real world QEMU deployments. That is way too user hostile to introduce
> > > as a requirement.
> > > 
> > > Why do we need to force this ?  People have been successfuly using
> > > EPYC CPUs without NUMA in QEMU for years now.
> > > 
> > > It might not match behaviour of bare metal silicon, but that hasn't
> > > obviously caused the world to come crashing down.  
> > So far it produces warning in linux kernel (RHBZ1728166),
> > (resulting performance might be suboptimal), but I haven't seen
> > anyone reporting crashes yet.
> > 
> > 
> > What other options do we have?
> > Perhaps we can turn on strict check for new machine types only,
> > so old configs can keep broken topology (CPUID),
> > while new ones would require -numa and produce correct topology.  
> 
> No, tieing this to machine types is not viable either. That is still
> going to break essentially every single management application that
> exists today using QEMU.
for that we have deprecation process, so users could switch to new CLI
that would be required.


> Breaking stuff existing apps is not acceptable for something that is
> merely reporting sub-optimal performance. That's simply a documentation
> task to highlight best practice to app developers.
> 
> Regards,
> Daniel




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Daniel P . Berrangé
On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 13:50:59 +0100
> Daniel P. Berrangé  wrote:
> 
> > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > Babu Moger  wrote:
> > >   
> > > > To support some of the complex topology, we introduced EPYC mode apicid 
> > > > decode.
> > > > But, EPYC mode decode is running into problems. Also it can become 
> > > > quite a
> > > > maintenance problem in the future. So, it was decided to remove that 
> > > > code and
> > > > use the generic decode which works for majority of the topology. Most 
> > > > of the
> > > > SPECed configuration would work just fine. With some non-SPECed user 
> > > > inputs,
> > > > it will create some sub-optimal configuration.
> > > > Here is the discussion thread.
> > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > > > 
> > > > This series removes all the EPYC mode specific apicid changes and use 
> > > > the generic
> > > > apicid decode.  
> > > 
> > > the main difference between EPYC and all other CPUs is that
> > > it requires numa configuration (it's not optional)
> > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > 
> > >  if (epyc && !numa) 
> > > error("EPYC cpu requires numa to be configured")  
> > 
> > Please no. This will break 90% of current usage of the EPYC CPU in
> > real world QEMU deployments. That is way too user hostile to introduce
> > as a requirement.
> > 
> > Why do we need to force this ?  People have been successfuly using
> > EPYC CPUs without NUMA in QEMU for years now.
> > 
> > It might not match behaviour of bare metal silicon, but that hasn't
> > obviously caused the world to come crashing down.
> So far it produces warning in linux kernel (RHBZ1728166),
> (resulting performance might be suboptimal), but I haven't seen
> anyone reporting crashes yet.
> 
> 
> What other options do we have?
> Perhaps we can turn on strict check for new machine types only,
> so old configs can keep broken topology (CPUID),
> while new ones would require -numa and produce correct topology.

No, tieing this to machine types is not viable either. That is still
going to break essentially every single management application that
exists today using QEMU.

Breaking stuff existing apps is not acceptable for something that is
merely reporting sub-optimal performance. That's simply a documentation
task to highlight best practice to app developers.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Igor Mammedov
On Wed, 26 Aug 2020 13:50:59 +0100
Daniel P. Berrangé  wrote:

> On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > On Fri, 21 Aug 2020 17:12:19 -0500
> > Babu Moger  wrote:
> >   
> > > To support some of the complex topology, we introduced EPYC mode apicid 
> > > decode.
> > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > maintenance problem in the future. So, it was decided to remove that code 
> > > and
> > > use the generic decode which works for majority of the topology. Most of 
> > > the
> > > SPECed configuration would work just fine. With some non-SPECed user 
> > > inputs,
> > > it will create some sub-optimal configuration.
> > > Here is the discussion thread.
> > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > > 
> > > This series removes all the EPYC mode specific apicid changes and use the 
> > > generic
> > > apicid decode.  
> > 
> > the main difference between EPYC and all other CPUs is that
> > it requires numa configuration (it's not optional)
> > so we need an extra patch on top of this series to enfoce that, i.e:
> > 
> >  if (epyc && !numa) 
> > error("EPYC cpu requires numa to be configured")  
> 
> Please no. This will break 90% of current usage of the EPYC CPU in
> real world QEMU deployments. That is way too user hostile to introduce
> as a requirement.
> 
> Why do we need to force this ?  People have been successfuly using
> EPYC CPUs without NUMA in QEMU for years now.
> 
> It might not match behaviour of bare metal silicon, but that hasn't
> obviously caused the world to come crashing down.
So far it produces warning in linux kernel (RHBZ1728166),
(resulting performance might be suboptimal), but I haven't seen
anyone reporting crashes yet.


What other options do we have?
Perhaps we can turn on strict check for new machine types only,
so old configs can keep broken topology (CPUID),
while new ones would require -numa and produce correct topology.


> 
> Regards,
> Daniel




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Daniel P . Berrangé
On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> On Fri, 21 Aug 2020 17:12:19 -0500
> Babu Moger  wrote:
> 
> > To support some of the complex topology, we introduced EPYC mode apicid 
> > decode.
> > But, EPYC mode decode is running into problems. Also it can become quite a
> > maintenance problem in the future. So, it was decided to remove that code 
> > and
> > use the generic decode which works for majority of the topology. Most of the
> > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > it will create some sub-optimal configuration.
> > Here is the discussion thread.
> > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> > 
> > This series removes all the EPYC mode specific apicid changes and use the 
> > generic
> > apicid decode.
> 
> the main difference between EPYC and all other CPUs is that
> it requires numa configuration (it's not optional)
> so we need an extra patch on top of this series to enfoce that, i.e:
> 
>  if (epyc && !numa) 
> error("EPYC cpu requires numa to be configured")

Please no. This will break 90% of current usage of the EPYC CPU in
real world QEMU deployments. That is way too user hostile to introduce
as a requirement.

Why do we need to force this ?  People have been successfuly using
EPYC CPUs without NUMA in QEMU for years now.

It might not match behaviour of bare metal silicon, but that hasn't
obviously caused the world to come crashing down.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Igor Mammedov
On Tue, 25 Aug 2020 16:25:21 +0100
"Dr. David Alan Gilbert"  wrote:

> * Igor Mammedov (imamm...@redhat.com) wrote:
> > On Tue, 25 Aug 2020 09:15:04 +0100
> > "Dr. David Alan Gilbert"  wrote:
> >   
> > > * Babu Moger (babu.mo...@amd.com) wrote:  
> > > > Hi Dave,
> > > > 
> > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > > > > * Babu Moger (babu.mo...@amd.com) wrote:
> > > > >> To support some of the complex topology, we introduced EPYC mode 
> > > > >> apicid decode.
> > > > >> But, EPYC mode decode is running into problems. Also it can become 
> > > > >> quite a
> > > > >> maintenance problem in the future. So, it was decided to remove that 
> > > > >> code and
> > > > >> use the generic decode which works for majority of the topology. 
> > > > >> Most of the
> > > > >> SPECed configuration would work just fine. With some non-SPECed user 
> > > > >> inputs,
> > > > >> it will create some sub-optimal configuration.
> > > > >> Here is the discussion thread.
> > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3Dreserved=0
> > > > >>
> > > > >> This series removes all the EPYC mode specific apicid changes and 
> > > > >> use the generic
> > > > >> apicid decode.
> > > > > 
> > > > > Hi Babu,
> > > > >   This does simplify things a lot!
> > > > > One worry, what happens about a live migration of a VM from an old 
> > > > > qemu
> > > > > that was using the node-id to a qemu with this new scheme?
> > > > 
> > > > The node_id which we introduced was only used internally. This wasn't
> > > > exposed outside. I don't think live migration will be an issue.
> > > 
> > > Didn't it become part of the APIC ID visible to the guest?  
> > 
> > Daniel asked similar question wrt hard error on start up,
> > when CLI is not sufficient to create EPYC cpu.
> > 
> > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > 
> > Migration might fall into the same category.
> > Also looking at the history, 5.0 commit 
> >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based 
> > cpus models
> > silently broke APIC ID (without versioning), for all EPYC models (that's 
> > were 1 new and 1 old one).
> > 
> > (I'm not aware of somebody complaining about it)
> > 
> > Another commit ed78467a21459, changed CPUID_8000_001E without versioning as 
> > well.
> > 
> > 
> > With current EPYC apicid code, if all starts align (no numa or 1 numa node 
> > only on
> > CLI and no -smp dies=) it might produce a valid CPU 
> > (apicid+CPUID_8000_001E).
> > No numa is gray area, since EPYC spec implies that it has to be numa 
> > machine in case of real EPYC cpus.
> > Multi-node configs would be correct only if user assigns cpus to numa nodes
> > by duplicating internal node_id algorithm that this series removes.
> > 
> > There might be other broken cases that I don't recall anymore
> > (should be mentioned in previous versions of this series)
> > 
> > 
> > To summarize from migration pov (ignoring ed78467a21459 change):
> > 
> >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> 
> Oh 
> 
> >  2) with this series (lets call it qemu 5.2)
> >  pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks 
> > current code to pre-5.0
> >  qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > 
> > It's all about picking which poison to choose,
> > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > doesn't work as expected.  
> 
> I think that would make our lives easier for other reasons; so I'm happy
> to go with that.

to make things less painful for users, me wonders if there is a way
to block migration if epyc and specific QEMU versions are used?

> > PS:
> >  I didn't review it yet, but with this series we aren't
> >  making up internal node_ids that should match user provided numa node ids 
> > somehow.
> >  It seems series lost the patch that would enforce numa in case -smp dies>1,
> >  but otherwise it heads in the right direction.  
> 
> Dave
> 
> > > 
> > > Dave
> > >   
> >   




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-26 Thread Igor Mammedov
On Fri, 21 Aug 2020 17:12:19 -0500
Babu Moger  wrote:

> To support some of the complex topology, we introduced EPYC mode apicid 
> decode.
> But, EPYC mode decode is running into problems. Also it can become quite a
> maintenance problem in the future. So, it was decided to remove that code and
> use the generic decode which works for majority of the topology. Most of the
> SPECed configuration would work just fine. With some non-SPECed user inputs,
> it will create some sub-optimal configuration.
> Here is the discussion thread.
> https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> 
> This series removes all the EPYC mode specific apicid changes and use the 
> generic
> apicid decode.

the main difference between EPYC and all other CPUs is that
it requires numa configuration (it's not optional)
so we need an extra patch on top of this series to enfoce that, i.e:

 if (epyc && !numa) 
error("EPYC cpu requires numa to be configured")

I think there was a patch in previous revisions that aimed for this.
Simplest form would be above snippet.

More complex one, would be moving auto_enable_numa from MachineClass to
MachineState so we can change it at runtime if EPYC is used. That should
take care of use case where user hasn't provided -numa.


Eduardo,
 is there any way to tell managment that particular CPU type requires
 -numa ?

> ---
> v5:
>  Revert EPYC specific decode.
>  Simplify CPUID_8000_001E
> 
> v4:
>   
> https://lore.kernel.org/qemu-devel/159744083536.39197.13827776633866601278.st...@naples-babu.amd.com/
>   Not much of a change. Just added few text changes.
>   Error out configuration instead of warning if dies are not configured in 
> EPYC.
>   Few other text changes to clarify the removal of node_id, nr_nodes and 
> nodes_per_pkg.
> 
> v3:
>   
> https://lore.kernel.org/qemu-devel/159681772267.9679.1334429994189974662.st...@naples-babu.amd.com/#r
>   Added a new check to pass the dies for EPYC numa configuration.
>   Added Simplify CPUID_8000_001E patch with some changes suggested by Igor.
>   Dropped the patch to build the topology from CpuInstanceProperties.
>   TODO: Not sure if we still need the Autonuma changes Igor mentioned.
>   Needs more clarity on that.
> 
> v2:
>   
> https://lore.kernel.org/qemu-devel/159362436285.36204.986406297373871949.st...@naples-babu.amd.com/
>   Used the numa information from CpuInstanceProperties for building
>   the apic_id suggested by Igor.
>   Also did some minor code re-aarangement to take care of changes.
>   Dropped the patch "Simplify CPUID_8000_001E" from v1. Will send
>   it later.
> 
> v1:
>  
> https://lore.kernel.org/qemu-devel/159164739269.20543.3074052993891532749.st...@naples-babu.amd.com
> 
> Babu Moger (8):
>   hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology
>   Revert "i386: Fix pkg_id offset for EPYC cpu models"
>   Revert "target/i386: Enable new apic id encoding for EPYC based cpus 
> models"
>   Revert "hw/i386: Move arch_id decode inside x86_cpus_init"
>   Revert "i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition"
>   Revert "hw/i386: Introduce apicid functions inside X86MachineState"
>   Revert "hw/386: Add EPYC mode topology decoding functions"
>   i386: Simplify CPUID_8000_001E for AMD
> 
> 
>  hw/i386/pc.c   |8 +--
>  hw/i386/x86.c  |   43 +++-
>  include/hw/i386/topology.h |  101 ---
>  include/hw/i386/x86.h  |9 ---
>  target/i386/cpu.c  |  115 
> 
>  target/i386/cpu.h  |3 -
>  tests/test-x86-cpuid.c |   40 ---
>  7 files changed, 73 insertions(+), 246 deletions(-)
> 
> --
> Signature
> 




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-25 Thread Dr. David Alan Gilbert
* Igor Mammedov (imamm...@redhat.com) wrote:
> On Tue, 25 Aug 2020 09:15:04 +0100
> "Dr. David Alan Gilbert"  wrote:
> 
> > * Babu Moger (babu.mo...@amd.com) wrote:
> > > Hi Dave,
> > > 
> > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:  
> > > > * Babu Moger (babu.mo...@amd.com) wrote:  
> > > >> To support some of the complex topology, we introduced EPYC mode 
> > > >> apicid decode.
> > > >> But, EPYC mode decode is running into problems. Also it can become 
> > > >> quite a
> > > >> maintenance problem in the future. So, it was decided to remove that 
> > > >> code and
> > > >> use the generic decode which works for majority of the topology. Most 
> > > >> of the
> > > >> SPECed configuration would work just fine. With some non-SPECed user 
> > > >> inputs,
> > > >> it will create some sub-optimal configuration.
> > > >> Here is the discussion thread.
> > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3Dreserved=0
> > > >>
> > > >> This series removes all the EPYC mode specific apicid changes and use 
> > > >> the generic
> > > >> apicid decode.  
> > > > 
> > > > Hi Babu,
> > > >   This does simplify things a lot!
> > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > that was using the node-id to a qemu with this new scheme?  
> > > 
> > > The node_id which we introduced was only used internally. This wasn't
> > > exposed outside. I don't think live migration will be an issue.  
> > 
> > Didn't it become part of the APIC ID visible to the guest?
> 
> Daniel asked similar question wrt hard error on start up,
> when CLI is not sufficient to create EPYC cpu.
> 
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> 
> Migration might fall into the same category.
> Also looking at the history, 5.0 commit 
>   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus 
> models
> silently broke APIC ID (without versioning), for all EPYC models (that's were 
> 1 new and 1 old one).
> 
> (I'm not aware of somebody complaining about it)
> 
> Another commit ed78467a21459, changed CPUID_8000_001E without versioning as 
> well.
> 
> 
> With current EPYC apicid code, if all starts align (no numa or 1 numa node 
> only on
> CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> No numa is gray area, since EPYC spec implies that it has to be numa machine 
> in case of real EPYC cpus.
> Multi-node configs would be correct only if user assigns cpus to numa nodes
> by duplicating internal node_id algorithm that this series removes.
> 
> There might be other broken cases that I don't recall anymore
> (should be mentioned in previous versions of this series)
> 
> 
> To summarize from migration pov (ignoring ed78467a21459 change):
> 
>  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration

Oh 

>  2) with this series (lets call it qemu 5.2)
>  pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current 
> code to pre-5.0
>  qemu 5.0, 5.1 ==> qemu 5.2 - broken
> 
> It's all about picking which poison to choose,
> I'd preffer 2nd case as it lets drop a lot of complicated code that
> doesn't work as expected.

I think that would make our lives easier for other reasons; so I'm happy
to go with that.

> PS:
>  I didn't review it yet, but with this series we aren't
>  making up internal node_ids that should match user provided numa node ids 
> somehow.
>  It seems series lost the patch that would enforce numa in case -smp dies>1,
>  but otherwise it heads in the right direction.

Dave

> > 
> > Dave
> > 
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-25 Thread Igor Mammedov
On Tue, 25 Aug 2020 09:15:04 +0100
"Dr. David Alan Gilbert"  wrote:

> * Babu Moger (babu.mo...@amd.com) wrote:
> > Hi Dave,
> > 
> > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:  
> > > * Babu Moger (babu.mo...@amd.com) wrote:  
> > >> To support some of the complex topology, we introduced EPYC mode apicid 
> > >> decode.
> > >> But, EPYC mode decode is running into problems. Also it can become quite 
> > >> a
> > >> maintenance problem in the future. So, it was decided to remove that 
> > >> code and
> > >> use the generic decode which works for majority of the topology. Most of 
> > >> the
> > >> SPECed configuration would work just fine. With some non-SPECed user 
> > >> inputs,
> > >> it will create some sub-optimal configuration.
> > >> Here is the discussion thread.
> > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3Dreserved=0
> > >>
> > >> This series removes all the EPYC mode specific apicid changes and use 
> > >> the generic
> > >> apicid decode.  
> > > 
> > > Hi Babu,
> > >   This does simplify things a lot!
> > > One worry, what happens about a live migration of a VM from an old qemu
> > > that was using the node-id to a qemu with this new scheme?  
> > 
> > The node_id which we introduced was only used internally. This wasn't
> > exposed outside. I don't think live migration will be an issue.  
> 
> Didn't it become part of the APIC ID visible to the guest?

Daniel asked similar question wrt hard error on start up,
when CLI is not sufficient to create EPYC cpu.

https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html

Migration might fall into the same category.
Also looking at the history, 5.0 commit 
  247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus 
models
silently broke APIC ID (without versioning), for all EPYC models (that's were 1 
new and 1 old one).

(I'm not aware of somebody complaining about it)

Another commit ed78467a21459, changed CPUID_8000_001E without versioning as 
well.


With current EPYC apicid code, if all starts align (no numa or 1 numa node only 
on
CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
No numa is gray area, since EPYC spec implies that it has to be numa machine in 
case of real EPYC cpus.
Multi-node configs would be correct only if user assigns cpus to numa nodes
by duplicating internal node_id algorithm that this series removes.

There might be other broken cases that I don't recall anymore
(should be mentioned in previous versions of this series)


To summarize from migration pov (ignoring ed78467a21459 change):

 1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration
 2) with this series (lets call it qemu 5.2)
 pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current 
code to pre-5.0
 qemu 5.0, 5.1 ==> qemu 5.2 - broken

It's all about picking which poison to choose,
I'd preffer 2nd case as it lets drop a lot of complicated code that
doesn't work as expected.

PS:
 I didn't review it yet, but with this series we aren't
 making up internal node_ids that should match user provided numa node ids 
somehow.
 It seems series lost the patch that would enforce numa in case -smp dies>1,
 but otherwise it heads in the right direction.

> 
> Dave
> 




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-25 Thread Dr. David Alan Gilbert
* Babu Moger (babu.mo...@amd.com) wrote:
> Hi Dave,
> 
> On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > * Babu Moger (babu.mo...@amd.com) wrote:
> >> To support some of the complex topology, we introduced EPYC mode apicid 
> >> decode.
> >> But, EPYC mode decode is running into problems. Also it can become quite a
> >> maintenance problem in the future. So, it was decided to remove that code 
> >> and
> >> use the generic decode which works for majority of the topology. Most of 
> >> the
> >> SPECed configuration would work just fine. With some non-SPECed user 
> >> inputs,
> >> it will create some sub-optimal configuration.
> >> Here is the discussion thread.
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3Dreserved=0
> >>
> >> This series removes all the EPYC mode specific apicid changes and use the 
> >> generic
> >> apicid decode.
> > 
> > Hi Babu,
> >   This does simplify things a lot!
> > One worry, what happens about a live migration of a VM from an old qemu
> > that was using the node-id to a qemu with this new scheme?
> 
> The node_id which we introduced was only used internally. This wasn't
> exposed outside. I don't think live migration will be an issue.

Didn't it become part of the APIC ID visible to the guest?

Dave

-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-24 Thread Babu Moger
Hi Dave,

On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> * Babu Moger (babu.mo...@amd.com) wrote:
>> To support some of the complex topology, we introduced EPYC mode apicid 
>> decode.
>> But, EPYC mode decode is running into problems. Also it can become quite a
>> maintenance problem in the future. So, it was decided to remove that code and
>> use the generic decode which works for majority of the topology. Most of the
>> SPECed configuration would work just fine. With some non-SPECed user inputs,
>> it will create some sub-optimal configuration.
>> Here is the discussion thread.
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2Fdata=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3Dreserved=0
>>
>> This series removes all the EPYC mode specific apicid changes and use the 
>> generic
>> apicid decode.
> 
> Hi Babu,
>   This does simplify things a lot!
> One worry, what happens about a live migration of a VM from an old qemu
> that was using the node-id to a qemu with this new scheme?

The node_id which we introduced was only used internally. This wasn't
exposed outside. I don't think live migration will be an issue.



Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

2020-08-24 Thread Dr. David Alan Gilbert
* Babu Moger (babu.mo...@amd.com) wrote:
> To support some of the complex topology, we introduced EPYC mode apicid 
> decode.
> But, EPYC mode decode is running into problems. Also it can become quite a
> maintenance problem in the future. So, it was decided to remove that code and
> use the generic decode which works for majority of the topology. Most of the
> SPECed configuration would work just fine. With some non-SPECed user inputs,
> it will create some sub-optimal configuration.
> Here is the discussion thread.
> https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b...@amd.com/
> 
> This series removes all the EPYC mode specific apicid changes and use the 
> generic
> apicid decode.

Hi Babu,
  This does simplify things a lot!
One worry, what happens about a live migration of a VM from an old qemu
that was using the node-id to a qemu with this new scheme?

Dave

> ---
> v5:
>  Revert EPYC specific decode.
>  Simplify CPUID_8000_001E
> 
> v4:
>   
> https://lore.kernel.org/qemu-devel/159744083536.39197.13827776633866601278.st...@naples-babu.amd.com/
>   Not much of a change. Just added few text changes.
>   Error out configuration instead of warning if dies are not configured in 
> EPYC.
>   Few other text changes to clarify the removal of node_id, nr_nodes and 
> nodes_per_pkg.
> 
> v3:
>   
> https://lore.kernel.org/qemu-devel/159681772267.9679.1334429994189974662.st...@naples-babu.amd.com/#r
>   Added a new check to pass the dies for EPYC numa configuration.
>   Added Simplify CPUID_8000_001E patch with some changes suggested by Igor.
>   Dropped the patch to build the topology from CpuInstanceProperties.
>   TODO: Not sure if we still need the Autonuma changes Igor mentioned.
>   Needs more clarity on that.
> 
> v2:
>   
> https://lore.kernel.org/qemu-devel/159362436285.36204.986406297373871949.st...@naples-babu.amd.com/
>   Used the numa information from CpuInstanceProperties for building
>   the apic_id suggested by Igor.
>   Also did some minor code re-aarangement to take care of changes.
>   Dropped the patch "Simplify CPUID_8000_001E" from v1. Will send
>   it later.
> 
> v1:
>  
> https://lore.kernel.org/qemu-devel/159164739269.20543.3074052993891532749.st...@naples-babu.amd.com
> 
> Babu Moger (8):
>   hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology
>   Revert "i386: Fix pkg_id offset for EPYC cpu models"
>   Revert "target/i386: Enable new apic id encoding for EPYC based cpus 
> models"
>   Revert "hw/i386: Move arch_id decode inside x86_cpus_init"
>   Revert "i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition"
>   Revert "hw/i386: Introduce apicid functions inside X86MachineState"
>   Revert "hw/386: Add EPYC mode topology decoding functions"
>   i386: Simplify CPUID_8000_001E for AMD
> 
> 
>  hw/i386/pc.c   |8 +--
>  hw/i386/x86.c  |   43 +++-
>  include/hw/i386/topology.h |  101 ---
>  include/hw/i386/x86.h  |9 ---
>  target/i386/cpu.c  |  115 
> 
>  target/i386/cpu.h  |3 -
>  tests/test-x86-cpuid.c |   40 ---
>  7 files changed, 73 insertions(+), 246 deletions(-)
> 
> --
> Signature
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK