from:"Moger, Babu"

Re: [PATCH v3] target/i386: Fix CPUID encoding of Fn8000001E_ECX

2024-05-10 Thread Moger, Babu


Hi Daniel,

On 5/10/2024 3:10 AM, Daniel P. Berrangé wrote:

On Fri, May 10, 2024 at 11:05:44AM +0300, Michael Tokarev wrote:

09.05.2024 17:11, Daniel P. Berrangé wrote:

On Thu, May 09, 2024 at 04:54:16PM +0300, Michael Tokarev wrote:

03.05.2024 20:46, Babu Moger wrote:



diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 08c7de416f..46235466d7 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -81,6 +81,7 @@
GlobalProperty pc_compat_9_0[] = {
{ TYPE_X86_CPU, "guest-phys-bits", "0" },
{ "sev-guest", "legacy-vm-type", "true" },
+{ TYPE_X86_CPU, "legacy-multi-node", "on" },
};


Should this legacy-multi-node property be added to previous
machine types when applying to stable?  How about stable-8.2
and stable-7.2?


machine types are considered to express a fixed guest ABI
once part of a QEMU release. Given that we should not be
changing existing machine types in stable branches.


Yes, I understand this, and this is exactly why I asked.
The change in question has been Cc'ed to stable.  And I'm
trying to understand what should I do with it :)


In theory we could create new "bug fix" machine types in stable
branches. To support live migration, we would then need to also
add those same stable branch "bug fix" machine type versions in
all future QEMU versions. This is generally not worth the hassle
of exploding the number of machine types.

If you backport the patch, minus the machine type, then users
can still get the fix but they'll need to manually set the
property to enable it.


I don't think this makes big sense.  But maybe for someone who
actually hits this issue such backport will let to fix it.
Hence, again, I'm asking if it really a good idea to pick this
up for stable (any version of, - currently there are 2 active
series, 7.2, 8.2 and 9.0).


Hmm, the description says

   "Observed the following failure while booting the SEV-SNP guest"

and yet the patches for SEV-SNP are *not* merged in QEMU yet. So this
does not look relevant for stable unless I'm missing something.


I have not thought thru about stable tag. This is not critical for 
stable release.


If required I will send a separate patch for stable later. It is not 
required right now. Sorry about the noise.


--
- Babu Moger

Re: [PATCH v2] target/i386: Fix CPUID encoding of Fn8000001E_ECX

2024-03-22 Thread Moger, Babu

Any feedback or concerns with this patch? Otherwise can this be merged?
Thanks
Babu

On 1/2/24 17:17, Babu Moger wrote:
> Observed the following failure while booting the SEV-SNP guest and the
> guest fails to boot with the smp parameters:
> "-smp 192,sockets=1,dies=12,cores=8,threads=2".
> 
> qemu-system-x86_64: sev_snp_launch_update: SNP_LAUNCH_UPDATE ret=-5 
> fw_error=22 'Invalid parameter'
> qemu-system-x86_64: SEV-SNP: CPUID validation failed for function 0x801e, 
> index: 0x0.
> provided: eax:0x, ebx: 0x0100, ecx: 0x0b00, edx: 0x
> expected: eax:0x, ebx: 0x0100, ecx: 0x0300, edx: 0x
> qemu-system-x86_64: SEV-SNP: failed update CPUID page
> 
> Reason for the failure is due to overflowing of bits used for "Node per
> processor" in CPUID Fn801E_ECX. This field's width is 3 bits wide and
> can hold maximum value 0x7. With dies=12 (0xB), it overflows and spills
> over into the reserved bits. In the case of SEV-SNP, this causes CPUID
> enforcement failure and guest fails to boot.
> 
> The PPR documentation for CPUID_Fn801E_ECX [Node Identifiers]
> =
> BitsDescription
> 31:11   Reserved.
> 
> 10:8NodesPerProcessor: Node per processor. Read-only.
> ValidValues:
> Value   Description
> 0h  1 node per processor.
> 7h-1h   Reserved.
> 
> 7:0 NodeId: Node ID. Read-only. Reset: Fixed,XXh.
> =
> 
> As in the spec, the valid value for "node per processor" is 0 and rest
> are reserved.
> 
> Looking back at the history of decoding of CPUID_Fn801E_ECX, noticed
> that there were cases where "node per processor" can be more than 1. It
> is valid only for pre-F17h (pre-EPYC) architectures. For EPYC or later
> CPUs, the linux kernel does not use this information to build the L3
> topology.
> 
> Also noted that the CPUID Function 0x801E_ECX is available only when
> TOPOEXT feature is enabled. This feature is enabled only for EPYC(F17h)
> or later processors. So, previous generation of processors do not not
> enumerate 0x801E_ECX leaf.
> 
> There could be some corner cases where the older guests could enable the
> TOPOEXT feature by running with -cpu host, in which case legacy guests
> might notice the topology change. To address those cases introduced a
> new CPU property "legacy-multi-node". It will be true for older machine
> types to maintain compatibility. By default, it will be false, so new
> decoding will be used going forward.
> 
> The documentation is taken from Preliminary Processor Programming
> Reference (PPR) for AMD Family 19h Model 11h, Revision B1 Processors 55901
> Rev 0.25 - Oct 6, 2022.
> 
> Cc: qemu-sta...@nongnu.org
> Fixes: 31ada106d891 ("Simplify CPUID_8000_001E for AMD")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger 
> Reviewed-by: Zhao Liu 
> ---
> v2: Rebased to the latest tree.
> Updated the pc_compat_8_2 for the new flag.
> Added the comment for new property legacy_multi_node.
> Added Reviwed-by from Zhao.
> ---
>  hw/i386/pc.c  |  4 +++-
>  target/i386/cpu.c | 18 ++
>  target/i386/cpu.h |  6 ++
>  3 files changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 496498df3a..a504e05e62 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -78,7 +78,9 @@
>  { "qemu64-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },\
>  { "athlon-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },
>  
> -GlobalProperty pc_compat_8_2[] = {};
> +GlobalProperty pc_compat_8_2[] = {
> +{ TYPE_X86_CPU, "legacy-multi-node", "on" },
> +};
>  const size_t pc_compat_8_2_len = G_N_ELEMENTS(pc_compat_8_2);
>  
>  GlobalProperty pc_compat_8_1[] = {};
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 95d5f16cd5..2cc84e8500 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -398,12 +398,9 @@ static void encode_topo_cpuid801e(X86CPU *cpu, 
> X86CPUTopoInfo *topo_info,
>   * 31:11 Reserved.
>   * 10:8 NodesPerProcessor: Node per processor. Read-only. Reset: XXXb.
>   *  ValidValues:
> - *  Value Description
> - *  000b  1 node per processor.
> - *  001b  2 nodes per processor.
> - *  010b Reserved.
> - *  011b 4 nodes per processor.
> - *  111b-100b Reserved.
> + *  Value   Description
> + *  0h  1 node per processor.
> + *  7h-1h   Reserved.
>   *  7:0 NodeId: Node ID. Read-only. Reset: XXh.
>   *
>   * NOTE: Hardware reserves 3 bits for number of nodes per processor.
> @@ -412,8 +409,12 @@ static void encode_topo_cpuid801e(X86CPU *cpu, 
> X86CPUTopoInfo *topo_info,
>   * NodeId is combination of node and socket_id which is already decoded
>   * in apic_id. Just use it by shifting.
>   */
> -

Re: [PATCH v9 00/21] Introduce smp.modules for x86 in QEMU

2024-02-29 Thread Moger, Babu

Sanity tested on AMD machine. Looks good.

Tested-by: Babu Moger 

On 2/27/24 04:32, Zhao Liu wrote:
> From: Zhao Liu 
> 
> Hi list,
> 
> This is the our v9 patch series, rebased on the master branch at the
> commit 03d496a992d9 ("Merge tag 'pull-qapi-2024-02-26' of
> https://repo.or.cz/qemu/armbru into staging").
> 
> Compared with v8 [1], v9 mainly added more module description in commit
> message and added missing smp.modules description/documentation.
> 
> With the general introduction (with origial cluster level) of this
> secries in v7 [2] cover letter, the following sections are mainly about
> the description of the newly added smp.modules (since v8, changed x86
> cluster support to module) as supplement.
> 
> Since v4 [3], we've dropped the original L2 cache command line option
> (to configure L2 cache topology) and now we have the new RFC [4] to
> support the general cache topology configuration (as the supplement to
> this series).
> 
> Welcome your comments!
> 
> 
> Why We Need a New CPU Topology Level
> 
> 
> For the discussion in v7 about whether we should reuse current
> smp.clusters for x86 module, the core point is what's the essential
> differences between x86 module and general cluster.
> 
> Since, cluster (for ARM/riscv) lacks a comprehensive and rigorous
> hardware definition, and judging from the description of smp.clusters
> [5] when it was introduced by QEMU, x86 module is very similar to
> general smp.clusters: they are all a layer above existing core level
> to organize the physical cores and share L2 cache.
> 
> But there are following reasons that drive us to introduce the new
> smp.modules:
> 
>   * As the CPU topology abstraction in device tree [6], cluster supports
> nesting (though currently QEMU hasn't support that). In contrast,
> (x86) module does not support nesting.
> 
>   * Due to nesting, there is great flexibility in sharing resources
> on cluster, rather than narrowing cluster down to sharing L2 (and
> L3 tags) as the lowest topology level that contains cores.
> 
>   * Flexible nesting of cluster allows it to correspond to any level
> between the x86 package and core.
> 
>   * In Linux kernel, x86's cluster only represents the L2 cache domain
> but QEMU's smp.clusters is the CPU topology level. Linux kernel will
> also expose module level topology information in sysfs for x86. To
> avoid cluster ambiguity and keep a consistent CPU topology naming
> style with the Linux kernel, we introduce module level for x86.
> 
> Based on the above considerations, and in order to eliminate the naming
> confusion caused by the mapping between general cluster and x86 module,
> we now formally introduce smp.modules as the new topology level.
> 
> 
> Where to Place Module in Existing Topology Levels
> =
> 
> The module is, in existing hardware practice, the lowest layer that
> contains the core, while the cluster is able to have a higher topological
> scope than the module due to its nesting.
> 
> Therefore, we place the module between the cluster and the core:
> 
> drawer/book/socket/die/cluster/module/core/thread
> 
> 
> Additional Consideration on CPU Topology
> 
> 
> Beyond this patchset, nowadays, different arches have different topology
> requirements, and maintaining arch-agnostic general topology in SMP
> becomes to be an increasingly difficult thing due to differences in
> sharing resources and special flexibility (e.g., nesting):
> 
>   * It becomes difficult to put together all CPU topology hierarchies of
> different arches to define complete topology order.
> 
>   * It also becomes complex to ensure the correctness of the topology
> calculations.
>   - Now the max_cpus is calculated by multiplying all topology
> levels, and too many topology levels can easily cause omissions.
> 
> Maybe we should consider implementing arch-specfic topology hierarchies.
> 
> 
> [1]: 
> https://lore.kernel.org/qemu-devel/20240131101350.109512-1-zhao1@linux.intel.com/
> [2]: 
> https://lore.kernel.org/qemu-devel/20240108082727.420817-1-zhao1@linux.intel.com/
> [3]: 
> https://lore.kernel.org/qemu-devel/20231003085516-mutt-send-email-...@kernel.org/
> [4]: 
> https://lore.kernel.org/qemu-devel/20240220092504.726064-1-zhao1@linux.intel.com/
> [5]: 
> https://lore.kernel.org/qemu-devel/c3d68005-54e0-b8fe-8dc1-5989fe3c7...@huawei.com/
> [6]: 
> https://www.kernel.org/doc/Documentation/devicetree/bindings/cpu/cpu-topology.txt
> 
> Thanks and Best Regards,
> Zhao
> ---
> Changelog:
> 
> Changes since v8:
>  * Add the reason of why a new module level is needed in commit message.
>(Markus).
>  * Add the description about how Linux kernel supports x86 module level
>in commit message. (Daniel)
>  * Add module description in qemu_smp_opts.
>  * Add missing "modules" parameter of -smp example in

Re: [PATCH v9 07/21] i386/cpu: Use APIC ID info get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]

2024-02-29 Thread Moger, Babu




On 2/27/24 04:32, Zhao Liu wrote:
> From: Zhao Liu 
> 
> The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
> for cpuid 0x801D") adds the cache topology for AMD CPU by encoding
> the number of sharing threads directly.
> 
> From AMD's APM, NumSharingCache (CPUID[0x801D].EAX[bits 25:14])
> means [1]:
> 
> The number of logical processors sharing this cache is the value of
> this field incremented by 1. To determine which logical processors are
> sharing a cache, determine a Share Id for each processor as follows:
> 
> ShareId = LocalApicId >> log2(NumSharingCache+1)
> 
> Logical processors with the same ShareId then share a cache. If
> NumSharingCache+1 is not a power of two, round it up to the next power
> of two.
> 
> From the description above, the calculation of this field should be same
> as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
> APIC ID to calculate this field.
> 
> [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
>  Information
> 
> Cc: Babu Moger 
> Tested-by: Yongwei Ma 
> Signed-off-by: Zhao Liu 


Reviewed-by: Babu Moger 

> ---
> Changes since v7:
>  * Moved this patch after CPUID[4]'s similar change ("i386/cpu: Use APIC
>ID offset to encode cache topo in CPUID[4]"). (Xiaoyao)
>  * Dropped Michael/Babu's Acked/Reviewed/Tested tags since the code
>change due to the rebase.
>  * Re-added Yongwei's Tested tag For his re-testing (compilation on
>Intel platforms).
> 
> Changes since v3:
>  * Rewrote the subject. (Babu)
>  * Deleted the original "comment/help" expression, as this behavior is
>confirmed for AMD CPUs. (Babu)
>  * Renamed "num_apic_ids" (v3) to "num_sharing_cache" to match spec
>definition. (Babu)
> 
> Changes since v1:
>  * Renamed "l3_threads" to "num_apic_ids" in
>encode_cache_cpuid801d(). (Yanan)
>  * Added the description of the original commit and add Cc.
> ---
>  target/i386/cpu.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index c77bcbc44d59..df56c7a449c8 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -331,7 +331,7 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
> *cache,
> uint32_t *eax, uint32_t *ebx,
> uint32_t *ecx, uint32_t *edx)
>  {
> -uint32_t l3_threads;
> +uint32_t num_sharing_cache;
>  assert(cache->size == cache->line_size * cache->associativity *
>cache->partitions * cache->sets);
>  
> @@ -340,11 +340,11 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
> *cache,
>  
>  /* L3 is shared among multiple cores */
>  if (cache->level == 3) {
> -l3_threads = topo_info->cores_per_die * topo_info->threads_per_core;
> -*eax |= (l3_threads - 1) << 14;
> +num_sharing_cache = 1 << apicid_die_offset(topo_info);
>  } else {
> -*eax |= ((topo_info->threads_per_core - 1) << 14);
> +num_sharing_cache = 1 << apicid_core_offset(topo_info);
>  }
> +*eax |= (num_sharing_cache - 1) << 14;
>  
>  assert(cache->line_size > 0);
>  assert(cache->partitions > 0);

-- 
Thanks
Babu Moger

Re: [PATCH v9 21/21] i386/cpu: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]

2024-02-29 Thread Moger, Babu




On 2/27/24 04:32, Zhao Liu wrote:
> From: Zhao Liu 
> 
> CPUID[0x801D].EAX[bits 25:14] NumSharingCache: number of logical
> processors sharing cache.
> 
> The number of logical processors sharing this cache is
> NumSharingCache + 1.
> 
> After cache models have topology information, we can use
> CPUCacheInfo.share_level to decide which topology level to be encoded
> into CPUID[0x801D].EAX[bits 25:14].
> 
> Cc: Babu Moger 
> Tested-by: Yongwei Ma 
> Signed-off-by: Zhao Liu 


Reviewed-by: Babu Moger 

> ---
> Changes since v7:
>  * Renamed max_processor_ids_for_cache() to max_thread_ids_for_cache().
>  * Dropped Michael/Babu's ACKed/Tested tags since the code change.
>  * Re-added Yongwei's Tested tag For his re-testing.
> 
> Changes since v3:
>  * Explained what "CPUID[0x801D].EAX[bits 25:14]" means in the
>commit message. (Babu)
> 
> Changes since v1:
>  * Used cache->share_level as the parameter in
>max_processor_ids_for_cache().
> ---
>  target/i386/cpu.c | 10 +-
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 07cd729c3524..bc21c2d537b3 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -481,20 +481,12 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
> *cache,
> uint32_t *eax, uint32_t *ebx,
> uint32_t *ecx, uint32_t *edx)
>  {
> -uint32_t num_sharing_cache;
>  assert(cache->size == cache->line_size * cache->associativity *
>cache->partitions * cache->sets);
>  
>  *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
> (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
> -
> -/* L3 is shared among multiple cores */
> -if (cache->level == 3) {
> -num_sharing_cache = 1 << apicid_die_offset(topo_info);
> -} else {
> -num_sharing_cache = 1 << apicid_core_offset(topo_info);
> -}
> -*eax |= (num_sharing_cache - 1) << 14;
> +*eax |= max_thread_ids_for_cache(topo_info, cache->share_level) << 14;
>  
>  assert(cache->line_size > 0);
>  assert(cache->partitions > 0);

-- 
Thanks
Babu Moger

Re: [PATCH v9 18/21] hw/i386/pc: Support smp.modules for x86 PC machine

2024-02-29 Thread Moger, Babu




On 2/29/24 01:32, Zhao Liu wrote:
> Hi Babu,
> 
>>>  DEF("smp", HAS_ARG, QEMU_OPTION_smp,
>>>  "-smp 
>>> [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets]\n"
> 
> Here the "drawers" and "books" are listed...
> 
>>> -"   
>>> [,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n"
>>> +"   
>>> [,dies=dies][,clusters=clusters][,modules=modules][,cores=cores]\n"
>>> +"   [,threads=threads]\n"
>>>  "set the number of initial CPUs to 'n' [default=1]\n"
>>>  "maxcpus= maximum number of total CPUs, including\n"
>>>  "offline CPUs for hotplug, etc\n"
>>> @@ -290,7 +291,8 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp,
>>>  "sockets= number of sockets in one book\n"
>>>  "dies= number of dies in one socket\n"
>>>  "clusters= number of clusters in one die\n"
>>> -"cores= number of cores in one cluster\n"
>>> +"modules= number of modules in one cluster\n"
>>> +"cores= number of cores in one module\n"
>>>  "threads= number of threads in one core\n"
>>>  "Note: Different machines may have different subsets of the CPU 
>>> topology\n"
>>>  "  parameters supported, so the actual meaning of the supported 
>>> parameters\n"
>>> @@ -306,7 +308,7 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp,
>>>  "  must be set as 1 in the purpose of correct parsing.\n",
>>>  QEMU_ARCH_ALL)
>>>  SRST
>>> -``-smp 
>>> [[cpus=]n][,maxcpus=maxcpus][,sockets=sockets][,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]``
>>> +``-smp 
>>> [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets][,dies=dies][,clusters=clusters][,modules=modules][,cores=cores][,threads=threads]``
>>
>> You have added drawers, books here. Were they missing before?
>>
> 
> ...so yes, I think those 2 parameters are missed at this place.

ok. If there is another revision then add a line about this change in the
commit message. Otherwise it is fine.

Reviewed-by: Babu Moger 

> 
> Thank you for reviewing this.
> 
> Regards,
> Zhao
> 

-- 
Thanks
Babu Moger

Re: [PATCH v9 18/21] hw/i386/pc: Support smp.modules for x86 PC machine

2024-02-28 Thread Moger, Babu

Hi Zhao,

On 2/27/24 04:32, Zhao Liu wrote:
> From: Zhao Liu 
> 
> As module-level topology support is added to X86CPU, now we can enable
> the support for the modules parameter on PC machines. With this support,
> we can define a 5-level x86 CPU topology with "-smp":
> 
> -smp cpus=*,maxcpus=*,sockets=*,dies=*,modules=*,cores=*,threads=*.
> 
> Additionally, add the 5-level topology example in description of "-smp".
> 
> Tested-by: Yongwei Ma 
> Co-developed-by: Zhuocheng Ding 
> Signed-off-by: Zhuocheng Ding 
> Signed-off-by: Zhao Liu 
> ---
> Changes since v8:
>  * Add missing "modules" parameter in -smp example.
> 
> Changes since v7:
>  * Supported modules instead of clusters for PC.
>  * Dropped Michael/Babu/Yanan's ACKed/Tested/Reviewed tags since the
>code change.
>  * Re-added Yongwei's Tested tag For his re-testing.
> ---
>  hw/i386/pc.c|  1 +
>  qemu-options.hx | 18 ++
>  2 files changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index f8eb684a4926..b270a66605fc 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1830,6 +1830,7 @@ static void pc_machine_class_init(ObjectClass *oc, void 
> *data)
>  mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
>  mc->nvdimm_supported = true;
>  mc->smp_props.dies_supported = true;
> +mc->smp_props.modules_supported = true;
>  mc->default_ram_id = "pc.ram";
>  pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_64;
>  
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 9be1e5817c7d..b5784fda32cb 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -281,7 +281,8 @@ ERST
>  
>  DEF("smp", HAS_ARG, QEMU_OPTION_smp,
>  "-smp 
> [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets]\n"
> -"   
> [,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]\n"
> +"   
> [,dies=dies][,clusters=clusters][,modules=modules][,cores=cores]\n"
> +"   [,threads=threads]\n"
>  "set the number of initial CPUs to 'n' [default=1]\n"
>  "maxcpus= maximum number of total CPUs, including\n"
>  "offline CPUs for hotplug, etc\n"
> @@ -290,7 +291,8 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp,
>  "sockets= number of sockets in one book\n"
>  "dies= number of dies in one socket\n"
>  "clusters= number of clusters in one die\n"
> -"cores= number of cores in one cluster\n"
> +"modules= number of modules in one cluster\n"
> +"cores= number of cores in one module\n"
>  "threads= number of threads in one core\n"
>  "Note: Different machines may have different subsets of the CPU 
> topology\n"
>  "  parameters supported, so the actual meaning of the supported 
> parameters\n"
> @@ -306,7 +308,7 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp,
>  "  must be set as 1 in the purpose of correct parsing.\n",
>  QEMU_ARCH_ALL)
>  SRST
> -``-smp 
> [[cpus=]n][,maxcpus=maxcpus][,sockets=sockets][,dies=dies][,clusters=clusters][,cores=cores][,threads=threads]``
> +``-smp 
> [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets][,dies=dies][,clusters=clusters][,modules=modules][,cores=cores][,threads=threads]``

You have added drawers, books here. Were they missing before?

>  Simulate a SMP system with '\ ``n``\ ' CPUs initially present on
>  the machine type board. On boards supporting CPU hotplug, the optional
>  '\ ``maxcpus``\ ' parameter can be set to enable further CPUs to be
> @@ -345,14 +347,14 @@ SRST
>  -smp 8,sockets=2,cores=2,threads=2,maxcpus=8
>  
>  The following sub-option defines a CPU topology hierarchy (2 sockets
> -totally on the machine, 2 dies per socket, 2 cores per die, 2 threads
> -per core) for PC machines which support sockets/dies/cores/threads.
> -Some members of the option can be omitted but their values will be
> -automatically computed:
> +totally on the machine, 2 dies per socket, 2 modules per die, 2 cores per
> +module, 2 threads per core) for PC machines which support sockets/dies
> +/modules/cores/threads. Some members of the option can be omitted but
> +their values will be automatically computed:
>  
>  ::
>  
> --smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16
> +-smp 32,sockets=2,dies=2,modules=2,cores=2,threads=2,maxcpus=32
>  
>  The following sub-option defines a CPU topology hierarchy (2 sockets
>  totally on the machine, 2 clusters per socket, 2 cores per cluster,

-- 
Thanks
Babu Moger

Re: [PATCH v7 00/16] Support smp.clusters for x86 in QEMU

2024-01-08 Thread Moger, Babu

Hi  Zhao,

Ran few basic tests on AMD systems. Changes look good.

Thanks
Babu


Tested-by: Babu Moger 


On 1/8/24 02:27, Zhao Liu wrote:
> From: Zhao Liu 
> 
> Hi list,
> 
> This is the our v7 patch series, rebased on the master branch at the
> commit d328fef93ae7 ("Merge tag 'pull-20231230' of
> https://gitlab.com/rth7680/qemu into staging").
> 
> No more change since v6 [1] exclude the comment nit update.
> 
> Welcome your comments!
> 
> 
> PS: Since v5, we have dropped "x-l2-cache-topo" option and now are
> working on porting the original x-l2-cache-topo option to smp [2].
> Just like:
> 
> -smp cpus=4,sockets=2,cores=2,threads=1, \
>  l3-cache=socket,l2-cache=core,l1-i-cache=core,l1-d-cache=core
> 
> The cache topology enhancement in this patch set is the preparation for
> supporting future user-configurable cache topology (via generic cli
> interface).
> 
> 
> ---
> # Introduction
> 
> This series adds the cluster support for x86 PC machine, which allows
> x86 can use smp.clusters to configure the module level CPU topology
> of x86.
> 
> This series also is the preparation to help x86 to define the more
> flexible cache topology, such as having multiple cores share the
> same L2 cache at cluster level. (That was what x-l2-cache-topo did,
> and we will explore a generic way.)
> 
> About why we don't share L2 cache at cluster and need a configuration
> way, pls see section: ## Why not share L2 cache in cluster directly.
> 
> 
> # Background
> 
> The "clusters" parameter in "smp" is introduced by ARM [3], but x86
> hasn't supported it.
> 
> At present, x86 defaults L2 cache is shared in one core, but this is
> not enough. There're some platforms that multiple cores share the
> same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> Atom cores [4], that is, every four Atom cores shares one L2 cache.
> Therefore, we need the new CPU topology level (cluster/module).
> 
> Another reason is for hybrid architecture. cluster support not only
> provides another level of topology definition in x86, but would also
> provide required code change for future our hybrid topology support.
> 
> 
> # Overview
> 
> ## Introduction of module level for x86
> 
> "cluster" in smp is the CPU topology level which is between "core" and
> die.
> 
> For x86, the "cluster" in smp is corresponding to the module level [4],
> which is above the core level. So use the "module" other than "cluster"
> in x86 code.
> 
> And please note that x86 already has a cpu topology level also named
> "cluster" [5], this level is at the upper level of the package. Here,
> the cluster in x86 cpu topology is completely different from the
> "clusters" as the smp parameter. After the module level is introduced,
> the cluster as the smp parameter will actually refer to the module level
> of x86.
> 
> 
> ## Why not share L2 cache in cluster directly
> 
> Though "clusters" was introduced to help define L2 cache topology
> [3], using cluster to define x86's L2 cache topology will cause the
> compatibility problem:
> 
> Currently, x86 defaults that the L2 cache is shared in one core, which
> actually implies a default setting "cores per L2 cache is 1" and
> therefore implicitly defaults to having as many L2 caches as cores.
> 
> For example (i386 PC machine):
> -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> 
> Considering the topology of the L2 cache, this (*) implicitly means "1
> core per L2 cache" and "2 L2 caches per die".
> 
> If we use cluster to configure L2 cache topology with the new default
> setting "clusters per L2 cache is 1", the above semantics will change
> to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> cores per L2 cache".
> 
> So the same command (*) will cause changes in the L2 cache topology,
> further affecting the performance of the virtual machine.
> 
> Therefore, x86 should only treat cluster as a cpu topology level and
> avoid using it to change L2 cache by default for compatibility.
> 
> 
> ## module level in CPUID
> 
> Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix
> erroneous smp_num_siblings on Intel Hybrid platforms") is able to
> handle platforms with Module level enumerated via CPUID.1F.
> 
> Expose the module level in CPUID[0x1F] (for Intel CPUs) if the machine
> has more than 1 modules since v3.
> 
> 
> ## New cache topology info in CPUCacheInfo
> 
> (This is in preparation for users being able to configure cache topology
> from the cli later on.)
> 
> Currently, by default, the cache topology is encoded as:
> 1. i/d cache is shared in one core.
> 2. L2 cache is shared in one core.
> 3. L3 cache is shared in one die.
> 
> This default general setting has caused a misunderstanding, that is, the
> cache topology is completely equated with a specific cpu topology, such
> as the connection between L2 cache and core level, and the connection
> between L3 cache and die level.
> 
> In fact, the settings of these topologies depend on the specific
>

Re: [PATCH] target/i386: Fix CPUID encoding of Fn8000001E_ECX

2023-12-14 Thread Moger, Babu

Hi Zhao,

On 12/14/2023 8:08 AM, Zhao Liu wrote:

On Fri, Nov 10, 2023 at 11:08:06AM -0600, Babu Moger wrote:

Date: Fri, 10 Nov 2023 11:08:06 -0600
From: Babu Moger 
Subject: [PATCH] target/i386: Fix CPUID encoding of Fn801E_ECX
X-Mailer: git-send-email 2.34.1

Observed the following failure while booting the SEV-SNP guest and the
guest fails to boot with the smp parameters:
"-smp 192,sockets=1,dies=12,cores=8,threads=2".

qemu-system-x86_64: sev_snp_launch_update: SNP_LAUNCH_UPDATE ret=-5 fw_error=22 
'Invalid parameter'
qemu-system-x86_64: SEV-SNP: CPUID validation failed for function 0x801e, 
index: 0x0.
provided: eax:0x, ebx: 0x0100, ecx: 0x0b00, edx: 0x
expected: eax:0x, ebx: 0x0100, ecx: 0x0300, edx: 0x
qemu-system-x86_64: SEV-SNP: failed update CPUID page

Reason for the failure is due to overflowing of bits used for "Node per
processor" in CPUID Fn801E_ECX. This field's width is 3 bits wide and
can hold maximum value 0x7. With dies=12 (0xB), it overflows and spills
over into the reserved bits. In the case of SEV-SNP, this causes CPUID
enforcement failure and guest fails to boot.

The PPR documentation for CPUID_Fn801E_ECX [Node Identifiers]
=
BitsDescription
31:11   Reserved.

10:8NodesPerProcessor: Node per processor. Read-only.
 ValidValues:
 Value   Description
 0h  1 node per processor.
 7h-1h   Reserved.

7:0 NodeId: Node ID. Read-only. Reset: Fixed,XXh.
=

As in the spec, the valid value for "node per processor" is 0 and rest
are reserved.

Looking back at the history of decoding of CPUID_Fn801E_ECX, noticed
that there were cases where "node per processor" can be more than 1. It
is valid only for pre-F17h (pre-EPYC) architectures. For EPYC or later
CPUs, the linux kernel does not use this information to build the L3
topology.

Also noted that the CPUID Function 0x801E_ECX is available only when
TOPOEXT feature is enabled.

One additional query, such dependency relationship is not reflected in
encode_topo_cpuid801e(), should TOPOEXT be checked in
encode_topo_cpuid801e()?
No. We don't need to check in encode_topo_cpuid801e. Dependency 
check is done earlier than this is called.

This feature is enabled only for EPYC(F17h)
or later processors. So, previous generation of processors do not not
enumerate 0x801E_ECX leaf.

There could be some corner cases where the older guests could enable the
TOPOEXT feature by running with -cpu host, in which case legacy guests
might notice the topology change. To address those cases introduced a
new CPU property "legacy-multi-node". It will be true for older machine
types to maintain compatibility. By default, it will be false, so new
decoding will be used going forward.

The documentation is taken from Preliminary Processor Programming
Reference (PPR) for AMD Family 19h Model 11h, Revision B1 Processors 55901
Rev 0.25 - Oct 6, 2022.

Cc: qemu-sta...@nongnu.org
Fixes: 31ada106d891 ("Simplify CPUID_8000_001E for AMD")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger 
---

[snip]

+++ b/target/i386/cpu.h
@@ -1988,6 +1988,7 @@ struct ArchCPU {
   * If true present the old cache topology information
   */
  bool legacy_cache;
+bool legacy_multi_node;

This property deserves a comment, as does legacy_cache above.

Sure. Will do.

  /* Compatibility bits for old machine types: */

  bool enable_cpuid_0xb;
--
2.34.1

Just the above nit, otherwise,
Reviewed-by: Zhao Liu 

Thank you.

Babu

Re: [PATCH] target/i386: Fix CPUID encoding of Fn8000001E_ECX

2023-12-13 Thread Moger, Babu


Gentle reminder. Please let me know if there are any concerns or please
pull these patches for next update.
Thanks Babu

On 11/10/23 11:08, Babu Moger wrote:

Observed the following failure while booting the SEV-SNP guest and the
guest fails to boot with the smp parameters:
"-smp 192,sockets=1,dies=12,cores=8,threads=2".

qemu-system-x86_64: sev_snp_launch_update: SNP_LAUNCH_UPDATE ret=-5 fw_error=22 
'Invalid parameter'
qemu-system-x86_64: SEV-SNP: CPUID validation failed for function 0x801e, 
index: 0x0.
provided: eax:0x, ebx: 0x0100, ecx: 0x0b00, edx: 0x
expected: eax:0x, ebx: 0x0100, ecx: 0x0300, edx: 0x
qemu-system-x86_64: SEV-SNP: failed update CPUID page

Reason for the failure is due to overflowing of bits used for "Node per
processor" in CPUID Fn801E_ECX. This field's width is 3 bits wide and
can hold maximum value 0x7. With dies=12 (0xB), it overflows and spills
over into the reserved bits. In the case of SEV-SNP, this causes CPUID
enforcement failure and guest fails to boot.

The PPR documentation for CPUID_Fn801E_ECX [Node Identifiers]
=
BitsDescription
31:11   Reserved.

10:8NodesPerProcessor: Node per processor. Read-only.
ValidValues:
Value   Description
0h  1 node per processor.
7h-1h   Reserved.

7:0 NodeId: Node ID. Read-only. Reset: Fixed,XXh.
=

As in the spec, the valid value for "node per processor" is 0 and rest
are reserved.

Looking back at the history of decoding of CPUID_Fn801E_ECX, noticed
that there were cases where "node per processor" can be more than 1. It
is valid only for pre-F17h (pre-EPYC) architectures. For EPYC or later
CPUs, the linux kernel does not use this information to build the L3
topology.

Also noted that the CPUID Function 0x801E_ECX is available only when
TOPOEXT feature is enabled. This feature is enabled only for EPYC(F17h)
or later processors. So, previous generation of processors do not not
enumerate 0x801E_ECX leaf.

There could be some corner cases where the older guests could enable the
TOPOEXT feature by running with -cpu host, in which case legacy guests
might notice the topology change. To address those cases introduced a
new CPU property "legacy-multi-node". It will be true for older machine
types to maintain compatibility. By default, it will be false, so new
decoding will be used going forward.

The documentation is taken from Preliminary Processor Programming
Reference (PPR) for AMD Family 19h Model 11h, Revision B1 Processors 55901
Rev 0.25 - Oct 6, 2022.

Cc: qemu-sta...@nongnu.org
Fixes: 31ada106d891 ("Simplify CPUID_8000_001E for AMD")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger 
---
 hw/i386/pc.c  |  4 +++-
 target/i386/cpu.c | 18 ++
 target/i386/cpu.h |  1 +
 3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 188bc9d0f8..624d5da146 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -77,7 +77,9 @@
 { "qemu64-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },\
 { "athlon-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },
 
-GlobalProperty pc_compat_8_1[] = {};

+GlobalProperty pc_compat_8_1[] = {
+{ TYPE_X86_CPU, "legacy-multi-node", "on" },
+};
 const size_t pc_compat_8_1_len = G_N_ELEMENTS(pc_compat_8_1);
 
 GlobalProperty pc_compat_8_0[] = {

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 358d9c0a65..baee9394a1 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -398,12 +398,9 @@ static void encode_topo_cpuid801e(X86CPU *cpu, 
X86CPUTopoInfo *topo_info,
  * 31:11 Reserved.
  * 10:8 NodesPerProcessor: Node per processor. Read-only. Reset: XXXb.
  *  ValidValues:
- *  Value Description
- *  000b  1 node per processor.
- *  001b  2 nodes per processor.
- *  010b Reserved.
- *  011b 4 nodes per processor.
- *  111b-100b Reserved.
+ *  Value   Description
+ *  0h  1 node per processor.
+ *  7h-1h   Reserved.
  *  7:0 NodeId: Node ID. Read-only. Reset: XXh.
  *
  * NOTE: Hardware reserves 3 bits for number of nodes per processor.
@@ -412,8 +409,12 @@ static void encode_topo_cpuid801e(X86CPU *cpu, 
X86CPUTopoInfo *topo_info,
  * NodeId is combination of node and socket_id which is already decoded
  * in apic_id. Just use it by shifting.
  */
-*ecx = ((topo_info->dies_per_pkg - 1) << 8) |
-   ((cpu->apic_id >> apicid_die_offset(topo_info)) & 0xFF);
+if (cpu->legacy_multi_node) {
+*ecx = ((topo_info->dies_per_pkg - 1) << 8) |
+   ((cpu->apic_id >> apicid_die_offset(topo_info)) & 0xFF);
+} else {
+*ecx = (cpu->apic_id >> apicid_pkg_offset(topo_info)) & 0xFF;
+}
 
 *edx = 0;

Re: [PATCH v4 20/21] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]

2023-09-22 Thread Moger, Babu




On 9/14/2023 2:21 AM, Zhao Liu wrote:

From: Zhao Liu 

CPUID[0x801D].EAX[bits 25:14] NumSharingCache: number of logical
processors sharing cache.

The number of logical processors sharing this cache is
NumSharingCache + 1.

After cache models have topology information, we can use
CPUCacheInfo.share_level to decide which topology level to be encoded
into CPUID[0x801D].EAX[bits 25:14].

Signed-off-by: Zhao Liu 


Reviewed-by: Babu Moger 



---
Changes since v3:
  * Explain what "CPUID[0x801D].EAX[bits 25:14]" means in the commit
message. (Babu)

Changes since v1:
  * Use cache->share_level as the parameter in
max_processor_ids_for_cache().
---
  target/i386/cpu.c | 10 +-
  1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index bc28c59df089..3bed823dc3b7 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -482,20 +482,12 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
*cache,
 uint32_t *eax, uint32_t *ebx,
 uint32_t *ecx, uint32_t *edx)
  {
-uint32_t num_sharing_cache;
  assert(cache->size == cache->line_size * cache->associativity *
cache->partitions * cache->sets);
  
  *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |

 (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
-
-/* L3 is shared among multiple cores */
-if (cache->level == 3) {
-num_sharing_cache = 1 << apicid_die_offset(topo_info);
-} else {
-num_sharing_cache = 1 << apicid_core_offset(topo_info);
-}
-*eax |= (num_sharing_cache - 1) << 14;
+*eax |= max_processor_ids_for_cache(topo_info, cache->share_level) << 14;
  
  assert(cache->line_size > 0);

  assert(cache->partitions > 0);

Re: [PATCH v4 19/21] i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]

2023-09-22 Thread Moger, Babu




On 9/14/2023 2:21 AM, Zhao Liu wrote:

From: Zhao Liu 

The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
for cpuid 0x801D") adds the cache topology for AMD CPU by encoding
the number of sharing threads directly.

 From AMD's APM, NumSharingCache (CPUID[0x801D].EAX[bits 25:14])
means [1]:

The number of logical processors sharing this cache is the value of
this field incremented by 1. To determine which logical processors are
sharing a cache, determine a Share Id for each processor as follows:

ShareId = LocalApicId >> log2(NumSharingCache+1)

Logical processors with the same ShareId then share a cache. If
NumSharingCache+1 is not a power of two, round it up to the next power
of two.

 From the description above, the calculation of this field should be same
as CPUID[4].EAX[bits 25:14] for Intel CPUs. So also use the offsets of
APIC ID to calculate this field.

[1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
  Information

Signed-off-by: Zhao Liu 

Reviewed-by: Babu Moger 

---
Changes since v3:
  * Rewrite the subject. (Babu)
  * Delete the original "comment/help" expression, as this behavior is
confirmed for AMD CPUs. (Babu)
  * Rename "num_apic_ids" (v3) to "num_sharing_cache" to match spec
definition. (Babu)

Changes since v1:
  * Rename "l3_threads" to "num_apic_ids" in
encode_cache_cpuid801d(). (Yanan)
  * Add the description of the original commit and add Cc.
---
  target/i386/cpu.c | 10 --
  1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 5d066107d6ce..bc28c59df089 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -482,7 +482,7 @@ static void encode_cache_cpuid801d(CPUCacheInfo *cache,
 uint32_t *eax, uint32_t *ebx,
 uint32_t *ecx, uint32_t *edx)
  {
-uint32_t l3_threads;
+uint32_t num_sharing_cache;
  assert(cache->size == cache->line_size * cache->associativity *
cache->partitions * cache->sets);
  
@@ -491,13 +491,11 @@ static void encode_cache_cpuid801d(CPUCacheInfo *cache,
  
  /* L3 is shared among multiple cores */

  if (cache->level == 3) {
-l3_threads = topo_info->modules_per_die *
- topo_info->cores_per_module *
- topo_info->threads_per_core;
-*eax |= (l3_threads - 1) << 14;
+num_sharing_cache = 1 << apicid_die_offset(topo_info);
  } else {
-*eax |= ((topo_info->threads_per_core - 1) << 14);
+num_sharing_cache = 1 << apicid_core_offset(topo_info);
  }
+*eax |= (num_sharing_cache - 1) << 14;
  
  assert(cache->line_size > 0);

  assert(cache->partitions > 0);

Re: [PATCH v4 00/21] Support smp.clusters for x86 in QEMU

2023-09-22 Thread Moger, Babu


Tested the series on AMD system. Created a VM and ran some basic commands.

Everything looks good.

Tested-by: Babu Moger 


On 9/14/2023 2:21 AM, Zhao Liu wrote:

From: Zhao Liu 

Hi list,

(CC k...@vger.kernel.org for better browsing.)

This is the our v4 patch series, rebased on the master branch at the
commit 9ef497755afc2 ("Merge tag 'pull-vfio-20230911' of
https://github.com/legoater/qemu into staging").

Comparing with v3 [1], v4 mainly refactors the CPUID[0x1F] encoding and
exposes module level in CPUID[0x1F] with these new patches:

* [PATCH v4 08/21] i386: Split topology types of CPUID[0x1F] from the
definitions of CPUID[0xB]
* [PATCH v4 09/21] i386: Decouple CPUID[0x1F] subleaf with specific
topology level
* [PATCH v4 12/21] i386: Expose module level in CPUID[0x1F]

v4 also fixes compile warnings and fixes cache topology uninitialization
bugs for some AMD CPUs.

Welcome your comments!


# Introduction

This series add the cluster support for x86 PC machine, which allows
x86 can use smp.clusters to configure the module level CPU topology
of x86.

And since the compatibility issue (see section: ## Why not share L2
cache in cluster directly), this series also introduce a new command
to adjust the topology of the x86 L2 cache.

Welcome your comments!


# Background

The "clusters" parameter in "smp" is introduced by ARM [2], but x86
hasn't supported it.

At present, x86 defaults L2 cache is shared in one core, but this is
not enough. There're some platforms that multiple cores share the
same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
Atom cores [3], that is, every four Atom cores shares one L2 cache.
Therefore, we need the new CPU topology level (cluster/module).

Another reason is for hybrid architecture. cluster support not only
provides another level of topology definition in x86, but would also
provide required code change for future our hybrid topology support.


# Overview

## Introduction of module level for x86

"cluster" in smp is the CPU topology level which is between "core" and
die.

For x86, the "cluster" in smp is corresponding to the module level [4],
which is above the core level. So use the "module" other than "cluster"
in x86 code.

And please note that x86 already has a cpu topology level also named
"cluster" [4], this level is at the upper level of the package. Here,
the cluster in x86 cpu topology is completely different from the
"clusters" as the smp parameter. After the module level is introduced,
the cluster as the smp parameter will actually refer to the module level
of x86.


## Why not share L2 cache in cluster directly

Though "clusters" was introduced to help define L2 cache topology
[2], using cluster to define x86's L2 cache topology will cause the
compatibility problem:

Currently, x86 defaults that the L2 cache is shared in one core, which
actually implies a default setting "cores per L2 cache is 1" and
therefore implicitly defaults to having as many L2 caches as cores.

For example (i386 PC machine):
-smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)

Considering the topology of the L2 cache, this (*) implicitly means "1
core per L2 cache" and "2 L2 caches per die".

If we use cluster to configure L2 cache topology with the new default
setting "clusters per L2 cache is 1", the above semantics will change
to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
cores per L2 cache".

So the same command (*) will cause changes in the L2 cache topology,
further affecting the performance of the virtual machine.

Therefore, x86 should only treat cluster as a cpu topology level and
avoid using it to change L2 cache by default for compatibility.


## module level in CPUID

Linux kernel (from v6.4, with commit edc0a2b595765 ("x86/topology: Fix
erroneous smp_num_siblings on Intel Hybrid platforms") is able to
handle platforms with Module level enumerated via CPUID.1F.

Expose the module level in CPUID[0x1F] (for Intel CPUs) if the machine
has more than 1 modules since v3.

We can configure CPUID.04H.02H (L2 cache topology) with module level by
a new command:

 "-cpu,x-l2-cache-topo=cluster"

More information about this command, please see the section: "## New
property: x-l2-cache-topo".


## New cache topology info in CPUCacheInfo

Currently, by default, the cache topology is encoded as:
1. i/d cache is shared in one core.
2. L2 cache is shared in one core.
3. L3 cache is shared in one die.

This default general setting has caused a misunderstanding, that is, the
cache topology is completely equated with a specific cpu topology, such
as the connection between L2 cache and core level, and the connection
between L3 cache and die level.

In fact, the settings of these topologies depend on the specific
platform and are not static. For example, on Alder Lake-P, every
four Atom cores share the same L2 cache [2].

Thus, in this patch set, we explicitly define the corresponding cache
topology for different cpu models and this has two

Re: [PATCH v4 01/21] i386: Fix comment style in topology.h

2023-09-22 Thread Moger, Babu




On 9/14/2023 2:21 AM, Zhao Liu wrote:

From: Zhao Liu 

For function comments in this file, keep the comment style consistent
with other files in the directory.

Signed-off-by: Zhao Liu 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Yanan Wang 
Reviewed-by: Xiaoyao Li 
Acked-by: Michael S. Tsirkin 


Reviewed-by: Babu Moger 

Thanks

Babu



---
Changes since v3:
  * Optimized the description in commit message: Change "with other
places" to "with other files in the directory". (Babu)
---
  include/hw/i386/topology.h | 33 +
  1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 81573f6cfde0..5a19679f618b 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -24,7 +24,8 @@
  #ifndef HW_I386_TOPOLOGY_H
  #define HW_I386_TOPOLOGY_H
  
-/* This file implements the APIC-ID-based CPU topology enumeration logic,

+/*
+ * This file implements the APIC-ID-based CPU topology enumeration logic,
   * documented at the following document:
   *   Intel® 64 Architecture Processor Topology Enumeration
   *   
http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
@@ -41,7 +42,8 @@
  
  #include "qemu/bitops.h"
  
-/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support

+/*
+ * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
   */
  typedef uint32_t apic_id_t;
  
@@ -58,8 +60,7 @@ typedef struct X86CPUTopoInfo {

  unsigned threads_per_core;
  } X86CPUTopoInfo;
  
-/* Return the bit width needed for 'count' IDs

- */
+/* Return the bit width needed for 'count' IDs */
  static unsigned apicid_bitwidth_for_count(unsigned count)
  {
  g_assert(count >= 1);
@@ -67,15 +68,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
  return count ? 32 - clz32(count) : 0;
  }
  
-/* Bit width of the SMT_ID (thread ID) field on the APIC ID

- */
+/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
  static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
  {
  return apicid_bitwidth_for_count(topo_info->threads_per_core);
  }
  
-/* Bit width of the Core_ID field

- */
+/* Bit width of the Core_ID field */
  static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
  {
  return apicid_bitwidth_for_count(topo_info->cores_per_die);
@@ -87,8 +86,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo 
*topo_info)
  return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
  }
  
-/* Bit offset of the Core_ID field

- */
+/* Bit offset of the Core_ID field */
  static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
  {
  return apicid_smt_width(topo_info);
@@ -100,14 +98,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo 
*topo_info)
  return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
  }
  
-/* Bit offset of the Pkg_ID (socket ID) field

- */
+/* Bit offset of the Pkg_ID (socket ID) field */
  static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
  {
  return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
  }
  
-/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID

+/*
+ * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
   *
   * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
   */
@@ -120,7 +118,8 @@ static inline apic_id_t 
x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
 topo_ids->smt_id;
  }
  
-/* Calculate thread/core/package IDs for a specific topology,

+/*
+ * Calculate thread/core/package IDs for a specific topology,
   * based on (contiguous) CPU index
   */
  static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
@@ -137,7 +136,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo 
*topo_info,
  topo_ids->smt_id = cpu_index % nr_threads;
  }
  
-/* Calculate thread/core/package IDs for a specific topology,

+/*
+ * Calculate thread/core/package IDs for a specific topology,
   * based on APIC ID
   */
  static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
@@ -155,7 +155,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t 
apicid,
  topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
  }
  
-/* Make APIC ID for the CPU 'cpu_index'

+/*
+ * Make APIC ID for the CPU 'cpu_index'
   *
   * 'cpu_index' is a sequential, contiguous ID for the CPU.
   */

Re: [PATCH v2 1/2] i386: Add support for SUCCOR feature

2023-09-06 Thread Moger, Babu


Hi John,

On 9/5/2023 10:01 AM, John Allen wrote:

On Fri, Sep 01, 2023 at 11:30:53AM +0100, Joao Martins wrote:

On 26/07/2023 21:41, John Allen wrote:

Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to
be exposed to guests to allow them to handle machine check exceptions on AMD
hosts.

Reported-by: William Roche
Signed-off-by: John Allen

I think this is matching the last discussion:

Reviewed-by: Joao Martins

The patch ordering doesn't look correct though. Perhaps we should expose succor
only after MCE is fixed so this patch would be the second, not the first?

Yes, that makes sense. I will address this and send another version of
the series with the correct ordering.


Also, this should in generally be OK for -cpu host, but might be missing a third
patch that adds "succor" to the AMD models e.g.

Babu,

I think we previously discussed adding this to the models later in a
separate series. Is this your preferred course of action or can we add
it with this series?



Yes. We can add it later as a separate series. We just added EPYC-Genoa. 
We don't want to add EPYC-Genoa-v2 at this point. We have few more 
features pending as well.


Thanks

Babu

Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]

2023-08-23 Thread Moger, Babu

Hi Zhao,

On 8/18/23 02:37, Zhao Liu wrote:
> Hi Babu,
> 
> On Mon, Aug 14, 2023 at 11:03:53AM -0500, Moger, Babu wrote:
>> Date: Mon, 14 Aug 2023 11:03:53 -0500
>> From: "Moger, Babu" 
>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>>  CPUID[4]
>>
>> Hi Zhao,
>>
>>
>> On 8/14/23 03:22, Zhao Liu wrote:
>>> Hi Babu,
>>>
>>> On Fri, Aug 04, 2023 at 10:48:29AM -0500, Moger, Babu wrote:
>>>> Date: Fri, 4 Aug 2023 10:48:29 -0500
>>>> From: "Moger, Babu" 
>>>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>>>>  CPUID[4]
>>>>
>>>> Hi Zhao,
>>>>
>>>> On 8/4/23 04:48, Zhao Liu wrote:
>>>>> Hi Babu,
>>>>>
>>>>> On Thu, Aug 03, 2023 at 11:41:40AM -0500, Moger, Babu wrote:
>>>>>> Date: Thu, 3 Aug 2023 11:41:40 -0500
>>>>>> From: "Moger, Babu" 
>>>>>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to 
>>>>>> encode
>>>>>>  CPUID[4]
>>>>>>
>>>>>> Hi Zhao,
>>>>>>
>>>>>> On 8/2/23 18:49, Moger, Babu wrote:
>>>>>>> Hi Zhao,
>>>>>>>
>>>>>>> Hitting this error after this patch.
>>>>>>>
>>>>>>> ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
>>>>>>> not be reached
>>>>>>> Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: 
>>>>>>> code
>>>>>>> should not be reached
>>>>>>> Aborted (core dumped)
>>>>>>>
>>>>>>> Looks like share_level for all the caches for AMD is not initialized.
>>>>>
>>>>> I missed these change when I rebase. Sorry for that.
>>>>>
>>>>> BTW, could I ask a question? From a previous discussion[1], I understand
>>>>> that the cache info is used to show the correct cache information in
>>>>> new machine. And from [2], the wrong cache info may cause "compatibility
>>>>> issues".
>>>>>
>>>>> Is this "compatibility issues" AMD specific? I'm not sure if Intel should
>>>>> update the cache info like that. thanks!
>>>>
>>>> I was going to comment about that. Good that you asked me.
>>>>
>>>> Compatibility is qemu requirement.  Otherwise the migrations will fail.
>>>>
>>>> Any changes in the topology is going to cause migration problems.
>>>
>>> Could you please educate me more about the details of the "migration
>>> problem"?
>>>
>>> I didn't understand why it was causing the problem and wasn't sure if I
>>> was missing any cases.
>>>
>>
>> I am not an expert on migration but I test VM migration sometimes.
>> Here are some guidelines.
>> https://developers.redhat.com/blog/2015/03/24/live-migrating-qemu-kvm-virtual-machines
> 
> Thanks for the material!
> 
>>
>> When you migrate a VM to newer qemu using the same CPU type, migration
>> should work seamless. That means list of CPU features should be compatible
>> when you move to newer qemu version with CPU type.
> 
> I see. This patches set adds the "-smp cluster" command and the
> "x-l2-cache-topo" command. Migration requires that the target and

Shouldn't the command x-l2-cache-topo disabled by default? (For example
look at hw/i386/pc.c the property x-migrate-smi-count).

It will be enabled when user passes "-cpu x-l2-cache-topo=[core|cluster]".
Current code enables it by default as far I can see.

> source VM command lines are the same, so the new commands ensure that
> the migration is consistent.
> 
> But this patch set also includes some topology fixes (nr_cores fix and
> l1 cache topology fix) and encoding change (use APIC ID offset to encode
> addressable ids), these changes would affect migration and may cause
> CPUID change for VM view. Thus if this patch set is accepted, these
> changes also need to be pushed into stable versions. Do you agree?

Yes. That sounds right.

> 
> And about cache info for different CPU generations, migration usually
> happens on the same CPU type, and Intel uses the same default cache
> info for all CPU types. With the consistent cache info, migration is
> also Ok. So if we don't care about the specific cache info in the VM,
> it's okay to use the same default cache info for all CPU types. Right?

I am not sure about this. Please run migration tests to be sure.
-- 
Thanks
Babu Moger

Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]

2023-08-14 Thread Moger, Babu

Hi Zhao,


On 8/14/23 03:22, Zhao Liu wrote:
> Hi Babu,
> 
> On Fri, Aug 04, 2023 at 10:48:29AM -0500, Moger, Babu wrote:
>> Date: Fri, 4 Aug 2023 10:48:29 -0500
>> From: "Moger, Babu" 
>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>>  CPUID[4]
>>
>> Hi Zhao,
>>
>> On 8/4/23 04:48, Zhao Liu wrote:
>>> Hi Babu,
>>>
>>> On Thu, Aug 03, 2023 at 11:41:40AM -0500, Moger, Babu wrote:
>>>> Date: Thu, 3 Aug 2023 11:41:40 -0500
>>>> From: "Moger, Babu" 
>>>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>>>>  CPUID[4]
>>>>
>>>> Hi Zhao,
>>>>
>>>> On 8/2/23 18:49, Moger, Babu wrote:
>>>>> Hi Zhao,
>>>>>
>>>>> Hitting this error after this patch.
>>>>>
>>>>> ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
>>>>> not be reached
>>>>> Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
>>>>> should not be reached
>>>>> Aborted (core dumped)
>>>>>
>>>>> Looks like share_level for all the caches for AMD is not initialized.
>>>
>>> I missed these change when I rebase. Sorry for that.
>>>
>>> BTW, could I ask a question? From a previous discussion[1], I understand
>>> that the cache info is used to show the correct cache information in
>>> new machine. And from [2], the wrong cache info may cause "compatibility
>>> issues".
>>>
>>> Is this "compatibility issues" AMD specific? I'm not sure if Intel should
>>> update the cache info like that. thanks!
>>
>> I was going to comment about that. Good that you asked me.
>>
>> Compatibility is qemu requirement.  Otherwise the migrations will fail.
>>
>> Any changes in the topology is going to cause migration problems.
> 
> Could you please educate me more about the details of the "migration
> problem"?
> 
> I didn't understand why it was causing the problem and wasn't sure if I
> was missing any cases.
> 

I am not an expert on migration but I test VM migration sometimes.
Here are some guidelines.
https://developers.redhat.com/blog/2015/03/24/live-migrating-qemu-kvm-virtual-machines

When you migrate a VM to newer qemu using the same CPU type, migration
should work seamless. That means list of CPU features should be compatible
when you move to newer qemu version with CPU type.

Thanks
Babu

Re: [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]

2023-08-04 Thread Moger, Babu

Hi Zhao,

On 8/4/23 04:56, Zhao Liu wrote:
> Hi Babu,
> 
> On Thu, Aug 03, 2023 at 03:44:13PM -0500, Moger, Babu wrote:
>> Date: Thu, 3 Aug 2023 15:44:13 -0500
>> From: "Moger, Babu" 
>> Subject: Re: [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode
>>  CPUID[0x801D].EAX[bits 25:14]
>>
>> Hi Zhao,
>>   Please copy the thread to k...@vger.kernel.org also.  It makes it easier
>> to browse.
>>
> 
> OK. I'm not sure how to cc, should I forward all mail to KVM for the
> current version (v3), or should I cc the kvm mail list for the next
> version (v4)?

Yes. From v4.
Thanks
Babu
> 
>>
>> On 8/1/23 05:35, Zhao Liu wrote:
>>> From: Zhao Liu 
>>>
>>> CPUID[0x801D].EAX[bits 25:14] is used to represent the cache
>>> topology for amd CPUs.
>> Please change this to.
>>
>>
>> CPUID[0x801D].EAX[bits 25:14] NumSharingCache: number of logical
>> processors sharing cache. The number of
>> logical processors sharing this cache is NumSharingCache + 1.
> 
> OK.
> 
> Thanks,
> Zhao
> 
>>
>>>
>>> After cache models have topology information, we can use
>>> CPUCacheInfo.share_level to decide which topology level to be encoded
>>> into CPUID[0x801D].EAX[bits 25:14].
>>>
>>> Signed-off-by: Zhao Liu 
>>> ---
>>> Changes since v1:
>>>  * Use cache->share_level as the parameter in
>>>max_processor_ids_for_cache().
>>> ---
>>>  target/i386/cpu.c | 10 +-
>>>  1 file changed, 1 insertion(+), 9 deletions(-)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index f67b6be10b8d..6eee0274ade4 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -361,20 +361,12 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
>>> *cache,
>>> uint32_t *eax, uint32_t *ebx,
>>> uint32_t *ecx, uint32_t *edx)
>>>  {
>>> -uint32_t num_apic_ids;
>>>  assert(cache->size == cache->line_size * cache->associativity *
>>>cache->partitions * cache->sets);
>>>  
>>>  *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
>>> (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
>>> -
>>> -/* L3 is shared among multiple cores */
>>> -if (cache->level == 3) {
>>> -num_apic_ids = 1 << apicid_die_offset(topo_info);
>>> -} else {
>>> -num_apic_ids = 1 << apicid_core_offset(topo_info);
>>> -}
>>> -*eax |= (num_apic_ids - 1) << 14;
>>> +*eax |= max_processor_ids_for_cache(topo_info, cache->share_level) << 
>>> 14;
>>>  
>>>  assert(cache->line_size > 0);
>>>  assert(cache->partitions > 0);
>>
>> -- 
>> Thanks
>> Babu Moger

-- 
Thanks
Babu Moger

Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]

2023-08-04 Thread Moger, Babu

Hi Zhao,

On 8/4/23 04:48, Zhao Liu wrote:
> Hi Babu,
> 
> On Thu, Aug 03, 2023 at 11:41:40AM -0500, Moger, Babu wrote:
>> Date: Thu, 3 Aug 2023 11:41:40 -0500
>> From: "Moger, Babu" 
>> Subject: Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode
>>  CPUID[4]
>>
>> Hi Zhao,
>>
>> On 8/2/23 18:49, Moger, Babu wrote:
>>> Hi Zhao,
>>>
>>> Hitting this error after this patch.
>>>
>>> ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
>>> not be reached
>>> Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
>>> should not be reached
>>> Aborted (core dumped)
>>>
>>> Looks like share_level for all the caches for AMD is not initialized.
> 
> I missed these change when I rebase. Sorry for that.
> 
> BTW, could I ask a question? From a previous discussion[1], I understand
> that the cache info is used to show the correct cache information in
> new machine. And from [2], the wrong cache info may cause "compatibility
> issues".
> 
> Is this "compatibility issues" AMD specific? I'm not sure if Intel should
> update the cache info like that. thanks!

I was going to comment about that. Good that you asked me.

Compatibility is qemu requirement.  Otherwise the migrations will fail.

Any changes in the topology is going to cause migration problems.

I am not sure how you are going to handle this. You can probably look at
the feature "x-intel-pt-auto-level".

make sure to test the migration.

Thanks
Babu


> 
> [1]: 
> https://patchwork.kernel.org/project/kvm/patch/cy4pr12mb1768a3cbe42aafb03cb1081e95...@cy4pr12mb1768.namprd12.prod.outlook.com/
> [2]: 
> https://lore.kernel.org/qemu-devel/20180510204148.11687-1-babu.mo...@amd.com/
> 
>>
>> The following patch fixes the problem.
>>
>> ==
>>
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index f4c48e19fa..976a2755d8 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -528,6 +528,7 @@ static CPUCacheInfo legacy_l2_cache_cpuid2 = {
>>  .size = 2 * MiB,
>>  .line_size = 64,
>>  .associativity = 8,
>> +.share_level = CPU_TOPO_LEVEL_CORE,
> 
> This "legacy_l2_cache_cpuid2" is not used to encode cache topology.
> I should explicitly set this default topo level as CPU_TOPO_LEVEL_UNKNOW.
> 
>>  };
>>
>>
>> @@ -1904,6 +1905,7 @@ static CPUCaches epyc_v4_cache_info = {
>>  .lines_per_tag = 1,
>>  .self_init = 1,
>>  .no_invd_sharing = true,
>> +.share_level = CPU_TOPO_LEVEL_CORE,
>>  },
>>  .l1i_cache = &(CPUCacheInfo) {
>>  .type = INSTRUCTION_CACHE,
>> @@ -1916,6 +1918,7 @@ static CPUCaches epyc_v4_cache_info = {
>>  .lines_per_tag = 1,
>>  .self_init = 1,
>>  .no_invd_sharing = true,
>> +.share_level = CPU_TOPO_LEVEL_CORE,
>>  },
>>  .l2_cache = &(CPUCacheInfo) {
>>  .type = UNIFIED_CACHE,
>> @@ -1926,6 +1929,7 @@ static CPUCaches epyc_v4_cache_info = {
>>  .partitions = 1,
>>  .sets = 1024,
>>  .lines_per_tag = 1,
>> +.share_level = CPU_TOPO_LEVEL_CORE,
>>  },
>>  .l3_cache = &(CPUCacheInfo) {
>>  .type = UNIFIED_CACHE,
>> @@ -1939,6 +1943,7 @@ static CPUCaches epyc_v4_cache_info = {
>>  .self_init = true,
>>  .inclusive = true,
>>  .complex_indexing = false,
>> +.share_level = CPU_TOPO_LEVEL_DIE,
>>  },
>>  };
>>
>> @@ -2008,6 +2013,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>>  .lines_per_tag = 1,
>>  .self_init = 1,
>>  .no_invd_sharing = true,
>> +.share_level = CPU_TOPO_LEVEL_CORE,
>>  },
>>  .l1i_cache = &(CPUCacheInfo) {
>>  .type = INSTRUCTION_CACHE,
>> @@ -2020,6 +2026,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>>  .lines_per_tag = 1,
>>  .self_init = 1,
>>  .no_invd_sharing = true,
>> +.share_level = CPU_TOPO_LEVEL_CORE,
>>  },
>>  .l2_cache = &(CPUCacheInfo) {
>>  .type = UNIFIED_CACHE,
>> @@ -2030,6 +2037,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
>>  .partitions = 1,
>>  .sets = 1024,
>>  .lines_per_tag = 1,
>> +.share_level = CPU_TOPO_LEVEL_CORE,

Re: [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]

2023-08-03 Thread Moger, Babu

Hi Zhao,
  Please copy the thread to k...@vger.kernel.org also.  It makes it easier
to browse.


On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu 
> 
> CPUID[0x801D].EAX[bits 25:14] is used to represent the cache
> topology for amd CPUs.
Please change this to.


CPUID[0x801D].EAX[bits 25:14] NumSharingCache: number of logical
processors sharing cache. The number of
logical processors sharing this cache is NumSharingCache + 1.

> 
> After cache models have topology information, we can use
> CPUCacheInfo.share_level to decide which topology level to be encoded
> into CPUID[0x801D].EAX[bits 25:14].
> 
> Signed-off-by: Zhao Liu 
> ---
> Changes since v1:
>  * Use cache->share_level as the parameter in
>max_processor_ids_for_cache().
> ---
>  target/i386/cpu.c | 10 +-
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index f67b6be10b8d..6eee0274ade4 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -361,20 +361,12 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
> *cache,
> uint32_t *eax, uint32_t *ebx,
> uint32_t *ecx, uint32_t *edx)
>  {
> -uint32_t num_apic_ids;
>  assert(cache->size == cache->line_size * cache->associativity *
>cache->partitions * cache->sets);
>  
>  *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
> (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
> -
> -/* L3 is shared among multiple cores */
> -if (cache->level == 3) {
> -num_apic_ids = 1 << apicid_die_offset(topo_info);
> -} else {
> -num_apic_ids = 1 << apicid_core_offset(topo_info);
> -}
> -*eax |= (num_apic_ids - 1) << 14;
> +*eax |= max_processor_ids_for_cache(topo_info, cache->share_level) << 14;
>  
>  assert(cache->line_size > 0);
>  assert(cache->partitions > 0);

-- 
Thanks
Babu Moger

Re: [PATCH v3 15/17] i386: Fix NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14]

2023-08-03 Thread Moger, Babu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu 
> 
> The commit 8f4202fb1080 ("i386: Populate AMD Processor Cache Information
> for cpuid 0x801D") adds the cache topology for AMD CPU by encoding
> the number of sharing threads directly.
> 
> From AMD's APM, NumSharingCache (CPUID[0x801D].EAX[bits 25:14])
> means [1]:
> 
> The number of logical processors sharing this cache is the value of
> this field incremented by 1. To determine which logical processors are
> sharing a cache, determine a Share Id for each processor as follows:
> 
> ShareId = LocalApicId >> log2(NumSharingCache+1)
> 
> Logical processors with the same ShareId then share a cache. If
> NumSharingCache+1 is not a power of two, round it up to the next power
> of two.
> 
> From the description above, the caculation of this feild should be same
> as CPUID[4].EAX[bits 25:14] for intel cpus. So also use the offsets of
> APIC ID to calculate this field.
> 
> Note: I don't have the AMD hardware available, hope folks can help me
> to test this, thanks!

Yes. Decode looks good. You can remove this note in next revision.

The subject line "Fix" gives wrong impression. I would change the subject
to (or something like this).

i386: Use offsets get NumSharingCache for CPUID[0x801D].EAX[bits 25:14]


> 
> [1]: APM, vol.3, appendix.E.4.15 Function 8000_001Dh--Cache Topology
>  Information
> 
> Cc: Babu Moger 
> Signed-off-by: Zhao Liu 
> ---
> Changes since v1:
>  * Rename "l3_threads" to "num_apic_ids" in
>encode_cache_cpuid801d(). (Yanan)
>  * Add the description of the original commit and add Cc.
> ---
>  target/i386/cpu.c | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index c9897c0fe91a..f67b6be10b8d 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -361,7 +361,7 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
> *cache,
> uint32_t *eax, uint32_t *ebx,
> uint32_t *ecx, uint32_t *edx)
>  {
> -uint32_t l3_threads;
> +uint32_t num_apic_ids;

I would change it to match spec definition.

  uint32_t num_sharing_cache;


>  assert(cache->size == cache->line_size * cache->associativity *
>cache->partitions * cache->sets);
>  
> @@ -370,13 +370,11 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
> *cache,
>  
>  /* L3 is shared among multiple cores */
>  if (cache->level == 3) {
> -l3_threads = topo_info->modules_per_die *
> - topo_info->cores_per_module *
> - topo_info->threads_per_core;
> -*eax |= (l3_threads - 1) << 14;
> +num_apic_ids = 1 << apicid_die_offset(topo_info);
>  } else {
> -*eax |= ((topo_info->threads_per_core - 1) << 14);
> +num_apic_ids = 1 << apicid_core_offset(topo_info);
>  }
> +*eax |= (num_apic_ids - 1) << 14;
>  
>  assert(cache->line_size > 0);
>  assert(cache->partitions > 0);

-- 
Thanks
Babu Moger

Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]

2023-08-03 Thread Moger, Babu

Hi Zhao,

On 8/2/23 18:49, Moger, Babu wrote:
> Hi Zhao,
> 
> Hitting this error after this patch.
> 
> ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
> not be reached
> Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
> should not be reached
> Aborted (core dumped)
> 
> Looks like share_level for all the caches for AMD is not initialized.

The following patch fixes the problem.

==


diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f4c48e19fa..976a2755d8 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -528,6 +528,7 @@ static CPUCacheInfo legacy_l2_cache_cpuid2 = {
 .size = 2 * MiB,
 .line_size = 64,
 .associativity = 8,
+.share_level = CPU_TOPO_LEVEL_CORE,
 };


@@ -1904,6 +1905,7 @@ static CPUCaches epyc_v4_cache_info = {
 .lines_per_tag = 1,
 .self_init = 1,
 .no_invd_sharing = true,
+.share_level = CPU_TOPO_LEVEL_CORE,
 },
 .l1i_cache = &(CPUCacheInfo) {
 .type = INSTRUCTION_CACHE,
@@ -1916,6 +1918,7 @@ static CPUCaches epyc_v4_cache_info = {
 .lines_per_tag = 1,
 .self_init = 1,
 .no_invd_sharing = true,
+.share_level = CPU_TOPO_LEVEL_CORE,
 },
 .l2_cache = &(CPUCacheInfo) {
 .type = UNIFIED_CACHE,
@@ -1926,6 +1929,7 @@ static CPUCaches epyc_v4_cache_info = {
 .partitions = 1,
 .sets = 1024,
 .lines_per_tag = 1,
+.share_level = CPU_TOPO_LEVEL_CORE,
 },
 .l3_cache = &(CPUCacheInfo) {
 .type = UNIFIED_CACHE,
@@ -1939,6 +1943,7 @@ static CPUCaches epyc_v4_cache_info = {
 .self_init = true,
 .inclusive = true,
 .complex_indexing = false,
+.share_level = CPU_TOPO_LEVEL_DIE,
 },
 };

@@ -2008,6 +2013,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
 .lines_per_tag = 1,
 .self_init = 1,
 .no_invd_sharing = true,
+.share_level = CPU_TOPO_LEVEL_CORE,
 },
 .l1i_cache = &(CPUCacheInfo) {
 .type = INSTRUCTION_CACHE,
@@ -2020,6 +2026,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
 .lines_per_tag = 1,
 .self_init = 1,
 .no_invd_sharing = true,
+.share_level = CPU_TOPO_LEVEL_CORE,
 },
 .l2_cache = &(CPUCacheInfo) {
 .type = UNIFIED_CACHE,
@@ -2030,6 +2037,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
 .partitions = 1,
 .sets = 1024,
 .lines_per_tag = 1,
+.share_level = CPU_TOPO_LEVEL_CORE,
 },
 .l3_cache = &(CPUCacheInfo) {
 .type = UNIFIED_CACHE,
@@ -2043,6 +2051,7 @@ static const CPUCaches epyc_rome_v3_cache_info = {
 .self_init = true,
 .inclusive = true,
 .complex_indexing = false,
+.share_level = CPU_TOPO_LEVEL_DIE,
 },
 };

@@ -2112,6 +2121,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
 .lines_per_tag = 1,
 .self_init = 1,
 .no_invd_sharing = true,
+.share_level = CPU_TOPO_LEVEL_CORE,
 },
 .l1i_cache = &(CPUCacheInfo) {
 .type = INSTRUCTION_CACHE,
@@ -2124,6 +2134,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
 .lines_per_tag = 1,
 .self_init = 1,
 .no_invd_sharing = true,
+.share_level = CPU_TOPO_LEVEL_CORE,
 },
 .l2_cache = &(CPUCacheInfo) {
 .type = UNIFIED_CACHE,
@@ -2134,6 +2145,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
 .partitions = 1,
 .sets = 1024,
 .lines_per_tag = 1,
+.share_level = CPU_TOPO_LEVEL_CORE,
 },
 .l3_cache = &(CPUCacheInfo) {
 .type = UNIFIED_CACHE,
@@ -2147,6 +2159,7 @@ static const CPUCaches epyc_milan_v2_cache_info = {
 .self_init = true,
 .inclusive = true,
 .complex_indexing = false,
+.share_level = CPU_TOPO_LEVEL_DIE,
 },
 };

@@ -2162,6 +2175,7 @@ static const CPUCaches epyc_genoa_cache_info = {
 .lines_per_tag = 1,
 .self_init = 1,
 .no_invd_sharing = true,
+.share_level = CPU_TOPO_LEVEL_CORE,
 },
 .l1i_cache = &(CPUCacheInfo) {
 .type = INSTRUCTION_CACHE,
@@ -2174,6 +2188,7 @@ static const CPUCaches epyc_genoa_cache_info = {
 .lines_per_tag = 1,
 .self_init = 1,
 .no_invd_sharing = true,
+.share_level = CPU_TOPO_LEVEL_CORE,
 },
 .l2_cache = &(CPUCacheInfo) {
 .type = UNIFIED_CACHE,
@@ -2184,6 +2199,7 @@ static const CPUCaches epyc_genoa_cache_info = {
 .partitions = 1,
 .sets = 2048,
 .lines_per_tag = 1,
+.share_level = CPU_TOPO_LEVEL_CORE,
 },
 .l3_cache = &(CPUCacheInfo) {
 .type = UNIFIED_CACHE,
@@ -2197,6 +2213,7 @@ static const CPUCaches epyc_genoa_cache_info = {
 .self_init = true,
 .inclusive = tru

Re: [PATCH v3 14/17] i386: Use CPUCacheInfo.share_level to encode CPUID[4]

2023-08-02 Thread Moger, Babu

Hi Zhao,

Hitting this error after this patch.

ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code should
not be reached
Bail out! ERROR:../target/i386/cpu.c:257:max_processor_ids_for_cache: code
should not be reached
Aborted (core dumped)

Looks like share_level for all the caches for AMD is not initialized.

Thanks
Babu

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu 
> 
> CPUID[4].EAX[bits 25:14] is used to represent the cache topology for
> intel CPUs.
> 
> After cache models have topology information, we can use
> CPUCacheInfo.share_level to decide which topology level to be encoded
> into CPUID[4].EAX[bits 25:14].
> 
> And since maximum_processor_id (original "num_apic_ids") is parsed
> based on cpu topology levels, which are verified when parsing smp, it's
> no need to check this value by "assert(num_apic_ids > 0)" again, so
> remove this assert.
> 
> Additionally, wrap the encoding of CPUID[4].EAX[bits 31:26] into a
> helper to make the code cleaner.
> 
> Signed-off-by: Zhao Liu 
> ---
> Changes since v1:
>  * Use "enum CPUTopoLevel share_level" as the parameter in
>max_processor_ids_for_cache().
>  * Make cache_into_passthrough case also use
>max_processor_ids_for_cache() and max_core_ids_in_package() to
>encode CPUID[4]. (Yanan)
>  * Rename the title of this patch (the original is "i386: Use
>CPUCacheInfo.share_level to encode CPUID[4].EAX[bits 25:14]").
> ---
>  target/i386/cpu.c | 70 +--
>  1 file changed, 43 insertions(+), 27 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 55aba4889628..c9897c0fe91a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -234,22 +234,53 @@ static uint8_t cpuid2_cache_descriptor(CPUCacheInfo 
> *cache)
> ((t) == UNIFIED_CACHE) ? CACHE_TYPE_UNIFIED : \
> 0 /* Invalid value */)
>  
> +static uint32_t max_processor_ids_for_cache(X86CPUTopoInfo *topo_info,
> +enum CPUTopoLevel share_level)
> +{
> +uint32_t num_ids = 0;
> +
> +switch (share_level) {
> +case CPU_TOPO_LEVEL_CORE:
> +num_ids = 1 << apicid_core_offset(topo_info);
> +break;
> +case CPU_TOPO_LEVEL_DIE:
> +num_ids = 1 << apicid_die_offset(topo_info);
> +break;
> +case CPU_TOPO_LEVEL_PACKAGE:
> +num_ids = 1 << apicid_pkg_offset(topo_info);
> +break;
> +default:
> +/*
> + * Currently there is no use case for SMT and MODULE, so use
> + * assert directly to facilitate debugging.
> + */
> +g_assert_not_reached();
> +}
> +
> +return num_ids - 1;
> +}
> +
> +static uint32_t max_core_ids_in_package(X86CPUTopoInfo *topo_info)
> +{
> +uint32_t num_cores = 1 << (apicid_pkg_offset(topo_info) -
> +   apicid_core_offset(topo_info));
> +return num_cores - 1;
> +}
>  
>  /* Encode cache info for CPUID[4] */
>  static void encode_cache_cpuid4(CPUCacheInfo *cache,
> -int num_apic_ids, int num_cores,
> +X86CPUTopoInfo *topo_info,
>  uint32_t *eax, uint32_t *ebx,
>  uint32_t *ecx, uint32_t *edx)
>  {
>  assert(cache->size == cache->line_size * cache->associativity *
>cache->partitions * cache->sets);
>  
> -assert(num_apic_ids > 0);
>  *eax = CACHE_TYPE(cache->type) |
> CACHE_LEVEL(cache->level) |
> (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) |
> -   ((num_cores - 1) << 26) |
> -   ((num_apic_ids - 1) << 14);
> +   (max_core_ids_in_package(topo_info) << 26) |
> +   (max_processor_ids_for_cache(topo_info, cache->share_level) << 
> 14);
>  
>  assert(cache->line_size > 0);
>  assert(cache->partitions > 0);
> @@ -6116,56 +6147,41 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
>  
>  if (cores_per_pkg > 1) {
> -int addressable_cores_offset =
> -
> apicid_pkg_offset(_info) -
> -
> apicid_core_offset(_info);
> -
>  *eax &= ~0xFC00;
> -*eax |= (1 << addressable_cores_offset - 1) << 26;
> +*eax |= max_core_ids_in_package(_info) << 26;
>  }
>  if (host_vcpus_per_cache > cpus_per_pkg) {
> -int pkg_offset = apicid_pkg_offset(_info);
> -
>  *eax &= ~0x3FFC000;
> -*eax |= (1 << pkg_offset - 1) << 14;
> +*eax |=
> +max_processor_ids_for_cache(_info,
> +CPU_TOPO_LEVEL_PACKAGE)

Re: [PATCH v3 10/17] i386/cpu: Introduce cluster-id to X86CPU

2023-08-02 Thread Moger, Babu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhuocheng Ding 
> 
> We introduce cluster-id other than module-id to be consistent with

s/We introduce/Introduce/

Thanks
Babu

> CpuInstanceProperties.cluster-id, and this avoids the confusion
> of parameter names when hotplugging.
> 
> Following the legacy smp check rules, also add the cluster_id validity
> into x86_cpu_pre_plug().
> 
> Signed-off-by: Zhuocheng Ding 
> Co-developed-by: Zhao Liu 
> Signed-off-by: Zhao Liu 
> Acked-by: Michael S. Tsirkin 
> ---
>  hw/i386/x86.c | 33 +
>  target/i386/cpu.c |  2 ++
>  target/i386/cpu.h |  1 +
>  3 files changed, 28 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 0b460fd6074d..8154b86f95c7 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -328,6 +328,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
>  cpu->die_id = 0;
>  }
>  
> +/*
> + * cluster-id was optional in QEMU 8.0 and older, so keep it optional
> + * if there's only one cluster per die.
> + */
> +if (cpu->cluster_id < 0 && ms->smp.clusters == 1) {
> +cpu->cluster_id = 0;
> +}
> +
>  if (cpu->socket_id < 0) {
>  error_setg(errp, "CPU socket-id is not set");
>  return;
> @@ -344,6 +352,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
> cpu->die_id, ms->smp.dies - 1);
>  return;
>  }
> +if (cpu->cluster_id < 0) {
> +error_setg(errp, "CPU cluster-id is not set");
> +return;
> +} else if (cpu->cluster_id > ms->smp.clusters - 1) {
> +error_setg(errp, "Invalid CPU cluster-id: %u must be in range 
> 0:%u",
> +   cpu->cluster_id, ms->smp.clusters - 1);
> +return;
> +}
>  if (cpu->core_id < 0) {
>  error_setg(errp, "CPU core-id is not set");
>  return;
> @@ -363,16 +379,9 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
>  
>  topo_ids.pkg_id = cpu->socket_id;
>  topo_ids.die_id = cpu->die_id;
> +topo_ids.module_id = cpu->cluster_id;
>  topo_ids.core_id = cpu->core_id;
>  topo_ids.smt_id = cpu->thread_id;
> -
> -/*
> - * TODO: This is the temporary initialization for topo_ids.module_id 
> to
> - * avoid "maybe-uninitialized" compilation errors. Will remove when
> - * X86CPU supports cluster_id.
> - */
> -topo_ids.module_id = 0;
> -
>  cpu->apic_id = x86_apicid_from_topo_ids(_info, _ids);
>  }
>  
> @@ -419,6 +428,14 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
>  }
>  cpu->die_id = topo_ids.die_id;
>  
> +if (cpu->cluster_id != -1 && cpu->cluster_id != topo_ids.module_id) {
> +error_setg(errp, "property cluster-id: %u doesn't match set apic-id:"
> +" 0x%x (cluster-id: %u)", cpu->cluster_id, cpu->apic_id,
> +topo_ids.module_id);
> +return;
> +}
> +cpu->cluster_id = topo_ids.module_id;
> +
>  if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
>  error_setg(errp, "property core-id: %u doesn't match set apic-id:"
>  " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id,
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index d6969813ee02..ffa282219078 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -7806,12 +7806,14 @@ static Property x86_cpu_properties[] = {
>  DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, 0),
>  DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, 0),
>  DEFINE_PROP_INT32("core-id", X86CPU, core_id, 0),
> +DEFINE_PROP_INT32("cluster-id", X86CPU, cluster_id, 0),
>  DEFINE_PROP_INT32("die-id", X86CPU, die_id, 0),
>  DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, 0),
>  #else
>  DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, UNASSIGNED_APIC_ID),
>  DEFINE_PROP_INT32("thread-id", X86CPU, thread_id, -1),
>  DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
> +DEFINE_PROP_INT32("cluster-id", X86CPU, cluster_id, -1),
>  DEFINE_PROP_INT32("die-id", X86CPU, die_id, -1),
>  DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
>  #endif
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 5e97d0b76574..d9577938ae04 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -2034,6 +2034,7 @@ struct ArchCPU {
>  int32_t node_id; /* NUMA node this CPU belongs to */
>  int32_t socket_id;
>  int32_t die_id;
> +int32_t cluster_id;
>  int32_t core_id;
>  int32_t thread_id;
>  

-- 
Thanks
Babu Moger

Re: [PATCH v3 08/17] i386: Support modules_per_die in X86CPUTopoInfo

2023-08-02 Thread Moger, Babu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhuocheng Ding 
> 
> Support module level in i386 cpu topology structure "X86CPUTopoInfo".
> 
> Since x86 does not yet support the "clusters" parameter in "-smp",
> X86CPUTopoInfo.modules_per_die is currently always 1. Therefore, the
> module level width in APIC ID, which can be calculated by
> "apicid_bitwidth_for_count(topo_info->modules_per_die)", is always 0
> for now, so we can directly add APIC ID related helpers to support
> module level parsing.
> 
> At present, we don't expose module level in CPUID.1FH because currently
> linux (v6.4-rc1) doesn't support module level. And exposing module and
> die levels at the same time in CPUID.1FH will cause linux to calculate
> the wrong die_id. The module level should be exposed until the real
> machine has the module level in CPUID.1FH.
> 
> In addition, update topology structure in test-x86-topo.c.
> 
> Signed-off-by: Zhuocheng Ding 
> Co-developed-by: Zhao Liu 
> Signed-off-by: Zhao Liu 
> Acked-by: Michael S. Tsirkin 
> ---
> Changes since v1:
>  * Include module level related helpers (apicid_module_width() and
>apicid_module_offset()) in this patch. (Yanan)
> ---
>  hw/i386/x86.c  |  3 ++-
>  include/hw/i386/topology.h | 22 +++
>  target/i386/cpu.c  | 12 ++
>  tests/unit/test-x86-topo.c | 45 --
>  4 files changed, 52 insertions(+), 30 deletions(-)
> 
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 4efc390905ff..a552ae8bb4a8 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -72,7 +72,8 @@ static void init_topo_info(X86CPUTopoInfo *topo_info,
>  MachineState *ms = MACHINE(x86ms);
>  
>  topo_info->dies_per_pkg = ms->smp.dies;
> -topo_info->cores_per_die = ms->smp.cores;
> +topo_info->modules_per_die = ms->smp.clusters;

It is confusing. You said in the previous patch, using the clusters for
x86 is going to cause compatibility issues. Why is this clusters is used
to initialize modules_per_die?

Why not define a new field "modules"(just like clusters) in smp and use it
x86? Is is going to a problem?
May be I am not clear here. I am yet to understand all the other changes.

Thanks
Babu

> +topo_info->cores_per_module = ms->smp.cores;
>  topo_info->threads_per_core = ms->smp.threads;
>  }
>  
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 5a19679f618b..c807d3811dd3 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -56,7 +56,8 @@ typedef struct X86CPUTopoIDs {
>  
>  typedef struct X86CPUTopoInfo {
>  unsigned dies_per_pkg;
> -unsigned cores_per_die;
> +unsigned modules_per_die;
> +unsigned cores_per_module;
>  unsigned threads_per_core;
>  } X86CPUTopoInfo;
>  
> @@ -77,7 +78,13 @@ static inline unsigned apicid_smt_width(X86CPUTopoInfo 
> *topo_info)
>  /* Bit width of the Core_ID field */
>  static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>  {
> -return apicid_bitwidth_for_count(topo_info->cores_per_die);
> +return apicid_bitwidth_for_count(topo_info->cores_per_module);
> +}
> +
> +/* Bit width of the Module_ID (cluster ID) field */
> +static inline unsigned apicid_module_width(X86CPUTopoInfo *topo_info)
> +{
> +return apicid_bitwidth_for_count(topo_info->modules_per_die);
>  }
>  
>  /* Bit width of the Die_ID field */
> @@ -92,10 +99,16 @@ static inline unsigned apicid_core_offset(X86CPUTopoInfo 
> *topo_info)
>  return apicid_smt_width(topo_info);
>  }
>  
> +/* Bit offset of the Module_ID (cluster ID) field */
> +static inline unsigned apicid_module_offset(X86CPUTopoInfo *topo_info)
> +{
> +return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> +}
> +
>  /* Bit offset of the Die_ID field */
>  static inline unsigned apicid_die_offset(X86CPUTopoInfo *topo_info)
>  {
> -return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
> +return apicid_module_offset(topo_info) + apicid_module_width(topo_info);
>  }
>  
>  /* Bit offset of the Pkg_ID (socket ID) field */
> @@ -127,7 +140,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo 
> *topo_info,
>   X86CPUTopoIDs *topo_ids)
>  {
>  unsigned nr_dies = topo_info->dies_per_pkg;
> -unsigned nr_cores = topo_info->cores_per_die;
> +unsigned nr_cores = topo_info->cores_per_module *
> +topo_info->modules_per_die;
>  unsigned nr_threads = topo_info->threads_per_core;
>  
>  topo_ids->pkg_id = cpu_index / (nr_dies * nr_cores * nr_threads);
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 8a9fd5682efc..d6969813ee02 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -339,7 +339,9 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
> *cache,
>  
>  /* L3 is shared among multiple cores */
>  if (cache->level == 3) {
> -l3_threads = topo_info->cores_per_die *

Re: [PATCH v3 06/17] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid()

2023-08-02 Thread Moger, Babu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu 
> 
> In cpu_x86_cpuid(), there are many variables in representing the cpu
> topology, e.g., topo_info, cs->nr_cores/cs->nr_threads.
> 
> Since the names of cs->nr_cores/cs->nr_threads does not accurately
> represent its meaning, the use of cs->nr_cores/cs->nr_threads is prone
> to confusion and mistakes.
> 
> And the structure X86CPUTopoInfo names its memebers clearly, thus the

s/memebers/members/
Thanks
Babu

> variable "topo_info" should be preferred.
> 
> In addition, in cpu_x86_cpuid(), to uniformly use the topology variable,
> replace env->dies with topo_info.dies_per_pkg as well.
> 
> Suggested-by: Robert Hoo 
> Signed-off-by: Zhao Liu 
> ---
> Changes since v1:
>  * Extract cores_per_socket from the code block and use it as a local
>variable for cpu_x86_cpuid(). (Yanan)
>  * Remove vcpus_per_socket variable and use cpus_per_pkg directly.
>(Yanan)
>  * Replace env->dies with topo_info.dies_per_pkg in cpu_x86_cpuid().
> ---
>  target/i386/cpu.c | 31 ++-
>  1 file changed, 18 insertions(+), 13 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index c80613bfcded..fc50bf98c60e 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6008,11 +6008,16 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>  uint32_t limit;
>  uint32_t signature[3];
>  X86CPUTopoInfo topo_info;
> +uint32_t cores_per_pkg;
> +uint32_t cpus_per_pkg;
>  
>  topo_info.dies_per_pkg = env->nr_dies;
>  topo_info.cores_per_die = cs->nr_cores / env->nr_dies;
>  topo_info.threads_per_core = cs->nr_threads;
>  
> +cores_per_pkg = topo_info.cores_per_die * topo_info.dies_per_pkg;
> +cpus_per_pkg = cores_per_pkg * topo_info.threads_per_core;
> +
>  /* Calculate & apply limits for different index ranges */
>  if (index >= 0xC000) {
>  limit = env->cpuid_xlevel2;
> @@ -6048,8 +6053,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>  *ecx |= CPUID_EXT_OSXSAVE;
>  }
>  *edx = env->features[FEAT_1_EDX];
> -if (cs->nr_cores * cs->nr_threads > 1) {
> -*ebx |= (cs->nr_cores * cs->nr_threads) << 16;
> +if (cpus_per_pkg > 1) {
> +*ebx |= cpus_per_pkg << 16;
>  *edx |= CPUID_HT;
>  }
>  if (!cpu->enable_pmu) {
> @@ -6086,8 +6091,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>   */
>  if (*eax & 31) {
>  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
> -int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
> -if (cs->nr_cores > 1) {
> +
> +if (cores_per_pkg > 1) {
>  int addressable_cores_offset =
>  
> apicid_pkg_offset(_info) -
>  
> apicid_core_offset(_info);
> @@ -6095,7 +6100,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>  *eax &= ~0xFC00;
>  *eax |= (1 << addressable_cores_offset - 1) << 26;
>  }
> -if (host_vcpus_per_cache > vcpus_per_socket) {
> +if (host_vcpus_per_cache > cpus_per_pkg) {
>  int pkg_offset = apicid_pkg_offset(_info);
>  
>  *eax &= ~0x3FFC000;
> @@ -6240,12 +6245,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>  switch (count) {
>  case 0:
>  *eax = apicid_core_offset(_info);
> -*ebx = cs->nr_threads;
> +*ebx = topo_info.threads_per_core;
>  *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
>  break;
>  case 1:
>  *eax = apicid_pkg_offset(_info);
> -*ebx = cs->nr_cores * cs->nr_threads;
> +*ebx = cpus_per_pkg;
>  *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
>  break;
>  default:
> @@ -6266,7 +6271,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>  break;
>  case 0x1F:
>  /* V2 Extended Topology Enumeration Leaf */
> -if (env->nr_dies < 2) {
> +if (topo_info.dies_per_pkg < 2) {
>  *eax = *ebx = *ecx = *edx = 0;
>  break;
>  }
> @@ -6276,7 +6281,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>  switch (count) {
>  case 0:
>  *eax = apicid_core_offset(_info);
> -*ebx = cs->nr_threads;
> +*ebx = topo_info.threads_per_core;
>  *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
>  break;
>  case 1:
> @@ -6286,7 +6291,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>  break;
>  case 2:
>  *eax =

Re: [PATCH v3 05/17] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4]

2023-08-02 Thread Moger, Babu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu 
> 
> Refer to the fixes of cache_info_passthrough ([1], [2]) and SDM, the
> CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26] should use the
> nearest power-of-2 integer.
> 
> The nearest power-of-2 integer can be caculated by pow2ceil() or by
> using APIC ID offset (like L3 topology using 1 << die_offset [3]).
> 
> But in fact, CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits 31:26]
> are associated with APIC ID. For example, in linux kernel, the field
> "num_threads_sharing" (Bits 25 - 14) is parsed with APIC ID. And for
> another example, on Alder Lake P, the CPUID.04H:EAX[bits 31:26] is not
> matched with actual core numbers and it's caculated by:
> "(1 << (pkg_offset - core_offset)) - 1".
> 
> Therefore the offset of APIC ID should be preferred to caculate nearest
> power-of-2 integer for CPUID.04H:EAX[bits 25:14] and CPUID.04H:EAX[bits
> 31:26]:
> 1. d/i cache is shared in a core, 1 << core_offset should be used
>instand of "cs->nr_threads" in encode_cache_cpuid4() for
>CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14].
> 2. L2 cache is supposed to be shared in a core as for now, thereby
>1 << core_offset should also be used instand of "cs->nr_threads" in
>encode_cache_cpuid4() for CPUID.04H.02H:EAX[bits 25:14].
> 3. Similarly, the value for CPUID.04H:EAX[bits 31:26] should also be
>replaced by the offsets upper SMT level in APIC ID.
> 
> In addition, use APIC ID offset to replace "pow2ceil()" for
> cache_info_passthrough case.
> 
> [1]: efb3934adf9e ("x86: cpu: make sure number of addressable IDs for 
> processor cores meets the spec")
> [2]: d7caf13b5fcf ("x86: cpu: fixup number of addressable IDs for logical 
> processors sharing cache")
> [3]: d65af288a84d ("i386: Update new x86_apicid parsing rules with die_offset 
> support")
> 
> Fixes: 7e3482f82480 ("i386: Helpers to encode cache information consistently")
> Suggested-by: Robert Hoo 
> Signed-off-by: Zhao Liu 
> ---
> Changes since v1:
>  * Use APIC ID offset to replace "pow2ceil()" for cache_info_passthrough
>case. (Yanan)
>  * Split the L1 cache fix into a separate patch.
>  * Rename the title of this patch (the original is "i386/cpu: Fix number
>of addressable IDs in CPUID.04H").
> ---
>  target/i386/cpu.c | 30 +++---
>  1 file changed, 23 insertions(+), 7 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b439a05244ee..c80613bfcded 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6005,7 +6005,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>  {
>  X86CPU *cpu = env_archcpu(env);
>  CPUState *cs = env_cpu(env);
> -uint32_t die_offset;
>  uint32_t limit;
>  uint32_t signature[3];
>  X86CPUTopoInfo topo_info;
> @@ -6089,39 +6088,56 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>  int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
>  int vcpus_per_socket = cs->nr_cores * cs->nr_threads;
>  if (cs->nr_cores > 1) {
> +int addressable_cores_offset =
> +
> apicid_pkg_offset(_info) -
> +
> apicid_core_offset(_info);
> +
>  *eax &= ~0xFC00;
> -*eax |= (pow2ceil(cs->nr_cores) - 1) << 26;
> +*eax |= (1 << addressable_cores_offset - 1) << 26;
>  }
>  if (host_vcpus_per_cache > vcpus_per_socket) {
> +int pkg_offset = apicid_pkg_offset(_info);
> +
>  *eax &= ~0x3FFC000;
> -*eax |= (pow2ceil(vcpus_per_socket) - 1) << 14;
> +*eax |= (1 << pkg_offset - 1) << 14;
>  }
>  }

I hit this compile error with this patch.

[1/18] Generating qemu-version.h with a custom command (wrapped by meson
to capture output)
[2/4] Compiling C object libqemu-x86_64-softmmu.fa.p/target_i386_cpu.c.o
FAILED: libqemu-x86_64-softmmu.fa.p/target_i386_cpu.c.o
..
..
softmmu.fa.p/target_i386_cpu.c.o -c ../target/i386/cpu.c
../target/i386/cpu.c: In function ‘cpu_x86_cpuid’:
../target/i386/cpu.c:6096:60: error: suggest parentheses around ‘-’ inside
‘<<’ [-Werror=parentheses]
 6096 | *eax |= (1 << addressable_cores_offset - 1) << 26;
  |   ~^~~
../target/i386/cpu.c:6102:46: error: suggest parentheses around ‘-’ inside
‘<<’ [-Werror=parentheses]
 6102 | *eax |= (1 << pkg_offset - 1) << 14;
  |   ~~~^~~
cc1: all warnings being treated as errors

Please fix this.
Thanks
Babu


>  } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
>  *eax = *ebx = *ecx = *edx = 0;
>  } else {
>  *eax =

Re: [PATCH v3 03/17] softmmu: Fix CPUSTATE.nr_cores' calculation

2023-08-02 Thread Moger, Babu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhuocheng Ding 
> 
> From CPUState.nr_cores' comment, it represents "number of cores within
> this CPU package".
> 
> After 003f230e37d7 ("machine: Tweak the order of topology members in
> struct CpuTopology"), the meaning of smp.cores changed to "the number of
> cores in one die", but this commit missed to change CPUState.nr_cores'
> caculation, so that CPUState.nr_cores became wrong and now it
> misses to consider numbers of clusters and dies.
> 
> At present, only i386 is using CPUState.nr_cores.
> 
> But as for i386, which supports die level, the uses of CPUState.nr_cores
> are very confusing:
> 
> Early uses are based on the meaning of "cores per package" (before die
> is introduced into i386), and later uses are based on "cores per die"
> (after die's introduction).
> 
> This difference is due to that commit a94e1428991f ("target/i386: Add
> CPUID.1F generation support for multi-dies PCMachine") misunderstood
> that CPUState.nr_cores means "cores per die" when caculated
> CPUID.1FH.01H:EBX. After that, the changes in i386 all followed this
> wrong understanding.
> 
> With the influence of 003f230e37d7 and a94e1428991f, for i386 currently
> the result of CPUState.nr_cores is "cores per die", thus the original
> uses of CPUState.cores based on the meaning of "cores per package" are
> wrong when mutiple dies exist:
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.01H:EBX[bits 23:16] is
>incorrect because it expects "cpus per package" but now the
>result is "cpus per die".
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, for all leaves of CPUID.04H:
>EAX[bits 31:26] is incorrect because they expect "cpus per package"
>but now the result is "cpus per die". The error not only impacts the
>EAX caculation in cache_info_passthrough case, but also impacts other
>cases of setting cache topology for Intel CPU according to cpu
>topology (specifically, the incoming parameter "num_cores" expects
>"cores per package" in encode_cache_cpuid4()).
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.0BH.01H:EBX[bits
>15:00] is incorrect because the EBX of 0BH.01H (core level) expects
>"cpus per package", which may be different with 1FH.01H (The reason
>is 1FH can support more levels. For QEMU, 1FH also supports die,
>1FH.01H:EBX[bits 15:00] expects "cpus per die").
> 4. In cpu_x86_cpuid() of target/i386/cpu.c, when CPUID.8001H is
>caculated, here "cpus per package" is expected to be checked, but in
>fact, now it checks "cpus per die". Though "cpus per die" also works
>for this code logic, this isn't consistent with AMD's APM.
> 5. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.8008H:ECX expects
>"cpus per package" but it obtains "cpus per die".
> 6. In simulate_rdmsr() of target/i386/hvf/x86_emu.c, in
>kvm_rdmsr_core_thread_count() of target/i386/kvm/kvm.c, and in
>helper_rdmsr() of target/i386/tcg/sysemu/misc_helper.c,
>MSR_CORE_THREAD_COUNT expects "cpus per package" and "cores per
>package", but in these functions, it obtains "cpus per die" and
>"cores per die".
> 
> On the other hand, these uses are correct now (they are added in/after
> a94e1428991f):
> 1. In cpu_x86_cpuid() of target/i386/cpu.c, topo_info.cores_per_die
>meets the actual meaning of CPUState.nr_cores ("cores per die").
> 2. In cpu_x86_cpuid() of target/i386/cpu.c, vcpus_per_socket (in CPUID.
>04H's caculation) considers number of dies, so it's correct.
> 3. In cpu_x86_cpuid() of target/i386/cpu.c, CPUID.1FH.01H:EBX[bits
>15:00] needs "cpus per die" and it gets the correct result, and
>CPUID.1FH.02H:EBX[bits 15:00] gets correct "cpus per package".
> 
> When CPUState.nr_cores is correctly changed to "cores per package" again
> , the above errors will be fixed without extra work, but the "currently"
> correct cases will go wrong and need special handling to pass correct
> "cpus/cores per die" they want.
> 
> Thus in this patch, we fix CPUState.nr_cores' caculation to fit the

s/Thus in this patch, we fix CPUState.nr_cores' caculation/Fix
CPUState.nr_cores' calculation/


Describe your changes in imperative mood also spell check.
Thanks
Babu


> original meaning "cores per package", as well as changing calculation of
> topo_info.cores_per_die, vcpus_per_socket and CPUID.1FH.
> 
> In addition, in the nr_threads' comment, specify it represents the
> number of threads in the "core" to avoid confusion, and also add comment
> for nr_dies in CPUX86State.
> 
> Fixes: a94e1428991f ("target/i386: Add CPUID.1F generation support for 
> multi-dies PCMachine")
> Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct 
> CpuTopology")
> Signed-off-by: Zhuocheng Ding 
> Co-developed-by: Zhao Liu 
> Signed-off-by: Zhao Liu 
> ---
> Changes since v2:
>  * Use wrapped helper to get cores per socket in qemu_init_vcpu().
> Changes since v1:
>  * Add comment for nr_dies in CPUX86State. (Yanan)
>

Re: [PATCH v3 02/17] tests: Rename test-x86-cpuid.c to test-x86-topo.c

2023-08-01 Thread Moger, Babu

Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu 
> 
> In fact, this unit tests APIC ID other than CPUID.

This is not clear.

The tests in test-x86-topo.c actually test the APIC ID combinations.
Rename to test-x86-topo.c to make its name more in line with its actual
content.

> Rename to test-x86-topo.c to make its name more in line with its
> actual content.
> 
> Signed-off-by: Zhao Liu 
> Tested-by: Yongwei Ma 
> Reviewed-by: Philippe Mathieu-Daudé  Acked-by: Michael S. Tsirkin 
> ---
> Changes since v1:
>  * Rename test-x86-apicid.c to test-x86-topo.c. (Yanan)
> ---
>  MAINTAINERS  | 2 +-
>  tests/unit/meson.build   | 4 ++--
>  tests/unit/{test-x86-cpuid.c => test-x86-topo.c} | 2 +-
>  3 files changed, 4 insertions(+), 4 deletions(-)
>  rename tests/unit/{test-x86-cpuid.c => test-x86-topo.c} (99%)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 12e59b6b27de..51ba3d593e90 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1719,7 +1719,7 @@ F: include/hw/southbridge/ich9.h
>  F: include/hw/southbridge/piix.h
>  F: hw/isa/apm.c
>  F: include/hw/isa/apm.h
> -F: tests/unit/test-x86-cpuid.c
> +F: tests/unit/test-x86-topo.c
>  F: tests/qtest/test-x86-cpuid-compat.c
>  
>  PC Chipset
> diff --git a/tests/unit/meson.build b/tests/unit/meson.build
> index 93977cc32d2b..39b5d0007c69 100644
> --- a/tests/unit/meson.build
> +++ b/tests/unit/meson.build
> @@ -21,8 +21,8 @@ tests = {
>'test-opts-visitor': [testqapi],
>'test-visitor-serialization': [testqapi],
>'test-bitmap': [],
> -  # all code tested by test-x86-cpuid is inside topology.h
> -  'test-x86-cpuid': [],
> +  # all code tested by test-x86-topo is inside topology.h
> +  'test-x86-topo': [],
>'test-cutils': [],
>'test-div128': [],
>'test-shift128': [],
> diff --git a/tests/unit/test-x86-cpuid.c b/tests/unit/test-x86-topo.c
> similarity index 99%
> rename from tests/unit/test-x86-cpuid.c
> rename to tests/unit/test-x86-topo.c
> index bfabc0403a1a..2b104f86d7c2 100644
> --- a/tests/unit/test-x86-cpuid.c
> +++ b/tests/unit/test-x86-topo.c
> @@ -1,5 +1,5 @@
>  /*
> - *  Test code for x86 CPUID and Topology functions
> + *  Test code for x86 APIC ID and Topology functions
>   *
>   *  Copyright (c) 2012 Red Hat Inc.
>   *

-- 
Thanks
Babu Moger

Re: [PATCH v3 00/17] Support smp.clusters for x86

2023-08-01 Thread Moger, Babu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu 
> 
> Hi list,
> 
> This is the our v3 patch series, rebased on the master branch at the
> commit 234320cd0573 ("Merge tag 'pull-target-arm-20230731' of https:
> //git.linaro.org/people/pmaydell/qemu-arm into staging").
> 
> Comparing with v2 [1], v3 mainly adds "Tested-by", "Reviewed-by" and
> "ACKed-by" (for PC related patchies) tags and minor code changes (Pls
> see changelog).
> 
> 
> # Introduction
> 
> This series add the cluster support for x86 PC machine, which allows
> x86 can use smp.clusters to configure x86 modlue level CPU topology.

/s/modlue/module
> 
> And since the compatibility issue (see section: ## Why not share L2
> cache in cluster directly), this series also introduce a new command
> to adjust the x86 L2 cache topology.
> 
> Welcome your comments!
> 
> 
> # Backgroud
> 
> The "clusters" parameter in "smp" is introduced by ARM [2], but x86
> hasn't supported it.
> 
> At present, x86 defaults L2 cache is shared in one core, but this is
> not enough. There're some platforms that multiple cores share the
> same L2 cache, e.g., Alder Lake-P shares L2 cache for one module of
> Atom cores [3], that is, every four Atom cores shares one L2 cache.
> Therefore, we need the new CPU topology level (cluster/module).
> 
> Another reason is for hybrid architecture. cluster support not only
> provides another level of topology definition in x86, but would aslo
> provide required code change for future our hybrid topology support.
> 
> 
> # Overview
> 
> ## Introduction of module level for x86
> 
> "cluster" in smp is the CPU topology level which is between "core" and
> die.
> 
> For x86, the "cluster" in smp is corresponding to the module level [4],
> which is above the core level. So use the "module" other than "cluster"
> in x86 code.
> 
> And please note that x86 already has a cpu topology level also named
> "cluster" [4], this level is at the upper level of the package. Here,
> the cluster in x86 cpu topology is completely different from the
> "clusters" as the smp parameter. After the module level is introduced,
> the cluster as the smp parameter will actually refer to the module level
> of x86.
> 
> 
> ## Why not share L2 cache in cluster directly
> 
> Though "clusters" was introduced to help define L2 cache topology
> [2], using cluster to define x86's L2 cache topology will cause the
> compatibility problem:
> 
> Currently, x86 defaults that the L2 cache is shared in one core, which
> actually implies a default setting "cores per L2 cache is 1" and
> therefore implicitly defaults to having as many L2 caches as cores.
> 
> For example (i386 PC machine):
> -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> 
> Considering the topology of the L2 cache, this (*) implicitly means "1
> core per L2 cache" and "2 L2 caches per die".
> 
> If we use cluster to configure L2 cache topology with the new default
> setting "clusters per L2 cache is 1", the above semantics will change
> to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> cores per L2 cache".
> 
> So the same command (*) will cause changes in the L2 cache topology,
> further affecting the performance of the virtual machine.
> 
> Therefore, x86 should only treat cluster as a cpu topology level and
> avoid using it to change L2 cache by default for compatibility.
> 
> 
> ## module level in CPUID
> 
> Currently, we don't expose module level in CPUID.1FH because currently
> linux (v6.2-rc6) doesn't support module level. And exposing module and
> die levels at the same time in CPUID.1FH will cause linux to calculate
> wrong die_id. The module level should be exposed until the real machine
> has the module level in CPUID.1FH.
> 
> We can configure CPUID.04H.02H (L2 cache topology) with module level by
> a new command:
> 
> "-cpu,x-l2-cache-topo=cluster"
> 
> More information about this command, please see the section: "## New
> property: x-l2-cache-topo".
> 
> 
> ## New cache topology info in CPUCacheInfo
> 
> Currently, by default, the cache topology is encoded as:
> 1. i/d cache is shared in one core.
> 2. L2 cache is shared in one core.
> 3. L3 cache is shared in one die.
> 
> This default general setting has caused a misunderstanding, that is, the
> cache topology is completely equated with a specific cpu topology, such
> as the connection between L2 cache and core level, and the connection
> between L3 cache and die level.
> 
> In fact, the settings of these topologies depend on the specific
> platform and are not static. For example, on Alder Lake-P, every
> four Atom cores share the same L2 cache [2].
> 
> Thus, in this patch set, we explicitly define the corresponding cache
> topology for different cpu models and this has two benefits:
> 1. Easy to expand to new CPU models in the future, which has different
>cache topology.
> 2. It can easily support custom cache topology by some command (e.g.,
>x-l2-cache-topo).
> 
> 
> ## New property:

Re: [PATCH v3 01/17] i386: Fix comment style in topology.h

2023-08-01 Thread Moger, Babu

Hi Zhao,

On 8/1/23 05:35, Zhao Liu wrote:
> From: Zhao Liu 
> 
> For function comments in this file, keep the comment style consistent
> with other places.

s/with other places./with other files in the directory./

> 
> Signed-off-by: Zhao Liu 
> Reviewed-by: Philippe Mathieu-Daudé  Reviewed-by: Yanan Wang 
> Acked-by: Michael S. Tsirkin 
> ---
>  include/hw/i386/topology.h | 33 +
>  1 file changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> index 81573f6cfde0..5a19679f618b 100644
> --- a/include/hw/i386/topology.h
> +++ b/include/hw/i386/topology.h
> @@ -24,7 +24,8 @@
>  #ifndef HW_I386_TOPOLOGY_H
>  #define HW_I386_TOPOLOGY_H
>  
> -/* This file implements the APIC-ID-based CPU topology enumeration logic,
> +/*
> + * This file implements the APIC-ID-based CPU topology enumeration logic,
>   * documented at the following document:
>   *   Intel® 64 Architecture Processor Topology Enumeration
>   *   
> http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
> @@ -41,7 +42,8 @@
>  
>  #include "qemu/bitops.h"
>  
> -/* APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
> +/*
> + * APIC IDs can be 32-bit, but beware: APIC IDs > 255 require x2APIC support
>   */
>  typedef uint32_t apic_id_t;
>  
> @@ -58,8 +60,7 @@ typedef struct X86CPUTopoInfo {
>  unsigned threads_per_core;
>  } X86CPUTopoInfo;
>  
> -/* Return the bit width needed for 'count' IDs
> - */
> +/* Return the bit width needed for 'count' IDs */
>  static unsigned apicid_bitwidth_for_count(unsigned count)
>  {
>  g_assert(count >= 1);
> @@ -67,15 +68,13 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
>  return count ? 32 - clz32(count) : 0;
>  }
>  
> -/* Bit width of the SMT_ID (thread ID) field on the APIC ID
> - */
> +/* Bit width of the SMT_ID (thread ID) field on the APIC ID */
>  static inline unsigned apicid_smt_width(X86CPUTopoInfo *topo_info)
>  {
>  return apicid_bitwidth_for_count(topo_info->threads_per_core);
>  }
>  
> -/* Bit width of the Core_ID field
> - */
> +/* Bit width of the Core_ID field */
>  static inline unsigned apicid_core_width(X86CPUTopoInfo *topo_info)
>  {
>  return apicid_bitwidth_for_count(topo_info->cores_per_die);
> @@ -87,8 +86,7 @@ static inline unsigned apicid_die_width(X86CPUTopoInfo 
> *topo_info)
>  return apicid_bitwidth_for_count(topo_info->dies_per_pkg);
>  }
>  
> -/* Bit offset of the Core_ID field
> - */
> +/* Bit offset of the Core_ID field */
>  static inline unsigned apicid_core_offset(X86CPUTopoInfo *topo_info)
>  {
>  return apicid_smt_width(topo_info);
> @@ -100,14 +98,14 @@ static inline unsigned apicid_die_offset(X86CPUTopoInfo 
> *topo_info)
>  return apicid_core_offset(topo_info) + apicid_core_width(topo_info);
>  }
>  
> -/* Bit offset of the Pkg_ID (socket ID) field
> - */
> +/* Bit offset of the Pkg_ID (socket ID) field */
>  static inline unsigned apicid_pkg_offset(X86CPUTopoInfo *topo_info)
>  {
>  return apicid_die_offset(topo_info) + apicid_die_width(topo_info);
>  }
>  
> -/* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> +/*
> + * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
>   *
>   * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
>   */
> @@ -120,7 +118,8 @@ static inline apic_id_t 
> x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
> topo_ids->smt_id;
>  }
>  
> -/* Calculate thread/core/package IDs for a specific topology,
> +/*
> + * Calculate thread/core/package IDs for a specific topology,
>   * based on (contiguous) CPU index
>   */
>  static inline void x86_topo_ids_from_idx(X86CPUTopoInfo *topo_info,
> @@ -137,7 +136,8 @@ static inline void x86_topo_ids_from_idx(X86CPUTopoInfo 
> *topo_info,
>  topo_ids->smt_id = cpu_index % nr_threads;
>  }
>  
> -/* Calculate thread/core/package IDs for a specific topology,
> +/*
> + * Calculate thread/core/package IDs for a specific topology,
>   * based on APIC ID
>   */
>  static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> @@ -155,7 +155,8 @@ static inline void x86_topo_ids_from_apicid(apic_id_t 
> apicid,
>  topo_ids->pkg_id = apicid >> apicid_pkg_offset(topo_info);
>  }
>  
> -/* Make APIC ID for the CPU 'cpu_index'
> +/*
> + * Make APIC ID for the CPU 'cpu_index'
>   *
>   * 'cpu_index' is a sequential, contiguous ID for the CPU.
>   */

-- 
Thanks
Babu Moger

Re: [PATCH 1/2] i386: Add support for SUCCOR feature

2023-07-06 Thread Moger, Babu

Hi John,
Thanks for the patches. Few comments below.

On 7/6/23 14:40, John Allen wrote:
> Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to
> be exposed to guests to allow them to handle machine check exceptions on AMD
> hosts.
> 
> Reported-by: William Roche 
> Signed-off-by: John Allen 
> ---
>  target/i386/cpu.c | 2 +-
>  target/i386/cpu.h | 4 
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 06009b80e8..09fae9337a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5874,7 +5874,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> uint32_t count,
>  break;
>  case 0x8007:
>  *eax = 0;
> -*ebx = 0;
> +*ebx = env->features[FEAT_8000_0007_EBX] | 
> CPUID_8000_0007_EBX_SUCCOR;

This is adding this feature unconditionally which does not seem right.
Couple of things.
1. Add the feature word for SUCCOR. Users can add this feature using the
feature word "+succor".

2. Also define  CPUID_8000_0007_EBX_SUCCOR : In this case, we can add this
feature as part of the EPYC Model update.

Thanks
Babu

RE: [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC Models

2023-05-05 Thread Moger, Babu

[AMD Official Use Only - General]


> -Original Message-
> From: Paolo Bonzini 
> Sent: Friday, May 5, 2023 3:31 AM
> To: Moger, Babu 
> Cc: pbonz...@redhat.com; richard.hender...@linaro.org;
> weijiang.y...@intel.com; phi...@linaro.org; d...@amazon.co.uk;
> p...@xen.org; joao.m.mart...@oracle.com; qemu-devel@nongnu.org;
> mtosa...@redhat.com; k...@vger.kernel.org; m...@redhat.com;
> marcel.apfelb...@gmail.com; yang.zh...@intel.com; jing2@intel.com;
> vkuzn...@redhat.com; Roth, Michael ; Huang2, Wei
> ; berra...@redhat.com; b...@redhat.com
> Subject: Re: [PATCH v4 0/7] Add EPYC-Genoa model and update previous EPYC
> Models
> 
> Queued, thanks.

Thank You.
Babu

Re: [PATCH v3 1/7] target/i386: allow versioned CPUs to specify new cache_info

2023-05-04 Thread Moger, Babu

Hi Robert,

On 4/25/23 10:22, Moger, Babu wrote:
> Hi Robert,
> 
> On 4/25/23 00:42, Robert Hoo wrote:
>> Babu Moger  于2023年4月25日周二 00:42写道：
>>>
>>> From: Michael Roth 
>>>
>>> New EPYC CPUs versions require small changes to their cache_info's.
>>
>> Do you mean, for the real HW of EPYC CPU, each given model, e.g. Rome,
>> has HW version updates periodically?
> 
> Yes. Real hardware can change slightly changing the cache properties, but
> everything else exactly same as the base HW. But this is not a common
> thing. We don't see the need for adding new EPYC model for these cases.
> That is the reason we added cache_info here.
>>
>>> Because current QEMU x86 CPU definition does not support cache
>>> versions,
>>
>> cache version --> versioned cache info
> 
> Sure.
>>
>>> we would have to declare a new CPU type for each such case.
>>
>> My understanding was, for new HW CPU model, we should define a new
>> vCPU model mapping it. But if answer to my above question is yes, i.e.
>> new HW version of same CPU model, looks like it makes sense to some
>> extent.
> 
> Please see my response above.
> 
>>
>>> To avoid this duplication, the patch allows new cache_info pointers
>>> to be specified for a new CPU version.
>>
>> "To avoid the dup work, the patch adds "cache_info" in 
>> X86CPUVersionDefinition"
> 
> Sure
> 
>>>
>>> Co-developed-by: Wei Huang 
>>> Signed-off-by: Wei Huang 
>>> Signed-off-by: Michael Roth 
>>> Signed-off-by: Babu Moger 
>>> Acked-by: Michael S. Tsirkin 
>>> ---
>>>  target/i386/cpu.c | 36 +---
>>>  1 file changed, 33 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index 6576287e5b..e3d9eaa307 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -1598,6 +1598,7 @@ typedef struct X86CPUVersionDefinition {
>>>  const char *alias;
>>>  const char *note;
>>>  PropValue *props;
>>> +const CPUCaches *const cache_info;
>>>  } X86CPUVersionDefinition;
>>>
>>>  /* Base definition for a CPU model */
>>> @@ -5192,6 +5193,32 @@ static void x86_cpu_apply_version_props(X86CPU *cpu, 
>>> X86CPUModel *model)
>>>  assert(vdef->version == version);
>>>  }
>>>
>>> +/* Apply properties for the CPU model version specified in model */
>>
>> I don't think this comment matches below function.
> 
> Ok. Will remove it.
> 
>>
>>> +static const CPUCaches *x86_cpu_get_version_cache_info(X86CPU *cpu,
>>> +   X86CPUModel *model)
>>
>> Will "version" --> "versioned" be better?
> 
> Sure.
> 
>>
>>> +{
>>> +const X86CPUVersionDefinition *vdef;
>>> +X86CPUVersion version = x86_cpu_model_resolve_version(model);
>>> +const CPUCaches *cache_info = model->cpudef->cache_info;
>>> +
>>> +if (version == CPU_VERSION_LEGACY) {
>>> +return cache_info;
>>> +}
>>> +
>>> +for (vdef = x86_cpu_def_get_versions(model->cpudef); vdef->version; 
>>> vdef++) {
>>> +if (vdef->cache_info) {
>>> +cache_info = vdef->cache_info;
>>> +}
>>
>> No need to assign "cache_info" when traverse the vdef list, but in
>> below version matching block, do the assignment. Or, do you mean to
>> have last valid cache info (during the traverse) returned? e.g. v2 has
>> valid cache info, but v3 doesn't.

Forgot to respond to this comment.
Yes. That is correct. Idea is to get the valid cache_info from the
previous version if the latest does not have one.
Also tested it to verify the case. Good question.
Thanks
Babu Moger

RE: [PATCH v3 2/7] target/i386: Add new EPYC CPU versions with updated cache_info

2023-04-28 Thread Moger, Babu

[AMD Official Use Only - General]

Hi Maksim,

> -Original Message-
> From: Maksim Davydov 
> Sent: Wednesday, April 26, 2023 3:35 AM
> To: Moger, Babu 
> Cc: weijiang.y...@intel.com; phi...@linaro.org; d...@amazon.co.uk;
> p...@xen.org; joao.m.mart...@oracle.com; qemu-devel@nongnu.org;
> mtosa...@redhat.com; k...@vger.kernel.org; m...@redhat.com;
> marcel.apfelb...@gmail.com; yang.zh...@intel.com; jing2@intel.com;
> vkuzn...@redhat.com; Roth, Michael ; Huang2, Wei
> ; berra...@redhat.com; pbonz...@redhat.com;
> richard.hender...@linaro.org
> Subject: Re: [PATCH v3 2/7] target/i386: Add new EPYC CPU versions with
> updated cache_info
> 
> 
> On 4/25/23 18:35, Moger, Babu wrote:
> > Hi Maksim,
> >
> > On 4/25/23 07:51, Maksim Davydov wrote:
> >> On 4/24/23 19:33, Babu Moger wrote:
> >>> From: Michael Roth 
> >>>
> >>> Introduce new EPYC cpu versions: EPYC-v4 and EPYC-Rome-v3.
> >>> The only difference vs. older models is an updated cache_info with
> >>> the 'complex_indexing' bit unset, since this bit is not currently
> >>> defined for AMD and may cause problems should it be used for
> >>> something else in the future. Setting this bit will also cause CPUID
> >>> validation failures when running SEV-SNP guests.
> >>>
> >>> Signed-off-by: Michael Roth 
> >>> Signed-off-by: Babu Moger 
> >>> Acked-by: Michael S. Tsirkin 
> >>> ---
> >>>    target/i386/cpu.c | 118
> >>> ++
> >>>    1 file changed, 118 insertions(+)
> >>>
> >>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c index
> >>> e3d9eaa307..c1bc47661d 100644
> >>> --- a/target/i386/cpu.c
> >>> +++ b/target/i386/cpu.c
> >>> @@ -1707,6 +1707,56 @@ static const CPUCaches epyc_cache_info = {
> >>>    },
> >>>    };
> >>>    +static CPUCaches epyc_v4_cache_info = {
> >>> +    .l1d_cache = &(CPUCacheInfo) {
> >>> +    .type = DATA_CACHE,
> >>> +    .level = 1,
> >>> +    .size = 32 * KiB,
> >>> +    .line_size = 64,
> >>> +    .associativity = 8,
> >>> +    .partitions = 1,
> >>> +    .sets = 64,
> >>> +    .lines_per_tag = 1,
> >>> +    .self_init = 1,
> >>> +    .no_invd_sharing = true,
> >>> +    },
> >>> +    .l1i_cache = &(CPUCacheInfo) {
> >>> +    .type = INSTRUCTION_CACHE,
> >>> +    .level = 1,
> >>> +    .size = 64 * KiB,
> >>> +    .line_size = 64,
> >>> +    .associativity = 4,
> >>> +    .partitions = 1,
> >>> +    .sets = 256,
> >>> +    .lines_per_tag = 1,
> >>> +    .self_init = 1,
> >>> +    .no_invd_sharing = true,
> >>> +    },
> >>> +    .l2_cache = &(CPUCacheInfo) {
> >>> +    .type = UNIFIED_CACHE,
> >>> +    .level = 2,
> >>> +    .size = 512 * KiB,
> >>> +    .line_size = 64,
> >>> +    .associativity = 8,
> >>> +    .partitions = 1,
> >>> +    .sets = 1024,
> >>> +    .lines_per_tag = 1,
> >>> +    },
> >>> +    .l3_cache = &(CPUCacheInfo) {
> >>> +    .type = UNIFIED_CACHE,
> >>> +    .level = 3,
> >>> +    .size = 8 * MiB,
> >>> +    .line_size = 64,
> >>> +    .associativity = 16,
> >>> +    .partitions = 1,
> >>> +    .sets = 8192,
> >>> +    .lines_per_tag = 1,
> >>> +    .self_init = true,
> >>> +    .inclusive = true,
> >>> +    .complex_indexing = false,
> >>> +    },
> >>> +};
> >>> +
> >>>    static const CPUCaches epyc_rome_cache_info = {
> >>>    .l1d_cache = &(CPUCacheInfo) {
> >>>    .type = DATA_CACHE,
> >>> @@ -1757,6 +1807,56 @@ static const CPUCaches epyc_rome_cache_info
> =
> >>> {
> >>>    },
> >>>    };
> >>>    +static const CPUCaches epyc_rome_v3_cache_info = {
> >>> +    .l1d_cache = &(CPUCacheInfo) {
> >>> +    .type = DATA_CACHE,
> >>> +    .level = 1,
> >>> +    .size = 32 * KiB,
> >>> +    .line_size = 64,
> >>

Re: [PATCH v3 2/7] target/i386: Add new EPYC CPU versions with updated cache_info

2023-04-25 Thread Moger, Babu

Hi Maksim,

On 4/25/23 07:51, Maksim Davydov wrote:
> 
> On 4/24/23 19:33, Babu Moger wrote:
>> From: Michael Roth 
>>
>> Introduce new EPYC cpu versions: EPYC-v4 and EPYC-Rome-v3.
>> The only difference vs. older models is an updated cache_info with
>> the 'complex_indexing' bit unset, since this bit is not currently
>> defined for AMD and may cause problems should it be used for
>> something else in the future. Setting this bit will also cause
>> CPUID validation failures when running SEV-SNP guests.
>>
>> Signed-off-by: Michael Roth 
>> Signed-off-by: Babu Moger 
>> Acked-by: Michael S. Tsirkin 
>> ---
>>   target/i386/cpu.c | 118 ++
>>   1 file changed, 118 insertions(+)
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index e3d9eaa307..c1bc47661d 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -1707,6 +1707,56 @@ static const CPUCaches epyc_cache_info = {
>>   },
>>   };
>>   +static CPUCaches epyc_v4_cache_info = {
>> +    .l1d_cache = &(CPUCacheInfo) {
>> +    .type = DATA_CACHE,
>> +    .level = 1,
>> +    .size = 32 * KiB,
>> +    .line_size = 64,
>> +    .associativity = 8,
>> +    .partitions = 1,
>> +    .sets = 64,
>> +    .lines_per_tag = 1,
>> +    .self_init = 1,
>> +    .no_invd_sharing = true,
>> +    },
>> +    .l1i_cache = &(CPUCacheInfo) {
>> +    .type = INSTRUCTION_CACHE,
>> +    .level = 1,
>> +    .size = 64 * KiB,
>> +    .line_size = 64,
>> +    .associativity = 4,
>> +    .partitions = 1,
>> +    .sets = 256,
>> +    .lines_per_tag = 1,
>> +    .self_init = 1,
>> +    .no_invd_sharing = true,
>> +    },
>> +    .l2_cache = &(CPUCacheInfo) {
>> +    .type = UNIFIED_CACHE,
>> +    .level = 2,
>> +    .size = 512 * KiB,
>> +    .line_size = 64,
>> +    .associativity = 8,
>> +    .partitions = 1,
>> +    .sets = 1024,
>> +    .lines_per_tag = 1,
>> +    },
>> +    .l3_cache = &(CPUCacheInfo) {
>> +    .type = UNIFIED_CACHE,
>> +    .level = 3,
>> +    .size = 8 * MiB,
>> +    .line_size = 64,
>> +    .associativity = 16,
>> +    .partitions = 1,
>> +    .sets = 8192,
>> +    .lines_per_tag = 1,
>> +    .self_init = true,
>> +    .inclusive = true,
>> +    .complex_indexing = false,
>> +    },
>> +};
>> +
>>   static const CPUCaches epyc_rome_cache_info = {
>>   .l1d_cache = &(CPUCacheInfo) {
>>   .type = DATA_CACHE,
>> @@ -1757,6 +1807,56 @@ static const CPUCaches epyc_rome_cache_info = {
>>   },
>>   };
>>   +static const CPUCaches epyc_rome_v3_cache_info = {
>> +    .l1d_cache = &(CPUCacheInfo) {
>> +    .type = DATA_CACHE,
>> +    .level = 1,
>> +    .size = 32 * KiB,
>> +    .line_size = 64,
>> +    .associativity = 8,
>> +    .partitions = 1,
>> +    .sets = 64,
>> +    .lines_per_tag = 1,
>> +    .self_init = 1,
>> +    .no_invd_sharing = true,
>> +    },
>> +    .l1i_cache = &(CPUCacheInfo) {
>> +    .type = INSTRUCTION_CACHE,
>> +    .level = 1,
>> +    .size = 32 * KiB,
>> +    .line_size = 64,
>> +    .associativity = 8,
>> +    .partitions = 1,
>> +    .sets = 64,
>> +    .lines_per_tag = 1,
>> +    .self_init = 1,
>> +    .no_invd_sharing = true,
>> +    },
>> +    .l2_cache = &(CPUCacheInfo) {
>> +    .type = UNIFIED_CACHE,
>> +    .level = 2,
>> +    .size = 512 * KiB,
>> +    .line_size = 64,
>> +    .associativity = 8,
>> +    .partitions = 1,
>> +    .sets = 1024,
>> +    .lines_per_tag = 1,
>> +    },
>> +    .l3_cache = &(CPUCacheInfo) {
>> +    .type = UNIFIED_CACHE,
>> +    .level = 3,
>> +    .size = 16 * MiB,
>> +    .line_size = 64,
>> +    .associativity = 16,
>> +    .partitions = 1,
>> +    .sets = 16384,
>> +    .lines_per_tag = 1,
>> +    .self_init = true,
>> +    .inclusive = true,
>> +    .complex_indexing = false,
>> +    },
>> +};
>> +
>>   static const CPUCaches epyc_milan_cache_info = {
>>   .l1d_cache = &(CPUCacheInfo) {
>>   .type = DATA_CACHE,
>> @@ -4091,6 +4191,15 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>>   { /* end of list */ }
>>   }
>>   },
>> +    {
>> +    .version = 4,
>> +    .props = (PropValue[]) {
>> +    { "model-id",
>> +  "AMD EPYC-v4 Processor" },
>> +    { /* end of list */ }
>> +    },
>> +    .cache_info = _v4_cache_info
>> +    },
>>   { /* end of list */ }
>>   }
>>   },
>> @@ -4210,6 +4319,15 @@ static const X86CPUDefinition builtin_x86_defs[] = {
>>   { /* end of list */ }
>>   }
>>   },
>> +    {
>> +    .version = 3,
>> +

Re: [PATCH v3 1/7] target/i386: allow versioned CPUs to specify new cache_info

2023-04-25 Thread Moger, Babu

Hi Robert,

On 4/25/23 00:42, Robert Hoo wrote:
> Babu Moger  于2023年4月25日周二 00:42写道：
>>
>> From: Michael Roth 
>>
>> New EPYC CPUs versions require small changes to their cache_info's.
> 
> Do you mean, for the real HW of EPYC CPU, each given model, e.g. Rome,
> has HW version updates periodically?

Yes. Real hardware can change slightly changing the cache properties, but
everything else exactly same as the base HW. But this is not a common
thing. We don't see the need for adding new EPYC model for these cases.
That is the reason we added cache_info here.
> 
>> Because current QEMU x86 CPU definition does not support cache
>> versions,
> 
> cache version --> versioned cache info

Sure.
> 
>> we would have to declare a new CPU type for each such case.
> 
> My understanding was, for new HW CPU model, we should define a new
> vCPU model mapping it. But if answer to my above question is yes, i.e.
> new HW version of same CPU model, looks like it makes sense to some
> extent.

Please see my response above.

> 
>> To avoid this duplication, the patch allows new cache_info pointers
>> to be specified for a new CPU version.
> 
> "To avoid the dup work, the patch adds "cache_info" in 
> X86CPUVersionDefinition"

Sure

>>
>> Co-developed-by: Wei Huang 
>> Signed-off-by: Wei Huang 
>> Signed-off-by: Michael Roth 
>> Signed-off-by: Babu Moger 
>> Acked-by: Michael S. Tsirkin 
>> ---
>>  target/i386/cpu.c | 36 +---
>>  1 file changed, 33 insertions(+), 3 deletions(-)
>>
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index 6576287e5b..e3d9eaa307 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -1598,6 +1598,7 @@ typedef struct X86CPUVersionDefinition {
>>  const char *alias;
>>  const char *note;
>>  PropValue *props;
>> +const CPUCaches *const cache_info;
>>  } X86CPUVersionDefinition;
>>
>>  /* Base definition for a CPU model */
>> @@ -5192,6 +5193,32 @@ static void x86_cpu_apply_version_props(X86CPU *cpu, 
>> X86CPUModel *model)
>>  assert(vdef->version == version);
>>  }
>>
>> +/* Apply properties for the CPU model version specified in model */
> 
> I don't think this comment matches below function.

Ok. Will remove it.

> 
>> +static const CPUCaches *x86_cpu_get_version_cache_info(X86CPU *cpu,
>> +   X86CPUModel *model)
> 
> Will "version" --> "versioned" be better?

Sure.

> 
>> +{
>> +const X86CPUVersionDefinition *vdef;
>> +X86CPUVersion version = x86_cpu_model_resolve_version(model);
>> +const CPUCaches *cache_info = model->cpudef->cache_info;
>> +
>> +if (version == CPU_VERSION_LEGACY) {
>> +return cache_info;
>> +}
>> +
>> +for (vdef = x86_cpu_def_get_versions(model->cpudef); vdef->version; 
>> vdef++) {
>> +if (vdef->cache_info) {
>> +cache_info = vdef->cache_info;
>> +}
> 
> No need to assign "cache_info" when traverse the vdef list, but in
> below version matching block, do the assignment. Or, do you mean to
> have last valid cache info (during the traverse) returned? e.g. v2 has
> valid cache info, but v3 doesn't.
>> +
>> +if (vdef->version == version) {
>> +break;
>> +}
>> +}
>> +
>> +assert(vdef->version == version);
>> +return cache_info;
>> +}
>> +

-- 
Thanks
Babu Moger

Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on

2022-05-25 Thread Moger, Babu



On 5/25/22 02:05, Igor Mammedov wrote:
> On Tue, 24 May 2022 14:48:29 -0500
> "Moger, Babu"  wrote:
>
>> On 5/24/22 10:19, Igor Mammedov wrote:
>>> On Tue, 24 May 2022 11:10:18 -0400
>>> Igor Mammedov  wrote:
>>>
>>> CCing AMD folks as that might be of interest to them  
>> I am trying to recreate the bug on my AMD system here.. Seeing this message..
>>
>> qemu-system-x86_64: -numa node,nodeid=0,memdev=ram-node0: memdev=ram-node0
>> is ambiguous
>>
>> Here is my command line..
>>
>> #qemu-system-x86_64 -name rhel8 -m 4096 -hda vdisk.qcow2 -enable-kvm -net
>> nic  -nographic -machine q35,accel=kvm -cpu
>> host,host-cache-info=on,l3-cache=off -smp
>> 20,sockets=2,dies=1,cores=10,threads=1 -numa
>> node,nodeid=0,memdev=ram-node0 -numa node,nodeid=1,memdev=ram-node1 -numa
>> cpu,socket-id=0,node-id=0 -numa cpu,socket-id=1,node-id=1
>>
>> Am I missing something?
> Yep, sorry I've omitted -object memory-backend-foo definitions for
> ram-node0 and ram-node1
>
> one can use any memory backend, it doesn't really matter in this case,
> for example following should do:
>   -object memory-backend-ram,id=ram-node0,size=2G \
>   -object memory-backend-ram,id=ram-node1,size=2G 

Thanks Igor. However these changes(patch 1 and 2) does not affect AMD
systems as far i can see.

Thanks

Babu

>
>>
>>>  
>>>> Igor Mammedov (2):
>>>>   x86: cpu: make sure number of addressable IDs for processor cores
>>>> meets the spec
>>>>   x86: cpu: fixup number of addressable IDs for logical processors
>>>> sharing cache
>>>>
>>>>  target/i386/cpu.c | 20 
>>>>  1 file changed, 16 insertions(+), 4 deletions(-)
>>>>  

-- 
Thanks
Babu Moger

Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on

2022-05-25 Thread Moger, Babu



On 5/24/22 18:23, Alejandro Jimenez wrote:
> On 5/24/2022 3:48 PM, Moger, Babu wrote:
>>
>> On 5/24/22 10:19, Igor Mammedov wrote:
>>> On Tue, 24 May 2022 11:10:18 -0400
>>> Igor Mammedov  wrote:
>>>
>>> CCing AMD folks as that might be of interest to them
>>
>> I am trying to recreate the bug on my AMD system here.. Seeing this
>> message..
>>
>> qemu-system-x86_64: -numa node,nodeid=0,memdev=ram-node0: memdev=ram-node0
>> is ambiguous
>>
>> Here is my command line..
>>
>> #qemu-system-x86_64 -name rhel8 -m 4096 -hda vdisk.qcow2 -enable-kvm -net
>> nic  -nographic -machine q35,accel=kvm -cpu
>> host,host-cache-info=on,l3-cache=off -smp
>> 20,sockets=2,dies=1,cores=10,threads=1 -numa
>> node,nodeid=0,memdev=ram-node0 -numa node,nodeid=1,memdev=ram-node1 -numa
>> cpu,socket-id=0,node-id=0 -numa cpu,socket-id=1,node-id=1
>>
>> Am I missing something?
> Hi Babu,
>
> Hopefully this will help you reproduce the issue if you are testing on
> Milan/Genoa. Joao (CC'd) pointed out this warning to me late last year,
> while I was working on patches for encoding the topology CPUID leaf in
> different Zen platforms.
>
> What I found from my experiments on Milan, is that the warning will
> appear whenever the NUMA topology requested in QEMU cmdline assigns a
> number of CPUs to each node that is smaller than the default # of CPUs
> sharing a LLC on the host platform. In short, on a Milan host where we
> have 16 CPUs sharing a CCX:

Yes. I recreated the issue with this following command line.

#qemu-system-x86_64 -name rhel8 -m 4096 -hda vdisk.qcow2 -enable-kvm -net
nic  -nographic -machine q35,accel=kvm -cpu host,+topoext -smp
16,sockets=1,dies=1,cores=16,threads=1 -object
memory-backend-ram,id=ram-node0,size=2G -object
memory-backend-ram,id=ram-node1,size=2G  -numa
node,nodeid=0,cpus=0-7,memdev=ram-node0 -numa
node,nodeid=1,cpus=8-15,memdev=ram-node1

But solving this will be bit complicated. For AMD, this information comes
from CPUID 0x801d. But, when this cpuid is being populated we don't
have all the information about numa nodes etc..

But you can work-around it by modifying the command line by including
dies(dies=2 in this case) information.  Something like this.

#qemu-system-x86_64 -name rhel8 -m 4096 -hda vdisk.qcow2 -enable-kvm -net
nic  -nographic -machine q35,accel=kvm -cpu
host,+topoext,host-cache-info=on -smp
16,sockets=1,dies=2,cores=8,threads=1 -object
memory-backend-ram,id=ram-node0,size=2G -object
memory-backend-ram,id=ram-node1,size=2G  -numa
node,nodeid=0,cpus=0-7,memdev=ram-node0 -numa
node,nodeid=1,cpus=8-15,memdev=ram-node1

But this may not be acceptable solution in all the cases.

>
> # cat /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list
> 0-7,128-135
>
> If a guest is launched with the following arguments:
>
> -cpu host,+topoext \
> -smp cpus=64,cores=32,threads=2,sockets=1 \
> -numa node,nodeid=0,cpus=0-7 -numa node,nodeid=1,cpus=8-15 \
> -numa node,nodeid=2,cpus=16-23 -numa node,nodeid=3,cpus=24-31 \
> -numa node,nodeid=4,cpus=32-39 -numa node,nodeid=5,cpus=40-47 \
> -numa node,nodeid=6,cpus=48-55 -numa node,nodeid=7,cpus=56-63 \
>
> it assigns 8 cpus to each NUMA node, causing the error above to be
> displayed.
>
> Note that ultimately the guest topology is built based on the NUMA
> information, so the LLC domains on the guest only end up spanning a
> single NUMA node. e.g.:
>
> # cat /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list
> 0-7
>
> Hope that helps,
> Alejandro
>>
>>
>>>
>>>> Igor Mammedov (2):
>>>>    x86: cpu: make sure number of addressable IDs for processor cores
>>>>  meets the spec
>>>>    x86: cpu: fixup number of addressable IDs for logical processors
>>>>  sharing cache
>>>>
>>>>   target/i386/cpu.c | 20 
>>>>   1 file changed, 16 insertions(+), 4 deletions(-)
>>>>
>
-- 
Thanks
Babu Moger

Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on

2022-05-24 Thread Moger, Babu



On 5/24/22 10:19, Igor Mammedov wrote:
> On Tue, 24 May 2022 11:10:18 -0400
> Igor Mammedov  wrote:
>
> CCing AMD folks as that might be of interest to them

I am trying to recreate the bug on my AMD system here.. Seeing this message..

qemu-system-x86_64: -numa node,nodeid=0,memdev=ram-node0: memdev=ram-node0
is ambiguous

Here is my command line..

#qemu-system-x86_64 -name rhel8 -m 4096 -hda vdisk.qcow2 -enable-kvm -net
nic  -nographic -machine q35,accel=kvm -cpu
host,host-cache-info=on,l3-cache=off -smp
20,sockets=2,dies=1,cores=10,threads=1 -numa
node,nodeid=0,memdev=ram-node0 -numa node,nodeid=1,memdev=ram-node1 -numa
cpu,socket-id=0,node-id=0 -numa cpu,socket-id=1,node-id=1

Am I missing something?


>
>> Igor Mammedov (2):
>>   x86: cpu: make sure number of addressable IDs for processor cores
>> meets the spec
>>   x86: cpu: fixup number of addressable IDs for logical processors
>> sharing cache
>>
>>  target/i386/cpu.c | 20 
>>  1 file changed, 16 insertions(+), 4 deletions(-)
>>
-- 
Thanks
Babu Moger

RE: [PATCH v7 00/13] APIC ID fixes for AMD EPYC CPU model

2020-03-23 Thread Moger, Babu

[AMD Official Use Only - Internal Distribution Only]

> -Original Message-
> From: Igor Mammedov 
> Sent: Wednesday, March 18, 2020 5:47 AM
> To: Moger, Babu 
> Cc: Eduardo Habkost ; marcel.apfelb...@gmail.com;
> pbonz...@redhat.com; r...@twiddle.net; m...@redhat.com; qemu-
> de...@nongnu.org
> Subject: Re: [PATCH v7 00/13] APIC ID fixes for AMD EPYC CPU model
> 
> On Wed, 18 Mar 2020 02:43:57 +
> "Moger, Babu"  wrote:
> 
> > [AMD Official Use Only - Internal Distribution Only]
> >
> >
> >
> > > -Original Message-
> > > From: Eduardo Habkost 
> > > Sent: Tuesday, March 17, 2020 6:46 PM
> > > To: Moger, Babu 
> > > Cc: marcel.apfelb...@gmail.com; pbonz...@redhat.com; r...@twiddle.net;
> > > m...@redhat.com; imamm...@redhat.com; qemu-devel@nongnu.org
> > > Subject: Re: [PATCH v7 00/13] APIC ID fixes for AMD EPYC CPU model
> > >
> > > On Tue, Mar 17, 2020 at 07:22:06PM -0400, Eduardo Habkost wrote:
> > > > On Thu, Mar 12, 2020 at 11:28:47AM -0500, Babu Moger wrote:
> > > > > Eduardo, Can you please queue the series if there are no concerns.
> > > > > Thanks
> > > >
> > > > I had queued it for today's pull request, but it looks like it
> > > > breaks "make check".  See
> > >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-
> > >
> ci.org%2Fgithub%2Fehabkost%2Fqemu%2Fjobs%2F663529282data=02%7
> > >
> C01%7Cbabu.moger%40amd.com%7C43bba959c4d34e3be5fd08d7cacd634d%7
> > >
> C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637200855817408351&
> > >
> amp;sdata=cfjMVDKMgByvtUIqqGtcjNWGAf3PKFKxDLaS1eVME3U%3Dre
> > > served=0
> > > >
> > > >   PASS 4 bios-tables-test /x86_64/acpi/piix4/ipmi
> > > >   Could not access KVM kernel module: No such file or directory
> > > >   qemu-system-x86_64: -accel kvm: failed to initialize kvm: No such 
> > > > file or
> > > directory
> > > >   qemu-system-x86_64: falling back to tcg
> > > >   qemu-system-x86_64: Invalid CPU [socket: 0, die: 0, core: 1, thread: 
> > > > 0]
> with
> > > APIC ID 1, valid index range 0:5
> > > >   Broken pipe
> > > >   /home/travis/build/ehabkost/qemu/tests/qtest/libqtest.c:166: 
> > > > kill_qemu()
> > > tried to terminate QEMU process but encountered exit status 1 (expected 0)
> > > >   Aborted (core dumped)
> > > >   ERROR - too few tests run (expected 17, got 4)
> > > >   /home/travis/build/ehabkost/qemu/tests/Makefile.include:633: recipe
> for
> > > target 'check-qtest-x86_64' failed
> > > >   make: *** [check-qtest-x86_64] Error 1
> > >
> > > Failure is at the /x86_64/acpi/piix4/cpuhp test case:
> > >
> > >   $ QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
> > > QTEST_QEMU_IMG=qemu-img tests/qtest/bios-tables-test -m=quick --
> verbose -
> > > -debug-log
> > >   [...]
> > >   {*LOG(start):{/x86_64/acpi/piix4/cpuhp}:LOG*}
> > >   # starting QEMU: exec x86_64-softmmu/qemu-system-x86_64 -qtest
> > > unix:/tmp/qtest-2052313.sock -qtest-log /dev/null -chardev
> > > socket,path=/tmp/qtest-2052313.qmp,id=char0 -mon
> > > chardev=char0,mode=control -display none -machine pc,kernel-irqchip=off -
> > > accel kvm -accel tcg -net none -display none -smp
> > > 2,cores=3,sockets=2,maxcpus=6 -object memory-backend-
> > > ram,id=ram0,size=64M -object memory-backend-ram,id=ram1,size=64M -
> numa
> > > node,memdev=ram0 -numa node,memdev=ram1 -numa
> dist,src=0,dst=1,val=21
> > > -drive id=hd0,if=none,file=tests/acpi-test-disk-PVjFru,format=raw -device
> ide-
> > > hd,drive=hd0  -accel qtest
> > >   {*LOG(message):{starting QEMU: exec x86_64-softmmu/qemu-system-
> x86_64
> > > -qtest unix:/tmp/qtest-2052313.sock -qtest-log /dev/null -chardev
> > > socket,path=/tmp/qtest-2052313.qmp,id=char0 -mon
> > > chardev=char0,mode=control -display none -machine pc,kernel-irqchip=off -
> > > accel kvm -accel tcg -net none -display none -smp
> > > 2,cores=3,sockets=2,maxcpus=6 -object memory-backend-
> > > ram,id=ram0,size=64M -object memory-backend-ram,id=ram1,size=64M -
> numa
> > > node,memdev=ram0 -numa node,memdev=ram1 -numa
> dist,src=0,dst=1,val=21
> > > -drive id=hd0,if=none,file=tests/acpi-test-disk-PVjFru,format=raw -device
> ide-
> > > hd,drive=hd0  -accel qtest}:LOG*}
> > >   qemu-system-x86_64: Invalid CPU [socket:

RE: [PATCH v7 00/13] APIC ID fixes for AMD EPYC CPU model

2020-03-17 Thread Moger, Babu

[AMD Official Use Only - Internal Distribution Only]



> -Original Message-
> From: Eduardo Habkost 
> Sent: Tuesday, March 17, 2020 6:46 PM
> To: Moger, Babu 
> Cc: marcel.apfelb...@gmail.com; pbonz...@redhat.com; r...@twiddle.net;
> m...@redhat.com; imamm...@redhat.com; qemu-devel@nongnu.org
> Subject: Re: [PATCH v7 00/13] APIC ID fixes for AMD EPYC CPU model
> 
> On Tue, Mar 17, 2020 at 07:22:06PM -0400, Eduardo Habkost wrote:
> > On Thu, Mar 12, 2020 at 11:28:47AM -0500, Babu Moger wrote:
> > > Eduardo, Can you please queue the series if there are no concerns.
> > > Thanks
> >
> > I had queued it for today's pull request, but it looks like it
> > breaks "make check".  See
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-
> ci.org%2Fgithub%2Fehabkost%2Fqemu%2Fjobs%2F663529282data=02%7
> C01%7Cbabu.moger%40amd.com%7C43bba959c4d34e3be5fd08d7cacd634d%7
> C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637200855817408351&
> amp;sdata=cfjMVDKMgByvtUIqqGtcjNWGAf3PKFKxDLaS1eVME3U%3Dre
> served=0
> >
> >   PASS 4 bios-tables-test /x86_64/acpi/piix4/ipmi
> >   Could not access KVM kernel module: No such file or directory
> >   qemu-system-x86_64: -accel kvm: failed to initialize kvm: No such file or
> directory
> >   qemu-system-x86_64: falling back to tcg
> >   qemu-system-x86_64: Invalid CPU [socket: 0, die: 0, core: 1, thread: 0] 
> > with
> APIC ID 1, valid index range 0:5
> >   Broken pipe
> >   /home/travis/build/ehabkost/qemu/tests/qtest/libqtest.c:166: kill_qemu()
> tried to terminate QEMU process but encountered exit status 1 (expected 0)
> >   Aborted (core dumped)
> >   ERROR - too few tests run (expected 17, got 4)
> >   /home/travis/build/ehabkost/qemu/tests/Makefile.include:633: recipe for
> target 'check-qtest-x86_64' failed
> >   make: *** [check-qtest-x86_64] Error 1
> 
> Failure is at the /x86_64/acpi/piix4/cpuhp test case:
> 
>   $ QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
> QTEST_QEMU_IMG=qemu-img tests/qtest/bios-tables-test -m=quick --verbose -
> -debug-log
>   [...]
>   {*LOG(start):{/x86_64/acpi/piix4/cpuhp}:LOG*}
>   # starting QEMU: exec x86_64-softmmu/qemu-system-x86_64 -qtest
> unix:/tmp/qtest-2052313.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-2052313.qmp,id=char0 -mon
> chardev=char0,mode=control -display none -machine pc,kernel-irqchip=off -
> accel kvm -accel tcg -net none -display none -smp
> 2,cores=3,sockets=2,maxcpus=6 -object memory-backend-
> ram,id=ram0,size=64M -object memory-backend-ram,id=ram1,size=64M -numa
> node,memdev=ram0 -numa node,memdev=ram1 -numa dist,src=0,dst=1,val=21
> -drive id=hd0,if=none,file=tests/acpi-test-disk-PVjFru,format=raw -device ide-
> hd,drive=hd0  -accel qtest
>   {*LOG(message):{starting QEMU: exec x86_64-softmmu/qemu-system-x86_64
> -qtest unix:/tmp/qtest-2052313.sock -qtest-log /dev/null -chardev
> socket,path=/tmp/qtest-2052313.qmp,id=char0 -mon
> chardev=char0,mode=control -display none -machine pc,kernel-irqchip=off -
> accel kvm -accel tcg -net none -display none -smp
> 2,cores=3,sockets=2,maxcpus=6 -object memory-backend-
> ram,id=ram0,size=64M -object memory-backend-ram,id=ram1,size=64M -numa
> node,memdev=ram0 -numa node,memdev=ram1 -numa dist,src=0,dst=1,val=21
> -drive id=hd0,if=none,file=tests/acpi-test-disk-PVjFru,format=raw -device ide-
> hd,drive=hd0  -accel qtest}:LOG*}
>   qemu-system-x86_64: Invalid CPU [socket: 0, die: 0, core: 1, thread: 0] with
> APIC ID 1, valid index range 0:5
>   Broken pipe

The ms->smp.cpus Is not initialized to max cpus in this case. Looks like 
smp_parse did not run in this path.
For that reason the apicid is not initialized for all the cpus. Following patch 
fixes the problem.
I will test all the combinations and send the patch tomorrow. Let me know which 
tree I should use the to
generate the patch. It appears some patches are already pulled. I can send top 
of
 git://github.com/ehabkost/qemu.git (x86-next).

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 023dce1dbd..1eeb7b9732 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -156,7 +156,7 @@ void x86_cpus_init(X86MachineState *x86ms, int 
default_cpu_version)
   ms->smp.max_cpus - 1) + 
1;
 possible_cpus = mc->possible_cpu_arch_ids(ms);

-for (i = 0; i < ms->smp.cpus; i++) {
+for (i = 0; i < ms->possible_cpus->len; i++) {
 ms->possible_cpus->cpus[i].arch_id =
 x86_cpu_apic_id_from_index(x86ms, i);
 }

> 
> 
> >
> >
> > >
> > > On 3/11/20 5:52 PM, Babu Moger wrote:
> > > > This series fixes APIC ID encoding problem reported

RE: [PATCH v7 00/13] APIC ID fixes for AMD EPYC CPU model

2020-03-17 Thread Moger, Babu

[AMD Official Use Only - Internal Distribution Only]

Ok. I am looking at it.

> -Original Message-
> From: Eduardo Habkost 
> Sent: Tuesday, March 17, 2020 6:22 PM
> To: Moger, Babu 
> Cc: marcel.apfelb...@gmail.com; pbonz...@redhat.com; r...@twiddle.net;
> m...@redhat.com; imamm...@redhat.com; qemu-devel@nongnu.org
> Subject: Re: [PATCH v7 00/13] APIC ID fixes for AMD EPYC CPU model
> 
> On Thu, Mar 12, 2020 at 11:28:47AM -0500, Babu Moger wrote:
> > Eduardo, Can you please queue the series if there are no concerns.
> > Thanks
> 
> I had queued it for today's pull request, but it looks like it
> breaks "make check".  See
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-
> ci.org%2Fgithub%2Fehabkost%2Fqemu%2Fjobs%2F663529282data=02%7
> C01%7Cbabu.moger%40amd.com%7Ccf8b7161fda34176840208d7caca06af%7
> C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63720084136028&
> amp;sdata=qtBc13zinZ%2BGMT1a%2BniRJk6moGNzjgPWOJU42mL%2FZnM%3D
> reserved=0
> 
>   PASS 4 bios-tables-test /x86_64/acpi/piix4/ipmi
>   Could not access KVM kernel module: No such file or directory
>   qemu-system-x86_64: -accel kvm: failed to initialize kvm: No such file or
> directory
>   qemu-system-x86_64: falling back to tcg
>   qemu-system-x86_64: Invalid CPU [socket: 0, die: 0, core: 1, thread: 0] with
> APIC ID 1, valid index range 0:5
>   Broken pipe
>   /home/travis/build/ehabkost/qemu/tests/qtest/libqtest.c:166: kill_qemu() 
> tried
> to terminate QEMU process but encountered exit status 1 (expected 0)
>   Aborted (core dumped)
>   ERROR - too few tests run (expected 17, got 4)
>   /home/travis/build/ehabkost/qemu/tests/Makefile.include:633: recipe for
> target 'check-qtest-x86_64' failed
>   make: *** [check-qtest-x86_64] Error 1
> 
> 
> >
> > On 3/11/20 5:52 PM, Babu Moger wrote:
> > > This series fixes APIC ID encoding problem reported on AMD EPYC cpu
> models.
> > >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.
> redhat.com%2Fshow_bug.cgi%3Fid%3D1728166data=02%7C01%7Cbabu.
> moger%40amd.com%7Ccf8b7161fda34176840208d7caca06af%7C3dd8961fe48
> 84e608e11a82d994e183d%7C0%7C0%7C63720084136028sdata=uH
> 3lQsaE99WjC1kJJbnF%2FLvi1HM3rUwesp67pci5UgE%3Dreserved=0
> > >
> > > Currently, the APIC ID is decoded based on the sequence
> > > sockets->dies->cores->threads. This works for most standard AMD and other
> > > vendors' configurations, but this decoding sequence does not follow that 
> > > of
> > > AMD's APIC ID enumeration strictly. In some cases this can cause CPU
> topology
> > > inconsistency.  When booting a guest VM, the kernel tries to validate the
> > > topology, and finds it inconsistent with the enumeration of EPYC cpu 
> > > models.
> > >
> > > To fix the problem we need to build the topology as per the Processor
> > > Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
> > > Processors. The documentation is available from the bugzilla Link below.
> > >
> > > Link:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.
> kernel.org%2Fshow_bug.cgi%3Fid%3D206537data=02%7C01%7Cbabu.m
> oger%40amd.com%7Ccf8b7161fda34176840208d7caca06af%7C3dd8961fe4884
> e608e11a82d994e183d%7C0%7C0%7C63720084136028sdata=PECu
> xeA9RJ1Wgb5X2zjFqZHFlMQFxbU6n9PkuOvqtvA%3Dreserved=0
> > >
> > > Here is the text from the PPR.
> > > Operating systems are expected to use 
> > > Core::X86::Cpuid::SizeId[ApicIdSize],
> the
> > > number of least significant bits in the Initial APIC ID that indicate 
> > > core ID
> > > within a processor, in constructing per-core CPUID masks.
> > > Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of
> cores
> > > (MNC) that the processor could theoretically support, not the actual 
> > > number
> of
> > > cores that are actually implemented or enabled on the processor, as
> indicated
> > > by Core::X86::Cpuid::SizeId[NC].
> > > Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
> > > • ApicId[6] = Socket ID.
> > > • ApicId[5:4] = Node ID.
> > > • ApicId[3] = Logical CCX L3 complex ID
> > > • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} :
> {1'b0,LogicalCoreID[1:0]}
> > >
> > > v7:
> > >  Generated the patches on top of git://github.com/ehabkost/qemu.git (x86-
> next).
> > >  Changes from v6.
> > >  1. Added new function x86_set_epyc_topo_handlers to override the apic id
> > > encoding handlers.
> > &g

RE: [PATCH v5 16/16] tests: Update the Unit tests

2020-03-10 Thread Moger, Babu

[AMD Official Use Only - Internal Distribution Only]



> -Original Message-
> From: Eduardo Habkost 
> Sent: Tuesday, March 10, 2020 6:06 PM
> To: Moger, Babu 
> Cc: marcel.apfelb...@gmail.com; pbonz...@redhat.com; r...@twiddle.net;
> m...@redhat.com; imamm...@redhat.com; qemu-devel@nongnu.org
> Subject: Re: [PATCH v5 16/16] tests: Update the Unit tests
> 
> On Tue, Mar 03, 2020 at 01:58:38PM -0600, Babu Moger wrote:
> > Since the topology routines have changed, update
> > the unit tests to use the new APIs.
> >
> > Signed-off-by: Babu Moger 
> 
> This has to be part of the patches that changed the function
> interfaces, otherwise we break bisectability.

Yes. That is right.  Will quash with other patch.

> 
> --
> Eduardo

RE: [PATCH] i386: pass CLZERO to guests with EPYC CPU model on AMD ZEN platform

2020-02-05 Thread Moger, Babu

[AMD Official Use Only - Internal Distribution Only]

> -Original Message-
> From: Eduardo Habkost 
> Sent: Wednesday, February 5, 2020 4:38 PM
> To: Ani Sinha 
> Cc: Paolo Bonzini ; r...@twiddle.net; qemu-
> de...@nongnu.org; Singh, Brijesh ; Moger, Babu
> 
> Subject: Re: [PATCH] i386: pass CLZERO to guests with EPYC CPU model on
> AMD ZEN platform
> 
> Hi,
> 
> Sorry for the delayed reply.  I was away from work for the whole
> month of January.
> 
> On Mon, Jan 20, 2020 at 10:56:43AM +, Ani Sinha wrote:
> > Sorry Eduardo, it took a little while for me to get to this thread again.
> >
> > > On Dec 18, 2019, at 8:41 PM, Eduardo Habkost 
> wrote:
> > >
> > > On Wed, Dec 18, 2019 at 12:53:45PM +0100, Paolo Bonzini wrote:
> > >> On 18/12/19 10:05, Ani Sinha wrote:
> > >>> CLZERO CPUID should be passed on to the guests that use EPYC or
> EPYC-IBPB CPU
> > >>> model when the AMD ZEN based host supports it. This change makes it
> recognize
> > >>> this CPUID for guests which use EPYC or EPYC-IBPB CPU model.
> > >
> > > Can you clarify what's the intended use case here?  Why the
> > > "if host supports it" conditional?
> >
> > Looking at
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
> w.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F24594.pdfdata=02%7
> C01%7Cbabu.moger%40amd.com%7C4d00819020cb4892d50608d7aa8c016b%
> 7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637165391335697450
> sdata=3%2FFGZeeZoz387eeayAbjQJj9qGhemgw0xC0byr%2F0YJk%3D
> reserved=0 , it says :
> >
> > "The CLZERO instruction is supported if the feature flag CPUID
> Fn8000_0008_EBX[CLZERO] is set.”
> >
> > This I interpreted to mean that not all AMD Zen architectures
> > supports it. So when the host does support it, this CPUID
> > should be passed on to the guest as well.
> 
> This is not a supported use case of named CPU models.  Named CPU
> models should expose the same guest ABI on all hosts.  This means
> CPUID should be the same on all hosts if using the same CPU
> model (and same machine type).
> 
> If you need features to be automatically enabled/disabled
> depending on host capabilities, I advise you to use "-cpu host"
> or libvirt's mode=host-model.
> 
> >
> >
> > >
> > > If you need host-dependent CPU configuration, "-cpu host" (or the
> > > libvirt "host-model" mode) is the most appropriate solution.
> >
> > Yes that is an option but we are going to use EPYC-IBPB model for now.
> >
> >
> > >
> > >>>
> > >>> Signed-off-by: Ani Sinha 
> > >>> ---
> > >>> target/i386/cpu.c | 2 ++
> > >>> 1 file changed, 2 insertions(+)
> > >>>
> > >>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > >>> index 69f518a..55f0691 100644
> > >>> --- a/target/i386/cpu.c
> > >>> +++ b/target/i386/cpu.c
> > >>> @@ -3813,6 +3813,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
> > >>> CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A |
> CPUID_EXT3_ABM |
> > >>> CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM |
> CPUID_EXT3_LAHF_LM |
> > >>> CPUID_EXT3_TOPOEXT,
> > >>> +.features[FEAT_8000_0008_EBX] =
> > >>> +CPUID_8000_0008_EBX_CLZERO,
> > >>> .features[FEAT_7_0_EBX] =
> > >>> CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
> CPUID_7_0_EBX_AVX2 |
> > >>> CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
> CPUID_7_0_EBX_RDSEED |
> > >>>
> > >>
> > >> This needs to be done only for newer machine type (or is it CPU model
> > >> versions now? need Eduardo to respond).
> > >
> > > If we want to add it, it has to be done as a new CPU model version.
> >
> > I see what you mean.
> >
> > >
> > > But I don't know yet if we want to add it.  Do all EPYC CPUs have
> > > CLZERO available?  If not, it's probably not advisable to add it
> > > to EPYC (even if it's just on EPYC-v3).
> >
> > Ok so I think we need to get this clarified from AMD if all
> > their EPYC platforms supports this CPUID or not. Is there any
> > contact point within AMD where we can get this information?
> 
> I'm CCing Brijesh Singh and Babu Moger, who works on the EPYC CPU
> model recently.

Ani, I am already working on it.

Eduardo,  I am still waiting for your feedback on this series.
https://lore.kernel.org/qemu-devel/abd39b75-0a12-5198-5815-dd51a3d5c...@amd.com/

I have added all the missing feature bits for EPYC models(as EPYC-v3) and also 
added EPYC-Rome model.

> 
> >
> > For our use case, I just verified that even without this patch,
> > if we pass CLZERO through libvirt CPU definition xml, like "
> >

[PATCH v2 2/2] i386: Add 2nd Generation AMD EPYC processors

2019-11-07 Thread Moger, Babu

Adds the support for 2nd Gen AMD EPYC Processors. The model display
name will be EPYC-Rome.

Adds the following new feature bits on top of the feature bits from the
first generation EPYC models.
perfctr-core : core performance counter extensions support. Enables the VM to
   use extended performance counter support. It enables six
   programmable counters instead of four counters.
clzero   : instruction zeroes out the 64 byte cache line specified in RAX.
xsaveerptr   : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
   pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
   pointers.
wbnoinvd : Write back and do not invalidate cache
ibpb : Indirect Branch Prediction Barrier
amd-stibp: Single Thread Indirect Branch Predictor
clwb : Cache Line Write Back and Retain
xsaves   : XSAVES, XRSTORS and IA32_XSS support
rdpid: Read Processor ID instruction support
umip : User-Mode Instruction Prevention support

The  Reference documents are available at
https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdf
https://www.amd.com/system/files/TechDocs/24594.pdf

Depends on following kernel commits:
40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
6d61e3c32248 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
52297436199d ("kvm: svm: Update svm_xsaves_supported")

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c |  102 -
 target/i386/cpu.h |2 +
 2 files changed, 103 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6b7b0f8a4b..70afc3fb30 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1133,7 +1133,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = 
{
 "clzero", NULL, "xsaveerptr", NULL,
 NULL, NULL, NULL, NULL,
 NULL, "wbnoinvd", NULL, NULL,
-"ibpb", NULL, NULL, NULL,
+"ibpb", NULL, NULL, "amd-stibp",
 NULL, NULL, NULL, NULL,
 NULL, NULL, NULL, NULL,
 "amd-ssbd", "virt-ssbd", "amd-no-ssb", NULL,
@@ -1796,6 +1796,56 @@ static CPUCaches epyc_cache_info = {
 },
 };
 
+static CPUCaches epyc_rome_cache_info = {
+.l1d_cache = &(CPUCacheInfo) {
+.type = DATA_CACHE,
+.level = 1,
+.size = 32 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 64,
+.lines_per_tag = 1,
+.self_init = 1,
+.no_invd_sharing = true,
+},
+.l1i_cache = &(CPUCacheInfo) {
+.type = INSTRUCTION_CACHE,
+.level = 1,
+.size = 32 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 64,
+.lines_per_tag = 1,
+.self_init = 1,
+.no_invd_sharing = true,
+},
+.l2_cache = &(CPUCacheInfo) {
+.type = UNIFIED_CACHE,
+.level = 2,
+.size = 512 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 1024,
+.lines_per_tag = 1,
+},
+.l3_cache = &(CPUCacheInfo) {
+.type = UNIFIED_CACHE,
+.level = 3,
+.size = 16 * MiB,
+.line_size = 64,
+.associativity = 16,
+.partitions = 1,
+.sets = 16384,
+.lines_per_tag = 1,
+.self_init = true,
+.inclusive = true,
+.complex_indexing = true,
+},
+};
+
 static X86CPUDefinition builtin_x86_defs[] = {
 {
 .name = "qemu64",
@@ -3204,6 +3254,56 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .model_id = "Hygon Dhyana Processor",
 .cache_info = _cache_info,
 },
+{
+.name = "EPYC-Rome",
+.level = 0xd,
+.vendor = CPUID_VENDOR_AMD,
+.family = 23,
+.model = 49,
+.stepping = 0,
+.features[FEAT_1_EDX] =
+CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX | CPUID_CLFLUSH |
+CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA | CPUID_PGE |
+CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 | CPUID_MCE |
+CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE |
+CPUID_VME | CPUID_FP87,
+.features[FEAT_1_ECX] =
+CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX |
+CPUID_EXT_XSAVE | CPUID_EXT_AES |  CPUID_EXT_POPCNT |
+CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
+CPUID_EXT_CX16 | CPUID_EXT_FMA | CPUID_EXT_SSSE3 |
+CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3,
+.features[FEAT_8000_0001_EDX] =
+CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB |
+CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX |
+CPUID_EXT2_SYSCALL,
+.features[FEAT_8000_0001_ECX] =
+

[PATCH v2 1/2] i386: Add missing cpu feature bits in EPYC model

2019-11-07 Thread Moger, Babu

Adds the following missing CPUID bits:
perfctr-core : core performance counter extensions support. Enables the VM
   to use extended performance counter support. It enables six
   programmable counters instead of 4 counters.
clzero   : instruction zeroes out the 64 byte cache line specified in RAX.
xsaveerptr   : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
   pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
   pointers.
ibpb : Indirect Branch Prediction Barrie.
xsaves   : XSAVES, XRSTORS and IA32_XSS supported.

Depends on following kernel commits:
40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
52297436199d ("kvm: svm: Update svm_xsaves_supported")

These new features will be added in EPYC-v3. The -cpu help output after the 
change.
x86 EPYC-v1   AMD EPYC Processor
x86 EPYC-v2   AMD EPYC Processor (with IBPB)
x86 EPYC-v3   AMD EPYC Processor

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c |   17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 07cf562d89..6b7b0f8a4b 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3116,10 +3116,6 @@ static X86CPUDefinition builtin_x86_defs[] = {
 CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_RDSEED |
 CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_CLFLUSHOPT |
 CPUID_7_0_EBX_SHA_NI,
-/* Missing: XSAVES (not supported by some Linux versions,
- * including v4.1 to v4.12).
- * KVM doesn't yet expose any XSAVES state save component.
- */
 .features[FEAT_XSAVE] =
 CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
 CPUID_XSAVE_XGETBV1,
@@ -3142,6 +3138,19 @@ static X86CPUDefinition builtin_x86_defs[] = {
 { /* end of list */ }
 }
 },
+{
+.version = 3,
+.props = (PropValue[]) {
+{ "ibpb", "on" },
+{ "perfctr-core", "on" },
+{ "clzero", "on" },
+{ "xsaveerptr", "on" },
+{ "xsaves", "on" },
+{ "model-id",
+  "AMD EPYC Processor" },
+{ /* end of list */ }
+}
+},
 { /* end of list */ }
 }
 },

[PATCH v2 0/2] Add support for 2nd generation AMD EPYC processors

2019-11-07 Thread Moger, Babu

The following series adds the support for 2nd generation AMD EPYC Processors
on qemu guests. The model display name for 2nd generation will be EPYC-Rome.

Also fixes few missed cpu feature bits in 1st generation EPYC models.

The Reference documents are available at
https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdf
https://www.amd.com/system/files/TechDocs/24594.pdf

---
v2: Used the versioned CPU models instead of machine-type-based CPU
compatibility (commented by Eduardo).

Babu Moger (2):
  i386: Add missing cpu feature bits in EPYC model
  i386: Add 2nd Generation AMD EPYC processors


 target/i386/cpu.c |  119 +++--
 target/i386/cpu.h |2 +
 2 files changed, 116 insertions(+), 5 deletions(-)

--

RE: [PATCH 1/2] i386: Add missing cpu feature bits in EPYC model

2019-11-05 Thread Moger, Babu




> -Original Message-
> From: Eduardo Habkost 
> Sent: Tuesday, November 5, 2019 3:43 PM
> To: Moger, Babu 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; qemu-devel@nongnu.org
> Subject: Re: [PATCH 1/2] i386: Add missing cpu feature bits in EPYC model
> 
> On Tue, Nov 05, 2019 at 09:17:30PM +, Moger, Babu wrote:
> > Adds the following missing CPUID bits:
> > perfctr-core : core performance counter extensions support. Enables the VM
> >to use extended performance counter support. It enables six
> >programmable counters instead of 4 counters.
> > clzero   : instruction zeroes out the 64 byte cache line specified in 
> > RAX.
> > xsaveerptr   : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
> >pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
> >pointers.
> > ibpb : Indirect Branch Prediction Barrie.
> > xsaves   : XSAVES, XRSTORS and IA32_XSS supported.
> >
> > Depends on:
> > 40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
> > 504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
> > 52297436199d ("kvm: svm: Update svm_xsaves_supported")
> >
> > Signed-off-by: Babu Moger 
> > ---
> >  hw/i386/pc.c  |8 +++-
> >  target/i386/cpu.c |   11 +--
> >  2 files changed, 12 insertions(+), 7 deletions(-)
> >
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 51b72439b4..a72fe1db31 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -105,7 +105,13 @@ struct hpet_fw_config hpet_cfg = {.count =
> UINT8_MAX};
> >  /* Physical Address of PVH entry point read from kernel ELF NOTE */
> >  static size_t pvh_start_addr;
> >
> > -GlobalProperty pc_compat_4_1[] = {};
> > +GlobalProperty pc_compat_4_1[] = {
> > +{ "EPYC" "-" TYPE_X86_CPU, "perfctr-core", "off" },
> > +{ "EPYC" "-" TYPE_X86_CPU, "clzero", "off" },
> > +{ "EPYC" "-" TYPE_X86_CPU, "xsaveerptr", "off" },
> > +{ "EPYC" "-" TYPE_X86_CPU, "ibpb", "off" },
> > +{ "EPYC" "-" TYPE_X86_CPU, "xsaves", "off" },
> > +};
> 
> machine-type-based CPU compatibility was now replaced by
> versioned CPU models.  Please use the X86CPUDefinition.versions
> field to add a new version of EPYC instead.

Ok. Did  you mean like this commit  below?
fd63c6d1a5f77d68 ("i386: Add Cascadelake-Server-v2 CPU model")

> 
> >  const size_t pc_compat_4_1_len = G_N_ELEMENTS(pc_compat_4_1);
> >
> >  GlobalProperty pc_compat_4_0[] = {};
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 07cf562d89..71233e6310 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -3110,19 +3110,18 @@ static X86CPUDefinition builtin_x86_defs[] = {
> >  CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
> >  CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM
> |
> >  CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
> > -CPUID_EXT3_TOPOEXT,
> > +CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
> > +.features[FEAT_8000_0008_EBX] =
> > +CPUID_8000_0008_EBX_CLZERO |
> CPUID_8000_0008_EBX_XSAVEERPTR |
> > +CPUID_8000_0008_EBX_IBPB,
> >  .features[FEAT_7_0_EBX] =
> >  CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
> CPUID_7_0_EBX_AVX2 |
> >  CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
> CPUID_7_0_EBX_RDSEED |
> >  CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_SMAP |
> CPUID_7_0_EBX_CLFLUSHOPT |
> >  CPUID_7_0_EBX_SHA_NI,
> > -/* Missing: XSAVES (not supported by some Linux versions,
> > - * including v4.1 to v4.12).
> > - * KVM doesn't yet expose any XSAVES state save component.
> > - */
> >  .features[FEAT_XSAVE] =
> >  CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
> > -CPUID_XSAVE_XGETBV1,
> > +CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES,
> >  .features[FEAT_6_EAX] =
> >  CPUID_6_EAX_ARAT,
> >  .features[FEAT_SVM] =
> >
> 
> --
> Eduardo

[PATCH 2/2] i386: Add 2nd Generation AMD EPYC processors

2019-11-05 Thread Moger, Babu

Adds the support for 2nd Gen AMD EPYC Processors. The model display
name will be EPYC-Rome.

Adds the following new feature bits on top of the feature bits from the
first generation EPYC models.
perfctr-core : core performance counter extensions support. Enables the VM to
   use extended performance counter support. It enables six
   programmable counters instead of four counters.
clzero   : instruction zeroes out the 64 byte cache line specified in RAX.
xsaveerptr   : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
   pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
   pointers.
wbnoinvd : Write back and do not invalidate cache
ibpb : Indirect Branch Prediction Barrier
amd-stibp: Single Thread Indirect Branch Predictor
clwb : Cache Line Write Back and Retain
xsaves   : XSAVES, XRSTORS and IA32_XSS support
rdpid: Read Processor ID instruction support
umip : User-Mode Instruction Prevention support

The  Reference documents are available at
https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdf
https://www.amd.com/system/files/TechDocs/24594.pdf

Depends on following kernel commits:
40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
6d61e3c32248 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
52297436199d ("kvm: svm: Update svm_xsaves_supported")

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c |  102 -
 target/i386/cpu.h |2 +
 2 files changed, 103 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 71233e6310..846662c879 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1133,7 +1133,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = 
{
 "clzero", NULL, "xsaveerptr", NULL,
 NULL, NULL, NULL, NULL,
 NULL, "wbnoinvd", NULL, NULL,
-"ibpb", NULL, NULL, NULL,
+"ibpb", NULL, NULL, "amd-stibp",
 NULL, NULL, NULL, NULL,
 NULL, NULL, NULL, NULL,
 "amd-ssbd", "virt-ssbd", "amd-no-ssb", NULL,
@@ -1796,6 +1796,56 @@ static CPUCaches epyc_cache_info = {
 },
 };
 
+static CPUCaches epyc_rome_cache_info = {
+.l1d_cache = &(CPUCacheInfo) {
+.type = DATA_CACHE,
+.level = 1,
+.size = 32 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 64,
+.lines_per_tag = 1,
+.self_init = 1,
+.no_invd_sharing = true,
+},
+.l1i_cache = &(CPUCacheInfo) {
+.type = INSTRUCTION_CACHE,
+.level = 1,
+.size = 32 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 64,
+.lines_per_tag = 1,
+.self_init = 1,
+.no_invd_sharing = true,
+},
+.l2_cache = &(CPUCacheInfo) {
+.type = UNIFIED_CACHE,
+.level = 2,
+.size = 512 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 1024,
+.lines_per_tag = 1,
+},
+.l3_cache = &(CPUCacheInfo) {
+.type = UNIFIED_CACHE,
+.level = 3,
+.size = 16 * MiB,
+.line_size = 64,
+.associativity = 16,
+.partitions = 1,
+.sets = 16384,
+.lines_per_tag = 1,
+.self_init = true,
+.inclusive = true,
+.complex_indexing = true,
+},
+};
+
 static X86CPUDefinition builtin_x86_defs[] = {
 {
 .name = "qemu64",
@@ -3194,6 +3244,56 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .model_id = "Hygon Dhyana Processor",
 .cache_info = _cache_info,
 },
+{
+.name = "EPYC-Rome",
+.level = 0xd,
+.vendor = CPUID_VENDOR_AMD,
+.family = 23,
+.model = 49,
+.stepping = 0,
+.features[FEAT_1_EDX] =
+CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX | CPUID_CLFLUSH |
+CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA | CPUID_PGE |
+CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 | CPUID_MCE |
+CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE | CPUID_DE |
+CPUID_VME | CPUID_FP87,
+.features[FEAT_1_ECX] =
+CPUID_EXT_RDRAND | CPUID_EXT_F16C | CPUID_EXT_AVX |
+CPUID_EXT_XSAVE | CPUID_EXT_AES |  CPUID_EXT_POPCNT |
+CPUID_EXT_MOVBE | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 |
+CPUID_EXT_CX16 | CPUID_EXT_FMA | CPUID_EXT_SSSE3 |
+CPUID_EXT_MONITOR | CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3,
+.features[FEAT_8000_0001_EDX] =
+CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_PDPE1GB |
+CPUID_EXT2_FFXSR | CPUID_EXT2_MMXEXT | CPUID_EXT2_NX |
+CPUID_EXT2_SYSCALL,
+.features[FEAT_8000_0001_ECX] =
+

[PATCH 1/2] i386: Add missing cpu feature bits in EPYC model

2019-11-05 Thread Moger, Babu

Adds the following missing CPUID bits:
perfctr-core : core performance counter extensions support. Enables the VM
   to use extended performance counter support. It enables six
   programmable counters instead of 4 counters.
clzero   : instruction zeroes out the 64 byte cache line specified in RAX.
xsaveerptr   : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
   pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
   pointers.
ibpb : Indirect Branch Prediction Barrie.
xsaves   : XSAVES, XRSTORS and IA32_XSS supported.

Depends on:
40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
52297436199d ("kvm: svm: Update svm_xsaves_supported")

Signed-off-by: Babu Moger 
---
 hw/i386/pc.c  |8 +++-
 target/i386/cpu.c |   11 +--
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 51b72439b4..a72fe1db31 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -105,7 +105,13 @@ struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
 /* Physical Address of PVH entry point read from kernel ELF NOTE */
 static size_t pvh_start_addr;
 
-GlobalProperty pc_compat_4_1[] = {};
+GlobalProperty pc_compat_4_1[] = {
+{ "EPYC" "-" TYPE_X86_CPU, "perfctr-core", "off" },
+{ "EPYC" "-" TYPE_X86_CPU, "clzero", "off" },
+{ "EPYC" "-" TYPE_X86_CPU, "xsaveerptr", "off" },
+{ "EPYC" "-" TYPE_X86_CPU, "ibpb", "off" },
+{ "EPYC" "-" TYPE_X86_CPU, "xsaves", "off" },
+};
 const size_t pc_compat_4_1_len = G_N_ELEMENTS(pc_compat_4_1);
 
 GlobalProperty pc_compat_4_0[] = {};
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 07cf562d89..71233e6310 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3110,19 +3110,18 @@ static X86CPUDefinition builtin_x86_defs[] = {
 CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
 CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
 CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
-CPUID_EXT3_TOPOEXT,
+CPUID_EXT3_TOPOEXT | CPUID_EXT3_PERFCORE,
+.features[FEAT_8000_0008_EBX] =
+CPUID_8000_0008_EBX_CLZERO | CPUID_8000_0008_EBX_XSAVEERPTR |
+CPUID_8000_0008_EBX_IBPB,
 .features[FEAT_7_0_EBX] =
 CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_AVX2 |
 CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_RDSEED |
 CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_SMAP | CPUID_7_0_EBX_CLFLUSHOPT |
 CPUID_7_0_EBX_SHA_NI,
-/* Missing: XSAVES (not supported by some Linux versions,
- * including v4.1 to v4.12).
- * KVM doesn't yet expose any XSAVES state save component.
- */
 .features[FEAT_XSAVE] =
 CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
-CPUID_XSAVE_XGETBV1,
+CPUID_XSAVE_XGETBV1 | CPUID_XSAVE_XSAVES,
 .features[FEAT_6_EAX] =
 CPUID_6_EAX_ARAT,
 .features[FEAT_SVM] =

[PATCH 0/2] Add support for 2nd generation AMD EPYC processors

2019-11-05 Thread Moger, Babu

The following series adds the support for 2nd generation AMD EPYC Processors
on qemu guests. The model display name for will be EPYC-Rome.

Also fixes few missed cpu feature bits in 1st generation EPYC models.

The Reference documents are available at
https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdf
https://www.amd.com/system/files/TechDocs/24594.pdf

---

Babu Moger (2):
  i386: Add missing cpu feature bits in EPYC model
  i386: Add 2nd Generation AMD EPYC processors


 hw/i386/pc.c  |8 +++-
 target/i386/cpu.c |  113 ++---
 target/i386/cpu.h |2 +
 3 files changed, 115 insertions(+), 8 deletions(-)

--

Re: [Qemu-devel] [RFC 2 PATCH 13/16] machine: Add new epyc property in PCMachineState

2019-10-11 Thread Moger, Babu


On 10/10/19 10:59 PM, Eduardo Habkost wrote:
> On Fri, Sep 06, 2019 at 07:13:09PM +0000, Moger, Babu wrote:
>> Adds new epyc property in PCMachineState and also in MachineState.
>> This property will be used to initialize the mode specific handlers
>> to generate apic ids.
>>
>> Signed-off-by: Babu Moger 
>> ---
> [...]
>> diff --git a/include/hw/boards.h b/include/hw/boards.h
>> index 12eb5032a5..0001d42e50 100644
>> --- a/include/hw/boards.h
>> +++ b/include/hw/boards.h
>> @@ -299,6 +299,8 @@ struct MachineState {
>>  AccelState *accelerator;
>>  CPUArchIdList *possible_cpus;
>>  CpuTopology smp;
>> +bool epyc;
>> +
> 
> This won't scale at all when we start adding new CPU models with
> different topology constraints.

Yes, I knew. This could cause scaling issues. Let me see if we could
do anything different.

> 
> I still have hope we can avoid having separate set of topology ID
> functions (see my reply to "hw/386: Add new epyc mode topology

Yes. That was my hope too. Let me think thru this bit more. I will come
back on this.


> decoding functions").  But if we really have to create separate
> functions, we can make them part of the CPU model table, not a
> boolean machine property.
>

Re: [Qemu-devel] [RFC 2 PATCH 13/16] machine: Add new epyc property in PCMachineState

2019-10-11 Thread Moger, Babu



On 10/10/19 10:59 PM, Eduardo Habkost wrote:
> On Fri, Sep 06, 2019 at 07:13:09PM +0000, Moger, Babu wrote:
>> Adds new epyc property in PCMachineState and also in MachineState.
>> This property will be used to initialize the mode specific handlers
>> to generate apic ids.
>>
>> Signed-off-by: Babu Moger 
>> ---
> [...]
>> diff --git a/include/hw/boards.h b/include/hw/boards.h
>> index 12eb5032a5..0001d42e50 100644
>> --- a/include/hw/boards.h
>> +++ b/include/hw/boards.h
>> @@ -299,6 +299,8 @@ struct MachineState {
>>  AccelState *accelerator;
>>  CPUArchIdList *possible_cpus;
>>  CpuTopology smp;
>> +bool epyc;
>> +
> 
> This won't scale at all when we start adding new CPU models with
> different topology constraints.

Yes, I knew. This could cause scaling issues. Let me see if we could do
anything different to avoid this.

> 
> I still have hope we can avoid having separate set of topology ID
> functions (see my reply to "hw/386: Add new epyc mode topology

Yes. That was (not to have separate topology functions) my hope too. Let
me think thru this bit more.

> decoding functions").  But if we really have to create separate
> functions, we can make them part of the CPU model table, not a
> boolean machine property.
>

Re: [RFC 2 PATCH 06/16] hw/core: Add core complex id in X86CPU topology

2019-09-23 Thread Moger, Babu



On 9/22/19 7:48 AM, Michael S. Tsirkin wrote:
> On Fri, Sep 06, 2019 at 07:12:18PM +0000, Moger, Babu wrote:
>> Introduce cpu core complex id(ccx_id) in x86CPU topology.
>> Each CCX can have upto 4 cores and share same L3 cache.
>> This information is required to build the topology in
>> new apyc mode.
>>
>> Signed-off-by: Babu Moger 
>> ---
>>  hw/core/machine-hmp-cmds.c |3 +++
>>  hw/core/machine.c  |   13 +
>>  hw/i386/pc.c   |   10 ++
>>  include/hw/i386/topology.h |1 +
>>  qapi/machine.json  |4 +++-
>>  target/i386/cpu.c  |2 ++
>>  target/i386/cpu.h  |1 +
>>  7 files changed, 33 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/core/machine-hmp-cmds.c b/hw/core/machine-hmp-cmds.c
>> index 1f66bda346..6c534779af 100644
>> --- a/hw/core/machine-hmp-cmds.c
>> +++ b/hw/core/machine-hmp-cmds.c
>> @@ -89,6 +89,9 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict 
>> *qdict)
>>  if (c->has_die_id) {
>>  monitor_printf(mon, "die-id: \"%" PRIu64 "\"\n", c->die_id);
>>  }
>> +if (c->has_ccx_id) {
>> +monitor_printf(mon, "ccx-id: \"%" PRIu64 "\"\n", c->ccx_id);
>> +}
>>  if (c->has_core_id) {
>>  monitor_printf(mon, "core-id: \"%" PRIu64 "\"\n", 
>> c->core_id);
>>  }
>> diff --git a/hw/core/machine.c b/hw/core/machine.c
>> index 4034b7e903..9a8586cf30 100644
>> --- a/hw/core/machine.c
>> +++ b/hw/core/machine.c
>> @@ -694,6 +694,11 @@ void machine_set_cpu_numa_node(MachineState *machine,
>>  return;
>>  }
>>  
>> +if (props->has_ccx_id && !slot->props.has_ccx_id) {
>> +error_setg(errp, "ccx-id is not supported");
>> +return;
>> +}
>> +
>>  /* skip slots with explicit mismatch */
>>  if (props->has_thread_id && props->thread_id != 
>> slot->props.thread_id) {
>>  continue;
>> @@ -707,6 +712,10 @@ void machine_set_cpu_numa_node(MachineState *machine,
>>  continue;
>>  }
>>  
>> +if (props->has_ccx_id && props->ccx_id != slot->props.ccx_id) {
>> +continue;
>> +}
>> +
>>  if (props->has_socket_id && props->socket_id != 
>> slot->props.socket_id) {
>>  continue;
>>  }
>> @@ -1041,6 +1050,10 @@ static char *cpu_slot_to_string(const CPUArchId *cpu)
>>  if (cpu->props.has_die_id) {
>>  g_string_append_printf(s, "die-id: %"PRId64, cpu->props.die_id);
>>  }
>> +
>> +if (cpu->props.has_ccx_id) {
>> +g_string_append_printf(s, "ccx-id: %"PRId64, cpu->props.ccx_id);
>> +}
>>  if (cpu->props.has_core_id) {
>>  if (s->len) {
>>  g_string_append_printf(s, ", ");
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index 9e1c3f9f57..f71389ad9f 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -2444,6 +2444,7 @@ static void pc_cpu_pre_plug(HotplugHandler 
>> *hotplug_dev,
>>  
>>  topo_ids.pkg_id = cpu->socket_id;
>>  topo_ids.die_id = cpu->die_id;
>> +topo_ids.ccx_id = cpu->ccx_id;
>>  topo_ids.core_id = cpu->core_id;
>>  topo_ids.smt_id = cpu->thread_id;
>>  cpu->apic_id = apicid_from_topo_ids(_info, _ids);
>> @@ -2489,6 +2490,13 @@ static void pc_cpu_pre_plug(HotplugHandler 
>> *hotplug_dev,
>>  }
>>  cpu->die_id = topo_ids.die_id;
>>  
>> +if (cpu->ccx_id != -1 && cpu->ccx_id != topo_ids.ccx_id) {
>> +error_setg(errp, "property ccx-id: %u doesn't match set apic-id:"
>> +" 0x%x (ccx-id: %u)", cpu->ccx_id, cpu->apic_id, 
>> topo_ids.ccx_id);
>> +return;
>> +}
>> +cpu->ccx_id = topo_ids.ccx_id;
>> +
>>  if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
>>  error_setg(errp, "property core-id: %u doesn't match set apic-id:"
>>  " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, 
>> topo_ids.core_id);
>> @@ -2896,6 +2904,8 @@ stati

Re: [RFC 2 PATCH 00/16] APIC ID fixes for AMD EPYC CPU models

2019-09-20 Thread Moger, Babu

Eduardo and all,
 Waiting for the feedback on this to move forward. Appreciate your time.
Thanks Babu

On 9/6/19 2:11 PM, Moger, Babu wrote:
> These series fixes the problems encoding APIC ID for AMD EPYC cpu models.
> https://bugzilla.redhat.com/show_bug.cgi?id=1728166
> 
> This is the second pass to give an idea of the changes required to address
> the issue. First pass is availabe at 
> https://patchwork.kernel.org/cover/11069785/
> 
> Currently, apic id is decoded based on sockets/dies/cores/threads. This 
> appears
> to work for most standard configurations for AMD and other vendors. But this
> decoding does not follow AMD's APIC ID enumeration. In some cases this
> causes CPU topology inconstancy. While booting guest Kernel is trying to
> validate topology. It finds the topology not aligning to EPYC models.
> 
> To fix the problem we need to build the topology as per the
> Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision 
> B1
> Processors. It is available at https://www.amd.com/en/support/tech-docs
> 
> Here is the text from the PPR.
> 2.1.10.2.1.3
> ApicId Enumeration Requirements
> Operating systems are expected to use
> Core::X86::Cpuid::SizeId[ApicIdCoreIdSize], the number of least
> significant bits in the Initial APIC ID that indicate core ID within a
> processor, in constructing per-core CPUID
> masks. Core::X86::Cpuid::SizeId[ApicIdCoreIdSize] determines the maximum 
> number
> of cores (MNC) that the
> processor could theoretically support, not the actual number of cores that are
> actually implemented or enabled on
> the processor, as indicated by Core::X86::Cpuid::SizeId[NC].
> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
> • ApicId[6] = Socket ID.
> • ApicId[5:4] = Node ID.
> • ApicId[3] = Logical CCX L3 complex ID
> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} :
> {1'b0,LogicalCoreID[1:0]}.
> """
> 
> v2:
>   1. Introduced the new property epyc to enable new epyc mode.
>   2. Separated the epyc mode and non epyc mode function.
>   3. Introduced function pointers in PCMachineState to handle the
>  differences.
>   4. Mildly tested different combinations to make things are working as 
> expected.
>   5. TODO : Setting the epyc feature bit needs to be worked out. This feature 
> is
>  supported only on AMD EPYC models. I may need some guidance on that.
> 
> v1:
>   https://patchwork.kernel.org/cover/11069785/
> 
> ---
> 
> Babu Moger (16):
>   numa: Split the numa functionality
>   hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
>   hw/i386: Introduce X86CPUTopoInfo to contain topology info
>   machine: Add SMP Sockets in CpuTopology
>   hw/i386: Simplify topology Offset/width Calculation
>   hw/core: Add core complex id in X86CPU topology
>   hw/386: Add new epyc mode topology decoding functions
>   i386: Cleanup and use the new epyc mode topology functions
>   hw/i386: Introduce initialize_topo_info function
>   hw/i386: Introduce apicid_from_cpu_idx in PCMachineState
>   Introduce-topo_ids_from_apicid-handler
>   hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState
>   machine: Add new epyc property in PCMachineState
>   hw/i386: Introduce epyc mode function handlers
>   i386: Fix pkg_id offset for epyc mode
>   hw/core: Fix up the machine_set_cpu_numa_node for epyc
> 
> 
>  hw/core/machine-hmp-cmds.c |3 
>  hw/core/machine.c  |   38 ++
>  hw/core/numa.c |  110 
>  hw/i386/pc.c   |  143 +++--
>  include/hw/boards.h|8 +
>  include/hw/i386/pc.h   |9 +
>  include/hw/i386/topology.h |  294 
> +++-
>  include/sysemu/numa.h  |2 
>  qapi/machine.json  |4 -
>  target/i386/cpu.c  |  209 +++
>  target/i386/cpu.h  |1 
>  vl.c   |3 
>  12 files changed, 560 insertions(+), 264 deletions(-)
> 
> --
> Signature
>

Re: [Qemu-devel] [RFC 2 PATCH 06/16] hw/core: Add core complex id in X86CPU topology

2019-09-06 Thread Moger, Babu



On 9/6/19 2:20 PM, Eric Blake wrote:
> On 9/6/19 2:12 PM, Moger, Babu wrote:
>> Introduce cpu core complex id(ccx_id) in x86CPU topology.
>> Each CCX can have upto 4 cores and share same L3 cache.
>> This information is required to build the topology in
>> new apyc mode.
>>
>> Signed-off-by: Babu Moger 
>> ---
> 
>> +++ b/qapi/machine.json
>> @@ -597,9 +597,10 @@
>>  # @node-id: NUMA node ID the CPU belongs to
>>  # @socket-id: socket number within node/board the CPU belongs to
>>  # @die-id: die number within node/board the CPU belongs to (Since 4.1)
>> +# @ccx-id: core complex number within node/board the CPU belongs to (Since 
>> 4.1)
> 
> 4.2 now
ok. Will fix.
> 
>>  # @core-id: core number within die the CPU belongs to# @thread-id: thread 
>> number within core the CPU belongs to
> 
> Pre-existing, but let's fix that missing newline while you're here.

Sure. will take care. thanks

[Qemu-devel] [RFC 2 PATCH 15/16] i386: Fix pkg_id offset for epyc mode

2019-09-06 Thread Moger, Babu

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c |   24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f25491a029..f8b1fc5c07 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4094,9 +4094,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
uint32_t *eax, uint32_t *ebx,
uint32_t *ecx, uint32_t *edx)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 X86CPU *cpu = env_archcpu(env);
 CPUState *cs = env_cpu(env);
-uint32_t die_offset;
+uint32_t die_offset, pkg_offset;
 uint32_t limit;
 uint32_t signature[3];
 
@@ -4119,6 +4120,21 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 index = env->cpuid_level;
 }
 
+if (ms->epyc) {
+X86CPUTopoInfo topo_info = {
+.numa_nodes = nb_numa_nodes,
+.nr_sockets = ms->smp.sockets,
+.nr_cores = ms->smp.cores,
+.nr_threads = ms->smp.threads,
+};
+unsigned nodes = nodes_in_pkg(_info);
+pkg_offset = apicid_pkg_offset_epyc(nodes, MAX_CCX, MAX_CORES_IN_CCX,
+cs->nr_threads);
+} else {
+pkg_offset = apicid_pkg_offset(env->nr_dies, cs->nr_cores,
+   cs->nr_threads);
+}
+
 switch(index) {
 case 0:
 *eax = env->cpuid_level;
@@ -4275,8 +4291,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
 break;
 case 1:
-*eax = apicid_pkg_offset(env->nr_dies,
- cs->nr_cores, cs->nr_threads);
+*eax = pkg_offset;
 *ebx = cs->nr_cores * cs->nr_threads;
 *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
 break;
@@ -4310,8 +4325,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
 break;
 case 2:
-*eax = apicid_pkg_offset(env->nr_dies, cs->nr_cores,
-   cs->nr_threads);
+*eax = pkg_offset;
 *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
 *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
 break;

[Qemu-devel] [RFC 2 PATCH 13/16] machine: Add new epyc property in PCMachineState

2019-09-06 Thread Moger, Babu

Adds new epyc property in PCMachineState and also in MachineState.
This property will be used to initialize the mode specific handlers
to generate apic ids.

Signed-off-by: Babu Moger 
---
 hw/i386/pc.c |   23 +++
 include/hw/boards.h  |2 ++
 include/hw/i386/pc.h |1 +
 3 files changed, 26 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 959bd3821b..14760523a9 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2810,6 +2810,22 @@ static void pc_machine_set_pit(Object *obj, bool value, 
Error **errp)
 pcms->pit_enabled = value;
 }
 
+static bool pc_machine_get_epyc(Object *obj, Error **errp)
+{
+PCMachineState *pcms = PC_MACHINE(obj);
+
+return pcms->epyc;
+}
+
+static void pc_machine_set_epyc(Object *obj, bool value, Error **errp)
+{
+PCMachineState *pcms = PC_MACHINE(obj);
+MachineState *ms = MACHINE(pcms);
+
+pcms->epyc = value;
+ms->epyc = value;
+}
+
 static void pc_machine_initfn(Object *obj)
 {
 PCMachineState *pcms = PC_MACHINE(obj);
@@ -3015,6 +3031,13 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
 
 object_class_property_add_bool(oc, PC_MACHINE_PIT,
 pc_machine_get_pit, pc_machine_set_pit, _abort);
+
+object_class_property_add_bool(oc, "epyc",
+pc_machine_get_epyc, pc_machine_set_epyc, _abort);
+
+object_class_property_set_description(oc, "epyc",
+"Set on/off to use epyc mode", _abort);
+
 }
 
 static const TypeInfo pc_machine_info = {
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 12eb5032a5..0001d42e50 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -299,6 +299,8 @@ struct MachineState {
 AccelState *accelerator;
 CPUArchIdList *possible_cpus;
 CpuTopology smp;
+bool epyc;
+
 struct NVDIMMState *nvdimms_state;
 };
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index d6f1189997..cf9e7b0045 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -68,6 +68,7 @@ struct PCMachineState {
 uint64_t *node_mem;
 
 /* Apic id specific handlers */
+bool epyc;
 uint32_t (*apicid_from_cpu_idx)(X86CPUTopoInfo *topo_info, unsigned 
cpu_index);
 void (*topo_ids_from_apicid)(apic_id_t apicid, X86CPUTopoInfo *topo_info,
  X86CPUTopoIDs *topo_ids);

[Qemu-devel] [RFC 2 PATCH 14/16] hw/i386: Introduce epyc mode function handlers

2019-09-06 Thread Moger, Babu

Introduce following handlers for new epyc mode.
x86_apicid_from_cpu_idx_epyc: Generate apicid from cpu index.
x86_topo_ids_from_apicid_epyc: Generate topo ids from apic id.
x86_apicid_from_topo_ids_epyci: Generate apicid from topo ids.

Signed-off-by: Babu Moger 
---
 hw/i386/pc.c |5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 14760523a9..59c7c4d8b2 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2824,6 +2824,11 @@ static void pc_machine_set_epyc(Object *obj, bool value, 
Error **errp)
 
 pcms->epyc = value;
 ms->epyc = value;
+if (pcms->epyc) {
+pcms->apicid_from_cpu_idx = x86_apicid_from_cpu_idx_epyc;
+pcms->topo_ids_from_apicid = x86_topo_ids_from_apicid_epyc;
+pcms->apicid_from_topo_ids = x86_apicid_from_topo_ids_epyc;
+}
 }
 
 static void pc_machine_initfn(Object *obj)

[Qemu-devel] [RFC 2 PATCH 16/16] hw/core: Fix up the machine_set_cpu_numa_node for epyc

2019-09-06 Thread Moger, Babu

Current topology id match will not work for epyc mode when setting
the node id. In epyc mode, ids like smt_id, thread_id, core_id,
ccx_id, socket_id can be same for more than one CPUs with across
two numa nodes.

For example, we can have two CPUs with following ids on two different node.
1. smt_id=0, thread_id=0, core_id=0, ccx_id=0, socket_id=0, node_id=0
2. smt_id=0, thread_id=0, core_id=0, ccx_id=0, socket_id=0, node_id=1

The function machine_set_cpu_numa_node will fail to find a match to assign
the node. Added new function machine_set_cpu_numa_node_epyc to set the node_id
directly in epyc mode.

Signed-off-by: Babu Moger 
---
 hw/core/machine.c   |   24 
 hw/core/numa.c  |6 +-
 include/hw/boards.h |4 
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 9a8586cf30..6bceefc6f3 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -741,6 +741,30 @@ void machine_set_cpu_numa_node(MachineState *machine,
 }
 }
 
+void machine_set_cpu_numa_node_epyc(MachineState *machine,
+const CpuInstanceProperties *props,
+unsigned index,
+Error **errp)
+{
+MachineClass *mc = MACHINE_GET_CLASS(machine);
+CPUArchId *slot;
+
+if (!mc->possible_cpu_arch_ids) {
+error_setg(errp, "mapping of CPUs to NUMA node is not supported");
+return;
+}
+
+/* disabling node mapping is not supported, forbid it */
+assert(props->has_node_id);
+
+/* force board to initialize possible_cpus if it hasn't been done yet */
+mc->possible_cpu_arch_ids(machine);
+
+slot = >possible_cpus->cpus[index];
+slot->props.node_id = props->node_id;
+slot->props.has_node_id = props->has_node_id;
+}
+
 static void smp_parse(MachineState *ms, QemuOpts *opts)
 {
 if (opts) {
diff --git a/hw/core/numa.c b/hw/core/numa.c
index 27fa6b5e1d..a9e835aea6 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -247,7 +247,11 @@ void set_numa_node_options(MachineState *ms, NumaOptions 
*object, Error **errp)
  props = mc->cpu_index_to_instance_props(ms, cpus->value);
  props.node_id = nodenr;
  props.has_node_id = true;
- machine_set_cpu_numa_node(ms, , );
+ if (ms->epyc) {
+ machine_set_cpu_numa_node_epyc(ms, , cpus->value, );
+ } else {
+ machine_set_cpu_numa_node(ms, , );
+ }
  if (err) {
 error_propagate(errp, err);
 return;
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 0001d42e50..ec1b1c5a85 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -74,6 +74,10 @@ HotpluggableCPUList 
*machine_query_hotpluggable_cpus(MachineState *machine);
 void machine_set_cpu_numa_node(MachineState *machine,
const CpuInstanceProperties *props,
Error **errp);
+void machine_set_cpu_numa_node_epyc(MachineState *machine,
+const CpuInstanceProperties *props,
+unsigned index,
+Error **errp);
 
 void machine_class_allow_dynamic_sysbus_dev(MachineClass *mc, const char 
*type);

[Qemu-devel] [RFC 2 PATCH 11/16] Introduce-topo_ids_from_apicid-handler

2019-09-06 Thread Moger, Babu

hw/i386: Introduce topo_ids_from_apicid handler PCMachineState

Add function pointer topo_ids_from_apicid in PCMachineState.
Initialize with correct handler based on mode selected.
x86_apicid_from_cpu_idx will be the default handler.

Signed-off-by: Babu Moger 
---
 hw/i386/pc.c |   13 +++--
 include/hw/i386/pc.h |2 ++
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 69a6b82186..c88de09350 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2461,7 +2461,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 if (!cpu_slot) {
 MachineState *ms = MACHINE(pcms);
 
-x86_topo_ids_from_apicid(cpu->apic_id, _info, _ids);
+pcms->topo_ids_from_apicid(cpu->apic_id, _info, _ids);
 error_setg(errp,
 "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
 " APIC ID %" PRIu32 ", valid index range 0:%d",
@@ -2482,7 +2482,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 /* TODO: move socket_id/core_id/thread_id checks into x86_cpu_realizefn()
  * once -smp refactoring is complete and there will be CPU private
  * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
-x86_topo_ids_from_apicid(cpu->apic_id, _info, _ids);
+pcms->topo_ids_from_apicid(cpu->apic_id, _info, _ids);
 if (cpu->socket_id != -1 && cpu->socket_id != topo_ids.pkg_id) {
 error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
 " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, 
topo_ids.pkg_id);
@@ -2830,6 +2830,7 @@ static void pc_machine_initfn(Object *obj)
 
 /* Initialize the apic id related handlers */
 pcms->apicid_from_cpu_idx = x86_apicid_from_cpu_idx;
+pcms->topo_ids_from_apicid = x86_topo_ids_from_apicid;
 
 pc_system_flash_create(pcms);
 }
@@ -2872,8 +2873,8 @@ static int64_t pc_get_default_cpu_node_id(const 
MachineState *ms, int idx)
initialize_topo_info(_info, pcms, ms);
 
assert(idx < ms->possible_cpus->len);
-   x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
-_info, _ids);
+   pcms->topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
+  _info, _ids);
return topo_ids.pkg_id % nb_numa_nodes;
 }
 
@@ -2906,8 +2907,8 @@ static const CPUArchIdList 
*pc_possible_cpu_arch_ids(MachineState *ms)
 ms->possible_cpus->cpus[i].type = ms->cpu_type;
 ms->possible_cpus->cpus[i].vcpus_count = 1;
 ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, 
i);
-x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
- _info, _ids);
+pcms->topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
+   _info, _ids);
 ms->possible_cpus->cpus[i].props.has_socket_id = true;
 ms->possible_cpus->cpus[i].props.socket_id = topo_ids.pkg_id;
 ms->possible_cpus->cpus[i].props.has_die_id = true;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 6cefefdd57..9a40f123d0 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -69,6 +69,8 @@ struct PCMachineState {
 
 /* Apic id specific handlers */
 uint32_t (*apicid_from_cpu_idx)(X86CPUTopoInfo *topo_info, unsigned 
cpu_index);
+void (*topo_ids_from_apicid)(apic_id_t apicid, X86CPUTopoInfo *topo_info,
+ X86CPUTopoIDs *topo_ids);
 
 /* Address space used by IOAPIC device. All IOAPIC interrupts
  * will be translated to MSI messages in the address space. */

[Qemu-devel] [RFC 2 PATCH 10/16] hw/i386: Introduce apicid_from_cpu_idx in PCMachineState

2019-09-06 Thread Moger, Babu

Add function pointers in PCMachineState to handle apic id specific
functionalities. This will be used to initialize with correct
handlers based on mode selected.

x86_apicid_from_cpu_idx will be default handler.

Signed-off-by: Babu Moger 
---
 hw/i386/pc.c |5 -
 include/hw/i386/pc.h |4 
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 504e1ab083..69a6b82186 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -947,7 +947,7 @@ static uint32_t x86_cpu_apic_id_from_index(PCMachineState 
*pcms,
 
 initialize_topo_info(_info, pcms, ms);
 
-correct_id = x86_apicid_from_cpu_idx(_info, cpu_index);
+correct_id = pcms->apicid_from_cpu_idx(_info, cpu_index);
 if (pcmc->compat_apic_id_mode) {
 if (cpu_index != correct_id && !warned && !qtest_enabled()) {
 error_report("APIC IDs set in compatibility mode, "
@@ -2828,6 +2828,9 @@ static void pc_machine_initfn(Object *obj)
 pcms->pit_enabled = true;
 pcms->smp_dies = 1;
 
+/* Initialize the apic id related handlers */
+pcms->apicid_from_cpu_idx = x86_apicid_from_cpu_idx;
+
 pc_system_flash_create(pcms);
 }
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 859b64c51d..6cefefdd57 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -17,6 +17,7 @@
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
 #include "hw/acpi/acpi_dev_interface.h"
+#include "hw/i386/topology.h"
 
 #define HPET_INTCAP "hpet-intcap"
 
@@ -66,6 +67,9 @@ struct PCMachineState {
 uint64_t numa_nodes;
 uint64_t *node_mem;
 
+/* Apic id specific handlers */
+uint32_t (*apicid_from_cpu_idx)(X86CPUTopoInfo *topo_info, unsigned 
cpu_index);
+
 /* Address space used by IOAPIC device. All IOAPIC interrupts
  * will be translated to MSI messages in the address space. */
 AddressSpace *ioapic_as;

[Qemu-devel] [RFC 2 PATCH 09/16] hw/i386: Introduce initialize_topo_info function

2019-09-06 Thread Moger, Babu

Introduce initialize_topo_info to initialize X86CPUTopoInfo
data structure to build the topology. No functional change.

Signed-off-by: Babu Moger 
---
 hw/i386/pc.c |   29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index f71389ad9f..504e1ab083 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -918,6 +918,17 @@ bool e820_get_entry(int idx, uint32_t type, uint64_t 
*address, uint64_t *length)
 return false;
 }
 
+static inline void initialize_topo_info(X86CPUTopoInfo *topo_info,
+PCMachineState *pcms,
+const MachineState *ms)
+{
+topo_info->nr_dies = pcms->smp_dies;
+topo_info->nr_cores = ms->smp.cores;
+topo_info->nr_threads = ms->smp.threads;
+topo_info->nr_sockets = ms->smp.sockets;
+topo_info->numa_nodes = nb_numa_nodes;
+}
+
 /* Calculates initial APIC ID for a specific CPU index
  *
  * Currently we need to be able to calculate the APIC ID from the CPU index
@@ -934,9 +945,7 @@ static uint32_t x86_cpu_apic_id_from_index(PCMachineState 
*pcms,
 uint32_t correct_id;
 static bool warned;
 
-topo_info.nr_dies = pcms->smp_dies;
-topo_info.nr_cores = ms->smp.cores;
-topo_info.nr_threads = ms->smp.threads;
+initialize_topo_info(_info, pcms, ms);
 
 correct_id = x86_apicid_from_cpu_idx(_info, cpu_index);
 if (pcmc->compat_apic_id_mode) {
@@ -2399,9 +2408,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 return;
 }
 
-topo_info.nr_dies = pcms->smp_dies;
-topo_info.nr_cores = smp_cores;
-topo_info.nr_threads = smp_threads;
+initialize_topo_info(_info, pcms, ms);
 
 env->nr_dies = pcms->smp_dies;
 
@@ -2859,9 +2866,7 @@ static int64_t pc_get_default_cpu_node_id(const 
MachineState *ms, int idx)
PCMachineState *pcms = PC_MACHINE(ms);
X86CPUTopoInfo topo_info;
 
-   topo_info.nr_dies = pcms->smp_dies;
-   topo_info.nr_cores = ms->smp.cores;
-   topo_info.nr_threads = ms->smp.threads;
+   initialize_topo_info(_info, pcms, ms);
 
assert(idx < ms->possible_cpus->len);
x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
@@ -2876,9 +2881,6 @@ static const CPUArchIdList 
*pc_possible_cpu_arch_ids(MachineState *ms)
 X86CPUTopoInfo topo_info;
 int i;
 
-topo_info.nr_dies = pcms->smp_dies;
-topo_info.nr_cores = ms->smp.cores;
-topo_info.nr_threads = ms->smp.threads;
 
 if (ms->possible_cpus) {
 /*
@@ -2891,6 +2893,9 @@ static const CPUArchIdList 
*pc_possible_cpu_arch_ids(MachineState *ms)
 
 ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
   sizeof(CPUArchId) * max_cpus);
+
+initialize_topo_info(_info, pcms, ms);
+
 ms->possible_cpus->len = max_cpus;
 for (i = 0; i < ms->possible_cpus->len; i++) {
 X86CPUTopoIDs topo_ids;

[Qemu-devel] [RFC 2 PATCH 08/16] i386: Cleanup and use the new epyc mode topology functions

2019-09-06 Thread Moger, Babu

Use the new epyc mode functions and delete the unused code.

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c |  171 +++--
 1 file changed, 48 insertions(+), 123 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ca02bc21ec..f25491a029 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -28,6 +28,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/hvf.h"
 #include "sysemu/cpus.h"
+#include "sysemu/numa.h"
 #include "kvm_i386.h"
 #include "sev_i386.h"
 
@@ -338,67 +339,19 @@ static void encode_cache_cpuid8006(CPUCacheInfo *l2,
 }
 }
 
-/*
- * Definitions used for building CPUID Leaf 0x801D and 0x801E
- * Please refer to the AMD64 Architecture Programmer’s Manual Volume 3.
- * Define the constants to build the cpu topology. Right now, TOPOEXT
- * feature is enabled only on EPYC. So, these constants are based on
- * EPYC supported configurations. We may need to handle the cases if
- * these values change in future.
- */
-/* Maximum core complexes in a node */
-#define MAX_CCX 2
-/* Maximum cores in a core complex */
-#define MAX_CORES_IN_CCX 4
-/* Maximum cores in a node */
-#define MAX_CORES_IN_NODE 8
-/* Maximum nodes in a socket */
-#define MAX_NODES_PER_SOCKET 4
-
-/*
- * Figure out the number of nodes required to build this config.
- * Max cores in a node is 8
- */
-static int nodes_in_socket(int nr_cores)
-{
-int nodes;
-
-nodes = DIV_ROUND_UP(nr_cores, MAX_CORES_IN_NODE);
-
-   /* Hardware does not support config with 3 nodes, return 4 in that case */
-return (nodes == 3) ? 4 : nodes;
-}
-
-/*
- * Decide the number of cores in a core complex with the given nr_cores using
- * following set constants MAX_CCX, MAX_CORES_IN_CCX, MAX_CORES_IN_NODE and
- * MAX_NODES_PER_SOCKET. Maintain symmetry as much as possible
- * L3 cache is shared across all cores in a core complex. So, this will also
- * tell us how many cores are sharing the L3 cache.
- */
-static int cores_in_core_complex(int nr_cores)
-{
-int nodes;
-
-/* Check if we can fit all the cores in one core complex */
-if (nr_cores <= MAX_CORES_IN_CCX) {
-return nr_cores;
-}
-/* Get the number of nodes required to build this config */
-nodes = nodes_in_socket(nr_cores);
-
-/*
- * Divide the cores accros all the core complexes
- * Return rounded up value
- */
-return DIV_ROUND_UP(nr_cores, nodes * MAX_CCX);
-}
-
 /* Encode cache info for CPUID[801D] */
-static void encode_cache_cpuid801d(CPUCacheInfo *cache, CPUState *cs,
-uint32_t *eax, uint32_t *ebx,
-uint32_t *ecx, uint32_t *edx)
+static void encode_cache_cpuid801d(CPUCacheInfo *cache,
+   uint32_t *eax, uint32_t *ebx,
+   uint32_t *ecx, uint32_t *edx)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
+X86CPUTopoInfo topo_info = {
+.numa_nodes = nb_numa_nodes,
+.nr_sockets = ms->smp.sockets,
+.nr_cores = ms->smp.cores,
+.nr_threads = ms->smp.threads,
+};
+
 uint32_t l3_cores;
 assert(cache->size == cache->line_size * cache->associativity *
   cache->partitions * cache->sets);
@@ -408,10 +361,10 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
*cache, CPUState *cs,
 
 /* L3 is shared among multiple cores */
 if (cache->level == 3) {
-l3_cores = cores_in_core_complex(cs->nr_cores);
-*eax |= ((l3_cores * cs->nr_threads) - 1) << 14;
+l3_cores = cores_in_ccx(_info);
+*eax |= ((l3_cores * topo_info.nr_threads) - 1) << 14;
 } else {
-*eax |= ((cs->nr_threads - 1) << 14);
+*eax |= ((topo_info.nr_threads - 1) << 14);
 }
 
 assert(cache->line_size > 0);
@@ -431,56 +384,28 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
*cache, CPUState *cs,
(cache->complex_indexing ? CACHE_COMPLEX_IDX : 0);
 }
 
-/* Data structure to hold the configuration info for a given core index */
-struct core_topology {
-/* core complex id of the current core index */
-int ccx_id;
-/*
- * Adjusted core index for this core in the topology
- * This can be 0,1,2,3 with max 4 cores in a core complex
- */
-int core_id;
-/* Node id for this core index */
-int node_id;
-/* Number of nodes in this config */
-int num_nodes;
-};
-
-/*
- * Build the configuration closely match the EPYC hardware. Using the EPYC
- * hardware configuration values (MAX_CCX, MAX_CORES_IN_CCX, MAX_CORES_IN_NODE)
- * right now. This could change in future.
- * nr_cores : Total number of cores in the config
- * core_id  : Core index of the current CPU
- * topo : Data structure to hold all the config info for this core index
- */
-static void build_core_topology(int nr_cores, int core_id,
-struct core_topology *topo)
-{
-int

[Qemu-devel] [RFC 2 PATCH 06/16] hw/core: Add core complex id in X86CPU topology

2019-09-06 Thread Moger, Babu

Introduce cpu core complex id(ccx_id) in x86CPU topology.
Each CCX can have upto 4 cores and share same L3 cache.
This information is required to build the topology in
new apyc mode.

Signed-off-by: Babu Moger 
---
 hw/core/machine-hmp-cmds.c |3 +++
 hw/core/machine.c  |   13 +
 hw/i386/pc.c   |   10 ++
 include/hw/i386/topology.h |1 +
 qapi/machine.json  |4 +++-
 target/i386/cpu.c  |2 ++
 target/i386/cpu.h  |1 +
 7 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/hw/core/machine-hmp-cmds.c b/hw/core/machine-hmp-cmds.c
index 1f66bda346..6c534779af 100644
--- a/hw/core/machine-hmp-cmds.c
+++ b/hw/core/machine-hmp-cmds.c
@@ -89,6 +89,9 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict)
 if (c->has_die_id) {
 monitor_printf(mon, "die-id: \"%" PRIu64 "\"\n", c->die_id);
 }
+if (c->has_ccx_id) {
+monitor_printf(mon, "ccx-id: \"%" PRIu64 "\"\n", c->ccx_id);
+}
 if (c->has_core_id) {
 monitor_printf(mon, "core-id: \"%" PRIu64 "\"\n", c->core_id);
 }
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 4034b7e903..9a8586cf30 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -694,6 +694,11 @@ void machine_set_cpu_numa_node(MachineState *machine,
 return;
 }
 
+if (props->has_ccx_id && !slot->props.has_ccx_id) {
+error_setg(errp, "ccx-id is not supported");
+return;
+}
+
 /* skip slots with explicit mismatch */
 if (props->has_thread_id && props->thread_id != slot->props.thread_id) 
{
 continue;
@@ -707,6 +712,10 @@ void machine_set_cpu_numa_node(MachineState *machine,
 continue;
 }
 
+if (props->has_ccx_id && props->ccx_id != slot->props.ccx_id) {
+continue;
+}
+
 if (props->has_socket_id && props->socket_id != slot->props.socket_id) 
{
 continue;
 }
@@ -1041,6 +1050,10 @@ static char *cpu_slot_to_string(const CPUArchId *cpu)
 if (cpu->props.has_die_id) {
 g_string_append_printf(s, "die-id: %"PRId64, cpu->props.die_id);
 }
+
+if (cpu->props.has_ccx_id) {
+g_string_append_printf(s, "ccx-id: %"PRId64, cpu->props.ccx_id);
+}
 if (cpu->props.has_core_id) {
 if (s->len) {
 g_string_append_printf(s, ", ");
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 9e1c3f9f57..f71389ad9f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2444,6 +2444,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
 topo_ids.pkg_id = cpu->socket_id;
 topo_ids.die_id = cpu->die_id;
+topo_ids.ccx_id = cpu->ccx_id;
 topo_ids.core_id = cpu->core_id;
 topo_ids.smt_id = cpu->thread_id;
 cpu->apic_id = apicid_from_topo_ids(_info, _ids);
@@ -2489,6 +2490,13 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 }
 cpu->die_id = topo_ids.die_id;
 
+if (cpu->ccx_id != -1 && cpu->ccx_id != topo_ids.ccx_id) {
+error_setg(errp, "property ccx-id: %u doesn't match set apic-id:"
+" 0x%x (ccx-id: %u)", cpu->ccx_id, cpu->apic_id, topo_ids.ccx_id);
+return;
+}
+cpu->ccx_id = topo_ids.ccx_id;
+
 if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
 error_setg(errp, "property core-id: %u doesn't match set apic-id:"
 " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, 
topo_ids.core_id);
@@ -2896,6 +2904,8 @@ static const CPUArchIdList 
*pc_possible_cpu_arch_ids(MachineState *ms)
 ms->possible_cpus->cpus[i].props.socket_id = topo_ids.pkg_id;
 ms->possible_cpus->cpus[i].props.has_die_id = true;
 ms->possible_cpus->cpus[i].props.die_id = topo_ids.die_id;
+ms->possible_cpus->cpus[i].props.has_ccx_id = true;
+ms->possible_cpus->cpus[i].props.ccx_id = topo_ids.ccx_id;
 ms->possible_cpus->cpus[i].props.has_core_id = true;
 ms->possible_cpus->cpus[i].props.core_id = topo_ids.core_id;
 ms->possible_cpus->cpus[i].props.has_thread_id = true;
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index fb10863a66..5a61d53f05 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -170,6 +170,7 @@ static inline void x86_topo_ids_from_apicid(apic_id_t 
apicid,
 (apicid >> apicid_die_offset(nr_cores, nr_threads)) &
 ~(0xUL << apicid_die_width(nr_dies));
 topo_ids->pkg_id = apicid >> apicid_pkg_offset(nr_dies, nr_cores, 
nr_threads);
+topo_ids->ccx_id = 0;
 }
 
 /* Make APIC ID for the CPU 'cpu_index'
diff --git a/qapi/machine.json b/qapi/machine.json
index 6db8a7e2ec..bb7627e698 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -597,9 +597,10 @@
 # @node-id: NUMA node ID the CPU belongs to
 # @socket-id: socket number within node/board the CPU belongs

[Qemu-devel] [RFC 2 PATCH 12/16] hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState

2019-09-06 Thread Moger, Babu

Add function pointer apic_id_from_topo_ids in PCMachineState.
Initialize with correct handler based on the mode selected.
Also rename the handler apicid_from_topo_ids to x86_apicid_from_topo_ids
for  consistency. x86_apicid_from_topo_ids will be the default handler.

Signed-off-by: Babu Moger 
---
 hw/i386/pc.c   |3 ++-
 include/hw/i386/pc.h   |2 ++
 include/hw/i386/topology.h |4 ++--
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index c88de09350..959bd3821b 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2454,7 +2454,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 topo_ids.ccx_id = cpu->ccx_id;
 topo_ids.core_id = cpu->core_id;
 topo_ids.smt_id = cpu->thread_id;
-cpu->apic_id = apicid_from_topo_ids(_info, _ids);
+cpu->apic_id = pcms->apicid_from_topo_ids(_info, _ids);
 }
 
 cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, );
@@ -2831,6 +2831,7 @@ static void pc_machine_initfn(Object *obj)
 /* Initialize the apic id related handlers */
 pcms->apicid_from_cpu_idx = x86_apicid_from_cpu_idx;
 pcms->topo_ids_from_apicid = x86_topo_ids_from_apicid;
+pcms->apicid_from_topo_ids = x86_apicid_from_topo_ids;
 
 pc_system_flash_create(pcms);
 }
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 9a40f123d0..d6f1189997 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -71,6 +71,8 @@ struct PCMachineState {
 uint32_t (*apicid_from_cpu_idx)(X86CPUTopoInfo *topo_info, unsigned 
cpu_index);
 void (*topo_ids_from_apicid)(apic_id_t apicid, X86CPUTopoInfo *topo_info,
  X86CPUTopoIDs *topo_ids);
+apic_id_t (*apicid_from_topo_ids)(X86CPUTopoInfo *topo_info,
+  const X86CPUTopoIDs *topo_ids);
 
 /* Address space used by IOAPIC device. All IOAPIC interrupts
  * will be translated to MSI messages in the address space. */
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 6fd4184f07..740e66970d 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -294,7 +294,7 @@ static inline apic_id_t 
x86_apicid_from_cpu_idx_epyc(X86CPUTopoInfo *topo_info,
  *
  * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
  */
-static inline apic_id_t apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
+static inline apic_id_t x86_apicid_from_topo_ids(X86CPUTopoInfo *topo_info,
  const X86CPUTopoIDs *topo_ids)
 {
 unsigned nr_dies = topo_info->nr_dies;
@@ -356,7 +356,7 @@ static inline apic_id_t 
x86_apicid_from_cpu_idx(X86CPUTopoInfo *topo_info,
 {
 X86CPUTopoIDs topo_ids;
 x86_topo_ids_from_idx(topo_info, cpu_index, _ids);
-return apicid_from_topo_ids(topo_info, _ids);
+return x86_apicid_from_topo_ids(topo_info, _ids);
 }
 
 #endif /* HW_I386_TOPOLOGY_H */

[Qemu-devel] [RFC 2 PATCH 05/16] hw/i386: Simplify topology Offset/width Calculation

2019-09-06 Thread Moger, Babu

Some parameters are unnecessarily passed for offset/width
calculation. Remove those parameters from function prototypes.
No functional change.

Signed-off-by: Babu Moger 
---
 include/hw/i386/topology.h |   45 ++--
 target/i386/cpu.c  |   12 
 2 files changed, 22 insertions(+), 35 deletions(-)

diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 906017e8e3..fb10863a66 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -73,46 +73,37 @@ static unsigned apicid_bitwidth_for_count(unsigned count)
 
 /* Bit width of the SMT_ID (thread ID) field on the APIC ID
  */
-static inline unsigned apicid_smt_width(unsigned nr_dies,
-unsigned nr_cores,
-unsigned nr_threads)
+static inline unsigned apicid_smt_width(unsigned nr_threads)
 {
 return apicid_bitwidth_for_count(nr_threads);
 }
 
 /* Bit width of the Core_ID field
  */
-static inline unsigned apicid_core_width(unsigned nr_dies,
- unsigned nr_cores,
- unsigned nr_threads)
+static inline unsigned apicid_core_width(unsigned nr_cores)
 {
 return apicid_bitwidth_for_count(nr_cores);
 }
 
 /* Bit width of the Die_ID field */
-static inline unsigned apicid_die_width(unsigned nr_dies,
-unsigned nr_cores,
-unsigned nr_threads)
+static inline unsigned apicid_die_width(unsigned nr_dies)
 {
 return apicid_bitwidth_for_count(nr_dies);
 }
 
 /* Bit offset of the Core_ID field
  */
-static inline unsigned apicid_core_offset(unsigned nr_dies,
-  unsigned nr_cores,
-  unsigned nr_threads)
+static inline unsigned apicid_core_offset(unsigned nr_threads)
 {
-return apicid_smt_width(nr_dies, nr_cores, nr_threads);
+return apicid_smt_width(nr_threads);
 }
 
 /* Bit offset of the Die_ID field */
-static inline unsigned apicid_die_offset(unsigned nr_dies,
-  unsigned nr_cores,
-   unsigned nr_threads)
+static inline unsigned apicid_die_offset(unsigned nr_cores,
+ unsigned nr_threads)
 {
-return apicid_core_offset(nr_dies, nr_cores, nr_threads) +
-   apicid_core_width(nr_dies, nr_cores, nr_threads);
+return apicid_core_offset(nr_threads) +
+   apicid_core_width(nr_cores);
 }
 
 /* Bit offset of the Pkg_ID (socket ID) field
@@ -121,8 +112,8 @@ static inline unsigned apicid_pkg_offset(unsigned nr_dies,
  unsigned nr_cores,
  unsigned nr_threads)
 {
-return apicid_die_offset(nr_dies, nr_cores, nr_threads) +
-   apicid_die_width(nr_dies, nr_cores, nr_threads);
+return apicid_die_offset(nr_cores, nr_threads) +
+   apicid_die_width(nr_dies);
 }
 
 /* Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
@@ -137,8 +128,8 @@ static inline apic_id_t apicid_from_topo_ids(X86CPUTopoInfo 
*topo_info,
 unsigned nr_threads = topo_info->nr_threads;
 
 return (topo_ids->pkg_id  << apicid_pkg_offset(nr_dies, nr_cores, 
nr_threads)) |
-   (topo_ids->die_id  << apicid_die_offset(nr_dies, nr_cores, 
nr_threads)) |
-   (topo_ids->core_id << apicid_core_offset(nr_dies, nr_cores, 
nr_threads)) |
+   (topo_ids->die_id  << apicid_die_offset(nr_cores, nr_threads)) |
+   (topo_ids->core_id << apicid_core_offset(nr_threads)) |
topo_ids->smt_id;
 }
 
@@ -171,13 +162,13 @@ static inline void x86_topo_ids_from_apicid(apic_id_t 
apicid,
 unsigned nr_threads = topo_info->nr_threads;
 
 topo_ids->smt_id = apicid &
-~(0xUL << apicid_smt_width(nr_dies, nr_cores, nr_threads));
+~(0xUL << apicid_smt_width(nr_threads));
 topo_ids->core_id =
-(apicid >> apicid_core_offset(nr_dies, nr_cores, nr_threads)) &
-~(0xUL << apicid_core_width(nr_dies, nr_cores, 
nr_threads));
+(apicid >> apicid_core_offset(nr_threads)) &
+~(0xUL << apicid_core_width(nr_cores));
 topo_ids->die_id =
-(apicid >> apicid_die_offset(nr_dies, nr_cores, nr_threads)) &
-~(0xUL << apicid_die_width(nr_dies, nr_cores, nr_threads));
+(apicid >> apicid_die_offset(nr_cores, nr_threads)) &
+~(0xUL << apicid_die_width(nr_dies));
 topo_ids->pkg_id = apicid >> apicid_pkg_offset(nr_dies, nr_cores, 
nr_threads);
 }
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 19751e37a7..6d7f9b6b8b 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4260,8 +4260,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,

[Qemu-devel] [RFC 2 PATCH 03/16] hw/i386: Introduce X86CPUTopoInfo to contain topology info

2019-09-06 Thread Moger, Babu

This is an effort to re-arrange few data structure for better
readability. Add X86CPUTopoInfo which will have all the topology
informations required to build the cpu topology. There is no
functional changes.

Signed-off-by: Babu Moger 
---
 hw/i386/pc.c   |   40 +++-
 include/hw/i386/topology.h |   40 ++--
 2 files changed, 53 insertions(+), 27 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index ada445f8f3..95aab8e5e7 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -930,11 +930,15 @@ static uint32_t x86_cpu_apic_id_from_index(PCMachineState 
*pcms,
 {
 MachineState *ms = MACHINE(pcms);
 PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+X86CPUTopoInfo topo_info;
 uint32_t correct_id;
 static bool warned;
 
-correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
- ms->smp.threads, cpu_index);
+topo_info.nr_dies = pcms->smp_dies;
+topo_info.nr_cores = ms->smp.cores;
+topo_info.nr_threads = ms->smp.threads;
+
+correct_id = x86_apicid_from_cpu_idx(_info, cpu_index);
 if (pcmc->compat_apic_id_mode) {
 if (cpu_index != correct_id && !warned && !qtest_enabled()) {
 error_report("APIC IDs set in compatibility mode, "
@@ -2386,6 +2390,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 PCMachineState *pcms = PC_MACHINE(hotplug_dev);
 unsigned int smp_cores = ms->smp.cores;
 unsigned int smp_threads = ms->smp.threads;
+X86CPUTopoInfo topo_info;
 
 if(!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
 error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
@@ -2393,6 +2398,10 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 return;
 }
 
+topo_info.nr_dies = pcms->smp_dies;
+topo_info.nr_cores = smp_cores;
+topo_info.nr_threads = smp_threads;
+
 env->nr_dies = pcms->smp_dies;
 
 /*
@@ -2436,16 +2445,14 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 topo_ids.die_id = cpu->die_id;
 topo_ids.core_id = cpu->core_id;
 topo_ids.smt_id = cpu->thread_id;
-cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
-smp_threads, _ids);
+cpu->apic_id = apicid_from_topo_ids(_info, _ids);
 }
 
 cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, );
 if (!cpu_slot) {
 MachineState *ms = MACHINE(pcms);
 
-x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
- smp_cores, smp_threads, _ids);
+x86_topo_ids_from_apicid(cpu->apic_id, _info, _ids);
 error_setg(errp,
 "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
 " APIC ID %" PRIu32 ", valid index range 0:%d",
@@ -2466,8 +2473,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 /* TODO: move socket_id/core_id/thread_id checks into x86_cpu_realizefn()
  * once -smp refactoring is complete and there will be CPU private
  * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
-x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
- smp_cores, smp_threads, _ids);
+x86_topo_ids_from_apicid(cpu->apic_id, _info, _ids);
 if (cpu->socket_id != -1 && cpu->socket_id != topo_ids.pkg_id) {
 error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
 " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, 
topo_ids.pkg_id);
@@ -2842,19 +2848,28 @@ static int64_t pc_get_default_cpu_node_id(const 
MachineState *ms, int idx)
 {
X86CPUTopoIDs topo_ids;
PCMachineState *pcms = PC_MACHINE(ms);
+   X86CPUTopoInfo topo_info;
+
+   topo_info.nr_dies = pcms->smp_dies;
+   topo_info.nr_cores = ms->smp.cores;
+   topo_info.nr_threads = ms->smp.threads;
 
assert(idx < ms->possible_cpus->len);
x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
-pcms->smp_dies, ms->smp.cores,
-ms->smp.threads, _ids);
+_info, _ids);
return topo_ids.pkg_id % nb_numa_nodes;
 }
 
 static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
 {
 PCMachineState *pcms = PC_MACHINE(ms);
-int i;
 unsigned int max_cpus = ms->smp.max_cpus;
+X86CPUTopoInfo topo_info;
+int i;
+
+topo_info.nr_dies = pcms->smp_dies;
+topo_info.nr_cores = ms->smp.cores;
+topo_info.nr_threads = ms->smp.threads;
 
 if (ms->possible_cpus) {
 /*
@@ -2875,8 +2890,7 @@ static const CPUArchIdList 
*pc_possible_cpu_arch_ids(MachineState *ms)
 ms->possible_cpus->cpus[i].vcpus_count = 1;
 ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, 
i);
 x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
-

[Qemu-devel] [RFC 2 PATCH 07/16] hw/386: Add new epyc mode topology decoding functions

2019-09-06 Thread Moger, Babu

These functions add support for building new epyc mode topology
given smp details like numa nodes, cores, threads and sockets.
Subsequent patches will use these functions to build the topology.

The topology details are available in Processor Programming Reference (PPR)
for AMD Family 17h Model 01h, Revision B1 Processors.
It is available at https://www.amd.com/en/support/tech-docs

Signed-off-by: Babu Moger 
---
 include/hw/i386/topology.h |  174 
 1 file changed, 174 insertions(+)

diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 5a61d53f05..6fd4184f07 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -62,6 +62,22 @@ typedef struct X86CPUTopoInfo {
 unsigned nr_threads;
 } X86CPUTopoInfo;
 
+/*
+ * Definitions used for building CPUID Leaf 0x801D and 0x801E
+ * Please refer to the AMD64 Architecture Programmer’s Manual Volume 3.
+ * Define the constants to build the cpu topology. Right now, TOPOEXT
+ * feature is enabled only on EPYC. So, these constants are based on
+ * EPYC supported configurations. We may need to handle the cases if
+ * these values change in future.
+ */
+
+/* Maximum core complexes in a node */
+#define MAX_CCX  2
+/* Maximum cores in a core complex */
+#define MAX_CORES_IN_CCX 4
+/* Maximum cores in a node */
+#define MAX_CORES_IN_NODE8
+
 /* Return the bit width needed for 'count' IDs
  */
 static unsigned apicid_bitwidth_for_count(unsigned count)
@@ -116,6 +132,164 @@ static inline unsigned apicid_pkg_offset(unsigned nr_dies,
apicid_die_width(nr_dies);
 }
 
+/* Bit offset of the CCX_ID field */
+static inline unsigned apicid_ccx_offset(unsigned nr_cores,
+ unsigned nr_threads)
+{
+return apicid_core_offset(nr_threads) +
+   apicid_core_width(nr_cores);
+}
+
+/* Bit width of the Die_ID field */
+static inline unsigned apicid_ccx_width(unsigned nr_ccxs)
+{
+return apicid_bitwidth_for_count(nr_ccxs);
+}
+
+/* Bit offset of the node_id field */
+static inline unsigned apicid_node_offset(unsigned nr_ccxs,
+  unsigned nr_cores,
+  unsigned nr_threads)
+{
+return apicid_ccx_offset(nr_cores, nr_threads) +
+   apicid_ccx_width(nr_ccxs);
+}
+
+/* Bit width of the node_id field */
+static inline unsigned apicid_node_width(unsigned nr_nodes)
+{
+return apicid_bitwidth_for_count(nr_nodes);
+}
+
+/* Bit offset of the node_id field */
+static inline unsigned apicid_pkg_offset_epyc(unsigned nr_nodes,
+  unsigned nr_ccxs,
+  unsigned nr_cores,
+  unsigned nr_threads)
+{
+return apicid_node_offset(nr_ccxs, nr_cores, nr_threads) +
+   apicid_node_width(nr_nodes);
+}
+
+/*
+ * Figure out the number of nodes required to build this config.
+ * Max cores in a nodes is 8
+ */
+static inline int nodes_in_pkg(X86CPUTopoInfo *topo_info)
+{
+/*
+ * Create a config with user given (nr_nodes > 1) numa node config,
+ * else go with a standard configuration
+ */
+if (topo_info->numa_nodes > 1) {
+return DIV_ROUND_UP(topo_info->numa_nodes, topo_info->nr_sockets);
+} else {
+return DIV_ROUND_UP(topo_info->nr_cores, MAX_CORES_IN_NODE);
+}
+}
+
+/*
+ * Decide the number of cores in a core complex with the given nr_cores using
+ * following set constants MAX_CCX, MAX_CORES_IN_CCX, MAX_CORES_IN_DIE and
+ * MAX_NODES_PER_SOCKET. Maintain symmetry as much as possible
+ * L3 cache is shared across all cores in a core complex. So, this will also
+ * tell us how many cores are sharing the L3 cache.
+ */
+static inline int cores_in_ccx(X86CPUTopoInfo *topo_info)
+{
+int nodes;
+
+/* Get the number of nodes required to build this config */
+nodes = nodes_in_pkg(topo_info);
+
+/*
+ * Divide the cores accros all the core complexes
+ * Return rounded up value
+ */
+return DIV_ROUND_UP(topo_info->nr_cores, nodes * MAX_CCX);
+}
+
+/*
+ * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
+ *
+ * The caller must make sure core_id < nr_cores and smt_id < nr_threads.
+ */
+static inline apic_id_t x86_apicid_from_topo_ids_epyc(X86CPUTopoInfo 
*topo_info,
+  const X86CPUTopoIDs 
*topo_ids)
+{
+unsigned nr_ccxs = MAX_CCX;
+unsigned nr_nodes = nodes_in_pkg(topo_info);
+unsigned nr_cores = MAX_CORES_IN_CCX;
+unsigned nr_threads = topo_info->nr_threads;
+
+return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(nr_nodes, nr_ccxs,
+nr_cores, nr_threads)) 
|
+   (topo_ids->node_id  << apicid_node_offset(nr_ccxs, nr_cores,
+ nr_threads)) |
+

[Qemu-devel] [RFC 2 PATCH 04/16] machine: Add SMP Sockets in CpuTopology

2019-09-06 Thread Moger, Babu

Store the  smp Sockets in CpuTopology. Socket information
is required to build the cpu topology in new epyc mode.

Signed-off-by: Babu Moger 
---
 hw/core/machine.c   |1 +
 hw/i386/pc.c|1 +
 include/hw/boards.h |2 ++
 vl.c|1 +
 4 files changed, 5 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index c58a8e594e..4034b7e903 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -795,6 +795,7 @@ static void smp_parse(MachineState *ms, QemuOpts *opts)
 ms->smp.cpus = cpus;
 ms->smp.cores = cores;
 ms->smp.threads = threads;
+ms->smp.sockets = sockets;
 }
 
 if (ms->smp.cpus > 1) {
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 95aab8e5e7..9e1c3f9f57 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1609,6 +1609,7 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
 ms->smp.cpus = cpus;
 ms->smp.cores = cores;
 ms->smp.threads = threads;
+ms->smp.sockets = sockets;
 pcms->smp_dies = dies;
 }
 
diff --git a/include/hw/boards.h b/include/hw/boards.h
index a71d1a53a5..12eb5032a5 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -245,12 +245,14 @@ typedef struct DeviceMemoryState {
  * @cpus: the number of present logical processors on the machine
  * @cores: the number of cores in one package
  * @threads: the number of threads in one core
+ * @sockets: the number of sockets on the machine
  * @max_cpus: the maximum number of logical processors on the machine
  */
 typedef struct CpuTopology {
 unsigned int cpus;
 unsigned int cores;
 unsigned int threads;
+unsigned int sockets;
 unsigned int max_cpus;
 } CpuTopology;
 
diff --git a/vl.c b/vl.c
index 711d2ae5da..473a688779 100644
--- a/vl.c
+++ b/vl.c
@@ -3981,6 +3981,7 @@ int main(int argc, char **argv, char **envp)
 current_machine->smp.max_cpus = machine_class->default_cpus;
 current_machine->smp.cores = 1;
 current_machine->smp.threads = 1;
+current_machine->smp.sockets = 1;
 
 machine_class->smp_parse(current_machine,
 qemu_opts_find(qemu_find_opts("smp-opts"), NULL));

[Qemu-devel] [RFC 2 PATCH 02/16] hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs

2019-09-06 Thread Moger, Babu

Rename few data structures related to X86 topology.
X86CPUTopoIDs will have individual arch ids. Next
patch introduces X86CPUTopoInfo which will have all
topology information(like cores, threads etc..).

Adds node_id and ccx_id. This will be required to support
new epyc mode mode. There is no functional change.

Signed-off-by: Babu Moger 
---
 hw/i386/pc.c   |   60 ++--
 include/hw/i386/topology.h |   42 ---
 2 files changed, 52 insertions(+), 50 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 549c437050..ada445f8f3 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2379,7 +2379,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 int idx;
 CPUState *cs;
 CPUArchId *cpu_slot;
-X86CPUTopoInfo topo;
+X86CPUTopoIDs topo_ids;
 X86CPU *cpu = X86_CPU(dev);
 CPUX86State *env = >env;
 MachineState *ms = MACHINE(hotplug_dev);
@@ -2432,12 +2432,12 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 return;
 }
 
-topo.pkg_id = cpu->socket_id;
-topo.die_id = cpu->die_id;
-topo.core_id = cpu->core_id;
-topo.smt_id = cpu->thread_id;
+topo_ids.pkg_id = cpu->socket_id;
+topo_ids.die_id = cpu->die_id;
+topo_ids.core_id = cpu->core_id;
+topo_ids.smt_id = cpu->thread_id;
 cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
-smp_threads, );
+smp_threads, _ids);
 }
 
 cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, );
@@ -2445,11 +2445,11 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 MachineState *ms = MACHINE(pcms);
 
 x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
- smp_cores, smp_threads, );
+ smp_cores, smp_threads, _ids);
 error_setg(errp,
 "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
 " APIC ID %" PRIu32 ", valid index range 0:%d",
-topo.pkg_id, topo.die_id, topo.core_id, topo.smt_id,
+topo_ids.pkg_id, topo_ids.die_id, topo_ids.core_id, 
topo_ids.smt_id,
 cpu->apic_id, ms->possible_cpus->len - 1);
 return;
 }
@@ -2467,34 +2467,34 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
  * once -smp refactoring is complete and there will be CPU private
  * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
 x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
- smp_cores, smp_threads, );
-if (cpu->socket_id != -1 && cpu->socket_id != topo.pkg_id) {
+ smp_cores, smp_threads, _ids);
+if (cpu->socket_id != -1 && cpu->socket_id != topo_ids.pkg_id) {
 error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
-" 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, 
topo.pkg_id);
+" 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, 
topo_ids.pkg_id);
 return;
 }
-cpu->socket_id = topo.pkg_id;
+cpu->socket_id = topo_ids.pkg_id;
 
-if (cpu->die_id != -1 && cpu->die_id != topo.die_id) {
+if (cpu->die_id != -1 && cpu->die_id != topo_ids.die_id) {
 error_setg(errp, "property die-id: %u doesn't match set apic-id:"
-" 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, topo.die_id);
+" 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, topo_ids.die_id);
 return;
 }
-cpu->die_id = topo.die_id;
+cpu->die_id = topo_ids.die_id;
 
-if (cpu->core_id != -1 && cpu->core_id != topo.core_id) {
+if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
 error_setg(errp, "property core-id: %u doesn't match set apic-id:"
-" 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo.core_id);
+" 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, 
topo_ids.core_id);
 return;
 }
-cpu->core_id = topo.core_id;
+cpu->core_id = topo_ids.core_id;
 
-if (cpu->thread_id != -1 && cpu->thread_id != topo.smt_id) {
+if (cpu->thread_id != -1 && cpu->thread_id != topo_ids.smt_id) {
 error_setg(errp, "property thread-id: %u doesn't match set apic-id:"
-" 0x%x (thread-id: %u)", cpu->thread_id, cpu->apic_id, 
topo.smt_id);
+" 0x%x (thread-id: %u)", cpu->thread_id, cpu->apic_id, 
topo_ids.smt_id);
 return;
 }
-cpu->thread_id = topo.smt_id;
+cpu->thread_id = topo_ids.smt_id;
 
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_VPINDEX) &&
 !kvm_hv_vpindex_settable()) {
@@ -2840,14 +2840,14 @@ pc_cpu_index_to_props(MachineState *ms, unsigned 
cpu_index)
 
 static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
 {
-   X86CPUTopoInfo topo;
+   X86CPUTopoIDs topo_ids;

[Qemu-devel] [RFC 2 PATCH 01/16] numa: Split the numa functionality

2019-09-06 Thread Moger, Babu

To support new epyc mode, we need to know the number of numa nodes
in advance to generate apic id correctly. So, split the numa
initialization into two. The function parse_numa initializes numa_info
and updates nb_numa_nodes. And then parse_numa_node does the numa node
initialization.

Signed-off-by: Babu Moger 
---
 hw/core/numa.c|  106 +++--
 include/sysemu/numa.h |2 +
 vl.c  |2 +
 3 files changed, 80 insertions(+), 30 deletions(-)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index a11431483c..27fa6b5e1d 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -55,14 +55,10 @@ bool have_numa_distance;
 NodeInfo numa_info[MAX_NODES];
 
 
-static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
+static void parse_numa_info(MachineState *ms, NumaNodeOptions *node,
 Error **errp)
 {
-Error *err = NULL;
 uint16_t nodenr;
-uint16List *cpus = NULL;
-MachineClass *mc = MACHINE_GET_CLASS(ms);
-unsigned int max_cpus = ms->smp.max_cpus;
 
 if (node->has_nodeid) {
 nodenr = node->nodeid;
@@ -81,29 +77,6 @@ static void parse_numa_node(MachineState *ms, 
NumaNodeOptions *node,
 return;
 }
 
-if (!mc->cpu_index_to_instance_props || !mc->get_default_cpu_node_id) {
-error_setg(errp, "NUMA is not supported by this machine-type");
-return;
-}
-for (cpus = node->cpus; cpus; cpus = cpus->next) {
-CpuInstanceProperties props;
-if (cpus->value >= max_cpus) {
-error_setg(errp,
-   "CPU index (%" PRIu16 ")"
-   " should be smaller than maxcpus (%d)",
-   cpus->value, max_cpus);
-return;
-}
-props = mc->cpu_index_to_instance_props(ms, cpus->value);
-props.node_id = nodenr;
-props.has_node_id = true;
-machine_set_cpu_numa_node(ms, , );
-if (err) {
-error_propagate(errp, err);
-return;
-}
-}
-
 have_memdevs = have_memdevs ? : node->has_memdev;
 have_mem = have_mem ? : node->has_mem;
 if ((node->has_mem && have_memdevs) || (node->has_memdev && have_mem)) {
@@ -177,7 +150,7 @@ void set_numa_options(MachineState *ms, NumaOptions 
*object, Error **errp)
 
 switch (object->type) {
 case NUMA_OPTIONS_TYPE_NODE:
-parse_numa_node(ms, >u.node, );
+parse_numa_info(ms, >u.node, );
 if (err) {
 goto end;
 }
@@ -242,6 +215,73 @@ end:
 return 0;
 }
 
+void set_numa_node_options(MachineState *ms, NumaOptions *object, Error **errp)
+{
+MachineClass *mc = MACHINE_GET_CLASS(ms);
+NumaNodeOptions *node = >u.node;
+unsigned int max_cpus = ms->smp.max_cpus;
+uint16List *cpus = NULL;
+Error *err = NULL;
+uint16_t nodenr;
+
+if (node->has_nodeid) {
+nodenr = node->nodeid;
+} else {
+error_setg(errp, "NUMA node information is not available");
+}
+
+if (!mc->cpu_index_to_instance_props || !mc->get_default_cpu_node_id) {
+error_setg(errp, "NUMA is not supported by this machine-type");
+return;
+}
+
+for (cpus = node->cpus; cpus; cpus = cpus->next) {
+CpuInstanceProperties props;
+if (cpus->value >= max_cpus) {
+error_setg(errp,
+   "CPU index (%" PRIu16 ")"
+   " should be smaller than maxcpus (%d)",
+   cpus->value, max_cpus);
+return;
+ }
+ props = mc->cpu_index_to_instance_props(ms, cpus->value);
+ props.node_id = nodenr;
+ props.has_node_id = true;
+ machine_set_cpu_numa_node(ms, , );
+ if (err) {
+error_propagate(errp, err);
+return;
+ }
+}
+}
+
+static int parse_numa_node(void *opaque, QemuOpts *opts, Error **errp)
+{
+NumaOptions *object = NULL;
+MachineState *ms = MACHINE(opaque);
+Error *err = NULL;
+Visitor *v = opts_visitor_new(opts);
+
+visit_type_NumaOptions(v, NULL, , );
+visit_free(v);
+if (err) {
+goto end;
+}
+
+if (object->type == NUMA_OPTIONS_TYPE_NODE) {
+set_numa_node_options(ms, object, );
+}
+
+end:
+qapi_free_NumaOptions(object);
+if (err) {
+error_propagate(errp, err);
+return -1;
+}
+
+return 0;
+}
+
 /* If all node pair distances are symmetric, then only distances
  * in one direction are enough. If there is even one asymmetric
  * pair, though, then all distances must be provided. The
@@ -368,7 +408,7 @@ void numa_complete_configuration(MachineState *ms)
 if (ms->ram_slots > 0 && nb_numa_nodes == 0 &&
 mc->auto_enable_numa_with_memhp) {
 NumaNodeOptions node = { };
-parse_numa_node(ms, , _abort);
+parse_numa_info(ms, , _abort);
 }
 
 assert(max_numa_nodeid <= MAX_NODES);
@@ -448,6 +488,12 @@ void

[Qemu-devel] [RFC 2 PATCH 00/16] APIC ID fixes for AMD EPYC CPU models

2019-09-06 Thread Moger, Babu

These series fixes the problems encoding APIC ID for AMD EPYC cpu models.
https://bugzilla.redhat.com/show_bug.cgi?id=1728166

This is the second pass to give an idea of the changes required to address
the issue. First pass is availabe at 
https://patchwork.kernel.org/cover/11069785/

Currently, apic id is decoded based on sockets/dies/cores/threads. This appears
to work for most standard configurations for AMD and other vendors. But this
decoding does not follow AMD's APIC ID enumeration. In some cases this
causes CPU topology inconstancy. While booting guest Kernel is trying to
validate topology. It finds the topology not aligning to EPYC models.

To fix the problem we need to build the topology as per the
Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
Processors. It is available at https://www.amd.com/en/support/tech-docs

Here is the text from the PPR.
2.1.10.2.1.3
ApicId Enumeration Requirements
Operating systems are expected to use
Core::X86::Cpuid::SizeId[ApicIdCoreIdSize], the number of least
significant bits in the Initial APIC ID that indicate core ID within a
processor, in constructing per-core CPUID
masks. Core::X86::Cpuid::SizeId[ApicIdCoreIdSize] determines the maximum number
of cores (MNC) that the
processor could theoretically support, not the actual number of cores that are
actually implemented or enabled on
the processor, as indicated by Core::X86::Cpuid::SizeId[NC].
Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
• ApicId[6] = Socket ID.
• ApicId[5:4] = Node ID.
• ApicId[3] = Logical CCX L3 complex ID
• ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} :
{1'b0,LogicalCoreID[1:0]}.
"""

v2:
  1. Introduced the new property epyc to enable new epyc mode.
  2. Separated the epyc mode and non epyc mode function.
  3. Introduced function pointers in PCMachineState to handle the
 differences.
  4. Mildly tested different combinations to make things are working as 
expected.
  5. TODO : Setting the epyc feature bit needs to be worked out. This feature is
 supported only on AMD EPYC models. I may need some guidance on that.

v1:
  https://patchwork.kernel.org/cover/11069785/

---

Babu Moger (16):
  numa: Split the numa functionality
  hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs
  hw/i386: Introduce X86CPUTopoInfo to contain topology info
  machine: Add SMP Sockets in CpuTopology
  hw/i386: Simplify topology Offset/width Calculation
  hw/core: Add core complex id in X86CPU topology
  hw/386: Add new epyc mode topology decoding functions
  i386: Cleanup and use the new epyc mode topology functions
  hw/i386: Introduce initialize_topo_info function
  hw/i386: Introduce apicid_from_cpu_idx in PCMachineState
  Introduce-topo_ids_from_apicid-handler
  hw/i386: Introduce apic_id_from_topo_ids handler in PCMachineState
  machine: Add new epyc property in PCMachineState
  hw/i386: Introduce epyc mode function handlers
  i386: Fix pkg_id offset for epyc mode
  hw/core: Fix up the machine_set_cpu_numa_node for epyc


 hw/core/machine-hmp-cmds.c |3 
 hw/core/machine.c  |   38 ++
 hw/core/numa.c |  110 
 hw/i386/pc.c   |  143 +++--
 include/hw/boards.h|8 +
 include/hw/i386/pc.h   |9 +
 include/hw/i386/topology.h |  294 +++-
 include/sysemu/numa.h  |2 
 qapi/machine.json  |4 -
 target/i386/cpu.c  |  209 +++
 target/i386/cpu.h  |1 
 vl.c   |3 
 12 files changed, 560 insertions(+), 264 deletions(-)

--
Signature

Re: [Qemu-devel] [RFC PATCH 4/5] hw/i386: Generate apicid based on cpu_type

2019-08-01 Thread Moger, Babu

Hi Eduardo,
  Thanks for the quick comments. I will look into your comments closely
and will let you know if I have questions.

> -Original Message-
> From: Eduardo Habkost 
> Sent: Thursday, August 1, 2019 2:29 PM
> To: Moger, Babu 
> Cc: marcel.apfelb...@gmail.com; m...@redhat.com; pbonz...@redhat.com;
> r...@twiddle.net; imamm...@redhat.com; qemu-devel@nongnu.org
> Subject: Re: [RFC PATCH 4/5] hw/i386: Generate apicid based on cpu_type
> 
> Thanks for the patches.
> 
> I still haven't looked closely at all patches in the series, but
> patches 1-3 seem good on the first look.  A few comments on this
> one:
> 
> On Wed, Jul 31, 2019 at 11:20:50PM +, Moger, Babu wrote:
> > Check the cpu_type before calling the apicid functions
> > from topology.h.
> >
> > Signed-off-by: Babu Moger 
> > ---
> [...]
> > @@ -2437,16 +2478,26 @@ static void pc_cpu_pre_plug(HotplugHandler
> *hotplug_dev,
> >  topo.die_id = cpu->die_id;
> >  topo.core_id = cpu->core_id;
> >  topo.smt_id = cpu->thread_id;
> > -cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
> > -smp_threads, );
> > +   if (!strncmp(ms->cpu_type, "EPYC", 4)) {
> 
> Please don't add semantics to the CPU type name.  If you want
> some behavior to be configurable per CPU type, please do it at
> the X86CPUDefinition struct.
> 
> In this specific case, maybe the new APIC ID calculation code
> could
> be conditional on:
>   (vendor == AMD) && (env->features[...] & TOPOEXT).
> 
> Also, we must keep compatibility with the old APIC ID calculation
> code on older machine types.  We need a compatibility flag to
> enable the existing APIC ID calculation.
> 
> 
> > +x86_topo_ids_from_idx_epyc(nb_numa_nodes, smp_sockets,
> smp_cores,
> > +   smp_threads, idx, );
> > +cpu->apic_id = apicid_from_topo_ids_epyc(smp_cores,
> smp_threads,
> > + );
> > +   } else
> 
> There's a tab character here.  Please use spaces instead of tabs.
> 
> > +cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
> > +smp_threads, );
> 
> I see you are duplicating very similar logic in 3 different
> places, to call apicid_from_topo_ids() and
> x86_topo_ids_from_apicid().
> 
> Also, apicid_from_topo_ids() and x86_topo_ids_from_apicid() have very
> generic
> names, and people could call them expecting them to work for every CPU
> model
> (which they don't).  This makes the topology API very easy to misuse.
> 
> Why don't we make the existing generic
> apicid_from_topo_ids()/x86_topo_ids_from_apicid() functions work
> on all cases?  If they need additional input to handle EPYC and
> call EPYC-specific functions, we can make them get additional
> arguments.  This way we'll be sure that we'll never call the
> wrong implementation by accident.
> 
> This might make the list of arguments for
> x86_topo_ids_from_apicid() and apicid_from_topo_ids() become
> large.  We can address this by making them get a CpuTopology
> argument.
> 
> 
> In other words, the API could look like this:
> 
> 
> static inline apic_id_t apicid_from_topo_ids(const X86CPUTopology *topo,
>  const X86CPUTopologyIds *ids)
> {
> if (topo->epyc_mode) {
> return apicid_from_topo_ids_epyc(topo, ids);
> }
> 
> /* existing QEMU 4.1 logic: */
> return (ids->pkg_id  << apicid_pkg_offset(topo)) |
>(ids->die_id  << apicid_die_offset(topo)) |
>(ids->core_id << apicid_core_offset(topo)) |
>ids->smt_id;
> }
> 
> static inline void x86_topo_ids_from_apicid(apic_id_t apicid,
> const X86CPUTopology *topo,
> X86CPUTopologyIds *ids)
> {
> if (topo->epyc_mode) {
> x86_topo_ids_from_apicid_epyc(apicid, topo, ids);
> return;
> }
> 
> /* existing QEMU 4.1 logic: */
> ids->smt_id =
> apicid &
> ~(0xUL << apicid_smt_width(topo));
> ids->core_id =
> (apicid >> apicid_core_offset(topo)) &
> ~(0xUL << apicid_core_width(topo));
> ids->die_id =
> (apicid >> apicid_die_offset(topo)) &
> ~(0xUL << apicid_die_width(topo));
> ids

[Qemu-devel] [RFC PATCH 1/5] hw/boards: Add sockets in CpuTopology structure

2019-07-31 Thread Moger, Babu

Add sockets in CpuTopology. This is required when building
the CPU topology.

Signed-off-by: Babu Moger 
---
 hw/core/machine.c   | 1 +
 hw/i386/pc.c| 1 +
 include/hw/boards.h | 2 ++
 vl.c| 1 +
 4 files changed, 5 insertions(+)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index c58a8e594e..4034b7e903 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -795,6 +795,7 @@ static void smp_parse(MachineState *ms, QemuOpts *opts)
 ms->smp.cpus = cpus;
 ms->smp.cores = cores;
 ms->smp.threads = threads;
+ms->smp.sockets = sockets;
 }
 
 if (ms->smp.cpus > 1) {
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 549c437050..ef39463fd5 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1605,6 +1605,7 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
 ms->smp.cpus = cpus;
 ms->smp.cores = cores;
 ms->smp.threads = threads;
+ms->smp.sockets = sockets;
 pcms->smp_dies = dies;
 }
 
diff --git a/include/hw/boards.h b/include/hw/boards.h
index a71d1a53a5..12eb5032a5 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -245,12 +245,14 @@ typedef struct DeviceMemoryState {
  * @cpus: the number of present logical processors on the machine
  * @cores: the number of cores in one package
  * @threads: the number of threads in one core
+ * @sockets: the number of sockets on the machine
  * @max_cpus: the maximum number of logical processors on the machine
  */
 typedef struct CpuTopology {
 unsigned int cpus;
 unsigned int cores;
 unsigned int threads;
+unsigned int sockets;
 unsigned int max_cpus;
 } CpuTopology;
 
diff --git a/vl.c b/vl.c
index b426b32134..d8faf5ab43 100644
--- a/vl.c
+++ b/vl.c
@@ -3981,6 +3981,7 @@ int main(int argc, char **argv, char **envp)
 current_machine->smp.max_cpus = machine_class->default_cpus;
 current_machine->smp.cores = 1;
 current_machine->smp.threads = 1;
+current_machine->smp.sockets = 1;
 
 machine_class->smp_parse(current_machine,
 qemu_opts_find(qemu_find_opts("smp-opts"), NULL));
-- 
2.20.1

[Qemu-devel] [RFC PATCH 5/5] i386: Fix pkg_id offset EPYC

2019-07-31 Thread Moger, Babu

Per Processor Programming Reference (PPR) for AMD Family 17h Models,
the pkg_id offset in apicid is 6. Fix the offset based on EPYC models.

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index be4583068c..235496a9c1 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4079,7 +4079,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 MachineState *ms = MACHINE(qdev_get_machine());
 X86CPU *cpu = env_archcpu(env);
 CPUState *cs = env_cpu(env);
-uint32_t die_offset;
+uint32_t die_offset, pkg_offset;
 uint32_t limit;
 uint32_t signature[3];
 
@@ -4102,6 +4102,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 index = env->cpuid_level;
 }
 
+if (!strncmp(ms->cpu_type, "EPYC", 4))
+pkg_offset = PKG_OFFSET_EPYC;
+else
+pkg_offset = apicid_pkg_offset(env->nr_dies, cs->nr_cores,
+   cs->nr_threads);
+
 switch(index) {
 case 0:
 *eax = env->cpuid_level;
@@ -4260,8 +4266,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 *ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
 break;
 case 1:
-*eax = apicid_pkg_offset(env->nr_dies,
- cs->nr_cores, cs->nr_threads);
+*eax = pkg_offset;
 *ebx = cs->nr_cores * cs->nr_threads;
 *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
 break;
@@ -4297,8 +4302,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 *ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
 break;
 case 2:
-*eax = apicid_pkg_offset(env->nr_dies, cs->nr_cores,
-   cs->nr_threads);
+*eax = pkg_offset;
 *ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
 *ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
 break;
-- 
2.20.1

[Qemu-devel] [RFC PATCH 3/5] i386: Use topology functions from topology.h

2019-07-31 Thread Moger, Babu

Use the functions defined in topology.h and remove the old code.

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c | 146 +-
 1 file changed, 27 insertions(+), 119 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 19751e37a7..be4583068c 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -28,6 +28,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/hvf.h"
 #include "sysemu/cpus.h"
+#include "sysemu/numa.h"
 #include "kvm_i386.h"
 #include "sev_i386.h"
 
@@ -338,64 +339,8 @@ static void encode_cache_cpuid8006(CPUCacheInfo *l2,
 }
 }
 
-/*
- * Definitions used for building CPUID Leaf 0x801D and 0x801E
- * Please refer to the AMD64 Architecture Programmer’s Manual Volume 3.
- * Define the constants to build the cpu topology. Right now, TOPOEXT
- * feature is enabled only on EPYC. So, these constants are based on
- * EPYC supported configurations. We may need to handle the cases if
- * these values change in future.
- */
-/* Maximum core complexes in a node */
-#define MAX_CCX 2
-/* Maximum cores in a core complex */
-#define MAX_CORES_IN_CCX 4
-/* Maximum cores in a node */
-#define MAX_CORES_IN_NODE 8
-/* Maximum nodes in a socket */
-#define MAX_NODES_PER_SOCKET 4
-
-/*
- * Figure out the number of nodes required to build this config.
- * Max cores in a node is 8
- */
-static int nodes_in_socket(int nr_cores)
-{
-int nodes;
-
-nodes = DIV_ROUND_UP(nr_cores, MAX_CORES_IN_NODE);
-
-   /* Hardware does not support config with 3 nodes, return 4 in that case */
-return (nodes == 3) ? 4 : nodes;
-}
-
-/*
- * Decide the number of cores in a core complex with the given nr_cores using
- * following set constants MAX_CCX, MAX_CORES_IN_CCX, MAX_CORES_IN_NODE and
- * MAX_NODES_PER_SOCKET. Maintain symmetry as much as possible
- * L3 cache is shared across all cores in a core complex. So, this will also
- * tell us how many cores are sharing the L3 cache.
- */
-static int cores_in_core_complex(int nr_cores)
-{
-int nodes;
-
-/* Check if we can fit all the cores in one core complex */
-if (nr_cores <= MAX_CORES_IN_CCX) {
-return nr_cores;
-}
-/* Get the number of nodes required to build this config */
-nodes = nodes_in_socket(nr_cores);
-
-/*
- * Divide the cores accros all the core complexes
- * Return rounded up value
- */
-return DIV_ROUND_UP(nr_cores, nodes * MAX_CCX);
-}
-
 /* Encode cache info for CPUID[801D] */
-static void encode_cache_cpuid801d(CPUCacheInfo *cache, CPUState *cs,
+static void encode_cache_cpuid801d(CPUCacheInfo *cache, MachineState *ms,
 uint32_t *eax, uint32_t *ebx,
 uint32_t *ecx, uint32_t *edx)
 {
@@ -408,10 +353,10 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
*cache, CPUState *cs,
 
 /* L3 is shared among multiple cores */
 if (cache->level == 3) {
-l3_cores = cores_in_core_complex(cs->nr_cores);
-*eax |= ((l3_cores * cs->nr_threads) - 1) << 14;
+l3_cores = cores_in_ccx(nb_numa_nodes, ms->smp.sockets, ms->smp.cores);
+*eax |= ((l3_cores * ms->smp.threads) - 1) << 14;
 } else {
-*eax |= ((cs->nr_threads - 1) << 14);
+*eax |= ((ms->smp.threads - 1) << 14);
 }
 
 assert(cache->line_size > 0);
@@ -431,55 +376,19 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
*cache, CPUState *cs,
(cache->complex_indexing ? CACHE_COMPLEX_IDX : 0);
 }
 
-/* Data structure to hold the configuration info for a given core index */
-struct core_topology {
-/* core complex id of the current core index */
-int ccx_id;
-/*
- * Adjusted core index for this core in the topology
- * This can be 0,1,2,3 with max 4 cores in a core complex
- */
-int core_id;
-/* Node id for this core index */
-int node_id;
-/* Number of nodes in this config */
-int num_nodes;
-};
-
-/*
- * Build the configuration closely match the EPYC hardware. Using the EPYC
- * hardware configuration values (MAX_CCX, MAX_CORES_IN_CCX, MAX_CORES_IN_NODE)
- * right now. This could change in future.
- * nr_cores : Total number of cores in the config
- * core_id  : Core index of the current CPU
- * topo : Data structure to hold all the config info for this core index
- */
-static void build_core_topology(int nr_cores, int core_id,
-struct core_topology *topo)
-{
-int nodes, cores_in_ccx;
-
-/* First get the number of nodes required */
-nodes = nodes_in_socket(nr_cores);
-
-cores_in_ccx = cores_in_core_complex(nr_cores);
-
-topo->node_id = core_id / (cores_in_ccx * MAX_CCX);
-topo->ccx_id = (core_id % (cores_in_ccx * MAX_CCX)) / cores_in_ccx;
-topo->core_id = core_id % cores_in_ccx;
-topo->num_nodes = nodes;
-}
-
 /* Encode cache info for CPUID[801E] */
-static void encode_topo_cpuid801e(CPUState *cs, X86CPU *cpu,
+static void

[Qemu-devel] [RFC PATCH 2/5] hw/i386: Add AMD EPYC topology encoding

2019-07-31 Thread Moger, Babu

Currently, the apicid is a sequential number in x86 cpu models. This
works fine for most of the cases. But, in certain cases this will
result into cpu topology inconsistency. This problem was observed in
AMD EPYC cpu models.

To address that we need to build apicid as per the hardware specification.
We are attempting to build the topology close to the hardware as much as
possible.

Per Processor Programming Reference (PPR) for AMD Family 17h Models,
we need to follow the following decoding.

2.1.10.2.1.3
ApicId Enumeration Requirements
Operating systems are expected to use
Core::X86::Cpuid::SizeId[ApicIdCoreIdSize], the number of least
significant bits in the Initial APIC ID that indicate core ID within a
processor, in constructing per-core CPUID
masks. Core::X86::Cpuid::SizeId[ApicIdCoreIdSize] determines the maximum
number of cores (MNC) that the
processor could theoretically support, not the actual number of cores that
are actually implemented or enabled on
the processor, as indicated by Core::X86::Cpuid::SizeId[NC].
Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
• ApicId[6] = Socket ID.
• ApicId[5:4] = Node ID.
• ApicId[3] = Logical CCX L3 complex ID
• ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} :
{1'b0,LogicalCoreID[1:0]}.

Add the structures and functions to decode the ApicId for AMD EPYC models.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1728166

Signed-off-by: Babu Moger 
---
 include/hw/i386/topology.h | 140 +
 1 file changed, 140 insertions(+)

diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 4ff5b2da6c..0b2fd46b41 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -50,6 +50,7 @@ typedef struct X86CPUTopoInfo {
 unsigned die_id;
 unsigned core_id;
 unsigned smt_id;
+unsigned ccx_id;
 } X86CPUTopoInfo;
 
 /* Return the bit width needed for 'count' IDs
@@ -179,4 +180,143 @@ static inline apic_id_t x86_apicid_from_cpu_idx(unsigned 
nr_dies,
 return apicid_from_topo_ids(nr_dies, nr_cores, nr_threads, );
 }
 
+/* Definitions used for building CPUID Leaf 0x801D and 0x801E
+ * Please refer to the AMD64 Architecture Programmer’s Manual Volume 3.
+ * Define the constants to build the cpu topology. Right now, TOPOEXT
+ * feature is enabled only on EPYC. So, these constants are based on
+ * EPYC supported configurations. We may need to handle the cases if
+ * these values change in future.
+ */
+
+/* Maximum core complexes in a node */
+#define MAX_CCX  2
+/* Maximum cores in a core complex */
+#define MAX_CORES_IN_CCX 4
+/* Maximum cores in a node */
+#define MAX_CORES_IN_DIE 8
+#define CCX_OFFSET_EPYC  3
+#define DIE_OFFSET_EPYC  4
+#define PKG_OFFSET_EPYC  6
+
+/* Figure out the number of dies required to build this config.
+ * Max cores in a die is 8
+ */
+static inline int dies_in_socket(int numa_nodes, int smp_sockets, int 
smp_cores)
+{
+/*
+ * Create a config with user given (nr_nodes > 1) numa node config,
+ * else go with a standard configuration
+ */
+if (numa_nodes > 1)
+return DIV_ROUND_UP(numa_nodes, smp_sockets);
+else
+return DIV_ROUND_UP(smp_cores, MAX_CORES_IN_DIE);
+}
+
+/* Decide the number of cores in a core complex with the given nr_cores using
+ * following set constants MAX_CCX, MAX_CORES_IN_CCX, MAX_CORES_IN_DIE and
+ * MAX_NODES_PER_SOCKET. Maintain symmetry as much as possible
+ * L3 cache is shared across all cores in a core complex. So, this will also
+ * tell us how many cores are sharing the L3 cache.
+ */
+static inline int cores_in_ccx(int numa_nodes, int smp_sockets, int smp_cores)
+{
+int dies;
+
+/* Check if we can fit all the cores in one core complex */
+if (smp_cores <= MAX_CORES_IN_CCX) {
+return smp_cores;
+}
+
+/* Get the number of nodes required to build this config */
+dies = dies_in_socket(numa_nodes, smp_sockets, smp_cores);
+
+/*
+ * Divide the cores accros all the core complexes
+ * Return rounded up value
+ */
+return DIV_ROUND_UP(smp_cores, dies * MAX_CCX);
+}
+
+/* Calculate thread/core/package IDs for a specific topology,
+ * based on APIC ID
+ */
+static inline void x86_topo_ids_from_apicid_epyc(apic_id_t apicid,
+ unsigned smp_dies,
+ unsigned smp_cores,
+ unsigned smp_threads,
+ X86CPUTopoInfo *topo)
+{
+topo->smt_id = apicid &
+   ~(0xUL << apicid_smt_width(smp_dies, smp_cores, 
smp_threads));
+topo->core_id = (apicid >> apicid_core_offset(smp_dies, smp_cores, 
smp_threads)) &
+   ~(0xUL << apicid_core_width(smp_dies, smp_cores, 
smp_threads));
+

[Qemu-devel] [RFC PATCH 0/5] APIC ID fixes for AMD EPYC CPU models

2019-07-31 Thread Moger, Babu

These series fixes the problems encoding APIC ID for AMD EPYC cpu models.
https://bugzilla.redhat.com/show_bug.cgi?id=1728166

This is the first pass to give an idea of the changes required to address
the issue. Please feel free to comment.

Currently, apic id is decoded based on sockets/dies/cores/threads. This appears
to work for most standard configurations for AMD and other vendors. But this
decoding does not follow AMD's APIC ID enumeration. In some cases this
causes CPU topology inconstancy. While booting guest Kernel is trying to
validate topology. It finds the topology not aligning to EPYC models.

To fix the problem we need to build the topology as per the 
Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
Processors. It is available at https://www.amd.com/en/support/tech-docs

Here is the text from the PPR.
2.1.10.2.1.3
ApicId Enumeration Requirements
Operating systems are expected to use
Core::X86::Cpuid::SizeId[ApicIdCoreIdSize], the number of least
significant bits in the Initial APIC ID that indicate core ID within a
processor, in constructing per-core CPUID
masks. Core::X86::Cpuid::SizeId[ApicIdCoreIdSize] determines the maximum number
of cores (MNC) that the
processor could theoretically support, not the actual number of cores that are
actually implemented or enabled on
the processor, as indicated by Core::X86::Cpuid::SizeId[NC].
Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
• ApicId[6] = Socket ID.
• ApicId[5:4] = Node ID.
• ApicId[3] = Logical CCX L3 complex ID
• ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} :
{1'b0,LogicalCoreID[1:0]}.
"""

Babu Moger (5):
  hw/boards: Add sockets in CpuTopology structure
  hw/i386: Add AMD EPYC topology encoding
  i386: Use topology functions from topology.h
  hw/i386: Generate apicid based on cpu_type
  i386: Fix pkg_id offset EPYC

 hw/core/machine.c  |   1 +
 hw/i386/pc.c   |  82 ---
 include/hw/boards.h|   2 +
 include/hw/i386/topology.h | 140 
 target/i386/cpu.c  | 160 +
 vl.c   |   1 +
 6 files changed, 251 insertions(+), 135 deletions(-)

-- 
2.20.1

[Qemu-devel] [RFC PATCH 4/5] hw/i386: Generate apicid based on cpu_type

2019-07-31 Thread Moger, Babu

Check the cpu_type before calling the apicid functions
from topology.h.

Signed-off-by: Babu Moger 
---
 hw/i386/pc.c | 81 +---
 1 file changed, 70 insertions(+), 11 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index ef39463fd5..dad55c940f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -947,6 +947,36 @@ static uint32_t x86_cpu_apic_id_from_index(PCMachineState 
*pcms,
 }
 }
 
+/* Calculates initial APIC ID for a specific CPU index
+ *
+ * Currently we need to be able to calculate the APIC ID from the CPU index
+ * alone (without requiring a CPU object), as the QEMU<->Seabios interfaces 
have
+ * no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
+ * all CPUs up to max_cpus.
+ */
+static uint32_t x86_cpu_apic_id_from_index_epyc(PCMachineState *pcms,
+unsigned int cpu_index)
+{
+MachineState *ms = MACHINE(pcms);
+PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+uint32_t correct_id;
+static bool warned;
+
+correct_id = x86_apicid_from_cpu_idx_epyc(nb_numa_nodes, ms->smp.sockets,
+ ms->smp.cores, ms->smp.threads,
+ cpu_index);
+if (pcmc->compat_apic_id_mode) {
+if (cpu_index != correct_id && !warned && !qtest_enabled()) {
+error_report("APIC IDs set in compatibility mode, "
+ "CPU topology won't match the configuration");
+warned = true;
+}
+return cpu_index;
+} else {
+return correct_id;
+}
+}
+
 static void pc_build_smbios(PCMachineState *pcms)
 {
 uint8_t *smbios_tables, *smbios_anchor;
@@ -1619,7 +1649,7 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
 void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
 {
 PCMachineState *pcms = PC_MACHINE(ms);
-int64_t apic_id = x86_cpu_apic_id_from_index(pcms, id);
+int64_t apic_id;
 Error *local_err = NULL;
 
 if (id < 0) {
@@ -1627,6 +1657,11 @@ void pc_hot_add_cpu(MachineState *ms, const int64_t id, 
Error **errp)
 return;
 }
 
+if(!strncmp(ms->cpu_type, "EPYC", 4))
+apic_id = x86_cpu_apic_id_from_index_epyc(pcms, id);
+else
+apic_id = x86_cpu_apic_id_from_index(pcms, id);
+
 if (apic_id >= ACPI_CPU_HOTPLUG_ID_LIMIT) {
 error_setg(errp, "Unable to add CPU: %" PRIi64
", resulting APIC ID (%" PRIi64 ") is too large",
@@ -1658,8 +1693,13 @@ void pc_cpus_init(PCMachineState *pcms)
  *
  * This is used for FW_CFG_MAX_CPUS. See comments on bochs_bios_init().
  */
-pcms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
- ms->smp.max_cpus - 1) + 1;
+if(!strncmp(ms->cpu_type, "EPYC", 4))
+pcms->apic_id_limit = x86_cpu_apic_id_from_index_epyc(pcms,
+  ms->smp.max_cpus 
- 1) + 1;
+else
+pcms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
+ ms->smp.max_cpus - 1) 
+ 1;
+
 possible_cpus = mc->possible_cpu_arch_ids(ms);
 for (i = 0; i < ms->smp.cpus; i++) {
 pc_new_cpu(pcms, possible_cpus->cpus[i].arch_id, _fatal);
@@ -2387,6 +2427,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 PCMachineState *pcms = PC_MACHINE(hotplug_dev);
 unsigned int smp_cores = ms->smp.cores;
 unsigned int smp_threads = ms->smp.threads;
+unsigned int smp_sockets = ms->smp.sockets;
 
 if(!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
 error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
@@ -2437,16 +2478,26 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 topo.die_id = cpu->die_id;
 topo.core_id = cpu->core_id;
 topo.smt_id = cpu->thread_id;
-cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
-smp_threads, );
+   if (!strncmp(ms->cpu_type, "EPYC", 4)) {
+x86_topo_ids_from_idx_epyc(nb_numa_nodes, smp_sockets, smp_cores,
+   smp_threads, idx, );
+cpu->apic_id = apicid_from_topo_ids_epyc(smp_cores, smp_threads,
+ );
+   } else
+cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
+smp_threads, );
 }
 
 cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, );
 if (!cpu_slot) {
 MachineState *ms = MACHINE(pcms);
 
-x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
- smp_cores, smp_threads, );
+if(!strncmp(ms->cpu_type, "EPYC", 4))
+x86_topo_ids_from_apicid_epyc(cpu->apic_id, pcms->smp_dies,
+

Re: [Qemu-devel] [PATCH] i386: Disable TOPOEXT by default on "-cpu host"

2018-08-13 Thread Moger, Babu

Looks good. Did some basic testing.

Reviewed-by: Babu Moger 

> -Original Message-
> From: Richard W.M. Jones 
> Sent: Friday, August 10, 2018 2:41 AM
> To: Eduardo Habkost 
> Cc: qemu-devel@nongnu.org; Paolo Bonzini ;
> Richard Henderson ; Moger, Babu
> 
> Subject: Re: [PATCH] i386: Disable TOPOEXT by default on "-cpu host"
> 
> On Thu, Aug 09, 2018 at 07:18:52PM -0300, Eduardo Habkost wrote:
> > Enabling TOPOEXT is always allowed, but it can't be enabled
> > blindly by "-cpu host" because it may make guests crash if the
> > rest of the cache topology information isn't provided or isn't
> > consistent.
> >
> > This addresses the bug reported at:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1613277
> >
> > Signed-off-by: Eduardo Habkost 
> > ---
> >  target/i386/cpu.c | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 723e02221e..3ac627978f 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -849,6 +849,12 @@ static FeatureWordInfo
> feature_word_info[FEATURE_WORDS] = {
> >  },
> >  .cpuid_eax = 0x8001, .cpuid_reg = R_ECX,
> >  .tcg_features = TCG_EXT3_FEATURES,
> > +/*
> > + * TOPOEXT is always allowed but can't be enabled blindly by
> > + * "-cpu host", as it requires consistent cache topology info
> > + * to be provided so it doesn't confuse guests.
> > + */
> > +.no_autoenable_flags = CPUID_EXT3_TOPOEXT,
> >  },
> >  [FEAT_C000_0001_EDX] = {
> >  .feat_names = {
> 
> Can confirm that this fixes the problem observed on the original AMD
> Phenom machine, using qemu from git (@6ad908053) + your patch.
> Therefore:
> 
> Tested-by: Richard W.M. Jones 
> 
> Thanks,
> 
> Rich.
> 
> --
> Richard Jones, Virtualization Group, Red Hat
> http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-p2v converts physical machines to virtual machines.  Boot with a
> live CD or over the network (PXE) and turn machines into KVM guests.
> http://libguestfs.org/virt-v2v

Re: [Qemu-devel] [PATCH for-3.0] i386: Rename enum CacheType members

2018-07-18 Thread Moger, Babu



> -Original Message-
> From: Aleksandar Markovic [mailto:amarko...@wavecomp.com]
> Sent: Wednesday, July 18, 2018 8:35 AM
> To: Philippe Mathieu-Daudé ; Eduardo Habkost
> ; qemu-devel@nongnu.org
> Cc: Moger, Babu ; Paolo Bonzini
> ; Aurelien Jarno ; Richard
> Henderson 
> Subject: Re: [PATCH for-3.0] i386: Rename enum CacheType members
> 
> 
> On 07/17/2018 04:40 PM, Eduardo Habkost wrote:
> > Rename DCACHE to DATA_CACHE and ICACHE to INSTRUCTION_CACHE.
> >
> > This avoids conflict with Linux asm/cachectl.h macros and fixes
> > build failure on mips hosts.
> >
> > Reported-by: Philippe Mathieu-Daudé 
> > Signed-off-by: Eduardo Habkost 
> 
> Acked-by: Aleksandar Markovic 
> 
Reviewed-by: Babu Moger

Re: [Qemu-devel] [PATCH] pc: Fix typo on PC_COMPAT_2_12

2018-07-09 Thread Moger, Babu

Looks good. thanks

> -Original Message-
> From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> Sent: Monday, July 2, 2018 8:10 PM
> To: qemu-devel@nongnu.org
> Cc: Eduardo Habkost ; Paolo Bonzini
> ; Moger, Babu ; Michael
> S. Tsirkin ; Igor Mammedov 
> Subject: [PATCH] pc: Fix typo on PC_COMPAT_2_12
> 
> I forgot a hyphen when amending the compat code on commit
> e0051647 ("i386: Enable TOPOEXT feature on AMD EPYC CPU").
> 
> Fixes: e00516475c270dcb6705753da96063f95699abf2
> Signed-off-by: Eduardo Habkost 

Reviewed-by: Babu Moger 

> ---
> Bug detected by compat_checker:
> https://github.com/ehabkost/gdb-qemu
> ---
>  include/hw/i386/pc.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 4d99d69681..654003f44c 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -309,7 +309,7 @@ bool e820_get_entry(int, uint32_t, uint64_t *,
> uint64_t *);
>  .property = "xlevel",\
>  .value= stringify(0x800a),\
>  },{\
> -.driver   = "EPYC-IBPB" TYPE_X86_CPU,\
> +.driver   = "EPYC-IBPB-" TYPE_X86_CPU,\
>  .property = "xlevel",\
>  .value= stringify(0x800a),\
>  },
> --
> 2.18.0.rc1.1.g3f1ff2140

Re: [Qemu-devel] [PATCH v15 1/3] i386: Fix up the Node id for CPUID_8000_001E

2018-06-15 Thread Moger, Babu




> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org]
> On Behalf Of Babu Moger
> Sent: Friday, June 15, 2018 12:16 PM
> To: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; ehabk...@redhat.com
> Cc: qemu-devel@nongnu.org; mtosa...@redhat.com; k...@vger.kernel.org;
> k...@tripleback.net; ge...@hostfission.com; Moger, Babu
> 
> Subject: [PATCH v15 1/3] i386: Fix up the Node id for CPUID_8000_001E
> 
> This is part of topoext support. To keep the compatibility, we need
> to support all the combination of nr_cores and nr_threads currently
> supported. With this combination, we might end up with more nodes than
> we can support with real hardware. We need to fix up the node id to
> accommodate more nodes here. We can achieve this by shifting the bits.
> 
> Signed-off-by: Babu Moger 
> ---
>  target/i386/cpu.c | 23 ++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 7a4484b..5246be4 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -19,6 +19,7 @@
> 
>  #include "qemu/osdep.h"
>  #include "qemu/cutils.h"
> +#include "qemu/bitops.h"
> 
>  #include "cpu.h"
>  #include "exec/exec-all.h"
> @@ -472,6 +473,8 @@ static void encode_topo_cpuid801e(CPUState
> *cs, X86CPU *cpu,
> uint32_t *ecx, uint32_t *edx)
>  {
>  struct core_topology topo = {0};
> +unsigned long nodes;
> +int shift;
> 
>  build_core_topology(cs->nr_cores, cpu->core_id, );
>  *eax = cpu->apic_id;
> @@ -504,7 +507,25 @@ static void encode_topo_cpuid801e(CPUState
> *cs, X86CPU *cpu,
>   * 2  Socket id
>   *   1:0  Node id
>   */
> -*ecx = ((topo.num_nodes - 1) << 8) | (cpu->socket_id << 2) |
> topo.node_id;
> +if (topo.num_nodes <= 4) {
> +*ecx = ((topo.num_nodes - 1) << 8) | (cpu->socket_id << 2) |
> +topo.node_id;
> +} else {
> +/*
> + * Node id fix up. Actual hardware supports up to 4 nodes. But with
> + * more than 32 cores, we may end up with more than 4 nodes.
> + * Node id is a combination of socket id and node id. Only 
> requirement
> + * here is that this number should be unique accross the system.
> + * Shift the socket id to accommodate more nodes. We dont expect
> both
> + * socket id and node id to be big number at the same time. This is 
> not
> + * an ideal config but we need to to support it. Max bit we can have
> + * here is 8.
> + */
> +nodes = topo.num_nodes - 1;
> +shift = find_last_bit(, 8);
> +*ecx = ((topo.num_nodes - 1) << 8) | (cpu->socket_id << shift) |

Sorry. There is a bug here.  This should be (cpu->socket_id << (shift + 1) ). I 
will fix it.  Let me know the rest of the code.


> +topo.node_id;
> +}
>  *edx = 0;
>  }
> 
> --
> 1.8.3.1

Re: [Qemu-devel] [PATCH v14 5/6] i386: Disable TOPOEXT feature if it cannot be supported

2018-06-15 Thread Moger, Babu




> -Original Message-
> From: Moger, Babu
> Sent: Thursday, June 14, 2018 6:09 PM
> To: Moger, Babu ; Eduardo Habkost
> 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com
> Subject: RE: [PATCH v14 5/6] i386: Disable TOPOEXT feature if it cannot be
> supported
> 
> 
> > -Original Message-
> > From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org]
> > On Behalf Of Moger, Babu
> > Sent: Thursday, June 14, 2018 5:19 PM
> > To: Eduardo Habkost 
> > Cc: m...@redhat.com; marcel.apfelb...@gmail.com;
> pbonz...@redhat.com;
> > r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> > k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com
> > Subject: RE: [PATCH v14 5/6] i386: Disable TOPOEXT feature if it cannot be
> > supported
> >
> >
> >
> > > -Original Message-----
> > > From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> > > Sent: Thursday, June 14, 2018 2:13 PM
> > > To: Moger, Babu 
> > > Cc: m...@redhat.com; marcel.apfelb...@gmail.com;
> > pbonz...@redhat.com;
> > > r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> > > k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com
> > > Subject: Re: [PATCH v14 5/6] i386: Disable TOPOEXT feature if it cannot be
> > > supported
> > >
> > > On Wed, Jun 13, 2018 at 09:18:26PM -0400, Babu Moger wrote:
> > > > Disable the TOPOEXT feature if it cannot be supported.
> > > > We cannot support this feature with more than 2 nr_threads
> > > > or more than 32 cores in a socket.
> > > >
> > > > Signed-off-by: Babu Moger 
> > > > ---
> > > >  target/i386/cpu.c | 17 -
> > > >  1 file changed, 16 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > > index 2eb26da..637d8eb 100644
> > > > --- a/target/i386/cpu.c
> > > > +++ b/target/i386/cpu.c
> > > > @@ -4765,7 +4765,7 @@ static void x86_cpu_realizefn(DeviceState
> *dev,
> > > Error **errp)
> > > >  X86CPUClass *xcc = X86_CPU_GET_CLASS(dev);
> > > >  CPUX86State *env = >env;
> > > >  Error *local_err = NULL;
> > > > -static bool ht_warned;
> > > > +static bool ht_warned, topo_warned;
> > > >
> > > >  if (xcc->host_cpuid_required && !accel_uses_host_cpuid()) {
> > > >  char *name = x86_cpu_class_get_model_name(xcc);
> > > > @@ -4779,6 +4779,21 @@ static void x86_cpu_realizefn(DeviceState
> > *dev,
> > > Error **errp)
> > > >  return;
> > > >  }
> > > >
> > > > +/* Disable TOPOEXT if topology cannot be supported */
> > > > +if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT) {
> > > > +if (!topology_supports_topoext(MAX_CORES_IN_NODE *
> > > MAX_NODES_PER_SOCKET,
> > > > +  2)) {
> > >
> > > I understand you stopped using cpu->nr_cores/cpu->nr_threads
> > > because it was not filled yet.
> > >
> > > But why exactly do you need to do this before calling
> > > x86_cpu_expand_features()?
> >
> > We extend the xlevel in x86_cpu_expand_features based on the TOPOEXT
> > feature.
> > So, I thought it would be right to do that way.
> > >
> > > If you really need nr_cores and nr_threads to be available
> > > earlier, we could simply move their initialization to
> > > cpu_exec_initfn() instead of the solution you implemented in
> > > patch 4/6.
> > >
> > > > +env->features[FEAT_8000_0001_ECX] &=
> > !CPUID_EXT3_TOPOEXT;
> > >
> > > !CPUID_EXT3_TOPOEXT is 0, this will clear all bits in
> > > env->features[FEAT_8000_0001_ECX].  Did you mean
> > > ~CPUID_EXT3_TOPOEXT?
> >
> > Yes. That is correct.  Sorry.. I missed it.
> > >
> > >
> > > > +if (!topo_warned) {
> > > > +error_report("TOPOEXT feature cannot be supported with
> > more"
> > > > + " than %d cores or more than 2 threads 
> > > > per socket."
> > > > +

Re: [Qemu-devel] [PATCH v14 2/6] i386: Enable TOPOEXT feature on AMD EPYC CPU

2018-06-14 Thread Moger, Babu




> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org]
> On Behalf Of Moger, Babu
> Sent: Thursday, June 14, 2018 3:41 PM
> To: Eduardo Habkost 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com
> Subject: RE: [PATCH v14 2/6] i386: Enable TOPOEXT feature on AMD EPYC
> CPU
> 
> 
> 
> > -Original Message-
> > From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> > Sent: Thursday, June 14, 2018 1:40 PM
> > To: Moger, Babu 
> > Cc: m...@redhat.com; marcel.apfelb...@gmail.com;
> pbonz...@redhat.com;
> > r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> > k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com
> > Subject: Re: [PATCH v14 2/6] i386: Enable TOPOEXT feature on AMD EPYC
> > CPU
> >
> > On Wed, Jun 13, 2018 at 09:18:23PM -0400, Babu Moger wrote:
> > > Enable TOPOEXT feature on EPYC CPU. This is required to support
> > > hyperthreading on VM guests. Also extend xlevel to 0x801E.
> > >
> > > Signed-off-by: Babu Moger 
> > > ---
> > >  target/i386/cpu.c | 11 +--
> > >  1 file changed, 9 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > index 86fb1a4..2eb26da 100644
> > > --- a/target/i386/cpu.c
> > > +++ b/target/i386/cpu.c
> > > @@ -2554,7 +2554,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
> > >  .features[FEAT_8000_0001_ECX] =
> > >  CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
> > >  CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A |
> > CPUID_EXT3_ABM |
> > > -CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM |
> CPUID_EXT3_LAHF_LM,
> > > +CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM |
> CPUID_EXT3_LAHF_LM
> > |
> > > +CPUID_EXT3_TOPOEXT,
> > >  .features[FEAT_7_0_EBX] =
> > >  CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
> > CPUID_7_0_EBX_AVX2 |
> > >  CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
> > CPUID_7_0_EBX_RDSEED |
> > > @@ -2599,7 +2600,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
> > >  .features[FEAT_8000_0001_ECX] =
> > >  CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
> > >  CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A |
> > CPUID_EXT3_ABM |
> > > -CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM |
> CPUID_EXT3_LAHF_LM,
> > > +CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM |
> CPUID_EXT3_LAHF_LM
> > |
> > > +CPUID_EXT3_TOPOEXT,
> > >  .features[FEAT_8000_0008_EBX] =
> > >  CPUID_8000_0008_EBX_IBPB,
> > >  .features[FEAT_7_0_EBX] =
> >
> > This part is OK, but it requires patch 3/6 to be included in the
> > same patch.
> 
> Ok. Sure.
> >
> > > @@ -4667,6 +4669,11 @@ static void x86_cpu_expand_features(X86CPU
> > *cpu, Error **errp)
> > >  x86_cpu_adjust_level(cpu, >cpuid_min_xlevel,
> 0x800A);
> > >  }
> > >
> > > +/* TOPOEXT feature requires 0x801E */
> > > +if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT)
> {
> > > +x86_cpu_adjust_level(cpu, >cpuid_min_xlevel,
> 0x801E);
> > > +}
> >
> > This part needs to be done more carefully to avoid breaking
> > compatibility.  "-machine pc-q35-2.12 -cpu Opteron_G5,+topoext"
> > currently results in xlevel=0x801A, and this must not change.

Not sure if this could be a problem.  "+topoext" sets the feature bits. But 
xlevel is still 0x801A which does not look right.
I need to verify this case.
 
> Ok.
> >
> > I suggest just changing setting .xlevel=0x801E on EPYC at
> > builtin_x86_defs[1], and worry about automatically increasing
> > xlevel later.
> Ok.
> >
> > (If you change EPYC.xlevel in builtin_x86_defs, don't forget to
> > set EPYC.xlevel=0x800A on PC_COMPAT_2_12)
> Sure.
> >
> > --
> > Eduardo

Re: [Qemu-devel] [PATCH v14 5/6] i386: Disable TOPOEXT feature if it cannot be supported

2018-06-14 Thread Moger, Babu



> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org]
> On Behalf Of Moger, Babu
> Sent: Thursday, June 14, 2018 5:19 PM
> To: Eduardo Habkost 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com
> Subject: RE: [PATCH v14 5/6] i386: Disable TOPOEXT feature if it cannot be
> supported
> 
> 
> 
> > -Original Message-
> > From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> > Sent: Thursday, June 14, 2018 2:13 PM
> > To: Moger, Babu 
> > Cc: m...@redhat.com; marcel.apfelb...@gmail.com;
> pbonz...@redhat.com;
> > r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> > k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com
> > Subject: Re: [PATCH v14 5/6] i386: Disable TOPOEXT feature if it cannot be
> > supported
> >
> > On Wed, Jun 13, 2018 at 09:18:26PM -0400, Babu Moger wrote:
> > > Disable the TOPOEXT feature if it cannot be supported.
> > > We cannot support this feature with more than 2 nr_threads
> > > or more than 32 cores in a socket.
> > >
> > > Signed-off-by: Babu Moger 
> > > ---
> > >  target/i386/cpu.c | 17 -
> > >  1 file changed, 16 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > index 2eb26da..637d8eb 100644
> > > --- a/target/i386/cpu.c
> > > +++ b/target/i386/cpu.c
> > > @@ -4765,7 +4765,7 @@ static void x86_cpu_realizefn(DeviceState *dev,
> > Error **errp)
> > >  X86CPUClass *xcc = X86_CPU_GET_CLASS(dev);
> > >  CPUX86State *env = >env;
> > >  Error *local_err = NULL;
> > > -static bool ht_warned;
> > > +static bool ht_warned, topo_warned;
> > >
> > >  if (xcc->host_cpuid_required && !accel_uses_host_cpuid()) {
> > >  char *name = x86_cpu_class_get_model_name(xcc);
> > > @@ -4779,6 +4779,21 @@ static void x86_cpu_realizefn(DeviceState
> *dev,
> > Error **errp)
> > >  return;
> > >  }
> > >
> > > +/* Disable TOPOEXT if topology cannot be supported */
> > > +if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT) {
> > > +if (!topology_supports_topoext(MAX_CORES_IN_NODE *
> > MAX_NODES_PER_SOCKET,
> > > +  2)) {
> >
> > I understand you stopped using cpu->nr_cores/cpu->nr_threads
> > because it was not filled yet.
> >
> > But why exactly do you need to do this before calling
> > x86_cpu_expand_features()?
> 
> We extend the xlevel in x86_cpu_expand_features based on the TOPOEXT
> feature.
> So, I thought it would be right to do that way.
> >
> > If you really need nr_cores and nr_threads to be available
> > earlier, we could simply move their initialization to
> > cpu_exec_initfn() instead of the solution you implemented in
> > patch 4/6.
> >
> > > +env->features[FEAT_8000_0001_ECX] &=
> !CPUID_EXT3_TOPOEXT;
> >
> > !CPUID_EXT3_TOPOEXT is 0, this will clear all bits in
> > env->features[FEAT_8000_0001_ECX].  Did you mean
> > ~CPUID_EXT3_TOPOEXT?
> 
> Yes. That is correct.  Sorry.. I missed it.
> >
> >
> > > +if (!topo_warned) {
> > > +error_report("TOPOEXT feature cannot be supported with
> more"
> > > + " than %d cores or more than 2 threads per 
> > > socket."
> > > + " Disabling the feature.",
> > > + (MAX_CORES_IN_NODE * MAX_NODES_PER_SOCKET));
> > > +topo_warned = true;
> >
> > This will print a warning for "-cpu EPYC -smp 64,cores=64".
> > We shouldn't.

If we support all the values, we may not need this. 
> >
> > I'm starting to believe we shouldn't add TOPOEXT to EPYC unless
> > we're ready to make the TOPOEXT CPUID leaves work for all valid
> > -smp configurations.  If the feature will work only on a few
> > specific cases, the feature should be enabled explicitly using
> > "-cpu ...,+topoext".
> >
> > Is it really impossible to make CPUID return reasonable topology
> > data for larger nr_cores and nr_threads values?  It would make
> > everything much simpler.
> 
> I am starting to think about this.  We tried to limit the configuration based 
> on
> the actual hardware configuration.
> If we leave that decision to the user then we might allow the config
> whatever the user wants.
> I need to make some changes in for topology(0x8001e) information to
> make this work.
> 

One  more thought, we can allow all the configurations. If user creates 
supported configuration, it will work perfectly fine.
If user creates unsupported configuration(like more threads, more cores etc), 
we still create the topology, but it will not be ideal topology.
Reason for this is, I don't want to mess up the good configuration to support 
invalid config. That way we don't have to change anything in topology code now.
 
> >
> > --
> > Eduardo

Re: [Qemu-devel] [PATCH v14 5/6] i386: Disable TOPOEXT feature if it cannot be supported

2018-06-14 Thread Moger, Babu




> -Original Message-
> From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> Sent: Thursday, June 14, 2018 2:13 PM
> To: Moger, Babu 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com
> Subject: Re: [PATCH v14 5/6] i386: Disable TOPOEXT feature if it cannot be
> supported
> 
> On Wed, Jun 13, 2018 at 09:18:26PM -0400, Babu Moger wrote:
> > Disable the TOPOEXT feature if it cannot be supported.
> > We cannot support this feature with more than 2 nr_threads
> > or more than 32 cores in a socket.
> >
> > Signed-off-by: Babu Moger 
> > ---
> >  target/i386/cpu.c | 17 -
> >  1 file changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 2eb26da..637d8eb 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -4765,7 +4765,7 @@ static void x86_cpu_realizefn(DeviceState *dev,
> Error **errp)
> >  X86CPUClass *xcc = X86_CPU_GET_CLASS(dev);
> >  CPUX86State *env = >env;
> >  Error *local_err = NULL;
> > -static bool ht_warned;
> > +static bool ht_warned, topo_warned;
> >
> >  if (xcc->host_cpuid_required && !accel_uses_host_cpuid()) {
> >  char *name = x86_cpu_class_get_model_name(xcc);
> > @@ -4779,6 +4779,21 @@ static void x86_cpu_realizefn(DeviceState *dev,
> Error **errp)
> >  return;
> >  }
> >
> > +/* Disable TOPOEXT if topology cannot be supported */
> > +if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT) {
> > +if (!topology_supports_topoext(MAX_CORES_IN_NODE *
> MAX_NODES_PER_SOCKET,
> > +  2)) {
> 
> I understand you stopped using cpu->nr_cores/cpu->nr_threads
> because it was not filled yet.
> 
> But why exactly do you need to do this before calling
> x86_cpu_expand_features()?

We extend the xlevel in x86_cpu_expand_features based on the TOPOEXT feature.
So, I thought it would be right to do that way.  
> 
> If you really need nr_cores and nr_threads to be available
> earlier, we could simply move their initialization to
> cpu_exec_initfn() instead of the solution you implemented in
> patch 4/6.
> 
> > +env->features[FEAT_8000_0001_ECX] &= !CPUID_EXT3_TOPOEXT;
> 
> !CPUID_EXT3_TOPOEXT is 0, this will clear all bits in
> env->features[FEAT_8000_0001_ECX].  Did you mean
> ~CPUID_EXT3_TOPOEXT?

Yes. That is correct.  Sorry.. I missed it.
> 
> 
> > +if (!topo_warned) {
> > +error_report("TOPOEXT feature cannot be supported with 
> > more"
> > + " than %d cores or more than 2 threads per 
> > socket."
> > + " Disabling the feature.",
> > + (MAX_CORES_IN_NODE * MAX_NODES_PER_SOCKET));
> > +topo_warned = true;
> 
> This will print a warning for "-cpu EPYC -smp 64,cores=64".
> We shouldn't.
> 
> I'm starting to believe we shouldn't add TOPOEXT to EPYC unless
> we're ready to make the TOPOEXT CPUID leaves work for all valid
> -smp configurations.  If the feature will work only on a few
> specific cases, the feature should be enabled explicitly using
> "-cpu ...,+topoext".
> 
> Is it really impossible to make CPUID return reasonable topology
> data for larger nr_cores and nr_threads values?  It would make
> everything much simpler.

I am starting to think about this.  We tried to limit the configuration based 
on the actual hardware configuration.
If we leave that decision to the user then we might allow the config whatever 
the user wants. 
I need to make some changes in for topology(0x8001e) information to make 
this work.

> 
> --
> Eduardo

Re: [Qemu-devel] [PATCH v14 3/6] i386: Disable TOPOEXT feature on pc-2.12

2018-06-14 Thread Moger, Babu



> -Original Message-
> From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> Sent: Thursday, June 14, 2018 1:41 PM
> To: Moger, Babu 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com
> Subject: Re: [PATCH v14 3/6] i386: Disable TOPOEXT feature on pc-2.12
> 
> On Wed, Jun 13, 2018 at 09:18:24PM -0400, Babu Moger wrote:
> > Disable TOPOEXT feature for older machines.
> >
> > Signed-off-by: Babu Moger 
> > ---
> >  include/hw/i386/pc.h | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > index 04d1f8c..ecccf6b 100644
> > --- a/include/hw/i386/pc.h
> > +++ b/include/hw/i386/pc.h
> > @@ -303,6 +303,10 @@ bool e820_get_entry(int, uint32_t, uint64_t *,
> uint64_t *);
> >  .driver   = TYPE_X86_CPU,\
> >  .property = "legacy-cache",\
> >  .value= "on",\
> > +},{\
> > +.driver   = TYPE_X86_CPU,\
> > +.property = "topoext",\
> > +.value= "off",\
> >  },
> 
> This is OK, if combined with the first hunks of patch 2/6.
Ok. Sure.
> 
> --
> Eduardo

Re: [Qemu-devel] [PATCH v14 2/6] i386: Enable TOPOEXT feature on AMD EPYC CPU

2018-06-14 Thread Moger, Babu




> -Original Message-
> From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> Sent: Thursday, June 14, 2018 1:40 PM
> To: Moger, Babu 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com
> Subject: Re: [PATCH v14 2/6] i386: Enable TOPOEXT feature on AMD EPYC
> CPU
> 
> On Wed, Jun 13, 2018 at 09:18:23PM -0400, Babu Moger wrote:
> > Enable TOPOEXT feature on EPYC CPU. This is required to support
> > hyperthreading on VM guests. Also extend xlevel to 0x801E.
> >
> > Signed-off-by: Babu Moger 
> > ---
> >  target/i386/cpu.c | 11 +--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 86fb1a4..2eb26da 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -2554,7 +2554,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
> >  .features[FEAT_8000_0001_ECX] =
> >  CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
> >  CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A |
> CPUID_EXT3_ABM |
> > -CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM,
> > +CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM
> |
> > +CPUID_EXT3_TOPOEXT,
> >  .features[FEAT_7_0_EBX] =
> >  CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
> CPUID_7_0_EBX_AVX2 |
> >  CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 |
> CPUID_7_0_EBX_RDSEED |
> > @@ -2599,7 +2600,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
> >  .features[FEAT_8000_0001_ECX] =
> >  CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
> >  CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A |
> CPUID_EXT3_ABM |
> > -CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM,
> > +CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM
> |
> > +CPUID_EXT3_TOPOEXT,
> >  .features[FEAT_8000_0008_EBX] =
> >  CPUID_8000_0008_EBX_IBPB,
> >  .features[FEAT_7_0_EBX] =
> 
> This part is OK, but it requires patch 3/6 to be included in the
> same patch.

Ok. Sure.
> 
> > @@ -4667,6 +4669,11 @@ static void x86_cpu_expand_features(X86CPU
> *cpu, Error **errp)
> >  x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x800A);
> >  }
> >
> > +/* TOPOEXT feature requires 0x801E */
> > +if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT) {
> > +x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x801E);
> > +}
> 
> This part needs to be done more carefully to avoid breaking
> compatibility.  "-machine pc-q35-2.12 -cpu Opteron_G5,+topoext"
> currently results in xlevel=0x801A, and this must not change.
Ok.
> 
> I suggest just changing setting .xlevel=0x801E on EPYC at
> builtin_x86_defs[1], and worry about automatically increasing
> xlevel later.
Ok.
> 
> (If you change EPYC.xlevel in builtin_x86_defs, don't forget to
> set EPYC.xlevel=0x800A on PC_COMPAT_2_12)
Sure.
> 
> --
> Eduardo

Re: [Qemu-devel] [PATCH v14 1/6] i386: Set TOPOEXT unconditionally for comapatibility

2018-06-14 Thread Moger, Babu




> -Original Message-
> From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> Sent: Wednesday, June 13, 2018 9:22 PM
> To: Moger, Babu 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com
> Subject: Re: [PATCH v14 1/6] i386: Set TOPOEXT unconditionally for
> comapatibility
> 
> On Wed, Jun 13, 2018 at 09:18:22PM -0400, Babu Moger wrote:
> > Enabling TOPOEXT feature might cause compatibility issues if
> > older kernels does not set this feature. Lets set this feature
> > unconditionally.
> >
> > Signed-off-by: Babu Moger 
> > ---
> >  target/i386/kvm.c | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > index 445e0e0..6f2cca7 100644
> > --- a/target/i386/kvm.c
> > +++ b/target/i386/kvm.c
> > @@ -372,6 +372,12 @@ uint32_t
> kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
> >  if (host_tsx_blacklisted()) {
> >  ret &= ~(CPUID_7_0_EBX_RTM | CPUID_7_0_EBX_HLE);
> >  }
> > +} else if (function == 0x8001 && reg == R_ECX) {
> > +/* Enabling topoext feature might cause compatibility issues if
> > + * older kernel does not set this feature. Lets set this feature
> > + * unconditionally.
> > + */
> 
> Thanks.  I will apply and rewrite the comment as:
Sure. Thanks
> 
> /*
>  * It's safe to enable TOPOEXT even if it's not returned
>  * by GET_SUPPORTED_CPUID.  Unconditionally enabling
>  * TOPOEXT here let us keep CPU models runnable on
>  * older kernels even when TOPOEXT is enabled.
>  */
> 
> > +ret |= CPUID_EXT3_TOPOEXT;
> >  } else if (function == 0x8001 && reg == R_EDX) {
> >  /* On Intel, kvm returns cpuid according to the Intel spec,
> >   * so add missing bits according to the AMD spec:
> > --
> > 1.8.3.1
> >
> 
> --
> Eduardo

Re: [Qemu-devel] [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC CPU

2018-06-13 Thread Moger, Babu




> -Original Message-
> From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> Sent: Wednesday, June 13, 2018 1:49 PM
> To: Moger, Babu 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com; Jiri
> Denemark 
> Subject: Re: [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC
> CPU
> 
> On Wed, Jun 13, 2018 at 06:21:58PM +, Moger, Babu wrote:
> >
> >
> > > -Original Message-
> > > From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> > > Sent: Wednesday, June 13, 2018 1:18 PM
> > > To: Moger, Babu 
> > > Cc: m...@redhat.com; marcel.apfelb...@gmail.com;
> pbonz...@redhat.com;
> > > r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> > > k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com; Jiri
> > > Denemark 
> > > Subject: Re: [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC
> > > CPU
> > >
> > > On Wed, Jun 13, 2018 at 06:10:30PM +, Moger, Babu wrote:
> > > > > -Original Message-
> > > > > From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> > > > > Sent: Wednesday, June 13, 2018 12:18 PM
> > > > > To: Moger, Babu 
> > > > > Cc: m...@redhat.com; marcel.apfelb...@gmail.com;
> > > pbonz...@redhat.com;
> > > > > r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> > > > > k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com;
> Jiri
> > > > > Denemark 
> > > > > Subject: Re: [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD
> EPYC
> > > > > CPU
> > > > >
> > > > > On Wed, Jun 13, 2018 at 04:52:18PM +, Moger, Babu wrote:
> > > > > [...]
> > > > > > > What do you think our options are here?
> > > > > >
> > > > > > Should we drop automatic topoext completely and move forward?
> > > > > > What are your thoughts?
> > > > >
> > > > > Let's drop automatic topoext by now, and see if we find solutions
> > > > > later.  I don't want to hold the rest of the patches because of
> > > > > this.
> > > >
> > > > Ok. I will drop topoext.
> > > >
> > > > >
> > > > > I'm thinking we could simply make kvm_arch_get_supported_cpuid()
> > > > > always return TOPOEXT on AMD CPUs, because the feature flag don't
> > > > > really depend on any KVM code to work (is that correct?).
> > > >
> > > > Yes, that is correct. I don't see any dependent code on TOPOEXT in
> KVM
> > > driver.
> > > >
> > > > Ok. Let me add TOPOEXT flag for all the AMD cpus and see how it goes.
> > >
> > > Hmm, this could actually solve all of our problems, then:
> > >
> > > We can forget about auto-topoext: just add TOPOEXT in
> > > kvm_arch_get_supported_cpuid(), add TOPOEXT unconditionally to
> > > the CPU models where you are interested into (EPYC only?), and
> > > add topoext=off to pc-2.12 compat_props.
> > >
> >
> > Ok Sure.
> 
> Sorry, I forgot we still need to decide what to do if TOPOEXT is
> enabled in the CPU model (or command-line) but the -smp options
> are not compatible with it.

Yes.  I have kept that check.  But, I had to implement 
topology_supports_topoext bit differently.
Reason for this is we need to disable this feature before the  
x86_cpu_expand_features.
But problem is nr_cores and nr_threads are not populated at this time. It is 
populated in qemu_init_vcpus.
Please take a look at topology_supports_topoext again.
> 
> In other words, what should guest see on CPUID if using:
> 
> "-machine pc-q35-3.0 -cpu EPYC -smp 64,cores=64"
> or:
> "-machine pc-q35-3.0 -cpu Opteron_G5,+topoext -smp 64,cores=64"
> 
Tested both these cases. It works fine with some warning messages.

> I wonder what would happen if we just return zeroes on
> CPUID[0x81E] if !topology_supports_topoext(), instead of
> trying to clear/set TOPOEXT depending on the -smp option?  It
> would make things much simpler for QEMU and libvirt.

I did not see that difference in behavior if we clear the bit versus return 0s.
Sending new patches now. Please review.
One note: I will going on vacation from June 20th for couple of weeks.
 If possible I would like to close this feature. If we cannot that is fine. 
Just an FYI.

> 
> --
> Eduardo

Re: [Qemu-devel] [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC CPU

2018-06-13 Thread Moger, Babu




> -Original Message-
> From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> Sent: Wednesday, June 13, 2018 1:18 PM
> To: Moger, Babu 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com; Jiri
> Denemark 
> Subject: Re: [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC
> CPU
> 
> On Wed, Jun 13, 2018 at 06:10:30PM +, Moger, Babu wrote:
> > > -Original Message-
> > > From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> > > Sent: Wednesday, June 13, 2018 12:18 PM
> > > To: Moger, Babu 
> > > Cc: m...@redhat.com; marcel.apfelb...@gmail.com;
> pbonz...@redhat.com;
> > > r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> > > k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com; Jiri
> > > Denemark 
> > > Subject: Re: [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC
> > > CPU
> > >
> > > On Wed, Jun 13, 2018 at 04:52:18PM +, Moger, Babu wrote:
> > > [...]
> > > > > What do you think our options are here?
> > > >
> > > > Should we drop automatic topoext completely and move forward?
> > > > What are your thoughts?
> > >
> > > Let's drop automatic topoext by now, and see if we find solutions
> > > later.  I don't want to hold the rest of the patches because of
> > > this.
> >
> > Ok. I will drop topoext.
> >
> > >
> > > I'm thinking we could simply make kvm_arch_get_supported_cpuid()
> > > always return TOPOEXT on AMD CPUs, because the feature flag don't
> > > really depend on any KVM code to work (is that correct?).
> >
> > Yes, that is correct. I don't see any dependent code on TOPOEXT in KVM
> driver.
> >
> > Ok. Let me add TOPOEXT flag for all the AMD cpus and see how it goes.
> 
> Hmm, this could actually solve all of our problems, then:
> 
> We can forget about auto-topoext: just add TOPOEXT in
> kvm_arch_get_supported_cpuid(), add TOPOEXT unconditionally to
> the CPU models where you are interested into (EPYC only?), and
> add topoext=off to pc-2.12 compat_props.
> 

Ok Sure.

> Sorry for not noticing that before.  I was incorrectly assuming

No problem.

> that TOPOEXT was safe to enable only if it was returned by
> GET_SUPPORTED_CPUID.
> 
> --
> Eduardo

Re: [Qemu-devel] [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC CPU

2018-06-13 Thread Moger, Babu




> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org]
> On Behalf Of Moger, Babu
> Sent: Wednesday, June 13, 2018 1:11 PM
> To: Eduardo Habkost 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com; Jiri
> Denemark 
> Subject: RE: [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC
> CPU
> 
> 
> > -Original Message-
> > From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> > Sent: Wednesday, June 13, 2018 12:18 PM
> > To: Moger, Babu 
> > Cc: m...@redhat.com; marcel.apfelb...@gmail.com;
> pbonz...@redhat.com;
> > r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> > k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com; Jiri
> > Denemark 
> > Subject: Re: [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC
> > CPU
> >
> > On Wed, Jun 13, 2018 at 04:52:18PM +, Moger, Babu wrote:
> > [...]
> > > > What do you think our options are here?
> > >
> > > Should we drop automatic topoext completely and move forward?
> > > What are your thoughts?
> >
> > Let's drop automatic topoext by now, and see if we find solutions
> > later.  I don't want to hold the rest of the patches because of
> > this.
> 
> Ok. I will drop topoext.

Sorry, I mean automatic topoext.

> 
> >
> > I'm thinking we could simply make kvm_arch_get_supported_cpuid()
> > always return TOPOEXT on AMD CPUs, because the feature flag don't
> > really depend on any KVM code to work (is that correct?).
> 
> Yes, that is correct. I don't see any dependent code on TOPOEXT in KVM
> driver.
> 
> Ok. Let me add TOPOEXT flag for all the AMD cpus and see how it goes.
> 
> >
> > --
> > Eduardo

Re: [Qemu-devel] [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC CPU

2018-06-13 Thread Moger, Babu



> -Original Message-
> From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> Sent: Wednesday, June 13, 2018 12:18 PM
> To: Moger, Babu 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com; Jiri
> Denemark 
> Subject: Re: [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC
> CPU
> 
> On Wed, Jun 13, 2018 at 04:52:18PM +, Moger, Babu wrote:
> [...]
> > > What do you think our options are here?
> >
> > Should we drop automatic topoext completely and move forward?
> > What are your thoughts?
> 
> Let's drop automatic topoext by now, and see if we find solutions
> later.  I don't want to hold the rest of the patches because of
> this.

Ok. I will drop topoext.

> 
> I'm thinking we could simply make kvm_arch_get_supported_cpuid()
> always return TOPOEXT on AMD CPUs, because the feature flag don't
> really depend on any KVM code to work (is that correct?).

Yes, that is correct. I don't see any dependent code on TOPOEXT in KVM driver.

Ok. Let me add TOPOEXT flag for all the AMD cpus and see how it goes.

> 
> --
> Eduardo

Re: [Qemu-devel] [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC CPU

2018-06-13 Thread Moger, Babu



> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org]
> On Behalf Of Babu Moger
> Sent: Tuesday, June 12, 2018 2:47 PM
> To: Eduardo Habkost 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com; Jiri
> Denemark 
> Subject: Re: [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC
> CPU
> 
> 
> 
> On 06/12/2018 02:05 PM, Eduardo Habkost wrote:
> > On Tue, Jun 12, 2018 at 06:38:08PM +, Moger, Babu wrote:
> > [...]
> >>> I'm starting to think that enabling TOPOEXT automatically is
> >>> adding too much complexity and compatibility problems, and it's
> >>> better to leave this task to management software.
> >>>
> >>> The main problem here is:
> >>>
> >>> This works today with QEMU 2.12 + Linux <= 4.15:
> >>>$ $QEMU -machine pc -cpu EPYC,enforce -smp
> >>> 8,sockets=2,cores=2,threads=2"
> >>> and must keep working with QEMU 3.0 and Linux <= 4.15.
> >>>
> >>> In addition to that, the results for:
> >>>$ $QEMU -machine pc-q35-3.0 -cpu EPYC,enforce [...]
> >>> must be deterministic and expose exactly the same CPUID data even
> >>> if host hardware or software changes, as long as the QEMU
> >>> command-line is the same.
> >>>
> >>> Do you see a way to fulfill those two constraints while making
> >>> "-machine pc-q35-3.0 -cpu EPYC" enable TOPOEXT automatically?
> >>>
> >> Now(setting feature before x86_cpu_expand_features), enabling
> >> TOPOEXT appears to work fine.
> > What about the above constraints?  Are you really fulfilling
> > them?
> >
> > This one is tricky:
> >
> > ] This works today with QEMU 2.12 + Linux <= 4.15:
> > ]   $ $QEMU -machine pc -cpu EPYC,enforce -smp
> 8,sockets=2,cores=2,threads=2"
> > ] and must keep working with QEMU 3.0 and Linux <= 4.15.
>   This works fine on kernel <= 4.15 with some warnings(-smp
> 8,sockets=2,cores=2,threads=2 -cpu EPYC).
> 
> qemu-system-x86_64: warning: host doesn't support requested feature:
> CPUID.8001H:EDX.rdtscp [bit 27]
> qemu-system-x86_64: warning: host doesn't support requested feature:
> CPUID.8001H:ECX.topoext [bit 22]
> qemu-system-x86_64: This family of AMD CPU doesn't support
> hyperthreading(2). Please configure -smp options properly or try
> enabling topoext feature.
> continues..
> >
> > If we enable TOPOEXT unconditionally, the command-line won't work
> > with Linux <= 4.15.
> >
> > If we enable TOPOEXT only if the kernel returns TOPOEXT on
> > GET_SUPPORTED_CPUID, we break the second constraint:
> >
> > ] The results for:
> > ]   $ $QEMU -machine pc-q35-3.0 -cpu EPYC,enforce [...]
> > ] must be deterministic and expose exactly the same CPUID data even
> > ] if host hardware or software changes, as long as the QEMU
> > ] command-line is the same.
> >
> This fails on kernel  <= 4.15 with following messages((-smp
> 8,sockets=2,cores=2,threads=2 -cpu EPYC,enforce).
> qemu-system-x86_64: warning: host doesn't support requested feature:
> CPUID.8001H:EDX.rdtscp [bit 27]
> qemu-system-x86_64: warning: host doesn't support requested feature:
> CPUID.8001H:ECX.topoext [bit 22]
> qemu-system-x86_64: Host doesn't support requested features
> exits..
> 
> What do you think our options are here?

Should we drop automatic topoext completely and move forward? What are your 
thoughts?

Re: [Qemu-devel] [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC CPU

2018-06-12 Thread Moger, Babu




> -Original Message-
> From: Eduardo Habkost [mailto:ehabk...@redhat.com]
> Sent: Tuesday, June 12, 2018 12:40 PM
> To: Moger, Babu 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com; Jiri
> Denemark 
> Subject: Re: [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC
> CPU
> 
> On Tue, Jun 12, 2018 at 04:29:25PM +, Moger, Babu wrote:
> > > [...]
> > > > > +/* TOPOEXT feature requires 0x801E */
> > > > > +if (env->features[FEAT_8000_0001_ECX] &
> CPUID_EXT3_TOPOEXT)
> > > {
> > > > > +x86_cpu_adjust_level(cpu, >cpuid_min_xlevel,
> > > 0x801E);
> > > > > +}
> > > >
> > > > I suggest moving this hunk to a separate patch.  I'm not 100%
> > > > sure yet if this will require compat_props code to disable
> > > > auto-xlevel-increase on older machine-types.
> > >
> > > The problem here is that:
> > >   $QEMU -machine pc-i440fx-1.3 -cpu Opteron_G4,+topoext
> > > currently results in xlevel=0x801A, since QEMU 1.3.
> > >
> > > (The same applies to all machine-types between 1.3 and 2.12)
> > >
> > > I was hoping that we could declare topoext as non-migration-safe,
> > > but I believe libvirt will already include "topoext" when using
> > > "host-model" if the host CPU supports TOPOEXT.  Jiri, can you
> > > confirm that?
> > >
> > > We can address that with a "x-topoext-auto-xlevel" property, set
> > > to true on all CPU models by default, and disabled by
> > > PC_COMPAT_2_12.
> > >
> > > The code would become:
> > >
> > > if (cpu->topoext_auto_xlevel && env-
> >features[FEAT_8000_0001_ECX] &
> > > CPUID_EXT3_TOPOEXT) {
> > > x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x801E);
> > > }
> > >
> > > Or, we could simply declare that "-cpu Opteron_G4,+topoext" will
> > > never increase xlevel automatically (on any machine-type), and
> > > change the code above to:
> > >
> > > if (cpu->auto_topoext && env->features[FEAT_8000_0001_ECX] &
> > > CPUID_EXT3_TOPOEXT) {
> > > x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x801E);
> > > }
> >
> > I was going to do this.  But there is one problem.  We don't
> > set the CPUID_EXT3_TOPOEXT in CPU model table. So this won't
> > work.
> 
> Won't this work if the auto_topoext handling is done before
> x86_cpu_expand_features() is called?

Yes. That works. We can go with this approach.

> 
> 
> > One more thing I noticed that feature setting should happen
> > much before x86_cpu_realizefn.
> 
> Why?

I was wrong here.  It should be done before x86_cpu_expand_features.

> 
> >
> > Couple of options.
> > First option.
> > 1. Set both feature and xlevel here(in x86_cpu_expand_features).
> >  if (cpu->x_auto_topoext {
> > env->features[FEAT_8000_0001_ECX]  |= CPUID_EXT3_TOPOEXT;
> > x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x801E);
> >  }
> > 2. And remove feature setting in x86_cpu_realizefn.
> 
> This would make TOPOEXT be included in 'query-cpu-model-expansion
> model=EPYC', which would be incorrect because TOPOEXT won't
> always be enabled when using EPYC.
> 
> 
> >
> > Or
> >
> > Second option
> > 1.Set the feature bit in CPU model table.
> > 2. Set xlevel in x86_cpu_expand_features using cpu->x_auto_topoext
> > 3. And remove feature setting in x86_cpu_realizefn.
> >
> > I  prefer the second option.
> 
> Same here: TOPOEXT would be included in
> 'query-cpu-model-expansion model=EPYC', and this would be
> incorrect.
> 
> 
> I'm starting to think that enabling TOPOEXT automatically is
> adding too much complexity and compatibility problems, and it's
> better to leave this task to management software.
> 
> The main problem here is:
> 
> This works today with QEMU 2.12 + Linux <= 4.15:
>   $ $QEMU -machine pc -cpu EPYC,enforce -smp
> 8,sockets=2,cores=2,threads=2"
> and must keep working with QEMU 3.0 and Linux <= 4.15.
> 
> In addition to that, the results for:
>   $ $QEMU -machine pc-q35-3.0 -cpu EPYC,enforce [...]
> must be deterministic and expose exactly the same CPUID data even
> if host hardware or software changes, as long as the QEMU
> command-line is the same.
> 
> Do you see a way to fulfill those two constraints while making
> "-machine pc-q35-3.0 -cpu EPYC" enable TOPOEXT automatically?
> 

Now(setting feature before x86_cpu_expand_features), enabling TOPOEXT appears 
to work fine.

> --
> Eduardo

Re: [Qemu-devel] [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC CPU

2018-06-12 Thread Moger, Babu




> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org]
> On Behalf Of Eduardo Habkost
> Sent: Monday, June 11, 2018 4:10 PM
> To: Moger, Babu 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com; Jiri
> Denemark 
> Subject: Re: [PATCH v13 3/5] i386: Enable TOPOEXT feature on AMD EPYC
> CPU
> 
> On Mon, Jun 11, 2018 at 05:50:30PM -0300, Eduardo Habkost wrote:
> [...]
> > > +/* TOPOEXT feature requires 0x801E */
> > > +if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT)
> {
> > > +x86_cpu_adjust_level(cpu, >cpuid_min_xlevel,
> 0x801E);
> > > +}
> >
> > I suggest moving this hunk to a separate patch.  I'm not 100%
> > sure yet if this will require compat_props code to disable
> > auto-xlevel-increase on older machine-types.
> 
> The problem here is that:
>   $QEMU -machine pc-i440fx-1.3 -cpu Opteron_G4,+topoext
> currently results in xlevel=0x801A, since QEMU 1.3.
> 
> (The same applies to all machine-types between 1.3 and 2.12)
> 
> I was hoping that we could declare topoext as non-migration-safe,
> but I believe libvirt will already include "topoext" when using
> "host-model" if the host CPU supports TOPOEXT.  Jiri, can you
> confirm that?
> 
> We can address that with a "x-topoext-auto-xlevel" property, set
> to true on all CPU models by default, and disabled by
> PC_COMPAT_2_12.
> 
> The code would become:
> 
> if (cpu->topoext_auto_xlevel && env->features[FEAT_8000_0001_ECX] &
> CPUID_EXT3_TOPOEXT) {
> x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x801E);
> }
> 
> Or, we could simply declare that "-cpu Opteron_G4,+topoext" will
> never increase xlevel automatically (on any machine-type), and
> change the code above to:
> 
> if (cpu->auto_topoext && env->features[FEAT_8000_0001_ECX] &
> CPUID_EXT3_TOPOEXT) {
> x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x801E);
> }

I was going to do this.  But there is one problem.  We don't set the 
CPUID_EXT3_TOPOEXT in CPU model table. So this won't work.
One more thing I noticed that feature setting should happen much before 
x86_cpu_realizefn.

Couple of options. 
First option.
1. Set both feature and xlevel here(in x86_cpu_expand_features).
 if (cpu->x_auto_topoext {
env->features[FEAT_8000_0001_ECX]  |= CPUID_EXT3_TOPOEXT;
x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x801E);
 }
2. And remove feature setting in x86_cpu_realizefn.

Or 

Second option
1.Set the feature bit in CPU model table.
2. Set xlevel in x86_cpu_expand_features using cpu->x_auto_topoext
3. And remove feature setting in x86_cpu_realizefn.
 
I  prefer the second option. 

> 
> --
> Eduardo

Re: [Qemu-devel] [PATCH v13 2/5] i386: Introduce auto_topoext bit to manage topoext

2018-06-11 Thread Moger, Babu



> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org]
> On Behalf Of Eduardo Habkost
> Sent: Monday, June 11, 2018 4:05 PM
> To: Moger, Babu 
> Cc: m...@redhat.com; marcel.apfelb...@gmail.com; pbonz...@redhat.com;
> r...@twiddle.net; mtosa...@redhat.com; qemu-devel@nongnu.org;
> k...@vger.kernel.org; k...@tripleback.net; ge...@hostfission.com
> Subject: Re: [PATCH v13 2/5] i386: Introduce auto_topoext bit to manage
> topoext
> 
> On Mon, Jun 11, 2018 at 05:46:23PM -0300, Eduardo Habkost wrote:
> [...]
> > On PC_COMPAT_2_12, both would work:
> >   { TYPE_X86_CPU, "auto-topoext", "off" }
> > or
> >   { "EPYC" "-" TYPE_X86_CPU, "auto-topoext", "off" }.
> >
> > I prefer the latter, but both would work.
> 
> Oh, while we're at it: please name the property "x-auto-topoext",
> to indicate it's only for QEMU internal use or debugging, and not
> a supported command-line option.

Ok. Sure.

> 
> --
> Eduardo

1 2 >

1 - 100 of 156 matches

Mail list logo