Re: [PATCH v4 0/2] Fix overflow of the max number of IDs for logic processor and core
Kindly ping for any comments. Thanks, Qian On 8/29/2023 12:24 PM, Qian Wen wrote: > CPUID.1.EBX[23:16]: Maximum number of addressable IDs for logical > processors in this physical package. > CPUID.4:EAX[31:26]: Maximum number of addressable IDs for processor cores > in the physical package. > > The current qemu code doesn't limit the value written to these two fields. > If the guest has a huge number of cores, APs (application processor) will > fail to bring up and the wrong info will be reported. > According to HW behavior, setting max value written to CPUID.1.EBX[23:16] > to 255, and CPUID.4:EAX[31:26] to 63. > > --- > Changes v3 -> v4: > - Add "Reviewed-by" from Isaku and Xiaoyao. > - Rebase to the v8.1.0. > Changes v2 -> v3: > - Add patch 2. > - Revise the commit message and comment to be clearer. > - Using MIN() for limitation. > Changes v1 -> v2: > - Revise the commit message and comment to more clearer. > - Rebased to v8.1.0-rc2. > > Qian Wen (2): > target/i386: Avoid cpu number overflow in legacy topology > target/i386: Avoid overflow of the cache parameter enumerated by leaf > 4 > > target/i386/cpu.c | 8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > base-commit:f5fe7c17ac4e309e47e78f0f9761aebc8d2f2c81
Re: [PATCH v3 0/2] Fix overflow of the max number of IDs for logic processor and core
On 8/18/2023 3:33 AM, Isaku Yamahata wrote: > On Wed, Aug 16, 2023 at 04:06:56PM +0800, > Qian Wen wrote: > >> CPUID.1.EBX[23:16]: Maximum number of addressable IDs for logical >> processors in this physical package. >> CPUID.4:EAX[31:26]: Maximum number of addressable IDs for processor cores >> in the physical package. >> >> The current qemu code doesn't limit the value written to these two fields. >> If the guest has a huge number of cores, APs (application processor) will >> fail to bring up and the wrong info will be reported. >> According to HW behavior, setting max value written to CPUID.1.EBX[23:16] >> to 255, and CPUID.4:EAX[31:26] to 63. >> >> --- >> Changes v2 -> v3: >> - Add patch 2. >> - Revise the commit message and comment to be clearer. >> - Using MIN() for limitation. >> Changes v1 -> v2: >> - Revise the commit message and comment to more clearer. >> - Rebased to v8.1.0-rc2. >> >> Qian Wen (2): >> target/i386: Avoid cpu number overflow in legacy topology >> target/i386: Avoid overflow of the cache parameter enumerated by leaf 4 >> >> target/i386/cpu.c | 8 +--- >> 1 file changed, 5 insertions(+), 3 deletions(-) >> >> base-commit: 0d52116fd82cdd1f4a88837336af5b6290c364a4 >> -- >> 2.25.1 >> > The patch itself looks good. Can we add test cases? > We have some in qemu/tests/unit/test-x86-cpuid.c. Hi Isaku, thanks for your comments! I take a look, the test-x86-cpuid.c has some tests for topology functions, e.g., apicid_smt/core/die_width, apicid_core/die/pkg_offset, x86_apicid_from_cpu_idx... Do you mean adding another test for function cpu_x86_cpuid() like below? If so, it seems that some effort is required to instantiate CPUX86State for input of cpu_x86_cpuid, but the result of this test case is obvious. So, is it necessary to add this test? + uint32_t unused, eax, ebx; + /* CPUID.1.EBX[23:16]: Maximum number of addressable IDs for logical + processors in this physical package. + */ + cpu_x86_cpuid(env, 1, 0, &unused, &ebx, &unused, &unused); + g_assert_cmpuint(ebx & 0xFF, ==, 0xFF); + + /* CPUID.4:EAX[31:26]: Maximum number of addressable IDs for processor + cores in the physical package. + */ + cpu_x86_cpuid(env, 4, 0, &eax, &unused, &unused, &unused); + g_assert_cmpuint(ebx & 0xFC00, ==, 0xFC00); Thanks, Qian
Re: [PATCH] target/i386: Fix reporting of CPU dies when nr_cores=nr_threads=1
On 7/24/2023 2:59 AM, 小太 wrote: > When QEMU is started with `-smp D,sockets=1,dies=D,cores=1,threads=1` (that > is, 1 socket with D dies but each die contains just a single thread), both > Linux and Windows guests incorrectly interprets the system as having D > sockets with 1 die each > > Ultimately this is caused by various CPUID leaves not being die-aware in > their "threads per socket" calculations, so this patch fixes that > > These changes are referenced to the AMD PPR for Family 19h Model 01h (Milan) > and Family 17h Model 01h (Naples) manuals: > - CPUID_Fn0001_EBX[23:16]: Number of threads in the processor > (Core::X86::Cpuid::SizeId[NC] + 1) > - CPUID_Fn000B_EBX_x01[15:0]: Number of logical cores in processor >socket (not present until Rome) > - CPUID_Fn8001_ECX[1]: Multi core product > (Core::X86::Cpuid::SizeId[NC] != 0) > - CPUID_Fn8008_ECX[7:0]: The number of threads in the package - 1 > (Core::X86::Cpuid::SizeId[NC]) > > Note there are two remaining occurences that I didn't touch: > - CPUID_Fn801E_ECX[10:8]: Always 0 (1 node per processor) for Milan. >But for Naples, it can also be 2 or 4 nodes >where each node is defined as one or two >CCXes (CCD?). But Milan also has multiple >CCXes, so clearly the definition of a node is >different from model to model, so I've left >it untouched. (QEMU seems to use the Naples >definition) > - MSR_CORE_THREAD_COUNT: This MSR doesn't exist on Milan or Naples > > Signed-off-by: 小太 > --- > target/i386/cpu.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index 97ad229d8b..6ff23fa590 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -6049,8 +6049,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, > uint32_t count, > *ecx |= CPUID_EXT_OSXSAVE; > } > *edx = env->features[FEAT_1_EDX]; > -if (cs->nr_cores * cs->nr_threads > 1) { > -*ebx |= (cs->nr_cores * cs->nr_threads) << 16; > +if (env->nr_dies * cs->nr_cores * cs->nr_threads > 1) { > +*ebx |= (env->nr_dies * cs->nr_cores * cs->nr_threads) << 16; The nr_cores in CPUState means "Number of cores within this CPU package", so it doesn't need "nr_dies" here. Zhao Liu is fixing the calculation of nr_cores [1]. [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg07187.html Thanks, Qian > *edx |= CPUID_HT; > } > if (!cpu->enable_pmu) { > @@ -6230,7 +6230,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, > uint32_t count, > break; > case 1: > *eax = apicid_pkg_offset(&topo_info); > -*ebx = cs->nr_cores * cs->nr_threads; > +*ebx = env->nr_dies * cs->nr_cores * cs->nr_threads; > *ecx |= CPUID_TOPOLOGY_LEVEL_CORE; > break; > default: > @@ -6496,7 +6496,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, > uint32_t count, > * discards multiple thread information if it is set. > * So don't set it here for Intel to make Linux guests happy. > */ > -if (cs->nr_cores * cs->nr_threads > 1) { > +if (env->nr_dies * cs->nr_cores * cs->nr_threads > 1) { > if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 || > env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 || > env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) { > @@ -6562,7 +6562,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, > uint32_t count, > *eax |= (cpu_x86_virtual_addr_width(env) << 8); > } > *ebx = env->features[FEAT_8000_0008_EBX]; > -if (cs->nr_cores * cs->nr_threads > 1) { > +if (env->nr_dies * cs->nr_cores * cs->nr_threads > 1) { > /* > * Bits 15:12 is "The number of bits in the initial > * Core::X86::Apic::ApicId[ApicId] value that indicate > @@ -6570,7 +6570,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, > uint32_t count, > * Bits 7:0 is "The number of threads in the package is NC+1" > */ > *ecx = (apicid_pkg_offset(&topo_info) << 12) | > - ((cs->nr_cores * cs->nr_threads) - 1); > + ((env->nr_dies * cs->nr_cores * cs->nr_threads) - 1); > } else { > *ecx = 0; > }
Re: [PATCH v2] target/i386: Avoid cpu number overflow in legacy topology
On 8/14/2023 7:59 AM, Zhao Liu wrote: > Hi Qian, > > On Sun, Aug 13, 2023 at 06:49:40PM +0800, Wen, Qian wrote: > > [snip] > >>> also perhaps double check if we could do induce similar overflow >>> tweaking other -smp properties (todo for another patch[es] if there are >>> such places). >> I have a check, the CPUID.0x4:EAX[31:26] indicates the Maximum number of >> addressable IDs for processor cores in the physical package. >> If we launch over 64 cores VM, the 6-bits field will also overflow. I will >> add the following fix to patch2 in v3. > Good catch! > > Just discussion, I find if we use APIC ID offset to encode 0x4, then it > seems no need for an explicit check [1], right? > > [1]: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg00027.html Yes. The offset is always power of 2, so the value written to the 6-bit field likes 0b, 0b1, 0b11, 0b111... So, EAX[31:26] will be expected, i.e., 0x3f, when the value is overflow and truncated. > > Thanks, > Zhao > >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c >> index 52a2a1a1c7..9c1ae3d83d 100644 >> --- a/target/i386/cpu.c >> +++ b/target/i386/cpu.c >> @@ -243,6 +243,7 @@ static void encode_cache_cpuid4(CPUCacheInfo *cache, >> cache->partitions * cache->sets); >> >> assert(num_apic_ids > 0); >> + num_cores = num_cores > 64 ? 64 : num_cores; >> *eax = CACHE_TYPE(cache->type) | >> CACHE_LEVEL(cache->level) | >> (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) | >> >> >> Thanks, >> Qian >>>>>> *edx |= CPUID_HT; >>>>>> } >>>>>> if (!cpu->enable_pmu) {
Re: [PATCH v2] target/i386: Avoid cpu number overflow in legacy topology
On 8/9/2023 9:47 PM, Igor Mammedov wrote: > On Wed, 9 Aug 2023 21:20:48 +0800 > "Wen, Qian" wrote: > >> On 8/9/2023 7:14 PM, Igor Mammedov wrote: >>> On Wed, 9 Aug 2023 18:27:32 +0800 >>> Qian Wen wrote: >>> >>>> The legacy topology enumerated by CPUID.1.EBX[23:16] is defined in SDM >>>> Vol2: >>>> >>>> Bits 23-16: Maximum number of addressable IDs for logical processors in >>>> this physical package. >>>> >>>> When launching the VM with -smp 256, the value written to EBX[23:16] is >>>> 0 because of data overflow. If the guest only supports legacy topology, >>>> without V2 Extended Topology enumerated by CPUID.0x1f or Extended >>>> Topology enumerated by CPUID.0x0b to support over 255 CPUs, the return >>>> of the kernel invoking cpu_smt_allowed() is false and AP's bring-up will >>>> fail. Then only CPU 0 is online, and others are offline. >>>> >>>> To avoid this issue caused by overflow, limit the max value written to >>>> EBX[23:16] to 255. >>> what happens on real hw or in lack of thereof what SDM says about this >>> value when there is more than 255 threads?. >>> >> Current SDM doesn't specify what the value should be when APIC IDs per >> package exceeds 255. So we asked the internal HW architect, the response is >> that EBX[23:16] will report 255 instead of being truncated to a smaller >> value. > then mention it in commit log so one wouldn't wonder where the value came > from. Ok, thanks for your suggestion! >> Thanks, >> Qian >> >>>> Signed-off-by: Qian Wen >>>> --- >>>> Changes v1 -> v2: >>>> - Revise the commit message and comment to more clearer. >>>> - Rebased to v8.1.0-rc2. >>>> --- >>>> target/i386/cpu.c | 16 ++-- >>>> 1 file changed, 14 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c >>>> index 97ad229d8b..6e1d88fbd7 100644 >>>> --- a/target/i386/cpu.c >>>> +++ b/target/i386/cpu.c >>>> @@ -6008,6 +6008,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, >>>> uint32_t count, >>>> uint32_t die_offset; >>>> uint32_t limit; >>>> uint32_t signature[3]; >>>> +uint32_t threads_per_socket; >>>> X86CPUTopoInfo topo_info; >>>> >>>> topo_info.dies_per_pkg = env->nr_dies; >>>> @@ -6049,8 +6050,19 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t >>>> index, uint32_t count, >>>> *ecx |= CPUID_EXT_OSXSAVE; >>>> } >>>> *edx = env->features[FEAT_1_EDX]; >>>> -if (cs->nr_cores * cs->nr_threads > 1) { >>>> -*ebx |= (cs->nr_cores * cs->nr_threads) << 16; >>>> +/* >>>> + * Only bits [23:16] represent the maximum number of addressable >>>> + * IDs for logical processors in this physical package. >>>> + * When thread_per_socket > 255, it will 1) overwrite bits[31:24] >>>> + * which is apic_id, 2) bits [23:16] get truncated. >>>> + */ >>>> +threads_per_socket = cs->nr_cores * cs->nr_threads; >>>> +if (threads_per_socket > 255) { >>>> +threads_per_socket = 255; >>>> +} >>>> + >>>> +if (threads_per_socket > 1) { >>>> +*ebx |= threads_per_socket << 16; > ^ > more robust would be mask out non-relevant fields at rhs I think a mask for this case is a bit redundant, since the limitation of 255 already filtered non-relevant fields. I prefer not to add the mask here and keep code style consistency with others place. > also perhaps double check if we could do induce similar overflow > tweaking other -smp properties (todo for another patch[es] if there are such > places). I have a check, the CPUID.0x4:EAX[31:26] indicates the Maximum number of addressable IDs for processor cores in the physical package. If we launch over 64 cores VM, the 6-bits field will also overflow. I will add the following fix to patch2 in v2. diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 52a2a1a1c7..9c1ae3d83d 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -243,6 +243,7 @@ static void encode_cache_cpuid4(CPUCacheInfo *cache, cache->partitions * cache->sets); assert(num_apic_ids > 0); + num_cores = num_cores > 64 ? 64 : num_cores; *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) | (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) | Thanks, Qian >>>> *edx |= CPUID_HT; >>>> } >>>> if (!cpu->enable_pmu) {
Re: [PATCH v2] target/i386: Avoid cpu number overflow in legacy topology
On 8/9/2023 7:14 PM, Igor Mammedov wrote: > On Wed, 9 Aug 2023 18:27:32 +0800 > Qian Wen wrote: > >> The legacy topology enumerated by CPUID.1.EBX[23:16] is defined in SDM >> Vol2: >> >> Bits 23-16: Maximum number of addressable IDs for logical processors in >> this physical package. >> >> When launching the VM with -smp 256, the value written to EBX[23:16] is >> 0 because of data overflow. If the guest only supports legacy topology, >> without V2 Extended Topology enumerated by CPUID.0x1f or Extended >> Topology enumerated by CPUID.0x0b to support over 255 CPUs, the return >> of the kernel invoking cpu_smt_allowed() is false and AP's bring-up will >> fail. Then only CPU 0 is online, and others are offline. >> >> To avoid this issue caused by overflow, limit the max value written to >> EBX[23:16] to 255. > what happens on real hw or in lack of thereof what SDM says about this > value when there is more than 255 threads?. > Current SDM doesn't specify what the value should be when APIC IDs per package exceeds 255. So we asked the internal HW architect, the response is that EBX[23:16] will report 255 instead of being truncated to a smaller value. Thanks, Qian >> Signed-off-by: Qian Wen >> --- >> Changes v1 -> v2: >> - Revise the commit message and comment to more clearer. >> - Rebased to v8.1.0-rc2. >> --- >> target/i386/cpu.c | 16 ++-- >> 1 file changed, 14 insertions(+), 2 deletions(-) >> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c >> index 97ad229d8b..6e1d88fbd7 100644 >> --- a/target/i386/cpu.c >> +++ b/target/i386/cpu.c >> @@ -6008,6 +6008,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, >> uint32_t count, >> uint32_t die_offset; >> uint32_t limit; >> uint32_t signature[3]; >> +uint32_t threads_per_socket; >> X86CPUTopoInfo topo_info; >> >> topo_info.dies_per_pkg = env->nr_dies; >> @@ -6049,8 +6050,19 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, >> uint32_t count, >> *ecx |= CPUID_EXT_OSXSAVE; >> } >> *edx = env->features[FEAT_1_EDX]; >> -if (cs->nr_cores * cs->nr_threads > 1) { >> -*ebx |= (cs->nr_cores * cs->nr_threads) << 16; >> +/* >> + * Only bits [23:16] represent the maximum number of addressable >> + * IDs for logical processors in this physical package. >> + * When thread_per_socket > 255, it will 1) overwrite bits[31:24] >> + * which is apic_id, 2) bits [23:16] get truncated. >> + */ >> +threads_per_socket = cs->nr_cores * cs->nr_threads; >> +if (threads_per_socket > 255) { >> +threads_per_socket = 255; >> +} >> + >> +if (threads_per_socket > 1) { >> +*ebx |= threads_per_socket << 16; >> *edx |= CPUID_HT; >> } >> if (!cpu->enable_pmu) {
Re: [PATCH] target/i386: Avoid cpu number overflow in legacy topology
On 8/7/2023 4:08 PM, Zhao Liu wrote: > On Fri, Jul 28, 2023 at 04:01:50PM +0800, Qian Wen wrote: >> Date: Fri, 28 Jul 2023 16:01:50 +0800 >> From: Qian Wen >> Subject: [PATCH] target/i386: Avoid cpu number overflow in legacy topology >> X-Mailer: git-send-email 2.25.1 >> >> The legacy topology enumerated by CPUID.1.EBX[23:16] is defined in SDM >> Vol2: >> >> Bits 23-16: Maximum number of addressable IDs for logical processors in >> this physical package. >> >> To avoid data overflow, limit the max value written to EBX[23:16] to >> 255. >> >> Signed-off-by: Qian Wen >> --- >> target/i386/cpu.c | 15 +-- >> 1 file changed, 13 insertions(+), 2 deletions(-) >> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c >> index 1294be374ab2..70589a58b727 100644 >> --- a/target/i386/cpu.c >> +++ b/target/i386/cpu.c >> @@ -5356,6 +5356,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, >> uint32_t count, >> uint32_t die_offset; >> uint32_t limit; >> uint32_t signature[3]; >> +uint32_t threads_per_socket; >> X86CPUTopoInfo topo_info; >> >> topo_info.dies_per_pkg = env->nr_dies; >> @@ -5397,8 +5398,18 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, >> uint32_t count, >> *ecx |= CPUID_EXT_OSXSAVE; >> } >> *edx = env->features[FEAT_1_EDX]; >> -if (cs->nr_cores * cs->nr_threads > 1) { >> -*ebx |= (cs->nr_cores * cs->nr_threads) << 16; >> +/* >> + * The vCPU number more than 255 needs support of V2 Extended >> + * Topology enumerated by CPUID.0x1f or Extended Topology >> + * enumerated by CPUID.0x0b. >> + */ >> +threads_per_socket = cs->nr_cores * cs->nr_threads; >> +if (threads_per_socket > 255) { >> +threads_per_socket = 255; > Straight encoding to 255 is good for me! > > -Zhao Got it, thanks! Thanks, Qian >> +} >> + >> +if (threads_per_socket > 1) { >> +*ebx |= threads_per_socket << 16; >> *edx |= CPUID_HT; >> } >> /* >> -- >> 2.25.1 >>
Re: [PATCH] target/i386: Avoid cpu number overflow in legacy topology
On 8/7/2023 3:36 PM, Xiaoyao Li wrote: > On 7/28/2023 4:01 PM, Qian Wen wrote: >> The legacy topology enumerated by CPUID.1.EBX[23:16] is defined in SDM >> Vol2: >> >> Bits 23-16: Maximum number of addressable IDs for logical processors in >> this physical package. >> >> To avoid data overflow, limit the max value written to EBX[23:16] to >> 255. > > It's better explain what's issue when overflow happens. > When launch vm with -smp 256, the value writes to EBX[23:16] is 0. If the guest only support legacy topology, the result of kernel invokes cpu_smt_allowed() is false and AP's bring-up will fail. Then only CPU 0 is online, others offline. >> Signed-off-by: Qian Wen >> --- >> target/i386/cpu.c | 15 +-- >> 1 file changed, 13 insertions(+), 2 deletions(-) >> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c >> index 1294be374ab2..70589a58b727 100644 >> --- a/target/i386/cpu.c >> +++ b/target/i386/cpu.c >> @@ -5356,6 +5356,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, >> uint32_t count, >> uint32_t die_offset; >> uint32_t limit; >> uint32_t signature[3]; >> + uint32_t threads_per_socket; >> X86CPUTopoInfo topo_info; >> topo_info.dies_per_pkg = env->nr_dies; >> @@ -5397,8 +5398,18 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, >> uint32_t count, >> *ecx |= CPUID_EXT_OSXSAVE; >> } >> *edx = env->features[FEAT_1_EDX]; >> - if (cs->nr_cores * cs->nr_threads > 1) { >> - *ebx |= (cs->nr_cores * cs->nr_threads) << 16; >> + /* >> + * The vCPU number more than 255 needs support of V2 Extended >> + * Topology enumerated by CPUID.0x1f or Extended Topology >> + * enumerated by CPUID.0x0b. >> + */ > > the above comment doesn't explain why it needs below. > > you can explain only bits [23:16] represents the maximum number of > addressable IDs for logical processors in this physical package. > > When thread_per_socket > 255, it will 1) overwrite bits[31:24] which is > apic_id, 2) bits [23:16] gets truncated. Thanks for your suggestion, I will add your description in v2. Thanks, Qian > >> + threads_per_socket = cs->nr_cores * cs->nr_threads; >> + if (threads_per_socket > 255) { >> + threads_per_socket = 255; >> + } >> + >> + if (threads_per_socket > 1) { >> + *ebx |= threads_per_socket << 16; >> *edx |= CPUID_HT; >> } >> /* >