On 12/04/2017 08:13 AM, Prarit Bhargava wrote:
> 
> 
> x86: Booting SMP configuration:
> .... node  #0, CPUs:        #1  #2  #3  #4
> .... node  #1, CPUs:    #5  #6  #7  #8  #9
> .... node  #0, CPUs:   #10 #11 #12 #13 #14
> .... node  #1, CPUs:   #15 #16 #17 #18 #19
> smp: Brought up 2 nodes, 20 CPUs
> smpboot: Max logical packages: 1
> 
> which means that the calculation of logical packages is wrong because
> 
>       ncpus = cpu_data(0).booted_cores * smp_num_siblings;
>       ncpus = 10 * 2;
>       ncpus = 20;
> 
> smp_num_siblings is defined as "The number of threads in a core" which
> should be 1 if HT/SMT is disabled.
> 
> It looks like my patch has exposed a bug in the
> smp_num_siblings calculation.   I'm still debugging ...

The bug is that smp_num_siblings has been incorrectly calculated as the
*maximum* number of threads in a core, and not the actual number of threads in
a core on systems which have a CPUID level greater than 0xb.  (see
arch/x86/kernel/cpu/topology.c:59)

That will take some time to investigate and come up with a proper solution and
fix.  In the meantime, the patch below will fix the problem in the short-term.
I've tested the patch using SMT enabled, SMT disabled, maxcpus=1 and nr_cpus=1.

tglx, Please revert b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages
estimate") if you think that is a better option.  The problem with
smp_num_siblings has been around for almost a decade.

P.

---8<---

Subject: [PATCH] arch/x86: Do not use smp_num_siblings in
 __max_logical_packages calculation

Documentation/x86/topology.txt defines smp_num_siblings as "The number of
threads in a core".  Since commit bbb65d2d365e ("x86: use cpuid vector 0xb
when available for detecting cpu topology") smp_num_siblings is the
maximum number of threads in a core.  If Simultaneous MultiThreading
(SMT) is disabled on a system, smp_num_siblings is 2 and not 1 as
expected.

Use topology_max_smt_threads() in the __max_logical_packages calculation.

Signed-off-by: Prarit Bhargava <pra...@redhat.com
Cc: Jakub Kicinski <kubak...@wp.pl>
Cc: "net...@vger.kernel.org" <net...@vger.kernel.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Clark Williams <willi...@redhat.com>
---
 arch/x86/kernel/smpboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3d01df7d7cf6..eaee15fb7d8b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1304,7 +1304,7 @@ void __init native_smp_cpus_done(unsigned int max_cpus)
         * Today neither Intel nor AMD support heterogenous systems so
         * extrapolate the boot cpu's data to all packages.
         */
-       ncpus = cpu_data(0).booted_cores * smp_num_siblings;
+       ncpus = cpu_data(0).booted_cores * topology_max_smt_threads();
        __max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
        pr_info("Max logical packages: %u\n", __max_logical_packages);
 
-- 
1.8.3.1

Reply via email to