Re: [bisected] x86 boot still broken on -rc2

2017-12-04 Thread Jakub Kicinski
On Mon,  4 Dec 2017 11:45:21 -0500, Prarit Bhargava wrote:
> On 12/04/2017 08:13 AM, Prarit Bhargava wrote:
> > x86: Booting SMP configuration:
> >  node  #0, CPUs:#1  #2  #3  #4
> >  node  #1, CPUs:#5  #6  #7  #8  #9
> >  node  #0, CPUs:   #10 #11 #12 #13 #14
> >  node  #1, CPUs:   #15 #16 #17 #18 #19
> > smp: Brought up 2 nodes, 20 CPUs
> > smpboot: Max logical packages: 1
> > 
> > which means that the calculation of logical packages is wrong because
> > 
> >   ncpus = cpu_data(0).booted_cores * smp_num_siblings;
> >   ncpus = 10 * 2;
> >   ncpus = 20;
> > 
> > smp_num_siblings is defined as "The number of threads in a core" which
> > should be 1 if HT/SMT is disabled.
> > 
> > It looks like my patch has exposed a bug in the
> > smp_num_siblings calculation.   I'm still debugging ...  
> 
> The bug is that smp_num_siblings has been incorrectly calculated as the
> *maximum* number of threads in a core, and not the actual number of threads in
> a core on systems which have a CPUID level greater than 0xb.  (see
> arch/x86/kernel/cpu/topology.c:59)
> 
> That will take some time to investigate and come up with a proper solution and
> fix.  In the meantime, the patch below will fix the problem in the short-term.
> I've tested the patch using SMT enabled, SMT disabled, maxcpus=1 and 
> nr_cpus=1.

Thanks Prarit, the work around does the job!  Indeed, I have SMT
disabled.


Re: [bisected] x86 boot still broken on -rc2

2017-12-04 Thread Jakub Kicinski
On Mon,  4 Dec 2017 11:45:21 -0500, Prarit Bhargava wrote:
> On 12/04/2017 08:13 AM, Prarit Bhargava wrote:
> > x86: Booting SMP configuration:
> >  node  #0, CPUs:#1  #2  #3  #4
> >  node  #1, CPUs:#5  #6  #7  #8  #9
> >  node  #0, CPUs:   #10 #11 #12 #13 #14
> >  node  #1, CPUs:   #15 #16 #17 #18 #19
> > smp: Brought up 2 nodes, 20 CPUs
> > smpboot: Max logical packages: 1
> > 
> > which means that the calculation of logical packages is wrong because
> > 
> >   ncpus = cpu_data(0).booted_cores * smp_num_siblings;
> >   ncpus = 10 * 2;
> >   ncpus = 20;
> > 
> > smp_num_siblings is defined as "The number of threads in a core" which
> > should be 1 if HT/SMT is disabled.
> > 
> > It looks like my patch has exposed a bug in the
> > smp_num_siblings calculation.   I'm still debugging ...  
> 
> The bug is that smp_num_siblings has been incorrectly calculated as the
> *maximum* number of threads in a core, and not the actual number of threads in
> a core on systems which have a CPUID level greater than 0xb.  (see
> arch/x86/kernel/cpu/topology.c:59)
> 
> That will take some time to investigate and come up with a proper solution and
> fix.  In the meantime, the patch below will fix the problem in the short-term.
> I've tested the patch using SMT enabled, SMT disabled, maxcpus=1 and 
> nr_cpus=1.

Thanks Prarit, the work around does the job!  Indeed, I have SMT
disabled.


Re: [bisected] x86 boot still broken on -rc2

2017-12-04 Thread Prarit Bhargava
On 12/04/2017 08:13 AM, Prarit Bhargava wrote:
> 
> 
> x86: Booting SMP configuration:
>  node  #0, CPUs:#1  #2  #3  #4
>  node  #1, CPUs:#5  #6  #7  #8  #9
>  node  #0, CPUs:   #10 #11 #12 #13 #14
>  node  #1, CPUs:   #15 #16 #17 #18 #19
> smp: Brought up 2 nodes, 20 CPUs
> smpboot: Max logical packages: 1
> 
> which means that the calculation of logical packages is wrong because
> 
>   ncpus = cpu_data(0).booted_cores * smp_num_siblings;
>   ncpus = 10 * 2;
>   ncpus = 20;
> 
> smp_num_siblings is defined as "The number of threads in a core" which
> should be 1 if HT/SMT is disabled.
> 
> It looks like my patch has exposed a bug in the
> smp_num_siblings calculation.   I'm still debugging ...

The bug is that smp_num_siblings has been incorrectly calculated as the
*maximum* number of threads in a core, and not the actual number of threads in
a core on systems which have a CPUID level greater than 0xb.  (see
arch/x86/kernel/cpu/topology.c:59)

That will take some time to investigate and come up with a proper solution and
fix.  In the meantime, the patch below will fix the problem in the short-term.
I've tested the patch using SMT enabled, SMT disabled, maxcpus=1 and nr_cpus=1.

tglx, Please revert b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages
estimate") if you think that is a better option.  The problem with
smp_num_siblings has been around for almost a decade.

P.

---8<---

Subject: [PATCH] arch/x86: Do not use smp_num_siblings in
 __max_logical_packages calculation

Documentation/x86/topology.txt defines smp_num_siblings as "The number of
threads in a core".  Since commit bbb65d2d365e ("x86: use cpuid vector 0xb
when available for detecting cpu topology") smp_num_siblings is the
maximum number of threads in a core.  If Simultaneous MultiThreading
(SMT) is disabled on a system, smp_num_siblings is 2 and not 1 as
expected.

Use topology_max_smt_threads() in the __max_logical_packages calculation.

Signed-off-by: Prarit Bhargava 
Cc: "net...@vger.kernel.org" 
Cc: Thomas Gleixner 
Cc: Clark Williams 
---
 arch/x86/kernel/smpboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3d01df7d7cf6..eaee15fb7d8b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1304,7 +1304,7 @@ void __init native_smp_cpus_done(unsigned int max_cpus)
 * Today neither Intel nor AMD support heterogenous systems so
 * extrapolate the boot cpu's data to all packages.
 */
-   ncpus = cpu_data(0).booted_cores * smp_num_siblings;
+   ncpus = cpu_data(0).booted_cores * topology_max_smt_threads();
__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
pr_info("Max logical packages: %u\n", __max_logical_packages);
 
-- 
1.8.3.1



Re: [bisected] x86 boot still broken on -rc2

2017-12-04 Thread Prarit Bhargava
On 12/04/2017 08:13 AM, Prarit Bhargava wrote:
> 
> 
> x86: Booting SMP configuration:
>  node  #0, CPUs:#1  #2  #3  #4
>  node  #1, CPUs:#5  #6  #7  #8  #9
>  node  #0, CPUs:   #10 #11 #12 #13 #14
>  node  #1, CPUs:   #15 #16 #17 #18 #19
> smp: Brought up 2 nodes, 20 CPUs
> smpboot: Max logical packages: 1
> 
> which means that the calculation of logical packages is wrong because
> 
>   ncpus = cpu_data(0).booted_cores * smp_num_siblings;
>   ncpus = 10 * 2;
>   ncpus = 20;
> 
> smp_num_siblings is defined as "The number of threads in a core" which
> should be 1 if HT/SMT is disabled.
> 
> It looks like my patch has exposed a bug in the
> smp_num_siblings calculation.   I'm still debugging ...

The bug is that smp_num_siblings has been incorrectly calculated as the
*maximum* number of threads in a core, and not the actual number of threads in
a core on systems which have a CPUID level greater than 0xb.  (see
arch/x86/kernel/cpu/topology.c:59)

That will take some time to investigate and come up with a proper solution and
fix.  In the meantime, the patch below will fix the problem in the short-term.
I've tested the patch using SMT enabled, SMT disabled, maxcpus=1 and nr_cpus=1.

tglx, Please revert b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages
estimate") if you think that is a better option.  The problem with
smp_num_siblings has been around for almost a decade.

P.

---8<---

Subject: [PATCH] arch/x86: Do not use smp_num_siblings in
 __max_logical_packages calculation

Documentation/x86/topology.txt defines smp_num_siblings as "The number of
threads in a core".  Since commit bbb65d2d365e ("x86: use cpuid vector 0xb
when available for detecting cpu topology") smp_num_siblings is the
maximum number of threads in a core.  If Simultaneous MultiThreading
(SMT) is disabled on a system, smp_num_siblings is 2 and not 1 as
expected.

Use topology_max_smt_threads() in the __max_logical_packages calculation.

Signed-off-by: Prarit Bhargava 
Cc: "net...@vger.kernel.org" 
Cc: Thomas Gleixner 
Cc: Clark Williams 
---
 arch/x86/kernel/smpboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3d01df7d7cf6..eaee15fb7d8b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1304,7 +1304,7 @@ void __init native_smp_cpus_done(unsigned int max_cpus)
 * Today neither Intel nor AMD support heterogenous systems so
 * extrapolate the boot cpu's data to all packages.
 */
-   ncpus = cpu_data(0).booted_cores * smp_num_siblings;
+   ncpus = cpu_data(0).booted_cores * topology_max_smt_threads();
__max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
pr_info("Max logical packages: %u\n", __max_logical_packages);
 
-- 
1.8.3.1



Re: [bisected] x86 boot still broken on -rc2

2017-12-04 Thread Prarit Bhargava


On 12/04/2017 07:28 AM, Prarit Bhargava wrote:
> 
> 
> On 12/03/2017 08:28 PM, Jakub Kicinski wrote:
>> Same thing on rc2, bisected down to:
>>
>> commit b4c0a7326f5dc0ef7a64128b0ae7d081f4b2cbd1 (refs/bisect/bad)
>> Author: Prarit Bhargava 
>> Date:   Tue Nov 14 07:42:57 2017 -0500
>>
>> x86/smpboot: Fix __max_logical_packages estimate
>> 
>> A system booted with a small number of cores enabled per package
>> panics because the estimate of __max_logical_packages is too low.
>> 
>> This occurs when the total number of active cores across all packages is
>> less than the maximum core count for a single package. e.g.:
>> 
>>   On a 4 package system with 20 cores/package where only 4 cores are
>>   enabled on each package, the value of __max_logical_packages is
>>   calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.
>> 
>> Calculate __max_logical_packages after the cpu enumeration has completed.
>> Use the boot cpu's data to extrapolate the number of packages.
>> 
>> Signed-off-by: Prarit Bhargava 
>> Signed-off-by: Thomas Gleixner 
>> Cc: Tom Lendacky 
>> Cc: Andi Kleen 
>> Cc: Christian Borntraeger 
>> Cc: Peter Zijlstra 
>> Cc: Kan Liang 
>> Cc: He Chen 
>> Cc: Stephane Eranian 
>> Cc: Dave Hansen 
>> Cc: Piotr Luc 
>> Cc: Andy Lutomirski 
>> Cc: Arvind Yadav 
>> Cc: Vitaly Kuznetsov 
>> Cc: Borislav Petkov 
>> Cc: Tim Chen 
>> Cc: Mathias Krause 
>> Cc: "Kirill A. Shutemov" 
>> Link: https://lkml.kernel.org/r/20171114124257.22013-4-pra...@redhat.com
>>
>>
>> On Fri, 1 Dec 2017 16:39:54 -0800, Jakub Kicinski wrote:
>>> Hi!
>>>
>>> I'm hitting these after DaveM pulled rc1 into net-next on my Xeon
>>> E5-2630 v4 box.  It also happens on linux-next.  Did anyone else
>>> experience it?  (.config attached)
>>>
>>> [5.003771] WARNING: CPU: 14 PID: 1 at 
>>> ../arch/x86/events/intel/uncore.c:936 uncore_pci_probe+0x285/0x2b0
>>> [5.007544] Modules linked in:
>>> [5.007544] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 
>>> 4.15.0-rc1-perf-00225-gb2a4e0a76b1d #782
>>> [5.007544] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 
>>> 11/08/2016
> 
> I have a Dell R730 available for use.  OOC are you booting with the default
> BIOS options?
>

Jakub, I was able to reproduce this on a similar system by DISABLING
hyperthreading in the BIOS.  Doing this on other systems seems to have no
impact.  What is odd about this system when booting is that the
kernel claims that hyperthreading is ENABLED:

x86: Booting SMP configuration:
 node  #0, CPUs:#1  #2  #3  #4
 node  #1, CPUs:#5  #6  #7  #8  #9
 node  #0, CPUs:   #10 #11 #12 #13 #14
 node  #1, CPUs:   #15 #16 #17 #18 #19
smp: Brought up 2 nodes, 20 CPUs
smpboot: Max logical packages: 1

which means that the calculation of logical packages is wrong because

ncpus = cpu_data(0).booted_cores * smp_num_siblings;
ncpus = 10 * 2;
ncpus = 20;

smp_num_siblings is defined as "The number of threads in a core" which
should be 1 if HT/SMT is disabled.

It looks like my patch has exposed a bug in the
smp_num_siblings calculation.   I'm still debugging ...

FWIW, I did test this code on systems by disabling HT/SMT in BIOS on
several systems.  I have tested those systems again and don't see a
problem.  It is something peculiar to a few systems.

P.


Re: [bisected] x86 boot still broken on -rc2

2017-12-04 Thread Prarit Bhargava


On 12/04/2017 07:28 AM, Prarit Bhargava wrote:
> 
> 
> On 12/03/2017 08:28 PM, Jakub Kicinski wrote:
>> Same thing on rc2, bisected down to:
>>
>> commit b4c0a7326f5dc0ef7a64128b0ae7d081f4b2cbd1 (refs/bisect/bad)
>> Author: Prarit Bhargava 
>> Date:   Tue Nov 14 07:42:57 2017 -0500
>>
>> x86/smpboot: Fix __max_logical_packages estimate
>> 
>> A system booted with a small number of cores enabled per package
>> panics because the estimate of __max_logical_packages is too low.
>> 
>> This occurs when the total number of active cores across all packages is
>> less than the maximum core count for a single package. e.g.:
>> 
>>   On a 4 package system with 20 cores/package where only 4 cores are
>>   enabled on each package, the value of __max_logical_packages is
>>   calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.
>> 
>> Calculate __max_logical_packages after the cpu enumeration has completed.
>> Use the boot cpu's data to extrapolate the number of packages.
>> 
>> Signed-off-by: Prarit Bhargava 
>> Signed-off-by: Thomas Gleixner 
>> Cc: Tom Lendacky 
>> Cc: Andi Kleen 
>> Cc: Christian Borntraeger 
>> Cc: Peter Zijlstra 
>> Cc: Kan Liang 
>> Cc: He Chen 
>> Cc: Stephane Eranian 
>> Cc: Dave Hansen 
>> Cc: Piotr Luc 
>> Cc: Andy Lutomirski 
>> Cc: Arvind Yadav 
>> Cc: Vitaly Kuznetsov 
>> Cc: Borislav Petkov 
>> Cc: Tim Chen 
>> Cc: Mathias Krause 
>> Cc: "Kirill A. Shutemov" 
>> Link: https://lkml.kernel.org/r/20171114124257.22013-4-pra...@redhat.com
>>
>>
>> On Fri, 1 Dec 2017 16:39:54 -0800, Jakub Kicinski wrote:
>>> Hi!
>>>
>>> I'm hitting these after DaveM pulled rc1 into net-next on my Xeon
>>> E5-2630 v4 box.  It also happens on linux-next.  Did anyone else
>>> experience it?  (.config attached)
>>>
>>> [5.003771] WARNING: CPU: 14 PID: 1 at 
>>> ../arch/x86/events/intel/uncore.c:936 uncore_pci_probe+0x285/0x2b0
>>> [5.007544] Modules linked in:
>>> [5.007544] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 
>>> 4.15.0-rc1-perf-00225-gb2a4e0a76b1d #782
>>> [5.007544] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 
>>> 11/08/2016
> 
> I have a Dell R730 available for use.  OOC are you booting with the default
> BIOS options?
>

Jakub, I was able to reproduce this on a similar system by DISABLING
hyperthreading in the BIOS.  Doing this on other systems seems to have no
impact.  What is odd about this system when booting is that the
kernel claims that hyperthreading is ENABLED:

x86: Booting SMP configuration:
 node  #0, CPUs:#1  #2  #3  #4
 node  #1, CPUs:#5  #6  #7  #8  #9
 node  #0, CPUs:   #10 #11 #12 #13 #14
 node  #1, CPUs:   #15 #16 #17 #18 #19
smp: Brought up 2 nodes, 20 CPUs
smpboot: Max logical packages: 1

which means that the calculation of logical packages is wrong because

ncpus = cpu_data(0).booted_cores * smp_num_siblings;
ncpus = 10 * 2;
ncpus = 20;

smp_num_siblings is defined as "The number of threads in a core" which
should be 1 if HT/SMT is disabled.

It looks like my patch has exposed a bug in the
smp_num_siblings calculation.   I'm still debugging ...

FWIW, I did test this code on systems by disabling HT/SMT in BIOS on
several systems.  I have tested those systems again and don't see a
problem.  It is something peculiar to a few systems.

P.


Re: [bisected] x86 boot still broken on -rc2

2017-12-04 Thread Prarit Bhargava


On 12/03/2017 08:28 PM, Jakub Kicinski wrote:
> Same thing on rc2, bisected down to:
> 
> commit b4c0a7326f5dc0ef7a64128b0ae7d081f4b2cbd1 (refs/bisect/bad)
> Author: Prarit Bhargava 
> Date:   Tue Nov 14 07:42:57 2017 -0500
> 
> x86/smpboot: Fix __max_logical_packages estimate
> 
> A system booted with a small number of cores enabled per package
> panics because the estimate of __max_logical_packages is too low.
> 
> This occurs when the total number of active cores across all packages is
> less than the maximum core count for a single package. e.g.:
> 
>   On a 4 package system with 20 cores/package where only 4 cores are
>   enabled on each package, the value of __max_logical_packages is
>   calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.
> 
> Calculate __max_logical_packages after the cpu enumeration has completed.
> Use the boot cpu's data to extrapolate the number of packages.
> 
> Signed-off-by: Prarit Bhargava 
> Signed-off-by: Thomas Gleixner 
> Cc: Tom Lendacky 
> Cc: Andi Kleen 
> Cc: Christian Borntraeger 
> Cc: Peter Zijlstra 
> Cc: Kan Liang 
> Cc: He Chen 
> Cc: Stephane Eranian 
> Cc: Dave Hansen 
> Cc: Piotr Luc 
> Cc: Andy Lutomirski 
> Cc: Arvind Yadav 
> Cc: Vitaly Kuznetsov 
> Cc: Borislav Petkov 
> Cc: Tim Chen 
> Cc: Mathias Krause 
> Cc: "Kirill A. Shutemov" 
> Link: https://lkml.kernel.org/r/20171114124257.22013-4-pra...@redhat.com
> 
> 
> On Fri, 1 Dec 2017 16:39:54 -0800, Jakub Kicinski wrote:
>> Hi!
>>
>> I'm hitting these after DaveM pulled rc1 into net-next on my Xeon
>> E5-2630 v4 box.  It also happens on linux-next.  Did anyone else
>> experience it?  (.config attached)
>>
>> [5.003771] WARNING: CPU: 14 PID: 1 at 
>> ../arch/x86/events/intel/uncore.c:936 uncore_pci_probe+0x285/0x2b0
>> [5.007544] Modules linked in:
>> [5.007544] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 
>> 4.15.0-rc1-perf-00225-gb2a4e0a76b1d #782
>> [5.007544] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 
>> 11/08/2016

I have a Dell R730 available for use.  OOC are you booting with the default
BIOS options?

P.


>> [5.007544] task: 9e842725 task.stack: 8a63fd2d
>> [5.007544] RIP: 0010:uncore_pci_probe+0x285/0x2b0
>> [5.007544] RSP: :ad8580163d10 EFLAGS: 00010286
>> [5.007544] RAX: 98576cc3df30 RBX: b08037e0 RCX: 
>> b0c1a120
>> [5.007544] RDX:  RSI:  RDI: 
>> b0c1a960
>> [5.007544] RBP: 985b6c00ac00 R08: fffe R09: 
>> 000f
>> [5.007544] R10: 98576f1b6018 R11: 0022 R12: 
>> 985b6c641000
>> [5.007544] R13: 0001 R14: 0001 R15: 
>> 0001
>> [5.007544] FS:  () GS:98576fb8() 
>> knlGS:
>> [5.007544] CS:  0010 DS:  ES:  CR0: 80050033
>> [5.007544] CR2:  CR3: 000185c09001 CR4: 
>> 003606e0
>> [5.007544] DR0:  DR1:  DR2: 
>> 
>> [5.007544] DR3:  DR6: fffe0ff0 DR7: 
>> 0400
>> [5.007544] Call Trace:
>> [5.007544]  local_pci_probe+0x3d/0x90
>> [5.007544]  ? pci_match_device+0xd9/0x100
>> [5.007544]  pci_device_probe+0x122/0x180
>> [5.007544]  driver_probe_device+0x246/0x330
>> [5.007544]  ? set_debug_rodata+0x11/0x11
>> [5.007544]  __driver_attach+0x8a/0x90
>> [5.007544]  ? driver_probe_device+0x330/0x330
>> [5.007544]  bus_for_each_dev+0x5c/0x90
>> [5.007544]  bus_add_driver+0x196/0x220
>> [5.007544]  driver_register+0x57/0xc0
>> [5.007544]  intel_uncore_init+0x1e3/0x249
>> [5.007544]  ? uncore_type_init+0x193/0x193
>> [5.007544]  ? set_debug_rodata+0x11/0x11
>> [5.007544]  do_one_initcall+0x4b/0x190
>> [5.007544]  kernel_init_freeable+0x16e/0x1f5
>> [5.007544]  ? rest_init+0xd0/0xd0
>> [5.007544]  kernel_init+0xa/0x100
>> [5.007544]  ret_from_fork+0x1f/0x30
>> [5.007544] Code: 48 8b 52 08 48 85 d2 74 0d 89 44 24 04 48 89 df ff d2 
>> 8b 44 24 04 48 89 df 89 44 24 04 e8 54 0a 1c 00 8b 44 24 0 
>> [5.007544] ---[ end trace 4dc4c3d5f5afcd2f ]---
>> [5.244504] bdx_uncore: probe of :ff:08.2 failed with error -22
>> [5.251604] bdx_uncore: probe of :ff:0b.1 failed with error -22
>> [5.258711] bdx_uncore: probe of :ff:10.1 failed with error 

Re: [bisected] x86 boot still broken on -rc2

2017-12-04 Thread Prarit Bhargava


On 12/03/2017 08:28 PM, Jakub Kicinski wrote:
> Same thing on rc2, bisected down to:
> 
> commit b4c0a7326f5dc0ef7a64128b0ae7d081f4b2cbd1 (refs/bisect/bad)
> Author: Prarit Bhargava 
> Date:   Tue Nov 14 07:42:57 2017 -0500
> 
> x86/smpboot: Fix __max_logical_packages estimate
> 
> A system booted with a small number of cores enabled per package
> panics because the estimate of __max_logical_packages is too low.
> 
> This occurs when the total number of active cores across all packages is
> less than the maximum core count for a single package. e.g.:
> 
>   On a 4 package system with 20 cores/package where only 4 cores are
>   enabled on each package, the value of __max_logical_packages is
>   calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.
> 
> Calculate __max_logical_packages after the cpu enumeration has completed.
> Use the boot cpu's data to extrapolate the number of packages.
> 
> Signed-off-by: Prarit Bhargava 
> Signed-off-by: Thomas Gleixner 
> Cc: Tom Lendacky 
> Cc: Andi Kleen 
> Cc: Christian Borntraeger 
> Cc: Peter Zijlstra 
> Cc: Kan Liang 
> Cc: He Chen 
> Cc: Stephane Eranian 
> Cc: Dave Hansen 
> Cc: Piotr Luc 
> Cc: Andy Lutomirski 
> Cc: Arvind Yadav 
> Cc: Vitaly Kuznetsov 
> Cc: Borislav Petkov 
> Cc: Tim Chen 
> Cc: Mathias Krause 
> Cc: "Kirill A. Shutemov" 
> Link: https://lkml.kernel.org/r/20171114124257.22013-4-pra...@redhat.com
> 
> 
> On Fri, 1 Dec 2017 16:39:54 -0800, Jakub Kicinski wrote:
>> Hi!
>>
>> I'm hitting these after DaveM pulled rc1 into net-next on my Xeon
>> E5-2630 v4 box.  It also happens on linux-next.  Did anyone else
>> experience it?  (.config attached)
>>
>> [5.003771] WARNING: CPU: 14 PID: 1 at 
>> ../arch/x86/events/intel/uncore.c:936 uncore_pci_probe+0x285/0x2b0
>> [5.007544] Modules linked in:
>> [5.007544] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 
>> 4.15.0-rc1-perf-00225-gb2a4e0a76b1d #782
>> [5.007544] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 
>> 11/08/2016

I have a Dell R730 available for use.  OOC are you booting with the default
BIOS options?

P.


>> [5.007544] task: 9e842725 task.stack: 8a63fd2d
>> [5.007544] RIP: 0010:uncore_pci_probe+0x285/0x2b0
>> [5.007544] RSP: :ad8580163d10 EFLAGS: 00010286
>> [5.007544] RAX: 98576cc3df30 RBX: b08037e0 RCX: 
>> b0c1a120
>> [5.007544] RDX:  RSI:  RDI: 
>> b0c1a960
>> [5.007544] RBP: 985b6c00ac00 R08: fffe R09: 
>> 000f
>> [5.007544] R10: 98576f1b6018 R11: 0022 R12: 
>> 985b6c641000
>> [5.007544] R13: 0001 R14: 0001 R15: 
>> 0001
>> [5.007544] FS:  () GS:98576fb8() 
>> knlGS:
>> [5.007544] CS:  0010 DS:  ES:  CR0: 80050033
>> [5.007544] CR2:  CR3: 000185c09001 CR4: 
>> 003606e0
>> [5.007544] DR0:  DR1:  DR2: 
>> 
>> [5.007544] DR3:  DR6: fffe0ff0 DR7: 
>> 0400
>> [5.007544] Call Trace:
>> [5.007544]  local_pci_probe+0x3d/0x90
>> [5.007544]  ? pci_match_device+0xd9/0x100
>> [5.007544]  pci_device_probe+0x122/0x180
>> [5.007544]  driver_probe_device+0x246/0x330
>> [5.007544]  ? set_debug_rodata+0x11/0x11
>> [5.007544]  __driver_attach+0x8a/0x90
>> [5.007544]  ? driver_probe_device+0x330/0x330
>> [5.007544]  bus_for_each_dev+0x5c/0x90
>> [5.007544]  bus_add_driver+0x196/0x220
>> [5.007544]  driver_register+0x57/0xc0
>> [5.007544]  intel_uncore_init+0x1e3/0x249
>> [5.007544]  ? uncore_type_init+0x193/0x193
>> [5.007544]  ? set_debug_rodata+0x11/0x11
>> [5.007544]  do_one_initcall+0x4b/0x190
>> [5.007544]  kernel_init_freeable+0x16e/0x1f5
>> [5.007544]  ? rest_init+0xd0/0xd0
>> [5.007544]  kernel_init+0xa/0x100
>> [5.007544]  ret_from_fork+0x1f/0x30
>> [5.007544] Code: 48 8b 52 08 48 85 d2 74 0d 89 44 24 04 48 89 df ff d2 
>> 8b 44 24 04 48 89 df 89 44 24 04 e8 54 0a 1c 00 8b 44 24 0 
>> [5.007544] ---[ end trace 4dc4c3d5f5afcd2f ]---
>> [5.244504] bdx_uncore: probe of :ff:08.2 failed with error -22
>> [5.251604] bdx_uncore: probe of :ff:0b.1 failed with error -22
>> [5.258711] bdx_uncore: probe of :ff:10.1 failed with error -22
>> [5.265819] bdx_uncore: probe of :ff:14.0 failed with error -22
>> [5.272919] bdx_uncore: probe of :ff:14.1 failed with error -22
>> [5.280019] bdx_uncore: probe of :ff:15.0 failed with error -22
>> [5.287112] bdx_uncore: probe of :ff:15.1 failed with error -22
>> [5.294376] WARNING: CPU: 1 PID: 15 at 
>> ../arch/x86/events/intel/uncore.c:1065 
>>