Re: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks
On Tue, Nov 20, 2018 at 1:32 PM Li, Aubrey wrote: > Thanks for your program, Samuel, it's very helpful. But I saw a different > output on my side, May I have your glibc version? > This system is running glibc 2.27, and kernel 4.18.7. The Xeon Gold 5120 also happens to be one of the Skylake-SP models with a single 512-bit FMA unit, instead of 2. Samuel.
Re: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks
On Tue, Nov 20, 2018 at 1:32 PM Li, Aubrey wrote: > Thanks for your program, Samuel, it's very helpful. But I saw a different > output on my side, May I have your glibc version? > This system is running glibc 2.27, and kernel 4.18.7. The Xeon Gold 5120 also happens to be one of the Skylake-SP models with a single 512-bit FMA unit, instead of 2. Samuel.
Re: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks
On 11/17/18 12:36 AM, Li, Aubrey wrote: > On 2018/11/17 7:10, Dave Hansen wrote: >> Just to be clear: there are 3 AVX-512 XSAVE states: >> >> XFEATURE_OPMASK, >> XFEATURE_ZMM_Hi256, >> XFEATURE_Hi16_ZMM, >> >> I honestly don't know what XFEATURE_OPMASK does. It does not appear to >> be affected by VZEROUPPER (although VZEROUPPER's SDM documentation isn't >> looking too great). XFEATURE_OPMASK refers to the additional 8 mask registers used in AVX512. These are more similar to general purpose registers than vector registers, and should not be too relevant here. >> >> But, XFEATURE_ZMM_Hi256 is used for the upper 256 bits of the >> registers ZMM0-ZMM15. Those are AVX-512-only registers. The only way >> to get data into XFEATURE_ZMM_Hi256 state is by using AVX512 instructions. >> >> XFEATURE_Hi16_ZMM is the same. The only way to get state in there is >> with AVX512 instructions. >> >> So, first of all, I think you *MUST* check XFEATURE_ZMM_Hi256 and >> XFEATURE_Hi16_ZMM. That's without question. > > No, XFEATURE_ZMM_Hi256 does not request turbo license 2, so it's less > interested to us. > I think Dave is right, and it's easy enough to check this. See the attached program. For the "high current" instruction vpmuludq operating on zmm0--zmm3 registers, we have (on a Skylake-SP Xeon Gold 5120) 175,097 core_power.lvl0_turbo_license:u ( +- 2.18% ) 41,185 core_power.lvl1_turbo_license:u ( +- 1.55% ) 83,928,648 core_power.lvl2_turbo_license:u ( +- 0.00% ) while for the same code operating on zmm28--zmm31 registers, we have 163,507 core_power.lvl0_turbo_license:u ( +- 6.85% ) 47,390 core_power.lvl1_turbo_license:u ( +- 12.25% ) 83,927,735 core_power.lvl2_turbo_license:u ( +- 0.00% ) In other words, the register index does not seem to matter at all for turbo license purposes (this makes sense, considering these chips have 168 vector registers internally; zmm15--zmm31 are simply newly exposed architectural registers). We can also see that XFEATURE_Hi16_ZMM does not imply license 1 or 2; we may be using xmm15--xmm31 purely for the convenient extra register space. For example, cases 4 and 5 of the sample program: 84,064,239 core_power.lvl0_turbo_license:u ( +- 0.00% ) 0 core_power.lvl1_turbo_license:u 0 core_power.lvl2_turbo_license:u 84,060,625 core_power.lvl0_turbo_license:u ( +- 0.00% ) 0 core_power.lvl1_turbo_license:u 0 core_power.lvl2_turbo_license:u So what's most important is the width of the vectors being used, not the instruction set or the register index. Second to that is the instruction type, namely whether those are "heavy" instructions. Neither of these things can be accurately captured by the XSAVE state. >> >> It's probably *possible* to run AVX512 instructions by loading state >> into the YMM register and then executing AVX512 instructions that only >> write to memory and never to register state. That *might* allow >> XFEATURE_Hi16_ZMM and XFEATURE_ZMM_Hi256 to stay in the init state, but >> for the frequency to be affected since AVX512 instructions _are_ >> executing. But, there's no way to detect this situation from XSAVE >> states themselves. >> > > Andi should have more details on this. FWICT, not all AVX512 instructions > has high current, those only touching memory do not cause notable frequency > drop. According to section 15.26 of the Intel optimization reference manual, "heavy" instructions consist of floating-point and integer multiplication. Moves, adds, logical operations, etc, will request at most turbo license 1 when operating on zmm registers. > > Thanks, > -Aubrey > #include #define INSN_LOOP_LO(insn, reg) do { \ asm volatile(\ "mov $1<<24,%%rcx;"\ ".align 32;" \ "1:" \ #insn " " "%%" #reg "0" "," "%%" #reg "0" "," "%%" #reg "0" ";"\ #insn " " "%%" #reg "1" "," "%%" #reg "1" "," "%%" #reg "1" ";"\ #insn " " "%%" #reg "2" "," "%%" #reg "2" "," "%%" #reg "2" ";"\ #insn " " "%%" #reg "3" "," "%%" #reg "3" "," "%%" #reg "3" ";"\ "dec %%rcx;" \ "jnz 1b;" \ ::: "rcx" \ ); \ } while(0); #define INSN_LOOP_HI(insn, reg) do {
Re: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks
On 11/17/18 12:36 AM, Li, Aubrey wrote: > On 2018/11/17 7:10, Dave Hansen wrote: >> Just to be clear: there are 3 AVX-512 XSAVE states: >> >> XFEATURE_OPMASK, >> XFEATURE_ZMM_Hi256, >> XFEATURE_Hi16_ZMM, >> >> I honestly don't know what XFEATURE_OPMASK does. It does not appear to >> be affected by VZEROUPPER (although VZEROUPPER's SDM documentation isn't >> looking too great). XFEATURE_OPMASK refers to the additional 8 mask registers used in AVX512. These are more similar to general purpose registers than vector registers, and should not be too relevant here. >> >> But, XFEATURE_ZMM_Hi256 is used for the upper 256 bits of the >> registers ZMM0-ZMM15. Those are AVX-512-only registers. The only way >> to get data into XFEATURE_ZMM_Hi256 state is by using AVX512 instructions. >> >> XFEATURE_Hi16_ZMM is the same. The only way to get state in there is >> with AVX512 instructions. >> >> So, first of all, I think you *MUST* check XFEATURE_ZMM_Hi256 and >> XFEATURE_Hi16_ZMM. That's without question. > > No, XFEATURE_ZMM_Hi256 does not request turbo license 2, so it's less > interested to us. > I think Dave is right, and it's easy enough to check this. See the attached program. For the "high current" instruction vpmuludq operating on zmm0--zmm3 registers, we have (on a Skylake-SP Xeon Gold 5120) 175,097 core_power.lvl0_turbo_license:u ( +- 2.18% ) 41,185 core_power.lvl1_turbo_license:u ( +- 1.55% ) 83,928,648 core_power.lvl2_turbo_license:u ( +- 0.00% ) while for the same code operating on zmm28--zmm31 registers, we have 163,507 core_power.lvl0_turbo_license:u ( +- 6.85% ) 47,390 core_power.lvl1_turbo_license:u ( +- 12.25% ) 83,927,735 core_power.lvl2_turbo_license:u ( +- 0.00% ) In other words, the register index does not seem to matter at all for turbo license purposes (this makes sense, considering these chips have 168 vector registers internally; zmm15--zmm31 are simply newly exposed architectural registers). We can also see that XFEATURE_Hi16_ZMM does not imply license 1 or 2; we may be using xmm15--xmm31 purely for the convenient extra register space. For example, cases 4 and 5 of the sample program: 84,064,239 core_power.lvl0_turbo_license:u ( +- 0.00% ) 0 core_power.lvl1_turbo_license:u 0 core_power.lvl2_turbo_license:u 84,060,625 core_power.lvl0_turbo_license:u ( +- 0.00% ) 0 core_power.lvl1_turbo_license:u 0 core_power.lvl2_turbo_license:u So what's most important is the width of the vectors being used, not the instruction set or the register index. Second to that is the instruction type, namely whether those are "heavy" instructions. Neither of these things can be accurately captured by the XSAVE state. >> >> It's probably *possible* to run AVX512 instructions by loading state >> into the YMM register and then executing AVX512 instructions that only >> write to memory and never to register state. That *might* allow >> XFEATURE_Hi16_ZMM and XFEATURE_ZMM_Hi256 to stay in the init state, but >> for the frequency to be affected since AVX512 instructions _are_ >> executing. But, there's no way to detect this situation from XSAVE >> states themselves. >> > > Andi should have more details on this. FWICT, not all AVX512 instructions > has high current, those only touching memory do not cause notable frequency > drop. According to section 15.26 of the Intel optimization reference manual, "heavy" instructions consist of floating-point and integer multiplication. Moves, adds, logical operations, etc, will request at most turbo license 1 when operating on zmm registers. > > Thanks, > -Aubrey > #include #define INSN_LOOP_LO(insn, reg) do { \ asm volatile(\ "mov $1<<24,%%rcx;"\ ".align 32;" \ "1:" \ #insn " " "%%" #reg "0" "," "%%" #reg "0" "," "%%" #reg "0" ";"\ #insn " " "%%" #reg "1" "," "%%" #reg "1" "," "%%" #reg "1" ";"\ #insn " " "%%" #reg "2" "," "%%" #reg "2" "," "%%" #reg "2" ";"\ #insn " " "%%" #reg "3" "," "%%" #reg "3" "," "%%" #reg "3" ";"\ "dec %%rcx;" \ "jnz 1b;" \ ::: "rcx" \ ); \ } while(0); #define INSN_LOOP_HI(insn, reg) do {
Re: [PATCH] x86: entry: flush the cache if syscall error
On Fri, Oct 12, 2018 at 2:26 PM Jann Horn wrote: > > On Fri, Oct 12, 2018 at 11:41 AM Samuel Neves wrote: > > > > On Thu, Oct 11, 2018 at 8:25 PM Andy Lutomirski wrote: > > > What exactly is this trying to protect against? And how many cycles > > > should we expect L1D_FLUSH to take? > > > > As far as I could measure, I got 1660 cycles per wrmsr 0x10b, 0x1 on a > > Skylake chip, and 1220 cycles on a Skylake-SP. > > Is that with L1D mostly empty, with L1D mostly full with clean lines, > or with L1D full of dirty lines that need to be written back? Mostly empty, as this is flushing repeatedly without bothering to refill L1d with anything. On Skylake the (averaged) uops breakdown is something like port 0: 255 port 1: 143 port 2: 176 port 3: 177 port 4: 524 port 5: 273 port 6: 616 port 7: 182 The number of port 4 dispatches is very close to the number of cache lines, suggesting one write per line (with respective 176+177+182 port {2, 3, 7} address generations). Furthermore, I suspect it also clears L1i cache. For 2^20 wrmsr executions, we have around 2^20 frontend_retired_l1i_miss events, but a negligible amount of frontend_retired_l2_miss ones.
Re: [PATCH] x86: entry: flush the cache if syscall error
On Fri, Oct 12, 2018 at 2:26 PM Jann Horn wrote: > > On Fri, Oct 12, 2018 at 11:41 AM Samuel Neves wrote: > > > > On Thu, Oct 11, 2018 at 8:25 PM Andy Lutomirski wrote: > > > What exactly is this trying to protect against? And how many cycles > > > should we expect L1D_FLUSH to take? > > > > As far as I could measure, I got 1660 cycles per wrmsr 0x10b, 0x1 on a > > Skylake chip, and 1220 cycles on a Skylake-SP. > > Is that with L1D mostly empty, with L1D mostly full with clean lines, > or with L1D full of dirty lines that need to be written back? Mostly empty, as this is flushing repeatedly without bothering to refill L1d with anything. On Skylake the (averaged) uops breakdown is something like port 0: 255 port 1: 143 port 2: 176 port 3: 177 port 4: 524 port 5: 273 port 6: 616 port 7: 182 The number of port 4 dispatches is very close to the number of cache lines, suggesting one write per line (with respective 176+177+182 port {2, 3, 7} address generations). Furthermore, I suspect it also clears L1i cache. For 2^20 wrmsr executions, we have around 2^20 frontend_retired_l1i_miss events, but a negligible amount of frontend_retired_l2_miss ones.
Re: [PATCH] x86: entry: flush the cache if syscall error
On Thu, Oct 11, 2018 at 8:25 PM Andy Lutomirski wrote: > What exactly is this trying to protect against? And how many cycles > should we expect L1D_FLUSH to take? As far as I could measure, I got 1660 cycles per wrmsr 0x10b, 0x1 on a Skylake chip, and 1220 cycles on a Skylake-SP.
Re: [PATCH] x86: entry: flush the cache if syscall error
On Thu, Oct 11, 2018 at 8:25 PM Andy Lutomirski wrote: > What exactly is this trying to protect against? And how many cycles > should we expect L1D_FLUSH to take? As far as I could measure, I got 1660 cycles per wrmsr 0x10b, 0x1 on a Skylake chip, and 1220 cycles on a Skylake-SP.
[tip:x86/urgent] x86/vdso: Fix lsl operand order
Commit-ID: e78e5a91456fcecaa2efbb3706572fe043766f4d Gitweb: https://git.kernel.org/tip/e78e5a91456fcecaa2efbb3706572fe043766f4d Author: Samuel Neves AuthorDate: Sat, 1 Sep 2018 21:14:52 +0100 Committer: Thomas Gleixner CommitDate: Sat, 1 Sep 2018 23:01:56 +0200 x86/vdso: Fix lsl operand order In the __getcpu function, lsl is using the wrong target and destination registers. Luckily, the compiler tends to choose %eax for both variables, so it has been working so far. Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available") Signed-off-by: Samuel Neves Signed-off-by: Thomas Gleixner Acked-by: Andy Lutomirski Cc: sta...@vger.kernel.org Link: https://lkml.kernel.org/r/20180901201452.27828-1-sne...@dei.uc.pt --- arch/x86/include/asm/vgtod.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h index fb856c9f0449..53748541c487 100644 --- a/arch/x86/include/asm/vgtod.h +++ b/arch/x86/include/asm/vgtod.h @@ -93,7 +93,7 @@ static inline unsigned int __getcpu(void) * * If RDPID is available, use it. */ - alternative_io ("lsl %[p],%[seg]", + alternative_io ("lsl %[seg],%[p]", ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */ X86_FEATURE_RDPID, [p] "=a" (p), [seg] "r" (__PER_CPU_SEG));
[tip:x86/urgent] x86/vdso: Fix lsl operand order
Commit-ID: e78e5a91456fcecaa2efbb3706572fe043766f4d Gitweb: https://git.kernel.org/tip/e78e5a91456fcecaa2efbb3706572fe043766f4d Author: Samuel Neves AuthorDate: Sat, 1 Sep 2018 21:14:52 +0100 Committer: Thomas Gleixner CommitDate: Sat, 1 Sep 2018 23:01:56 +0200 x86/vdso: Fix lsl operand order In the __getcpu function, lsl is using the wrong target and destination registers. Luckily, the compiler tends to choose %eax for both variables, so it has been working so far. Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available") Signed-off-by: Samuel Neves Signed-off-by: Thomas Gleixner Acked-by: Andy Lutomirski Cc: sta...@vger.kernel.org Link: https://lkml.kernel.org/r/20180901201452.27828-1-sne...@dei.uc.pt --- arch/x86/include/asm/vgtod.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h index fb856c9f0449..53748541c487 100644 --- a/arch/x86/include/asm/vgtod.h +++ b/arch/x86/include/asm/vgtod.h @@ -93,7 +93,7 @@ static inline unsigned int __getcpu(void) * * If RDPID is available, use it. */ - alternative_io ("lsl %[p],%[seg]", + alternative_io ("lsl %[seg],%[p]", ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */ X86_FEATURE_RDPID, [p] "=a" (p), [seg] "r" (__PER_CPU_SEG));
[PATCH] x86/vdso: fix lsl operand order
In the __getcpu function, lsl was using the wrong target and destination registers. Luckily, the compiler tends to choose %eax for both variables, so it has been working so far. Cc: x...@kernel.org Cc: sta...@vger.kernel.org Signed-off-by: Samuel Neves --- arch/x86/include/asm/vgtod.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h index fb856c9f0449..53748541c487 100644 --- a/arch/x86/include/asm/vgtod.h +++ b/arch/x86/include/asm/vgtod.h @@ -93,7 +93,7 @@ static inline unsigned int __getcpu(void) * * If RDPID is available, use it. */ - alternative_io ("lsl %[p],%[seg]", + alternative_io ("lsl %[seg],%[p]", ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */ X86_FEATURE_RDPID, [p] "=a" (p), [seg] "r" (__PER_CPU_SEG)); -- 2.17.1
[PATCH] x86/vdso: fix lsl operand order
In the __getcpu function, lsl was using the wrong target and destination registers. Luckily, the compiler tends to choose %eax for both variables, so it has been working so far. Cc: x...@kernel.org Cc: sta...@vger.kernel.org Signed-off-by: Samuel Neves --- arch/x86/include/asm/vgtod.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h index fb856c9f0449..53748541c487 100644 --- a/arch/x86/include/asm/vgtod.h +++ b/arch/x86/include/asm/vgtod.h @@ -93,7 +93,7 @@ static inline unsigned int __getcpu(void) * * If RDPID is available, use it. */ - alternative_io ("lsl %[p],%[seg]", + alternative_io ("lsl %[seg],%[p]", ".byte 0xf3,0x0f,0xc7,0xf8", /* RDPID %eax/rax */ X86_FEATURE_RDPID, [p] "=a" (p), [seg] "r" (__PER_CPU_SEG)); -- 2.17.1
[tip:x86/urgent] x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations
Commit-ID: 4596749339e06dc7a424fc08a15eded850ed78b7 Gitweb: https://git.kernel.org/tip/4596749339e06dc7a424fc08a15eded850ed78b7 Author: Samuel Neves <sne...@dei.uc.pt> AuthorDate: Wed, 21 Feb 2018 20:50:36 + Committer: Ingo Molnar <mi...@kernel.org> CommitDate: Fri, 23 Feb 2018 08:47:47 +0100 x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations Without this fix, /proc/cpuinfo will display an incorrect amount of CPU cores, after bringing them offline and online again, as exemplified below: $ cat /proc/cpuinfo | grep cores cpu cores : 4 cpu cores : 8 cpu cores : 8 cpu cores : 20 cpu cores : 4 cpu cores : 3 cpu cores : 2 cpu cores : 2 This patch fixes this by always zeroing the booted_cores variable upon turning off a logical CPU. Tested-by: Dou Liyang <douly.f...@cn.fujitsu.com> Signed-off-by: Samuel Neves <sne...@dei.uc.pt> Cc: Linus Torvalds <torva...@linux-foundation.org> Cc: Peter Zijlstra <pet...@infradead.org> Cc: Thomas Gleixner <t...@linutronix.de> Cc: jgr...@suse.com Cc: l...@kernel.org Cc: pra...@redhat.com Cc: vkuzn...@redhat.com Link: http://lkml.kernel.org/r/20180221205036.5244-1-sne...@dei.uc.pt Signed-off-by: Ingo Molnar <mi...@kernel.org> --- arch/x86/kernel/smpboot.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 9eee25d07586..ff99e2b6fc54 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1437,6 +1437,7 @@ static void remove_siblinginfo(int cpu) cpumask_clear(topology_sibling_cpumask(cpu)); cpumask_clear(topology_core_cpumask(cpu)); c->cpu_core_id = 0; + c->booted_cores = 0; cpumask_clear_cpu(cpu, cpu_sibling_setup_mask); recompute_smt_state(); }
[tip:x86/urgent] x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations
Commit-ID: 4596749339e06dc7a424fc08a15eded850ed78b7 Gitweb: https://git.kernel.org/tip/4596749339e06dc7a424fc08a15eded850ed78b7 Author: Samuel Neves AuthorDate: Wed, 21 Feb 2018 20:50:36 + Committer: Ingo Molnar CommitDate: Fri, 23 Feb 2018 08:47:47 +0100 x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations Without this fix, /proc/cpuinfo will display an incorrect amount of CPU cores, after bringing them offline and online again, as exemplified below: $ cat /proc/cpuinfo | grep cores cpu cores : 4 cpu cores : 8 cpu cores : 8 cpu cores : 20 cpu cores : 4 cpu cores : 3 cpu cores : 2 cpu cores : 2 This patch fixes this by always zeroing the booted_cores variable upon turning off a logical CPU. Tested-by: Dou Liyang Signed-off-by: Samuel Neves Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: jgr...@suse.com Cc: l...@kernel.org Cc: pra...@redhat.com Cc: vkuzn...@redhat.com Link: http://lkml.kernel.org/r/20180221205036.5244-1-sne...@dei.uc.pt Signed-off-by: Ingo Molnar --- arch/x86/kernel/smpboot.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 9eee25d07586..ff99e2b6fc54 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1437,6 +1437,7 @@ static void remove_siblinginfo(int cpu) cpumask_clear(topology_sibling_cpumask(cpu)); cpumask_clear(topology_core_cpumask(cpu)); c->cpu_core_id = 0; + c->booted_cores = 0; cpumask_clear_cpu(cpu, cpu_sibling_setup_mask); recompute_smt_state(); }
[PATCH] smpboot: correctly update number of booted cores
Without this fix, /proc/cpuinfo will display an incorrect amount of CPU cores, after bringing them offline and online again, as exemplified below: $ cat /proc/cpuinfo | grep cores cpu cores : 4 cpu cores : 8 cpu cores : 8 cpu cores : 20 cpu cores : 4 cpu cores : 3 cpu cores : 2 cpu cores : 2 This patch fixes this by always zeroing the booted_cores variable upon turning off a logical CPU. Signed-off-by: Samuel Neves <sne...@dei.uc.pt> --- arch/x86/kernel/smpboot.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 9eee25d07586..ff99e2b6fc54 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1437,6 +1437,7 @@ static void remove_siblinginfo(int cpu) cpumask_clear(topology_sibling_cpumask(cpu)); cpumask_clear(topology_core_cpumask(cpu)); c->cpu_core_id = 0; + c->booted_cores = 0; cpumask_clear_cpu(cpu, cpu_sibling_setup_mask); recompute_smt_state(); } -- 2.14.3
[PATCH] smpboot: correctly update number of booted cores
Without this fix, /proc/cpuinfo will display an incorrect amount of CPU cores, after bringing them offline and online again, as exemplified below: $ cat /proc/cpuinfo | grep cores cpu cores : 4 cpu cores : 8 cpu cores : 8 cpu cores : 20 cpu cores : 4 cpu cores : 3 cpu cores : 2 cpu cores : 2 This patch fixes this by always zeroing the booted_cores variable upon turning off a logical CPU. Signed-off-by: Samuel Neves --- arch/x86/kernel/smpboot.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 9eee25d07586..ff99e2b6fc54 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1437,6 +1437,7 @@ static void remove_siblinginfo(int cpu) cpumask_clear(topology_sibling_cpumask(cpu)); cpumask_clear(topology_core_cpumask(cpu)); c->cpu_core_id = 0; + c->booted_cores = 0; cpumask_clear_cpu(cpu, cpu_sibling_setup_mask); recompute_smt_state(); } -- 2.14.3