When hot adding a CPU and onlining it, the following WARN_ON() messages
are shown:

[  772.891448] ------------[ cut here ]------------
[  772.896624] WARNING: CPU: 58 PID: 15169 at 
arch/x86/kernel/cpu/perf_event_intel_cqm.c:1268 
intel_cqm_cpu_prepare+0x88/0x90()
[  772.909167] Modules linked in:
[  772.995134] CPU: 58 PID: 15169
[  773.016633]  0000000000000000 0000000092fb60ed ffff88104febbba8 
ffffffff8167b5fa
[  773.024789]  0000000000000000 0000000000000000 ffff88104febbbe8 
ffffffff810819ea
[  773.033119]  ffff88103be60000 ffff8c0fbc7ca020 ffffffff819fadf0 
000000000000008f
[  773.041461] Call Trace:
[  773.044402]  [<ffffffff8167b5fa>] dump_stack+0x45/0x57
[  773.050160]  [<ffffffff810819ea>] warn_slowpath_common+0x8a/0xc0
[  773.056888]  [<ffffffff81081b1a>] warn_slowpath_null+0x1a/0x20
[  773.063426]  [<ffffffff810365f8>] intel_cqm_cpu_prepare+0x88/0x90
[  773.070253]  [<ffffffff81036732>] intel_cqm_cpu_notifier+0x42/0x160
[  773.077271]  [<ffffffff810a0d3d>] notifier_call_chain+0x4d/0x80
[  773.083901]  [<ffffffff810a0e4e>] __raw_notifier_call_chain+0xe/0x10
[  773.091007]  [<ffffffff81081ef8>] _cpu_up+0xe8/0x190
[  773.096555]  [<ffffffff8108201a>] cpu_up+0x7a/0xa0
[  773.101910]  [<ffffffff816701b0>] cpu_subsys_online+0x40/0x90
[  773.108332]  [<ffffffff8143d777>] device_online+0x67/0x90
[  773.114368]  [<ffffffff8143d82a>] online_store+0x8a/0xa0
[  773.120305]  [<ffffffff8143aab8>] dev_attr_store+0x18/0x30
[  773.126437]  [<ffffffff8127224a>] sysfs_kf_write+0x3a/0x50
[  773.132560]  [<ffffffff812718d0>] kernfs_fop_write+0x120/0x170
[  773.139078]  [<ffffffff811f7657>] __vfs_write+0x37/0x100
[  773.145019]  [<ffffffff811fa398>] ? __sb_start_write+0x58/0x110
[  773.151635]  [<ffffffff8129d7ed>] ? security_file_permission+0x3d/0xc0
[  773.158932]  [<ffffffff811f7d59>] vfs_write+0xa9/0x190
[  773.164674]  [<ffffffff810234e6>] ? do_audit_syscall_entry+0x66/0x70
[  773.171776]  [<ffffffff811f8b55>] SyS_write+0x55/0xc0
[  773.177423]  [<ffffffff810672f0>] ? do_page_fault+0x30/0x80
[  773.183654]  [<ffffffff8168232e>] entry_SYSCALL_64_fastpath+0x12/0x71
[  773.190843] ---[ end trace e6219d24386873bd ]---
[  773.196573] smpboot: Booting Node 7 Processor 143 APIC 0x1f7
[  773.221241] microcode: CPU143 sig=0x306f3, pf=0x80, revision=0x9
[  773.228005] Will online and init hotplugged CPU: 143

Here is the root cause of the issue:
When calling intel_cqm_cpu_prepare() at CPU_UP_PREPARE notification,
the function checks that c86_chache_max_rmid is same as cqm_max_rmid
as follows:

static void intel_cqm_cpu_prepare(unsigned int cpu)
{
...
        WARN_ON(c->x86_cache_max_rmid != cqm_max_rmid);

But x86_cache_max_rmid of hot added CPU is not set yet, because it will 
set in get_cpu_cap() which is called after CPU_UP_PREPARE notification.

So when onlining a hot added CPU, the WARN_ON() are always shown:

To fix the issue, the patch moves WARN_ON()s from intel_cqm_cpu_prepare() to
cqm_pick_event_reader() which is called at CPU_STARTING notification.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasu...@jp.fujitsu.com>
CC: Peter Zijlstra <pet...@infradead.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Vikas Shivappa <vikas.shiva...@intel.com>
Cc: Kanaka Juvva <kanaka.d.ju...@intel.com>
CC: Matt Fleming <matt.flem...@intel.com>

---
 arch/x86/kernel/cpu/perf_event_intel_cqm.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c 
b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
index 63eb68b..6196d3e 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
@@ -1244,9 +1244,13 @@ static struct pmu intel_cqm_pmu = {
 
 static inline void cqm_pick_event_reader(int cpu)
 {
+       struct cpuinfo_x86 *c = &cpu_data(cpu);
        int phys_id = topology_physical_package_id(cpu);
        int i;
 
+       WARN_ON(c->x86_cache_max_rmid != cqm_max_rmid);
+       WARN_ON(c->x86_cache_occ_scale != cqm_l3_scale);
+
        for_each_cpu(i, &cqm_cpumask) {
                if (phys_id == topology_physical_package_id(i))
                        return; /* already got reader for this socket */
@@ -1258,14 +1262,10 @@ static inline void cqm_pick_event_reader(int cpu)
 static void intel_cqm_cpu_prepare(unsigned int cpu)
 {
        struct intel_pqr_state *state = &per_cpu(pqr_state, cpu);
-       struct cpuinfo_x86 *c = &cpu_data(cpu);
 
        state->rmid = 0;
        state->closid = 0;
        state->rmid_usecnt = 0;
-
-       WARN_ON(c->x86_cache_max_rmid != cqm_max_rmid);
-       WARN_ON(c->x86_cache_occ_scale != cqm_l3_scale);
 }
 
 static void intel_cqm_cpu_exit(unsigned int cpu)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to