Hi Venkat,

On 5/25/2026 10:07 PM, Venkat Rao Bagalkote wrote:
Greetings!!!

I am seeing an early boot kernel panic due to NULL pointer dereference on a POWER9 (pSeries) system when testing linux-next (next-20260522).


Thanks for the test.


Traces:

[    0.038567] Big cores detected but using small core scheduling
[    0.038796] BUG: Kernel NULL pointer dereference at 0x00000000
[    0.038804] Faulting instruction address: 0xc000000000e58504
[    0.038812] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.038819] LE PAGE_SIZE=64K MMU=Hash  SMP NR_CPUS=8192 NUMA pSeries
[    0.038830] Modules linked in:
[    0.038840] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 7.0.0- rc6+ #14 PREEMPTLAZY [    0.038851] Hardware name: IBM,8375-42A POWER9 (architected) 0x4e0202 0xf000005 of:IBM,FW950.80 (VL950_131) hv:phyp pSeries
[    0.039029] NIP [c000000000e58504] _find_first_bit+0x44/0x130
[    0.039043] LR [c000000000e58500] _find_first_bit+0x40/0x130
[    0.039076] [c0000000090e7bc0] [c00000000038b3b8] build_sched_domains+0xad8/0xe50

It seems that cpumask_first(llc_mask(i)) is accessing
NULL cpu_coregroup_mask():
has_coregroup_support() is false, thus cpu_coregroup_map
is never allocated in smp_prepare_cpus().
This machine is a "shared system" VM. We should probably
let the LLC id generation fall back to using L2 id if
cpu_coregroup_mask is unavailable (which restores the
behavior before this patch). I'm wondering if the following
change would help(need IBM friends' help on this):

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 3467f86fd78f..cf6c2e4190ab 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1042,11 +1042,6 @@ static const struct cpumask *tl_smallcore_smt_mask(struct sched_domain_topology_
 }
 #endif

-struct cpumask *cpu_coregroup_mask(int cpu)
-{
-       return per_cpu(cpu_coregroup_map, cpu);
-}
-
 static bool has_coregroup_support(void)
 {
        /* Coregroup identification not available on shared systems */
@@ -1056,6 +1051,14 @@ static bool has_coregroup_support(void)
        return coregroup_enabled;
 }

+struct cpumask *cpu_coregroup_mask(int cpu)
+{
+       if (!has_coregroup_support())
+               return cpu_l2_cache_mask(cpu);
+
+       return per_cpu(cpu_coregroup_map, cpu);
+}
+
 static int __init init_big_cores(void)
 {
        int cpu;









Reply via email to