Greetings!!!

I hit a kernel BUG on a linux-next kernel running on ppc64le (Power11 LPAR). The issue was observed once in CI (Avocado tests) and I haven’t been able to reproduce it reliably yet.

Architecture: ppc64le (Power11, pSeries)
Kernel: 7.1.0-rc5-next-20260529
Config: PREEMPT(lazy)
CPUs: large system (NR_CPUS=8192)


So far, I have not reproduced the crash, but I am trying to stress similar conditions using:

parallel read workloads (fio / dd)
memory pressure


Traces:

 (5/8) /home/upstreamci/avocado-fvt-wrapper/tests/avocado-misc-tests/cpu/ppc64_cpu_test.py:PPC64Test.test_smt_loop;run-run_type-upstream-9cfe: STARTED [ 1885.176400] crash hp: kexec_trylock() failed, kdump image may be inaccurate [ 1885.296164] crash hp: kexec_trylock() failed, kdump image may be inaccurate [ 1885.386120] crash hp: kexec_trylock() failed, kdump image may be inaccurate [ 1885.556134] crash hp: kexec_trylock() failed, kdump image may be inaccurate [ 1886.576119] crash hp: kexec_trylock() failed, kdump image may be inaccurate [ 1886.806060] crash hp: kexec_trylock() failed, kdump image may be inaccurate [ 1887.026051] crash hp: kexec_trylock() failed, kdump image may be inaccurate
[ 1887.456075] ------------[ cut here ]------------
[ 1887.456101] kernel BUG at kernel/sched/core.c:7512!
[ 1887.456107] Oops: Exception in kernel mode, sig: 5 [#1]
[ 1887.456111] LE PAGE_SIZE=4K MMU=Radix  SMP NR_CPUS=8192 NUMA pSeries
[ 1887.456116] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding tls ip_set rfkill nf_tables fsdev_dax kmem device_dax pseries_rng vmx_crypto dax_pmem fuse ext4 crc16 mbcache jbd2 sd_mod nd_pmem papr_scm sg libnvdimm ibmvscsi ibmveth scsi_transport_srp pseries_wdt [ 1887.456173] CPU: 28 UID: 0 PID: 85305 Comm: kexec Not tainted 7.1.0-rc5-next-20260529 #1 PREEMPT(lazy) [ 1887.456180] Hardware name: IBM,9080-HEX Power11 (architected) 0x820200 0xf000007 of:IBM,FW1110.01 (NH1110_069) hv:phyp pSeries [ 1887.456185] NIP:  c0000000013a8e8c LR: c0000000003483bc CTR: 0000000000000000 [ 1887.456190] REGS: c000000069f03070 TRAP: 0700   Not tainted (7.1.0-rc5-next-20260529) [ 1887.456195] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24428222  XER: 0000005a
[ 1887.456208] CFAR: c0000000003483b8 IRQMASK: 0
[ 1887.456208] GPR00: c0000000003483bc c000000069f03330 c000000001a82100 c000000069f033e0 [ 1887.456208] GPR04: 0000000000000000 0000000000000001 0000000000000001 c000000006dd3b00 [ 1887.456208] GPR08: ffffffffffffff00 0000000000000001 0000000000000000 0000000024428220 [ 1887.456208] GPR12: 0000000000000300 c000000effdbef00 0000000000000000 0000000000000000 [ 1887.456208] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1887.456208] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1887.456208] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1887.456208] GPR28: 0000000000000000 0000000000000000 0000000000000000 c000000069f033e0
[ 1887.456265] NIP [c0000000013a8e8c] preempt_schedule_irq+0x44/0x118
[ 1887.456274] LR [c0000000003483bc] dynamic_irqentry_exit_cond_resched+0x40/0x1a4
[ 1887.456282] Call Trace:
[ 1887.456284] [c000000069f03360] [c0000000003483bc] dynamic_irqentry_exit_cond_resched+0x40/0x1a4 [ 1887.456291] [c000000069f03380] [c00000000014f3bc] do_page_fault+0xc0/0x104 [ 1887.456298] [c000000069f033b0] [c000000000008be0] data_access_common_virt+0x210/0x220
[ 1887.456306] ---- interrupt: 300 at __copy_tofrom_user_base+0xac/0x5a4
[ 1887.456313] NIP:  c00000000017fc38 LR: c000000000aaa684 CTR: 0000000000000000 [ 1887.456317] REGS: c000000069f033e0 TRAP: 0300   Not tainted (7.1.0-rc5-next-20260529) [ 1887.456322] MSR:  8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 24428220  XER: 2004005a [ 1887.456334] CFAR: c00000000017fc34 DAR: 00003fff879a8000 DSISR: 42000000 IRQMASK: 0 [ 1887.456334] GPR00: 0000000000000000 c000000069f036a0 c000000001a82100 00003fff879a8000 [ 1887.456334] GPR04: c0000000bb314ff0 0000000000001000 69f0000606480600 0200c4080368f028 [ 1887.456334] GPR08: 09036af00005d9c4 0600000200e80803 0000000000000000 0000000000000030 [ 1887.456334] GPR12: 0000000000000040 c000000effdbef00 0000000000000000 000000000000000e [ 1887.456334] GPR16: 0000000004a00000 000000000000001f c000000069f038a0 c00000006e73e500 [ 1887.456334] GPR20: c00000006f0ff6a8 0000000000000000 c00000006f0ff540 0000000000000001 [ 1887.456334] GPR24: 000000001816ce60 c0000000bb314000 c000000002e48730 c000000069f03a30 [ 1887.456334] GPR28: c0000000bb314000 00003fff879a7010 0000000000000010 0000000000001000
[ 1887.456393] NIP [c00000000017fc38] __copy_tofrom_user_base+0xac/0x5a4
[ 1887.456399] LR [c000000000aaa684] raw_copy_to_user+0x12c/0x314
[ 1887.456405] ---- interrupt: 300
[ 1887.456408] [c000000069f036a0] [c000000000aaa5f4] raw_copy_to_user+0x9c/0x314 (unreliable) [ 1887.456416] [c000000069f036e0] [c000000000aacd08] _copy_to_iter+0xe4/0x79c [ 1887.456423] [c000000069f037a0] [c000000000ab01ec] copy_page_to_iter+0xd4/0x1a4 [ 1887.456429] [c000000069f037f0] [c0000000005ddc34] filemap_read+0x420/0x4f0 [ 1887.456436] [c000000069f039c0] [c0080000043443e0] ext4_file_read_iter+0x78/0x31c [ext4]
[ 1887.456517] [c000000069f03a10] [c000000000796498] vfs_read+0x2a8/0x3c8
[ 1887.456524] [c000000069f03ac0] [c00000000079726c] ksys_read+0x88/0x140
[ 1887.456530] [c000000069f03b10] [c000000000032f98] system_call_exception+0x198/0x4e0 [ 1887.456537] [c000000069f03e30] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
[ 1887.456544] ---- interrupt: 3000 at 0x3fff9b133cf4
[ 1887.456549] NIP:  00003fff9b133cf4 LR: 00003fff9b133cf4 CTR: 0000000000000000 [ 1887.456554] REGS: c000000069f03e60 TRAP: 3000   Not tainted (7.1.0-rc5-next-20260529) [ 1887.456558] MSR:  800000000000f033 <SF,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 44424402  XER: 00000000
[ 1887.456572] IRQMASK: 0
[ 1887.456572] GPR00: 0000000000000003 00003fffe5fb4190 0000000105087f00 0000000000000003 [ 1887.456572] GPR04: 00003fff82e93010 000000001816ce60 0000000000000022 0000000000000000 [ 1887.456572] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1887.456572] GPR12: 0000000000000000 00003fff9b4cd860 000000010507f588 0000000000000000 [ 1887.456572] GPR16: ffffffffffffffff 0000000000000000 0000000000000006 0000000000000000 [ 1887.456572] GPR20: 0000000000000001 00003fff9b23039c 00003fff9b2303a0 00003fffe5fb5ee7 [ 1887.456572] GPR24: 0000000000000000 0000000000000000 00003fffe5fb5ee7 00003fffe5fb42d0 [ 1887.456572] GPR28: 0000000000000003 00003fff82e93010 000000001816ce60 0000000000000000
[ 1887.456626] NIP [00003fff9b133cf4] 0x3fff9b133cf4
[ 1887.456630] LR [00003fff9b133cf4] 0x3fff9b133cf4
[ 1887.456634] ---- interrupt: 3000
[ 1887.456637] Code: fbe1fff8 e92d0128 f8010010 f821ffd1 81490000 39200001 2c0a0000 40820014 892d0152 552907fe 7d290034 5529d97e <0b090000> 60000000 3bc00000 ebed0128
[ 1887.456657] ---[ end trace 0000000000000000 ]---


If you happen to fix this, please add below tag.

Reported-by: Venkat Rao Bagalkote <[email protected]>


Regards,

Venkat.



Reply via email to