On 09/18/2018 02:27 AM, Cédric Le Goater wrote:
Hello Corey,
I just noticed this panic message on an OpenPOWER system running 4.19.0-rc4
(plus some custom KVM PPC patches). AFAICT the system was idle and I did not
see anything similar on 4.18.
Could it be commit 2512e40e48d2 ("ipmi: Rework SMI registration failure") ?
I don't think so, that fix should only affect error paths, really, and I
looked
at the powernv driver and it didn't seem to need any adjustment for that
change.
That is, unless you are getting some other IPMI failure, but even then, I
wouldn't think so.
But I also don't see how any other change that went in recently could
cause this.
Does this happen every time? What IPMI traffic is running on this at
the time?
It looks like the user went away and something didn't get handled properly.
That code uses RCU, so it's quite possible I screwed something up.
-corey
Thanks,
C.
[42630.028933] Unable to handle kernel paging request for data at address
0xffe950008
[42630.028961] Faulting instruction address: 0xc0000000001ace28
[42630.028976] Oops: Kernel access of bad area, sig: 11 [#1]
[42630.028986] LE SMP NR_CPUS=2048 NUMA PowerNV
[42630.029018] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE
iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6
nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter
ebtables ip6table_filter ip6_tables iptable_filter loop i2c_dev sg kvm_hv kvm
ofpart powernv_flash mtd at24 ipmi_powernv regmap_i2c ibmpowernv ipmi_devintf
uio_pdrv_genirq opal_prd ipmi_msghandler uio dm_mod ip_tables ext4 mbcache jbd2
raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx
raid6_pq libcrc32c raid1 raid0 sd_mod mlx5_core ast i2c_algo_bit drm_kms_helper
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ahci libahci libata tg3
mlxfw devlink drm_panel_orientation_quirks i2c_opal i2c_core ptp pps_core
[42630.029315] CPU: 64 PID: 744 Comm: kopald Not tainted 4.19.0-rc4-xive+ #142
[42630.029355] NIP: c0000000001ace28 LR: c00800000c7a09e4 CTR: c0000000001acde0
[42630.029397] REGS: c000200e588576d0 TRAP: 0300 Not tainted
(4.19.0-rc4-xive+)
[42630.029438] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 88002282
XER: 20040000
[42630.029490] CFAR: c00800000c7a931c DAR: 0000000ffe950008 DSISR: 40000000
IRQMASK: 1
[42630.029490] GPR00: c00800000c7a09e4 c000200e58857950 c00000000132c400
0000000000000000
[42630.029490] GPR04: c000200e588579e0 0000000000000001 0000000000000030
000000000000000c
[42630.029490] GPR08: 0000000ffe950000 0000000000000008 0000000000000008
c00800000c7a9308
[42630.029490] GPR12: c0000000001acde0 c000000ffffcb000 c000000000146e58
c000200e5764e740
[42630.029490] GPR16: c000000001352230 0000000000200040 0000000000000000
0000000000000001
[42630.029490] GPR20: c000000001353b00 000000010040967c 000000000000000a
c000000000ec7080
[42630.029490] GPR24: c000000f91c04324 0000000000000000 5deadbeef0000100
5deadbeef0000200
[42630.029490] GPR28: 0000000000000000 c000200e588579e0 c000200e54b9c018
0000000000000001
[42630.029710] NIP [c0000000001ace28] __srcu_read_lock+0x48/0x80
[42630.029744] LR [c00800000c7a09e4] acquire_ipmi_user+0x3c/0xb0
[ipmi_msghandler]
[42630.029775] Call Trace:
[42630.029792] [c000200e58857950] [c000200e58857b30] 0xc000200e58857b30
(unreliable)
[42630.029836] [c000200e58857980] [c00800000c7a09e4]
acquire_ipmi_user+0x3c/0xb0 [ipmi_msghandler]
[42630.029882] [c000200e588579c0] [c00800000c7a2434]
deliver_response+0x6c/0x140 [ipmi_msghandler]
[42630.029936] [c000200e58857a00] [c00800000c7a2530]
deliver_local_response+0x28/0x88 [ipmi_msghandler]
[42630.029982] [c000200e58857a30] [c00800000c7a6d34]
handle_one_recv_msg+0x14c/0xf30 [ipmi_msghandler]
[42630.030027] [c000200e58857b40] [c00800000c7a7cbc]
handle_new_recv_msgs+0x1a4/0x2e0 [ipmi_msghandler]
[42630.030072] [c000200e58857ba0] [c000000000120d08]
tasklet_action_common.isra.4+0xb8/0x190
[42630.030116] [c000200e58857c00] [c000000000a9a8d8] __do_softirq+0x158/0x3d4
[42630.030148] [c000200e58857cf0] [c000000000120798] irq_exit+0x158/0x190
[42630.030179] [c000200e58857d20] [c0000000000af238]
opal_handle_events+0x128/0x130
[42630.030213] [c000200e58857d80] [c0000000000a877c] kopald+0x9c/0x120
[42630.030245] [c000200e58857dc0] [c000000000146ff8] kthread+0x1a8/0x1b0
[42630.030278] [c000200e58857e30] [c00000000000b65c]
ret_from_kernel_thread+0x5c/0x80
[42630.030319] Instruction dump:
[42630.030348] f8010010 f821ffd1 39400001 83e33188 7c691b78 7bff07e0 886d098a
994d098a
[42630.030394] e90d0030 e92931b0 7bea1f24 7d295214 <7d49402a> 394a0001 7d49412a
4be6c8f5
[42630.030442] ---[ end trace 710c52ca727b0258 ]---
[42631.033746]
[42632.033802] Kernel panic - not syncing: Fatal exception in interrupt
[42633.[42768.041939961,5] OPAL: Reboot request...
194571] Rebooting in 10 seconds..
_______________________________________________
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer