On 09/18/2018 02:37 PM, Corey Minyard wrote:
> On 09/18/2018 02:27 AM, Cédric Le Goater wrote:
>> Hello Corey,
>>
>> I just noticed this panic message on an OpenPOWER system running 4.19.0-rc4
>> (plus some custom KVM PPC patches). AFAICT the system was idle and I did not
>> see anything similar on 4.18.
>>
>> Could it be commit 2512e40e48d2 ("ipmi: Rework SMI registration failure") ?
> 
> I don't think so, that fix should only affect error paths, really, and I 
> looked
> at the powernv driver and it didn't seem to need any adjustment for that
> change.
> 
> That is, unless you are getting some other IPMI failure, but even then, I
> wouldn't think so.
> 
> But I also don't see how any other change that went in recently could cause 
> this.
> 
> Does this happen every time? 

It happened once, some time after the first boot of the 4.19-rc4. 

> What IPMI traffic is running on this at the time?

Nothing else than the ping messages sent by the BMC. The kernel handles 
the interrupts on behalf of the firmware. But as the system crashed and 
rebooted, I couldn't check the firmware memory logs.

> It looks like the user went away and something didn't get handled properly.
> That code uses RCU, so it's quite possible I screwed something up.

I will keep you updated if it fails again.

Thanks,

C.

>> [42630.028933] Unable to handle kernel paging request for data at address 
>> 0xffe950008
>> [42630.028961] Faulting instruction address: 0xc0000000001ace28
>> [42630.028976] Oops: Kernel access of bad area, sig: 11 [#1]
>> [42630.028986] LE SMP NR_CPUS=2048 NUMA PowerNV
>> [42630.029018] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE 
>> iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 
>> nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter 
>> ebtables ip6table_filter ip6_tables iptable_filter loop i2c_dev sg kvm_hv 
>> kvm ofpart powernv_flash mtd at24 ipmi_powernv regmap_i2c ibmpowernv 
>> ipmi_devintf uio_pdrv_genirq opal_prd ipmi_msghandler uio dm_mod ip_tables 
>> ext4 mbcache jbd2 raid10 raid456 async_raid6_recov async_memcpy async_pq 
>> async_xor xor async_tx raid6_pq libcrc32c raid1 raid0 sd_mod mlx5_core ast 
>> i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops 
>> ttm drm ahci libahci libata tg3 mlxfw devlink drm_panel_orientation_quirks 
>> i2c_opal i2c_core ptp pps_core
>> [42630.029315] CPU: 64 PID: 744 Comm: kopald Not tainted 4.19.0-rc4-xive+ 
>> #142
>> [42630.029355] NIP:  c0000000001ace28 LR: c00800000c7a09e4 CTR: 
>> c0000000001acde0
>> [42630.029397] REGS: c000200e588576d0 TRAP: 0300   Not tainted  
>> (4.19.0-rc4-xive+)
>> [42630.029438] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 
>> 88002282  XER: 20040000
>> [42630.029490] CFAR: c00800000c7a931c DAR: 0000000ffe950008 DSISR: 40000000 
>> IRQMASK: 1
>> [42630.029490] GPR00: c00800000c7a09e4 c000200e58857950 c00000000132c400 
>> 0000000000000000
>> [42630.029490] GPR04: c000200e588579e0 0000000000000001 0000000000000030 
>> 000000000000000c
>> [42630.029490] GPR08: 0000000ffe950000 0000000000000008 0000000000000008 
>> c00800000c7a9308
>> [42630.029490] GPR12: c0000000001acde0 c000000ffffcb000 c000000000146e58 
>> c000200e5764e740
>> [42630.029490] GPR16: c000000001352230 0000000000200040 0000000000000000 
>> 0000000000000001
>> [42630.029490] GPR20: c000000001353b00 000000010040967c 000000000000000a 
>> c000000000ec7080
>> [42630.029490] GPR24: c000000f91c04324 0000000000000000 5deadbeef0000100 
>> 5deadbeef0000200
>> [42630.029490] GPR28: 0000000000000000 c000200e588579e0 c000200e54b9c018 
>> 0000000000000001
>> [42630.029710] NIP [c0000000001ace28] __srcu_read_lock+0x48/0x80
>> [42630.029744] LR [c00800000c7a09e4] acquire_ipmi_user+0x3c/0xb0 
>> [ipmi_msghandler]
>> [42630.029775] Call Trace:
>> [42630.029792] [c000200e58857950] [c000200e58857b30] 0xc000200e58857b30 
>> (unreliable)
>> [42630.029836] [c000200e58857980] [c00800000c7a09e4] 
>> acquire_ipmi_user+0x3c/0xb0 [ipmi_msghandler]
>> [42630.029882] [c000200e588579c0] [c00800000c7a2434] 
>> deliver_response+0x6c/0x140 [ipmi_msghandler]
>> [42630.029936] [c000200e58857a00] [c00800000c7a2530] 
>> deliver_local_response+0x28/0x88 [ipmi_msghandler]
>> [42630.029982] [c000200e58857a30] [c00800000c7a6d34] 
>> handle_one_recv_msg+0x14c/0xf30 [ipmi_msghandler]
>> [42630.030027] [c000200e58857b40] [c00800000c7a7cbc] 
>> handle_new_recv_msgs+0x1a4/0x2e0 [ipmi_msghandler]
>> [42630.030072] [c000200e58857ba0] [c000000000120d08] 
>> tasklet_action_common.isra.4+0xb8/0x190
>> [42630.030116] [c000200e58857c00] [c000000000a9a8d8] __do_softirq+0x158/0x3d4
>> [42630.030148] [c000200e58857cf0] [c000000000120798] irq_exit+0x158/0x190
>> [42630.030179] [c000200e58857d20] [c0000000000af238] 
>> opal_handle_events+0x128/0x130
>> [42630.030213] [c000200e58857d80] [c0000000000a877c] kopald+0x9c/0x120
>> [42630.030245] [c000200e58857dc0] [c000000000146ff8] kthread+0x1a8/0x1b0
>> [42630.030278] [c000200e58857e30] [c00000000000b65c] 
>> ret_from_kernel_thread+0x5c/0x80
>> [42630.030319] Instruction dump:
>> [42630.030348] f8010010 f821ffd1 39400001 83e33188 7c691b78 7bff07e0 
>> 886d098a 994d098a
>> [42630.030394] e90d0030 e92931b0 7bea1f24 7d295214 <7d49402a> 394a0001 
>> 7d49412a 4be6c8f5
>> [42630.030442] ---[ end trace 710c52ca727b0258 ]---
>> [42631.033746]
>> [42632.033802] Kernel panic - not syncing: Fatal exception in interrupt
>> [42633.[42768.041939961,5] OPAL: Reboot request...
>> 194571] Rebooting in 10 seconds..
> 
> 



_______________________________________________
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to