Re: [PATCH v8 1/2] powerpc/64s: reimplement book3s idle code in C

2019-04-12 Thread Nicholas Piggin
Satheesh Rajendran's on April 8, 2019 5:32 pm:
> Hi,
> 
> Hit with below kernel crash during Power8 Host boot with this patch series on 
> top
> of powerpc merge branch commit 
> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=merge=6a821ffee18a6e6c0027c523fa8c958df98ca361
> 
> built with ppc64le_defconfig
> 
> Host Console log:
> [0.454666] EEH: PCI Enhanced I/O Error Handling Enabled
> [0.456524] create_dump_obj: New platform dump. ID = 0x4 Size 7457968
> [0.457627] opal-power: OPAL EPOW, DPO support detected.
> [0.457722] BUG: Unable to handle kernel data access at 0xff76184a
> [0.457733] Faulting instruction address: 0xc001a94c
> [0.457740] Oops: Kernel access of bad area, sig: 11 [#1]
> [0.457745] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
> [0.457750] Modules linked in:
> [0.457756] CPU: 58 PID: 0 Comm: swapper/58 Not tainted 
> 5.1.0-rc2-gd0ae6c548 #1
> [0.457762] NIP:  c001a94c LR: c00a6e9c CTR: 
> c0008000
> [0.457768] REGS: c00f272b7b50 TRAP: 0380   Not tainted  
> (5.1.0-rc2-gd0ae6c548)
> [0.457773] MSR:  90001033   CR: 24004222  
> XER: 
> [0.457781] CFAR: c00a6e98 IRQMASK: 1 
> [0.457781] GPR00: c00a6e9c c00f272b7de0 0004 
> 0006 
> [0.457781] GPR04: c00a5dd4 24004222 c00f272b7d48 
> 0001 
> [0.457781] GPR08: 0002 ff761844 c00f27250c00 
> c3feb1676be1 
> [0.457781] GPR12: 4400 c009d380 c00ffe60ff90 
>  
> [0.457781] GPR16:   c004b4d0 
> c004b4a0 
> [0.457781] GPR20: c1526214 0800 0001 
> c1521b78 
> [0.457781] GPR24: 003a  0008 
>  
> [0.457781] GPR28: c1526140 0001 0400 
> c1525ce0 
> [0.457829] NIP [c001a94c] irq_set_pending_from_srr1+0x1c/0x50
> [0.457835] LR [c00a6e9c] power7_idle+0x3c/0x50
> [0.457839] Call Trace:
> [0.457843] [c00f272b7de0] [c00a6e98] power7_idle+0x38/0x50 
> (unreliable)
> [0.457849] [c00f272b7e00] [c00210f4] arch_cpu_idle+0x54/0x160
> [0.457856] [c00f272b7e30] [c0c47bc4] 
> default_idle_call+0x74/0x88
> [0.457862] [c00f272b7e50] [c0158f54] do_idle+0x2f4/0x3d0
> [0.457868] [c00f272b7ec0] [c0159288] 
> cpu_startup_entry+0x38/0x40
> [0.457874] [c00f272b7ef0] [c004dae4] 
> start_secondary+0x654/0x680
> [0.457881] [c00f272b7f90] [c000b25c] 
> start_secondary_prolog+0x10/0x14
> [0.457886] Instruction dump:
> [0.457890] 992d098b 7c630034 5463d97e 4e800020 6000 3c4c014d 38424dd0 
> 7c0802a6 
> [0.457898] 6000 3d22ff76 78637722 39291840 
> [0.457900] BUG: Unable to handle kernel data access at 0xff76184a
> [0.457901] <7d4918ae> 2b8a00ff 419e001c 892d098b 
> [0.457907] Faulting instruction address: 0xc001a94c
> [0.457910] BUG: Unable to handle kernel data access at 0xff76184a
> [0.457915] ---[ end trace fa7343cfd21c8798 ]---
> [0.457919] Faulting instruction address: 0xc001a94c
> [0.458961] BUG: Unable to handle kernel data access at 0xff76184a
> [0.458963] BUG: Unable to handle kernel data access at 0xff76184a
> [0.458964] BUG: Unable to handle kernel data access at 0xff76184a
> [0.458966] BUG: Unable to handle kernel data access at 0xff76184a
> [0.458968] BUG: Unable to handle kernel data access at 0xff76184a
> [0.458970] BUG: Unable to handle kernel data access at 0xff76184a
> [0.458972] Faulting instruction address: 0xc001a94c
> [0.458973] Faulting instruction address: 0xc001a94c
> [0.458974] Faulting instruction address: 0xc001a94c
> [0.458975] Faulting instruction address: 0xc001a94c
> [0.458976] Faulting instruction address: 0xc001a94c
> [0.458978] initcall 
> __machine_initcall_powernv_pnv_init_idle_states+0x0/0xb30 returned 0 after 0 
> usecs
> [0.458981] calling  __machine_initcall_powernv_opal_time_init+0x0/0x150 @ 
> 1
> [0.458982] Faulting instruction address: 0xc001a94c
> [0.459022] BUG: Unable to handle kernel data access at 0xff76184a
> [0.459040] Faulting instruction address: 0xc001a94c
> [0.459043] initcall __machine_initcall_powernv_opal_time_init+0x0/0x150 
> returned 0 after 0 usecs
> [0.459044] BUG: Unable to handle kernel data access at 0xff76184c
> [0.459045] Faulting instruction address: 0xc001a94c
> [0.459060] calling  __machine_initcall_powernv_rng_init+0x0/0x334 @ 1
> [0.459084] powernv-rng: Registering arch random hook.
> [

Re: [PATCH v8 1/2] powerpc/64s: reimplement book3s idle code in C

2019-04-08 Thread Satheesh Rajendran
Hi,

Hit with below kernel crash during Power8 Host boot with this patch series on 
top
of powerpc merge branch commit 
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=merge=6a821ffee18a6e6c0027c523fa8c958df98ca361

built with ppc64le_defconfig

Host Console log:
[0.454666] EEH: PCI Enhanced I/O Error Handling Enabled
[0.456524] create_dump_obj: New platform dump. ID = 0x4 Size 7457968
[0.457627] opal-power: OPAL EPOW, DPO support detected.
[0.457722] BUG: Unable to handle kernel data access at 0xff76184a
[0.457733] Faulting instruction address: 0xc001a94c
[0.457740] Oops: Kernel access of bad area, sig: 11 [#1]
[0.457745] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[0.457750] Modules linked in:
[0.457756] CPU: 58 PID: 0 Comm: swapper/58 Not tainted 5.1.0-rc2-gd0ae6c548 
#1
[0.457762] NIP:  c001a94c LR: c00a6e9c CTR: c0008000
[0.457768] REGS: c00f272b7b50 TRAP: 0380   Not tainted  
(5.1.0-rc2-gd0ae6c548)
[0.457773] MSR:  90001033   CR: 24004222  
XER: 
[0.457781] CFAR: c00a6e98 IRQMASK: 1 
[0.457781] GPR00: c00a6e9c c00f272b7de0 0004 
0006 
[0.457781] GPR04: c00a5dd4 24004222 c00f272b7d48 
0001 
[0.457781] GPR08: 0002 ff761844 c00f27250c00 
c3feb1676be1 
[0.457781] GPR12: 4400 c009d380 c00ffe60ff90 
 
[0.457781] GPR16:   c004b4d0 
c004b4a0 
[0.457781] GPR20: c1526214 0800 0001 
c1521b78 
[0.457781] GPR24: 003a  0008 
 
[0.457781] GPR28: c1526140 0001 0400 
c1525ce0 
[0.457829] NIP [c001a94c] irq_set_pending_from_srr1+0x1c/0x50
[0.457835] LR [c00a6e9c] power7_idle+0x3c/0x50
[0.457839] Call Trace:
[0.457843] [c00f272b7de0] [c00a6e98] power7_idle+0x38/0x50 
(unreliable)
[0.457849] [c00f272b7e00] [c00210f4] arch_cpu_idle+0x54/0x160
[0.457856] [c00f272b7e30] [c0c47bc4] default_idle_call+0x74/0x88
[0.457862] [c00f272b7e50] [c0158f54] do_idle+0x2f4/0x3d0
[0.457868] [c00f272b7ec0] [c0159288] cpu_startup_entry+0x38/0x40
[0.457874] [c00f272b7ef0] [c004dae4] start_secondary+0x654/0x680
[0.457881] [c00f272b7f90] [c000b25c] 
start_secondary_prolog+0x10/0x14
[0.457886] Instruction dump:
[0.457890] 992d098b 7c630034 5463d97e 4e800020 6000 3c4c014d 38424dd0 
7c0802a6 
[0.457898] 6000 3d22ff76 78637722 39291840 
[0.457900] BUG: Unable to handle kernel data access at 0xff76184a
[0.457901] <7d4918ae> 2b8a00ff 419e001c 892d098b 
[0.457907] Faulting instruction address: 0xc001a94c
[0.457910] BUG: Unable to handle kernel data access at 0xff76184a
[0.457915] ---[ end trace fa7343cfd21c8798 ]---
[0.457919] Faulting instruction address: 0xc001a94c
[0.458961] BUG: Unable to handle kernel data access at 0xff76184a
[0.458963] BUG: Unable to handle kernel data access at 0xff76184a
[0.458964] BUG: Unable to handle kernel data access at 0xff76184a
[0.458966] BUG: Unable to handle kernel data access at 0xff76184a
[0.458968] BUG: Unable to handle kernel data access at 0xff76184a
[0.458970] BUG: Unable to handle kernel data access at 0xff76184a
[0.458972] Faulting instruction address: 0xc001a94c
[0.458973] Faulting instruction address: 0xc001a94c
[0.458974] Faulting instruction address: 0xc001a94c
[0.458975] Faulting instruction address: 0xc001a94c
[0.458976] Faulting instruction address: 0xc001a94c
[0.458978] initcall 
__machine_initcall_powernv_pnv_init_idle_states+0x0/0xb30 returned 0 after 0 
usecs
[0.458981] calling  __machine_initcall_powernv_opal_time_init+0x0/0x150 @ 1
[0.458982] Faulting instruction address: 0xc001a94c
[0.459022] BUG: Unable to handle kernel data access at 0xff76184a
[0.459040] Faulting instruction address: 0xc001a94c
[0.459043] initcall __machine_initcall_powernv_opal_time_init+0x0/0x150 
returned 0 after 0 usecs
[0.459044] BUG: Unable to handle kernel data access at 0xff76184c
[0.459045] Faulting instruction address: 0xc001a94c
[0.459060] calling  __machine_initcall_powernv_rng_init+0x0/0x334 @ 1
[0.459084] powernv-rng: Registering arch random hook.
[0.459141] BUG: Unable to handle kernel data access at 0xff76184a
[0.459147] Faulting instruction address: 0xc001a94c
[0.459191] BUG: Unable to handle kernel data access at 0xff76184a
[0.459199] Faulting