Re: [PATCH linux-next] powerpc: init jump label early in ppc 64
On Mon, Jul 25, 2022 at 3:55 PM Michael Ellerman wrote: > > zhouzho...@gmail.com writes: > > From: Zhouyi Zhou > > > > In ppc 64, invoke jump_label_init in setup_feature_keys is too late > > because static key will be used in subroutine of early_init_devtree. > > > > So we can invoke jump_label_init earlier in early_setup. > > We can not move setup_feature_keys backward because its subroutine > > cpu_feature_keys_init depend on data structures initialized in > > early_init_devtree. > > > > Signed-off-by: Zhouyi Zhou > > --- > > Dear PPC developers > > > > I found this bug when trying to do rcutorture tests in ppc VM of > > Open Source Lab of Oregon State University. > > > > qemu-system-ppc64 -nographic -smp cores=8,threads=1 -net none -M pseries > > -nodefaults -device spapr-vscsi -serial > > file:/home/ubuntu/linux-next/tools/testing/selftests/rcutorture/res/2022.07.19-01.18.42-torture/results-rcutorture/TREE03/console.log > > -m 512 -kernel > > /home/ubuntu/linux-next/tools/testing/selftests/rcutorture/res/2022.07.19-01.18.42-torture/results-rcutorture/TREE03/vmlinux > > -append "debug_boot_weak_hash panic=-1 console=ttyS0 > > rcupdate.rcu_cpu_stall_suppress_at_boot=1 torture.disable_onoff_at_boot > > rcupdate.rcu_task_stall_timeout=3 rcutorture.onoff_interval=200 > > rcutorture.onoff_holdoff=30 rcutree.gp_preinit_delay=12 > > rcutree.gp_init_delay=3 rcutree.gp_cleanup_delay=3 rcutree.kthread_prio=2 > > threadirqs tree.use_softirq=0 rcutorture.n_barrier_cbs=4 > > rcutorture.stat_interval=15 rcutorture.shutdown_secs=420 > > rcutorture.test_no_idle_hz=1 rcutorture.verbose=1" > > > > console.log report following WARN: > > [0.00][T0] static_key_enable_cpuslocked(): static key > > '0xc2953260' used before call to jump_label_init()^M > > [0.00][T0] WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 > > static_key_enable_cpuslocked+0xfc/0x120^M > > [0.00][T0] Modules linked in:^M > > [0.00][T0] CPU: 0 PID: 0 Comm: swapper Not tainted > > 5.19.0-rc5-next-20220708-dirty #131^M > > [0.00][T0] NIP: c038068c LR: c0380688 CTR: > > c0186ac0^M > > [0.00][T0] REGS: c2867930 TRAP: 0700 Not tainted > > (5.19.0-rc5-next-20220708-dirty)^M > > [0.00][T0] MSR: 80022003 CR: 24282224 > > XER: 2004^M > > [0.00][T0] CFAR: 0730 IRQMASK: 1 ^M > > [0.00][T0] GPR00: c0380688 c2867bd0 > > c2868d00 0065 ^M > > [0.00][T0] GPR04: 0001 > > 0080 000d ^M > > [0.00][T0] GPR08: > > c27fd000 000f ^M > > [0.00][T0] GPR12: c0186ac0 c2082280 > > 0003 000d ^M > > [0.00][T0] GPR16: 02cc00d0 > > c2082280 0001 ^M > > [0.00][T0] GPR20: c2080942 > > ^M > > [0.00][T0] GPR24: c10d6168 > > c20034c8 ^M > > [0.00][T0] GPR28: 0028 > > c2080942 c2953260 ^M > > [0.00][T0] NIP [c038068c] > > static_key_enable_cpuslocked+0xfc/0x120^M > > [0.00][T0] LR [c0380688] > > static_key_enable_cpuslocked+0xf8/0x120^M > > [0.00][T0] Call Trace:^M > > [0.00][T0] [c2867bd0] [c0380688] > > static_key_enable_cpuslocked+0xf8/0x120 (unreliable)^M > > [0.00][T0] [c2867c40] [c0380810] > > static_key_enable+0x30/0x50^M > > [0.00][T0] [c2867c70] [c2030314] > > setup_forced_irqthreads+0x28/0x40^M > > [0.00][T0] [c2867c90] [c2003568] > > do_early_param+0xa0/0x108^M > > [0.00][T0] [c2867d10] [c0175340] > > parse_args+0x290/0x4e0^M > > [0.00][T0] [c2867e10] [c2003c74] > > parse_early_options+0x48/0x5c^M > > [0.00][T0] [c2867e30] [c2003ce0] > > parse_early_param+0x58/0x84^M > > [0.00][T0] [c2867e60] [c2009878] > > early_init_devtree+0xd4/0x518^M > > [0.00][T0] [c2867f10] [c200aee0] > > early_setup+0xb4/0x214^M > > > > After this fix, the WARN does not show again. > > Hi Zhouyi, Thank Michael for your guidance. > > We have hit something like this previously, see the stack trace in > commit e7eb919057c3 ("powerpc/64s: Handle program checks in wrong endian > during early boot"). I am learning the fantastic work by you (git log -p e7eb919057c3), e7eb919057c3 provides a trampoline to detect and correct the wrong endian when handling the exception caused by the WARN (static key used before call to jump_label_init) > > T
Re: [PATCH linux-next] powerpc: init jump label early in ppc 64
zhouzho...@gmail.com writes: > From: Zhouyi Zhou > > In ppc 64, invoke jump_label_init in setup_feature_keys is too late > because static key will be used in subroutine of early_init_devtree. > > So we can invoke jump_label_init earlier in early_setup. > We can not move setup_feature_keys backward because its subroutine > cpu_feature_keys_init depend on data structures initialized in > early_init_devtree. > > Signed-off-by: Zhouyi Zhou > --- > Dear PPC developers > > I found this bug when trying to do rcutorture tests in ppc VM of > Open Source Lab of Oregon State University. > > qemu-system-ppc64 -nographic -smp cores=8,threads=1 -net none -M pseries > -nodefaults -device spapr-vscsi -serial > file:/home/ubuntu/linux-next/tools/testing/selftests/rcutorture/res/2022.07.19-01.18.42-torture/results-rcutorture/TREE03/console.log > -m 512 -kernel > /home/ubuntu/linux-next/tools/testing/selftests/rcutorture/res/2022.07.19-01.18.42-torture/results-rcutorture/TREE03/vmlinux > -append "debug_boot_weak_hash panic=-1 console=ttyS0 > rcupdate.rcu_cpu_stall_suppress_at_boot=1 torture.disable_onoff_at_boot > rcupdate.rcu_task_stall_timeout=3 rcutorture.onoff_interval=200 > rcutorture.onoff_holdoff=30 rcutree.gp_preinit_delay=12 > rcutree.gp_init_delay=3 rcutree.gp_cleanup_delay=3 rcutree.kthread_prio=2 > threadirqs tree.use_softirq=0 rcutorture.n_barrier_cbs=4 > rcutorture.stat_interval=15 rcutorture.shutdown_secs=420 > rcutorture.test_no_idle_hz=1 rcutorture.verbose=1" > > console.log report following WARN: > [0.00][T0] static_key_enable_cpuslocked(): static key > '0xc2953260' used before call to jump_label_init()^M > [0.00][T0] WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 > static_key_enable_cpuslocked+0xfc/0x120^M > [0.00][T0] Modules linked in:^M > [0.00][T0] CPU: 0 PID: 0 Comm: swapper Not tainted > 5.19.0-rc5-next-20220708-dirty #131^M > [0.00][T0] NIP: c038068c LR: c0380688 CTR: > c0186ac0^M > [0.00][T0] REGS: c2867930 TRAP: 0700 Not tainted > (5.19.0-rc5-next-20220708-dirty)^M > [0.00][T0] MSR: 80022003 CR: 24282224 > XER: 2004^M > [0.00][T0] CFAR: 0730 IRQMASK: 1 ^M > [0.00][T0] GPR00: c0380688 c2867bd0 > c2868d00 0065 ^M > [0.00][T0] GPR04: 0001 > 0080 000d ^M > [0.00][T0] GPR08: > c27fd000 000f ^M > [0.00][T0] GPR12: c0186ac0 c2082280 > 0003 000d ^M > [0.00][T0] GPR16: 02cc00d0 > c2082280 0001 ^M > [0.00][T0] GPR20: c2080942 > ^M > [0.00][T0] GPR24: c10d6168 > c20034c8 ^M > [0.00][T0] GPR28: 0028 > c2080942 c2953260 ^M > [0.00][T0] NIP [c038068c] > static_key_enable_cpuslocked+0xfc/0x120^M > [0.00][T0] LR [c0380688] > static_key_enable_cpuslocked+0xf8/0x120^M > [0.00][T0] Call Trace:^M > [0.00][T0] [c2867bd0] [c0380688] > static_key_enable_cpuslocked+0xf8/0x120 (unreliable)^M > [0.00][T0] [c2867c40] [c0380810] > static_key_enable+0x30/0x50^M > [0.00][T0] [c2867c70] [c2030314] > setup_forced_irqthreads+0x28/0x40^M > [0.00][T0] [c2867c90] [c2003568] > do_early_param+0xa0/0x108^M > [0.00][T0] [c2867d10] [c0175340] > parse_args+0x290/0x4e0^M > [0.00][T0] [c2867e10] [c2003c74] > parse_early_options+0x48/0x5c^M > [0.00][T0] [c2867e30] [c2003ce0] > parse_early_param+0x58/0x84^M > [0.00][T0] [c2867e60] [c2009878] > early_init_devtree+0xd4/0x518^M > [0.00][T0] [c2867f10] [c200aee0] > early_setup+0xb4/0x214^M > > After this fix, the WARN does not show again. Hi Zhouyi, We have hit something like this previously, see the stack trace in commit e7eb919057c3 ("powerpc/64s: Handle program checks in wrong endian during early boot"). That was fixed incidentally/accidentally by the page_poison code changing to not use static keys so early. There was a similar case recently in the random code too, see 60e5b2886b92 ("random: do not use jump labels before they are initialized"). But I guess this will keep happening, as generic code authors expect to be able to use static keys in early_param() handlers. I think the ideal solution would be to move most early param parsing later. There's only a few parameters that need to be parsed that early in e
[PATCH linux-next] powerpc: init jump label early in ppc 64
From: Zhouyi Zhou In ppc 64, invoke jump_label_init in setup_feature_keys is too late because static key will be used in subroutine of early_init_devtree. So we can invoke jump_label_init earlier in early_setup. We can not move setup_feature_keys backward because its subroutine cpu_feature_keys_init depend on data structures initialized in early_init_devtree. Signed-off-by: Zhouyi Zhou --- Dear PPC developers I found this bug when trying to do rcutorture tests in ppc VM of Open Source Lab of Oregon State University. qemu-system-ppc64 -nographic -smp cores=8,threads=1 -net none -M pseries -nodefaults -device spapr-vscsi -serial file:/home/ubuntu/linux-next/tools/testing/selftests/rcutorture/res/2022.07.19-01.18.42-torture/results-rcutorture/TREE03/console.log -m 512 -kernel /home/ubuntu/linux-next/tools/testing/selftests/rcutorture/res/2022.07.19-01.18.42-torture/results-rcutorture/TREE03/vmlinux -append "debug_boot_weak_hash panic=-1 console=ttyS0 rcupdate.rcu_cpu_stall_suppress_at_boot=1 torture.disable_onoff_at_boot rcupdate.rcu_task_stall_timeout=3 rcutorture.onoff_interval=200 rcutorture.onoff_holdoff=30 rcutree.gp_preinit_delay=12 rcutree.gp_init_delay=3 rcutree.gp_cleanup_delay=3 rcutree.kthread_prio=2 threadirqs tree.use_softirq=0 rcutorture.n_barrier_cbs=4 rcutorture.stat_interval=15 rcutorture.shutdown_secs=420 rcutorture.test_no_idle_hz=1 rcutorture.verbose=1" console.log report following WARN: [0.00][T0] static_key_enable_cpuslocked(): static key '0xc2953260' used before call to jump_label_init()^M [0.00][T0] WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xfc/0x120^M [0.00][T0] Modules linked in:^M [0.00][T0] CPU: 0 PID: 0 Comm: swapper Not tainted 5.19.0-rc5-next-20220708-dirty #131^M [0.00][T0] NIP: c038068c LR: c0380688 CTR: c0186ac0^M [0.00][T0] REGS: c2867930 TRAP: 0700 Not tainted (5.19.0-rc5-next-20220708-dirty)^M [0.00][T0] MSR: 80022003 CR: 24282224 XER: 2004^M [0.00][T0] CFAR: 0730 IRQMASK: 1 ^M [0.00][T0] GPR00: c0380688 c2867bd0 c2868d00 0065 ^M [0.00][T0] GPR04: 0001 0080 000d ^M [0.00][T0] GPR08: c27fd000 000f ^M [0.00][T0] GPR12: c0186ac0 c2082280 0003 000d ^M [0.00][T0] GPR16: 02cc00d0 c2082280 0001 ^M [0.00][T0] GPR20: c2080942 ^M [0.00][T0] GPR24: c10d6168 c20034c8 ^M [0.00][T0] GPR28: 0028 c2080942 c2953260 ^M [0.00][T0] NIP [c038068c] static_key_enable_cpuslocked+0xfc/0x120^M [0.00][T0] LR [c0380688] static_key_enable_cpuslocked+0xf8/0x120^M [0.00][T0] Call Trace:^M [0.00][T0] [c2867bd0] [c0380688] static_key_enable_cpuslocked+0xf8/0x120 (unreliable)^M [0.00][T0] [c2867c40] [c0380810] static_key_enable+0x30/0x50^M [0.00][T0] [c2867c70] [c2030314] setup_forced_irqthreads+0x28/0x40^M [0.00][T0] [c2867c90] [c2003568] do_early_param+0xa0/0x108^M [0.00][T0] [c2867d10] [c0175340] parse_args+0x290/0x4e0^M [0.00][T0] [c2867e10] [c2003c74] parse_early_options+0x48/0x5c^M [0.00][T0] [c2867e30] [c2003ce0] parse_early_param+0x58/0x84^M [0.00][T0] [c2867e60] [c2009878] early_init_devtree+0xd4/0x518^M [0.00][T0] [c2867f10] [c200aee0] early_setup+0xb4/0x214^M After this fix, the WARN does not show again. Kind Regards Zhouyi -- arch/powerpc/kernel/setup_64.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index 2b2d0b0fbb30..bf2fb76221da 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -365,6 +365,9 @@ void __init early_setup(unsigned long dt_ptr) udbg_printf(" -> %s(), dt_ptr: 0x%lx\n", __func__, dt_ptr); + /* Initialise jump label because subsequent calls need it */ + jump_label_init(); + /* * Do early initialization using the flattened device * tree, such as retrieving the physical memory map or @@ -394,8 +397,15 @@ void __init early_setup(unsigned long dt_ptr) /* Apply all the dynamic patching */ apply_feature_fixups(); - setup_feature_keys(); + + /* +* All