Re: [PATCH v3 00/51] cpuidle,rcu: Clean up the mess
On Tue, Jan 17, 2023 at 02:21:40PM +, Sudeep Holla wrote: > On Tue, Jan 17, 2023 at 01:16:21PM +, Mark Rutland wrote: > > On Tue, Jan 17, 2023 at 11:26:29AM +0100, Peter Zijlstra wrote: > > > On Mon, Jan 16, 2023 at 04:59:04PM +, Mark Rutland wrote: > > > > > > > I'm sorry to have to bear some bad news on that front. :( > > > > > > Moo, something had to give.. > > > > > > > > > > IIUC what's happenign here is the PSCI cpuidle driver has entered idle > > > > and RCU > > > > is no longer watching when arm64's cpu_suspend() manipulates DAIF. Our > > > > local_daif_*() helpers poke lockdep and tracing, hence the call to > > > > trace_hardirqs_off() and the RCU usage. > > > > > > Right, strictly speaking not needed at this point, IRQs should have been > > > traced off a long time ago. > > > > True, but there are some other calls around here that *might* end up > > invoking > > RCU stuff (e.g. the MTE code). > > > > That all needs a noinstr cleanup too, which I'll sort out as a follow-up. > > > > > > I think we need RCU to be watching all the way down to cpu_suspend(), > > > > and it's > > > > cpu_suspend() that should actually enter/exit idle context. That and we > > > > need to > > > > make cpu_suspend() and the low-level PSCI invocation noinstr. > > > > > > > > I'm not sure whether 32-bit will have a similar issue or not. > > > > > > I'm not seeing 32bit or Risc-V have similar issues here, but who knows, > > > maybe I missed somsething. > > > > I reckon if they do, the core changes here give us the infrastructure to fix > > them if/when we get reports. > > > > > In any case, the below ought to cure the ARM64 case and remove that last > > > known RCU_NONIDLE() user as a bonus. > > > > The below works for me testing on a Juno R1 board with PSCI, using > > defconfig + > > CONFIG_PROVE_LOCKING=y + CONFIG_DEBUG_LOCKDEP=y + > > CONFIG_DEBUG_ATOMIC_SLEEP=y. > > I'm not sure how to test the LPI / FFH part, but it looks good to me. > > > > FWIW: > > > > Reviewed-by: Mark Rutland > > Tested-by: Mark Rutland > > > > Sudeep, would you be able to give the LPI/FFH side a spin with the kconfig > > options above? > > > > Not sure if I have messed up something in my mail setup, but I did reply > earlier. Sorry, that was my bad; I had been drafting my reply for a while and forgot to re-check prior to sending. > I did test both DT/cpuidle-psci driver and ACPI/LPI+FFH driver > with the fix Peter sent. I was seeing same splat as you in both DT and > ACPI boot which the patch fixed it. I used the same config as described by > you above. Perfect; thanks! Mark. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 00/51] cpuidle,rcu: Clean up the mess
On Tue, Jan 17, 2023 at 11:26:29AM +0100, Peter Zijlstra wrote: > On Mon, Jan 16, 2023 at 04:59:04PM +, Mark Rutland wrote: > > > I'm sorry to have to bear some bad news on that front. :( > > Moo, something had to give.. > > > > IIUC what's happenign here is the PSCI cpuidle driver has entered idle and > > RCU > > is no longer watching when arm64's cpu_suspend() manipulates DAIF. Our > > local_daif_*() helpers poke lockdep and tracing, hence the call to > > trace_hardirqs_off() and the RCU usage. > > Right, strictly speaking not needed at this point, IRQs should have been > traced off a long time ago. True, but there are some other calls around here that *might* end up invoking RCU stuff (e.g. the MTE code). That all needs a noinstr cleanup too, which I'll sort out as a follow-up. > > I think we need RCU to be watching all the way down to cpu_suspend(), and > > it's > > cpu_suspend() that should actually enter/exit idle context. That and we > > need to > > make cpu_suspend() and the low-level PSCI invocation noinstr. > > > > I'm not sure whether 32-bit will have a similar issue or not. > > I'm not seeing 32bit or Risc-V have similar issues here, but who knows, > maybe I missed somsething. I reckon if they do, the core changes here give us the infrastructure to fix them if/when we get reports. > In any case, the below ought to cure the ARM64 case and remove that last > known RCU_NONIDLE() user as a bonus. The below works for me testing on a Juno R1 board with PSCI, using defconfig + CONFIG_PROVE_LOCKING=y + CONFIG_DEBUG_LOCKDEP=y + CONFIG_DEBUG_ATOMIC_SLEEP=y. I'm not sure how to test the LPI / FFH part, but it looks good to me. FWIW: Reviewed-by: Mark Rutland Tested-by: Mark Rutland Sudeep, would you be able to give the LPI/FFH side a spin with the kconfig options above? Thanks, Mark. > > --- > diff --git a/arch/arm64/kernel/cpuidle.c b/arch/arm64/kernel/cpuidle.c > index 41974a1a229a..42e19fff40ee 100644 > --- a/arch/arm64/kernel/cpuidle.c > +++ b/arch/arm64/kernel/cpuidle.c > @@ -67,10 +67,10 @@ __cpuidle int acpi_processor_ffh_lpi_enter(struct > acpi_lpi_state *lpi) > u32 state = lpi->address; > > if (ARM64_LPI_IS_RETENTION_STATE(lpi->arch_flags)) > - return > CPU_PM_CPU_IDLE_ENTER_RETENTION_PARAM(psci_cpu_suspend_enter, > + return > CPU_PM_CPU_IDLE_ENTER_RETENTION_PARAM_RCU(psci_cpu_suspend_enter, > lpi->index, state); > else > - return CPU_PM_CPU_IDLE_ENTER_PARAM(psci_cpu_suspend_enter, > + return CPU_PM_CPU_IDLE_ENTER_PARAM_RCU(psci_cpu_suspend_enter, >lpi->index, state); > } > #endif > diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c > index e7163f31f716..0fbdf5fe64d8 100644 > --- a/arch/arm64/kernel/suspend.c > +++ b/arch/arm64/kernel/suspend.c > @@ -4,6 +4,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -104,6 +105,10 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned > long)) >* From this point debug exceptions are disabled to prevent >* updates to mdscr register (saved and restored along with >* general purpose registers) from kernel debuggers. > + * > + * Strictly speaking the trace_hardirqs_off() here is superfluous, > + * hardirqs should be firmly off by now. This really ought to use > + * something like raw_local_daif_save(). >*/ > flags = local_daif_save(); > > @@ -120,6 +125,8 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned > long)) >*/ > arm_cpuidle_save_irq_context(&context); > > + ct_cpuidle_enter(); > + > if (__cpu_suspend_enter(&state)) { > /* Call the suspend finisher */ > ret = fn(arg); > @@ -133,8 +140,11 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned > long)) >*/ > if (!ret) > ret = -EOPNOTSUPP; > + > + ct_cpuidle_exit(); > } else { > - RCU_NONIDLE(__cpu_suspend_exit()); > + ct_cpuidle_exit(); > + __cpu_suspend_exit(); > } > > arm_cpuidle_restore_irq_context(&context); > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c > index 4fc4e0381944..312a34ef28dc 100644 > --- a/drivers/cpuidle/cpuidle-psci.c > +++ b/drivers/cpuidle/cpuidle-psci.c > @@ -69,16 +69,12 @@ static __cpuidle int > __psci_enter_domain_idle_state(struct cpuidle_device *dev, > else > pm_runtime_put_sync_suspend(pd_dev); > > - ct_cpuidle_enter(); > - > state = psci_get_domain_state(); > if (!state) > state = states[idx]; > > ret = psci_cpu_suspend_enter(state) ? -1 : idx; > > - ct_cpuidle_exit(); > - > if (s2idle) > dev_pm_genpd_resume(pd_dev); > else > @@ -192,7 +188,7 @@ static _
Re: [PATCH v3 00/51] cpuidle,rcu: Clean up the mess
On Mon, Jan 16, 2023 at 04:59:04PM +, Mark Rutland wrote: > I'm sorry to have to bear some bad news on that front. :( Moo, something had to give.. > IIUC what's happenign here is the PSCI cpuidle driver has entered idle and RCU > is no longer watching when arm64's cpu_suspend() manipulates DAIF. Our > local_daif_*() helpers poke lockdep and tracing, hence the call to > trace_hardirqs_off() and the RCU usage. Right, strictly speaking not needed at this point, IRQs should have been traced off a long time ago. > I think we need RCU to be watching all the way down to cpu_suspend(), and it's > cpu_suspend() that should actually enter/exit idle context. That and we need > to > make cpu_suspend() and the low-level PSCI invocation noinstr. > > I'm not sure whether 32-bit will have a similar issue or not. I'm not seeing 32bit or Risc-V have similar issues here, but who knows, maybe I missed somsething. In any case, the below ought to cure the ARM64 case and remove that last known RCU_NONIDLE() user as a bonus. --- diff --git a/arch/arm64/kernel/cpuidle.c b/arch/arm64/kernel/cpuidle.c index 41974a1a229a..42e19fff40ee 100644 --- a/arch/arm64/kernel/cpuidle.c +++ b/arch/arm64/kernel/cpuidle.c @@ -67,10 +67,10 @@ __cpuidle int acpi_processor_ffh_lpi_enter(struct acpi_lpi_state *lpi) u32 state = lpi->address; if (ARM64_LPI_IS_RETENTION_STATE(lpi->arch_flags)) - return CPU_PM_CPU_IDLE_ENTER_RETENTION_PARAM(psci_cpu_suspend_enter, + return CPU_PM_CPU_IDLE_ENTER_RETENTION_PARAM_RCU(psci_cpu_suspend_enter, lpi->index, state); else - return CPU_PM_CPU_IDLE_ENTER_PARAM(psci_cpu_suspend_enter, + return CPU_PM_CPU_IDLE_ENTER_PARAM_RCU(psci_cpu_suspend_enter, lpi->index, state); } #endif diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c index e7163f31f716..0fbdf5fe64d8 100644 --- a/arch/arm64/kernel/suspend.c +++ b/arch/arm64/kernel/suspend.c @@ -4,6 +4,7 @@ #include #include #include +#include #include #include #include @@ -104,6 +105,10 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long)) * From this point debug exceptions are disabled to prevent * updates to mdscr register (saved and restored along with * general purpose registers) from kernel debuggers. +* +* Strictly speaking the trace_hardirqs_off() here is superfluous, +* hardirqs should be firmly off by now. This really ought to use +* something like raw_local_daif_save(). */ flags = local_daif_save(); @@ -120,6 +125,8 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long)) */ arm_cpuidle_save_irq_context(&context); + ct_cpuidle_enter(); + if (__cpu_suspend_enter(&state)) { /* Call the suspend finisher */ ret = fn(arg); @@ -133,8 +140,11 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long)) */ if (!ret) ret = -EOPNOTSUPP; + + ct_cpuidle_exit(); } else { - RCU_NONIDLE(__cpu_suspend_exit()); + ct_cpuidle_exit(); + __cpu_suspend_exit(); } arm_cpuidle_restore_irq_context(&context); diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c index 4fc4e0381944..312a34ef28dc 100644 --- a/drivers/cpuidle/cpuidle-psci.c +++ b/drivers/cpuidle/cpuidle-psci.c @@ -69,16 +69,12 @@ static __cpuidle int __psci_enter_domain_idle_state(struct cpuidle_device *dev, else pm_runtime_put_sync_suspend(pd_dev); - ct_cpuidle_enter(); - state = psci_get_domain_state(); if (!state) state = states[idx]; ret = psci_cpu_suspend_enter(state) ? -1 : idx; - ct_cpuidle_exit(); - if (s2idle) dev_pm_genpd_resume(pd_dev); else @@ -192,7 +188,7 @@ static __cpuidle int psci_enter_idle_state(struct cpuidle_device *dev, { u32 *state = __this_cpu_read(psci_cpuidle_data.psci_states); - return CPU_PM_CPU_IDLE_ENTER_PARAM(psci_cpu_suspend_enter, idx, state[idx]); + return CPU_PM_CPU_IDLE_ENTER_PARAM_RCU(psci_cpu_suspend_enter, idx, state[idx]); } static const struct of_device_id psci_idle_state_match[] = { diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c index e7bcfca4159f..f3a044fa4652 100644 --- a/drivers/firmware/psci/psci.c +++ b/drivers/firmware/psci/psci.c @@ -462,11 +462,22 @@ int psci_cpu_suspend_enter(u32 state) if (!psci_power_state_loses_context(state)) { struct arm_cpuidle_irq_context context; + ct_cpuidle_enter(); arm_cpuidle_save_irq_context(&context); ret = psci_ops.cpu_suspend(state, 0); arm_cpuidle
Re: [PATCH v3 00/51] cpuidle,rcu: Clean up the mess
On Thu, Jan 12, 2023 at 08:43:14PM +0100, Peter Zijlstra wrote: > Hi All! Hi Peter, > The (hopefully) final respin of cpuidle vs rcu cleanup patches. Barring any > objections I'll be queueing these patches in tip/sched/core in the next few > days. I'm sorry to have to bear some bad news on that front. :( I just had a go at testing this on a Juno dev board, using your queue.git sched/idle branch and defconfig + CONFIG_PROVE_LOCKING=y + CONFIG_DEBUG_LOCKDEP=y + CONFIG_DEBUG_ATOMIC_SLEEP=y. With that I consistently see RCU at boot time (log below). | = | WARNING: suspicious RCU usage | 6.2.0-rc3-00051-gced9b6eecb31 #1 Not tainted | - | include/trace/events/ipi.h:19 suspicious rcu_dereference_check() usage! | | other info that might help us debug this: | | | rcu_scheduler_active = 2, debug_locks = 1 | RCU used illegally from extended quiescent state! | no locks held by swapper/0/0. | | stack backtrace: | CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.0-rc3-00051-gced9b6eecb31 #1 | Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II May 16 2021 | Call trace: | dump_backtrace.part.0+0xe4/0xf0 | show_stack+0x18/0x30 | dump_stack_lvl+0x98/0xd4 | dump_stack+0x18/0x34 | lockdep_rcu_suspicious+0xf8/0x10c | trace_ipi_raise+0x1a8/0x1b0 | arch_irq_work_raise+0x4c/0x70 | __irq_work_queue_local+0x48/0x80 | irq_work_queue+0x50/0x80 | __wake_up_klogd.part.0+0x98/0xe0 | defer_console_output+0x20/0x30 | vprintk+0x98/0xf0 | _printk+0x5c/0x84 | lockdep_rcu_suspicious+0x34/0x10c | trace_lock_acquire+0x174/0x180 | lock_acquire+0x3c/0x8c | _raw_spin_lock_irqsave+0x70/0x150 | down_trylock+0x18/0x50 | __down_trylock_console_sem+0x3c/0xd0 | console_trylock+0x28/0x90 | vprintk_emit+0x11c/0x354 | vprintk_default+0x38/0x4c | vprintk+0xd4/0xf0 | _printk+0x5c/0x84 | lockdep_rcu_suspicious+0x34/0x10c | printk_sprint+0x238/0x240 | vprintk_store+0x32c/0x4b0 | vprintk_emit+0x104/0x354 | vprintk_default+0x38/0x4c | vprintk+0xd4/0xf0 | _printk+0x5c/0x84 | lockdep_rcu_suspicious+0x34/0x10c | trace_irq_disable+0x1ac/0x1b0 | trace_hardirqs_off+0xe8/0x110 | cpu_suspend+0x4c/0xfc | psci_cpu_suspend_enter+0x58/0x6c | psci_enter_idle_state+0x70/0x170 | cpuidle_enter_state+0xc4/0x464 | cpuidle_enter+0x38/0x50 | do_idle+0x230/0x2c0 | cpu_startup_entry+0x24/0x30 | rest_init+0x110/0x190 | arch_post_acpi_subsys_init+0x0/0x18 | start_kernel+0x6f8/0x738 | __primary_switched+0xbc/0xc4 IIUC what's happenign here is the PSCI cpuidle driver has entered idle and RCU is no longer watching when arm64's cpu_suspend() manipulates DAIF. Our local_daif_*() helpers poke lockdep and tracing, hence the call to trace_hardirqs_off() and the RCU usage. I think we need RCU to be watching all the way down to cpu_suspend(), and it's cpu_suspend() that should actually enter/exit idle context. That and we need to make cpu_suspend() and the low-level PSCI invocation noinstr. I'm not sure whether 32-bit will have a similar issue or not. I'm surprised no-one else who has tested has seen this; I suspect people haven't enabled lockdep and friends. :/ Thanks, Mark. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization