On 5/11/26 19:54, Uladzislau Rezki (Sony) wrote: > From: "Paul E. McKenney" <[email protected]> > > While an srcu_struct structure is in the midst of switching from CPU-0 > to all-CPUs state, it can attempt to invoke callbacks for CPUs that > have never been online. Worse yet, it can attempt in invoke callbacks > for CPUs that never will be online, even including imaginary CPUs not in > cpu_possible_mask. This can cause hangs on s390,
Uladzislau, Paul, according to the fixes tag below this change fixes a change that went into 7.0-rc6 -- and apparently causes a "hang" on some architectures. So shouldn't this be heading to mainline instead of -next? Ideally with a stable tag to ensure backporting to 7.0.y, but that is a separate decision? I had an eye on this issue after noticing Samir's report: https://lore.kernel.org/lkml/[email protected]/ And the jury is still out, but Jiri is dealing with some issues that might or might not be related to the problem this fixes, too: https://lore.kernel.org/all/[email protected]/ Ciao, Thorsten > which is not set up to > deal with workqueue handlers being scheduled on such CPUs. This commit > therefore causes Tree SRCU to refrain from queueing workqueue handlers > on CPUs that have not yet (and might never) come online. > > Because callbacks are not invoked on CPUs that have not been > online, it is an error to invoke call_srcu(), synchronize_srcu(), or > synchronize_srcu_expedited() on a CPU that is not yet fully online. > However, it turns out to be less code to redirect the callbacks > from too-early invocations of call_srcu() than to warn about such > invocations. This commit therefore also redirects callbacks queued on > not-yet-fully-online CPUs to the boot CPU. > > Reported-by: Vasily Gorbik <[email protected]> > Fixes: 61bbcfb50514 ("srcu: Push srcu_node allocation to GP when > non-preemptible") > Signed-off-by: Paul E. McKenney <[email protected]> > Tested-by: Vasily Gorbik <[email protected]> > Tested-by: Samir <[email protected]> > Reviewed-by: Shrikanth Hegde <[email protected]> > Cc: Tejun Heo <[email protected]> > Signed-off-by: Uladzislau Rezki (Sony) <[email protected]> > --- > kernel/rcu/srcutree.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index 0d01cd8c4b4a..7c2f7cc131f7 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -897,11 +897,9 @@ static void srcu_schedule_cbs_snp(struct srcu_struct > *ssp, struct srcu_node *snp > { > int cpu; > > - for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { > - if (!(mask & (1UL << (cpu - snp->grplo)))) > - continue; > - srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); > - } > + for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) > + if ((mask & (1UL << (cpu - snp->grplo))) && > rcu_cpu_beenfullyonline(cpu)) > + srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), > delay); > } > > /* > @@ -1322,7 +1320,9 @@ static unsigned long srcu_gp_start_if_needed(struct > srcu_struct *ssp, > */ > idx = __srcu_read_lock_nmisafe(ssp); > ss_state = smp_load_acquire(&ssp->srcu_sup->srcu_size_state); > - if (ss_state < SRCU_SIZE_WAIT_CALL) > + // If !rcu_cpu_beenfullyonline(), interrupts are still disabled, > + // so no migration is possible in either direction from this CPU. > + if (ss_state < SRCU_SIZE_WAIT_CALL || > !rcu_cpu_beenfullyonline(raw_smp_processor_id())) > sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id()); > else > sdp = raw_cpu_ptr(ssp->sda);

