If an expedited RCU CPU stall ends just at the stall-warning timeout, the current code will print an expedited stall-warning message, but one that doesn't identify any CPUs or tasks causing the stall. This is most likely to happen for short-timeout stalls, for example, the 20-millisecond timeouts that are sometimes used for small embedded devices. Needless to say, these semi-empty stall-warning messages can be rather confusing.
One option would be to suppress the stall-warning message entirely in this case, but the near-miss information can be quite valuable. This commit therefore detects this race condition and emits a "INFO: Expedited stall ended before state dump start" message to clarify matters. Reported-by: Borislav Petkov <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> --- kernel/rcu/tree_exp.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 96c49c56fc14a..82cada459e5d0 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -589,7 +589,12 @@ static void synchronize_rcu_expedited_stall(unsigned long jiffies_start, unsigne pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n", j - jiffies_start, rcu_state.expedited_sequence, data_race(rnp_root->expmask), ".T"[!!data_race(rnp_root->exp_tasks)]); - if (ndetected) { + if (!ndetected) { + // This is invoked from the grace-period worker, so + // a new grace period cannot have started. And if this + // worker were stalled, we would not get here. ;-) + pr_err("INFO: Expedited stall ended before state dump start\n"); + } else { pr_err("blocking rcu_node structures (internal RCU debug):"); rcu_for_each_node_breadth_first(rnp) { if (rnp == rnp_root) -- 2.40.1

