If an expedited RCU CPU stall ends just at the stall-warning timeout,
the current code will print an expedited stall-warning message, but one
that doesn't identify any CPUs or tasks causing the stall.  This is most
likely to happen for short-timeout stalls, for example, the 20-millisecond
timeouts that are sometimes used for small embedded devices.  Needless to
say, these semi-empty stall-warning messages can be rather confusing.

One option would be to suppress the stall-warning message entirely in
this case, but the near-miss information can be quite valuable.

This commit therefore detects this race condition and emits a "INFO:
Expedited stall ended before state dump start" message to clarify matters.

Reported-by: Borislav Petkov <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
 kernel/rcu/tree_exp.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 96c49c56fc14a..82cada459e5d0 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -589,7 +589,12 @@ static void synchronize_rcu_expedited_stall(unsigned long 
jiffies_start, unsigne
        pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
                j - jiffies_start, rcu_state.expedited_sequence, 
data_race(rnp_root->expmask),
                ".T"[!!data_race(rnp_root->exp_tasks)]);
-       if (ndetected) {
+       if (!ndetected) {
+               // This is invoked from the grace-period worker, so
+               // a new grace period cannot have started.  And if this
+               // worker were stalled, we would not get here.  ;-)
+               pr_err("INFO: Expedited stall ended before state dump start\n");
+       } else {
                pr_err("blocking rcu_node structures (internal RCU debug):");
                rcu_for_each_node_breadth_first(rnp) {
                        if (rnp == rnp_root)
-- 
2.40.1


Reply via email to