This series fixes a bug where rdp->defer_qs_pending can remain stuck in
PENDING when a preempted reader's quiescent state is reported up-tree via
a path other than the deferred-QS irq-work handler (FQS scan, hotplug
transition, expedited GP IPI, context switch). Once stuck, the pending
gate in rcu_read_unlock_special() silently suppresses all future arming
attempts on that CPU. The series adds PENDING -> IDLE transitions at the
missing sites (patches 1-7), including the case where the deferred-QS
irq-work handler may run between segments of a compound section (per Paul
McKenney's counter-example) and the softirq deferred-QS arming path.

Patch 8 adds a per-CPU rescue hrtimer that bounds the worst-case
deferred-QS reporting latency: when the irq-work handler lands in a clean
(non-reader, non-compound) context it reports the quiescent state directly
via the new rcu_preempt_deferred_qs_try_report() helper, and the rescue timer
reuses the same helper so that, under preempt=none, the QS report is quick
without depending on the scheduler.

Patches 9-13 add rcutorture coverage for the reader-end deboost behavior
(three from Paul, two from me). These were previously posted on their own
as an RFC; they are folded in here so the fix and its test coverage can be
reviewed together.

The last patch is a debug-only detector (CONFIG_RCU_GP_CLEANUP_STALE_CHECK,
marked [TEST COMMIT], not for merge) -- applied alone on unmodified
mainline without the fixes it reliably fires a WARN within 5 minutes under
TREE03 rcutorture, confirming the bug exists and the detector catches it;
with the full fix applied, I could not reproduce the issue.

The git tree with all patches can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git (tag: 
rcu-dqs-stuck-v3-20260618)

Change log:

Changes from v2 to v3:
- Folded in the rcutorture "reader-end deboost testing" patches (three from
  Paul, two from me), previously posted separately as an RFC, so the fix
  and its test coverage can be reviewed together:
  https://lore.kernel.org/all/[email protected]/
- New patch "rcu: add per-CPU rescue hrtimer for deferred-QS reporting" to
  bound the worst-case deferred-QS reporting latency.
- New patch "rcu: clear defer_qs_pending in deferred-QS bail when nesting > 0".
- Reworked "rcu: clear defer_qs_pending in handler for compounded sections":
  the irq-work handler now reports the deferred QS directly via the new
  rcu_preempt_deferred_qs_try_report() helper when it lands in a clean
  context, instead of only nudging the scheduler.

Changes from v1 to v2:
- Dropped RFC tag now that softirq paths have been investigated.
- Added new patch "rcu: set need_resched on softirq deferred-QS arming
  path" to handle the softirq arming case that was deferred in v1.

Link to v2: 
https://lore.kernel.org/all/[email protected]/
Link to v1: 
https://lore.kernel.org/all/[email protected]/

Joel Fernandes (11):
  rcu: introduce rcu_defer_qs_clear() helper
  rcu: clear defer_qs_pending when notifying GP changes
  rcu: clear defer_qs_pending in handler for compounded sections
  rcu: drop redundant defer_qs_pending clear in irqrestore handler
  rcu: clear defer_qs_pending at expedited IPI entry
  rcu: set need_resched on softirq deferred-QS arming path
  rcu: clear defer_qs_pending in deferred-QS bail when nesting > 0
  rcu: add per-CPU rescue hrtimer for deferred-QS reporting
  rcutorture: tighten boost-WARN to exclude any implicit-reader context
  rcutorture: give async deboost mechanisms up to 500us before WARN
  [TEST COMMIT] rcu: detect stuck defer_qs_pending at GP cleanup

Paul E. McKenney (3):
  rcutorture: Abstract reader-segment dump into
    rcu_torture_dump_read_segs()
  rcutorture: Check for immediate deboosting at reader end
  rcutorture: Test RCU readers from hardware interrupt handlers

 kernel/rcu/Kconfig.debug |  11 ++
 kernel/rcu/rcu.h         |   7 ++
 kernel/rcu/rcutorture.c  | 257 +++++++++++++++++++++++++++------------
 kernel/rcu/tree.c        |  50 ++++++++
 kernel/rcu/tree.h        |  14 +++
 kernel/rcu/tree_exp.h    |   6 +
 kernel/rcu/tree_plugin.h | 169 ++++++++++++++++++++++---
 7 files changed, 419 insertions(+), 95 deletions(-)


base-commit: 95c7d025cc8c3c6c41206e2a18332eb04878b7ef
-- 
2.34.1


Reply via email to