This series fixes a bug where rdp->defer_qs_pending can remain stuck in PENDING when a preempted reader's quiescent state is reported up-tree via a path other than the deferred-QS irq-work handler (FQS scan, hotplug transition, expedited GP IPI, context switch). Once stuck, the pending gate in rcu_read_unlock_special() silently suppresses all future arming attempts on that CPU. The series adds PENDING -> IDLE transitions at the missing sites (patches 1-7), including the case where the deferred-QS irq-work handler may run between segments of a compound section (per Paul McKenney's counter-example) and the softirq deferred-QS arming path.
Patch 8 adds a per-CPU rescue hrtimer that bounds the worst-case deferred-QS reporting latency: when the irq-work handler lands in a clean (non-reader, non-compound) context it reports the quiescent state directly via the new rcu_preempt_deferred_qs_try_report() helper, and the rescue timer reuses the same helper so that, under preempt=none, the QS report is quick without depending on the scheduler. Patches 9-13 add rcutorture coverage for the reader-end deboost behavior (three from Paul, two from me). These were previously posted on their own as an RFC; they are folded in here so the fix and its test coverage can be reviewed together. The last patch is a debug-only detector (CONFIG_RCU_GP_CLEANUP_STALE_CHECK, marked [TEST COMMIT], not for merge) -- applied alone on unmodified mainline without the fixes it reliably fires a WARN within 5 minutes under TREE03 rcutorture, confirming the bug exists and the detector catches it; with the full fix applied, I could not reproduce the issue. The git tree with all patches can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git (tag: rcu-dqs-stuck-v3-20260618) Change log: Changes from v2 to v3: - Folded in the rcutorture "reader-end deboost testing" patches (three from Paul, two from me), previously posted separately as an RFC, so the fix and its test coverage can be reviewed together: https://lore.kernel.org/all/[email protected]/ - New patch "rcu: add per-CPU rescue hrtimer for deferred-QS reporting" to bound the worst-case deferred-QS reporting latency. - New patch "rcu: clear defer_qs_pending in deferred-QS bail when nesting > 0". - Reworked "rcu: clear defer_qs_pending in handler for compounded sections": the irq-work handler now reports the deferred QS directly via the new rcu_preempt_deferred_qs_try_report() helper when it lands in a clean context, instead of only nudging the scheduler. Changes from v1 to v2: - Dropped RFC tag now that softirq paths have been investigated. - Added new patch "rcu: set need_resched on softirq deferred-QS arming path" to handle the softirq arming case that was deferred in v1. Link to v2: https://lore.kernel.org/all/[email protected]/ Link to v1: https://lore.kernel.org/all/[email protected]/ Joel Fernandes (11): rcu: introduce rcu_defer_qs_clear() helper rcu: clear defer_qs_pending when notifying GP changes rcu: clear defer_qs_pending in handler for compounded sections rcu: drop redundant defer_qs_pending clear in irqrestore handler rcu: clear defer_qs_pending at expedited IPI entry rcu: set need_resched on softirq deferred-QS arming path rcu: clear defer_qs_pending in deferred-QS bail when nesting > 0 rcu: add per-CPU rescue hrtimer for deferred-QS reporting rcutorture: tighten boost-WARN to exclude any implicit-reader context rcutorture: give async deboost mechanisms up to 500us before WARN [TEST COMMIT] rcu: detect stuck defer_qs_pending at GP cleanup Paul E. McKenney (3): rcutorture: Abstract reader-segment dump into rcu_torture_dump_read_segs() rcutorture: Check for immediate deboosting at reader end rcutorture: Test RCU readers from hardware interrupt handlers kernel/rcu/Kconfig.debug | 11 ++ kernel/rcu/rcu.h | 7 ++ kernel/rcu/rcutorture.c | 257 +++++++++++++++++++++++++++------------ kernel/rcu/tree.c | 50 ++++++++ kernel/rcu/tree.h | 14 +++ kernel/rcu/tree_exp.h | 6 + kernel/rcu/tree_plugin.h | 169 ++++++++++++++++++++++--- 7 files changed, 419 insertions(+), 95 deletions(-) base-commit: 95c7d025cc8c3c6c41206e2a18332eb04878b7ef -- 2.34.1

