The per-CPU clear in __note_gp_changes() (from an earlier commit) runs only when the local CPU notices a normal-GP advance. When an expedited GP arrives at a CPU whose defer_qs_pending is already PENDING, rcu_read_unlock_special() may skip irq_work queuing due to the pending gate.
Clear defer_qs_pending on the IPI target right in rcu_exp_handler(). This makes it possible for any arming attempt that follows the IPI within the current GP to be able to queue irq_work again, allowing completion expedited GPs quickly than waiting for one scheduler tick. Signed-off-by: Joel Fernandes <[email protected]> --- kernel/rcu/tree_exp.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 82cada459e5d..f8564a041879 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -763,6 +763,12 @@ static void rcu_exp_handler(void *unused) READ_ONCE(rdp->cpu_no_qs.b.exp))) return; + /* + * Clear defer_qs_pending so arming attempts following this IPI + * within the current GP can queue irq_work again. + */ + rcu_defer_qs_clear(rdp); + /* * Second, the common case of not being in an RCU read-side * critical section. If also enabled or idle, immediately -- 2.34.1

