On Tue, Sep 23, 2025 at 07:32:16PM +0200, Peter Zijlstra wrote: > On Tue, Sep 23, 2025 at 07:20:17AM -0700, Paul E. McKenney wrote: > > When expressing RCU Tasks Trace in terms of SRCU-fast, it was > > necessary to keep a nesting count and per-CPU srcu_ctr structure > > pointer in the task_struct structure, which is slow to access. > > But an alternative is to instead make rcu_read_lock_tasks_trace() and > > rcu_read_unlock_tasks_trace(), which match the underlying SRCU-fast > > semantics, avoiding the task_struct accesses. > > > > When all callers have switched to the new API, the previous > > rcu_read_lock_trace() and rcu_read_unlock_trace() APIs will be removed. > > > > Signed-off-by: Paul E. McKenney <paul...@kernel.org> > > Cc: Andrii Nakryiko <and...@kernel.org> > > Cc: Alexei Starovoitov <a...@kernel.org> > > Cc: Peter Zijlstra <pet...@infradead.org> > > Cc: <b...@vger.kernel.org> > > --- > > include/linux/rcupdate_trace.h | 37 ++++++++++++++++++++++++++++++++++ > > 1 file changed, 37 insertions(+) > > > > diff --git a/include/linux/rcupdate_trace.h b/include/linux/rcupdate_trace.h > > index 0bd47f12ecd17b..b87151e6b23881 100644 > > --- a/include/linux/rcupdate_trace.h > > +++ b/include/linux/rcupdate_trace.h > > @@ -34,6 +34,43 @@ static inline int rcu_read_lock_trace_held(void) > > > > #ifdef CONFIG_TASKS_TRACE_RCU > > > > +/** > > + * rcu_read_lock_tasks_trace - mark beginning of RCU-trace read-side > > critical section > > + * > > + * When synchronize_rcu_tasks_trace() is invoked by one task, then that > > + * task is guaranteed to block until all other tasks exit their read-side > > + * critical sections. Similarly, if call_rcu_trace() is invoked on one > > + * task while other tasks are within RCU read-side critical sections, > > + * invocation of the corresponding RCU callback is deferred until after > > + * the all the other tasks exit their critical sections. > > + * > > + * For more details, please see the documentation for > > srcu_read_lock_fast(). > > + */ > > +static inline struct srcu_ctr __percpu *rcu_read_lock_tasks_trace(void) > > +{ > > + struct srcu_ctr __percpu *ret = > > srcu_read_lock_fast(&rcu_tasks_trace_srcu_struct); > > + > > + if (IS_ENABLED(CONFIG_ARCH_WANTS_NO_INSTR)) > > + smp_mb(); > > I am somewhat confused by the relation between noinstr and smp_mb() > here. Subject mentions is, but Changelog is awfully silent again.
Thank you for looking this over! To Alexei's point, this commit should be merged with 18/34. > Furthermore I note that this is a positive while unlock is a negative > relation between the two. Which adds even more confusion. You are right, at most one of these two conditions can be correct. ;-) I believe that the one above needs a "!". The point of this is that architectures that set ARCH_WANTS_NO_INSTR have promised that any point in the entry/exit code that RCU is not watching has been marked noinstr. For those architectures, SRCU-fast can rely on the fact that the key updates in __srcu_read_lock_fast() and __srcu_read_unlock_fast() are either interrrupt-disabled regions or atomic operations, depending on the architecture. This means that the synchronize_rcu{,_expedited}() calls in the SRCU-fast grace-period code will be properly ordered with those accesses. But for !ARCH_WANTS_NO_INSTR architectures, it is possible to attach various forms of tracing to entry/exit code that RCU is not watching, which means that those synchronize_rcu{,_expedited}() calls won't have the needed ordering properties. So we use smp_mb() on the read side to force the needed ordering. Does that help, or am I missing the point of your question? Thanx, Paul > > + return ret; > > +} > > + > > +/** > > + * rcu_read_unlock_tasks_trace - mark end of RCU-trace read-side critical > > section > > + * @scp: return value from corresponding rcu_read_lock_tasks_trace(). > > + * > > + * Pairs with the preceding call to rcu_read_lock_tasks_trace() that > > + * returned the value passed in via scp. > > + * > > + * For more details, please see the documentation for rcu_read_unlock(). > > + */ > > +static inline void rcu_read_unlock_tasks_trace(struct srcu_ctr __percpu > > *scp) > > +{ > > + if (!IS_ENABLED(CONFIG_ARCH_WANTS_NO_INSTR)) > > + smp_mb(); > > + srcu_read_unlock_fast(&rcu_tasks_trace_srcu_struct, scp); > > +} > > + > > /** > > * rcu_read_lock_trace - mark beginning of RCU-trace read-side critical > > section > > * > > -- > > 2.40.1 > >