On Thu, Mar 12, 2026 at 04:38:14AM -0700, Usama Arif wrote:
> On Tue, 10 Mar 2026 17:49:39 +0000 Dmitry Ilvokhin <[email protected]> wrote:
>
> > Add the contended_release trace event. This tracepoint fires on the
> > holder side when a contended lock is released, complementing the
> > existing contention_begin/contention_end tracepoints which fire on the
> > waiter side.
> >
> > This enables correlating lock hold time under contention with waiter
> > events by lock address.
> >
> > Add trace_contended_release() calls to the slowpath unlock paths of
> > sleepable locks: mutex, rtmutex, semaphore, rwsem, percpu-rwsem, and
> > RT-specific rwbase locks. Each call site fires only when there are
> > blocked waiters being woken, except percpu_up_write() which always wakes
> > via __wake_up().
> >
> > Signed-off-by: Dmitry Ilvokhin <[email protected]>
> > ---
> > include/trace/events/lock.h | 17 +++++++++++++++++
> > kernel/locking/mutex.c | 1 +
> > kernel/locking/percpu-rwsem.c | 3 +++
> > kernel/locking/rtmutex.c | 1 +
> > kernel/locking/rwbase_rt.c | 8 +++++++-
> > kernel/locking/rwsem.c | 9 +++++++--
> > kernel/locking/semaphore.c | 4 +++-
> > 7 files changed, 39 insertions(+), 4 deletions(-)
> >
[...]
> > diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
> > index f3ee7a0d6047..1eee51766aaf 100644
> > --- a/kernel/locking/percpu-rwsem.c
> > +++ b/kernel/locking/percpu-rwsem.c
> > @@ -263,6 +263,8 @@ void percpu_up_write(struct percpu_rw_semaphore *sem)
> > {
> > rwsem_release(&sem->dep_map, _RET_IP_);
> >
> > + trace_contended_release(sem);
> > +
>
> Hello!
>
> I saw that you mentioned in the commmit message that you do this for only
> blocked waiters except for percpu_up_write(). We can use
> waitqueue_active(&sem->waiters) to check for this over here so that
> its consistent with every other call?
Thanks for the feedback, Usama.
I thought about it and even mentioned in the comment, but I forgot what
was the reason. Now, I think you are correct. I added wq_has_sleeper()
locally instead of waitqueue_active() locally, since we are not holding
the lock here and waitqueue_active() requires a barrier based on the
comment. It might be not very important here, but I'd rather make it
correct even for tracepoint.
Note that __percpu_up_read() doesn't need this guard. Maybe I was
thinking at __percpu_up_read() part before and just made it symmetric.
Anyway, thanks for suggestion.
>
>
> > /*
> > * Signal the writer is done, no fast path yet.
> > *
> > @@ -297,6 +299,7 @@ void __percpu_up_read(struct percpu_rw_semaphore *sem)
> > * writer.
> > */
> > smp_mb(); /* B matches C */
> > + trace_contended_release(sem);
>
> Should we do this after this_cpu_dec(*sem->read_count)?
Good point. I moved it after this_cpu_dec() so the tracepoint fires
after the lock is released but before rcuwait_wake_up(). I also went
through all other call sites and made the placement consistent where
possible: after release, before wake. It should be fixed in v3.