Re: Next-level bug in SRCU implementation of RCU Tasks Trace + PREEMPT_RT

Boqun Feng Wed, 18 Mar 2026 14:58:27 -0700

On Wed, Mar 18, 2026 at 02:52:48PM -0700, Boqun Feng wrote:
[...]
> > Ah so it is an ABBA deadlock, not a ABA self-deadlock. I guess this is a
> > different issue, from the NMI issue? It is more of an issue of calling
> > call_srcu  API with scheduler locks held.
> > 
> > Something like below I think:
> > 
> >   CPU A (BPF tracepoint)                CPU B (concurrent call_srcu)
> >   ----------------------------         ------------------------------------
> >   [1] holds  &rq->__lock
> >                                         [2]
> >                                         -> call_srcu
> >                                         -> srcu_gp_start_if_needed
> >                                         -> srcu_funnel_gp_start
> >                                         -> spin_lock_irqsave_ssp_content...
> >                                       -> holds srcu locks
> > 
> >   [4] calls  call_rcu_tasks_trace()      [5] srcu_funnel_gp_start (cont..)
> >                                                  -> queue_delayed_work
> >           -> call_srcu()                         -> __queue_work()
> >           -> srcu_gp_start_if_needed()           -> wake_up_worker()
> >           -> srcu_funnel_gp_start()              -> try_to_wake_up()
> >           -> spin_lock_irqsave_ssp_contention()  [6] WANTS  rq->__lock
> >           -> WANTS srcu locks
> 
> I see, we can also have a self deadlock even without CPU B, when CPU A
> is going to try_to_wake_up() the a worker on the same CPU.
> 
> An interesting observation is that the deadlock can be avoided in
> queue_delayed_work() uses a non-zero delay, that means a timer will be
> armed instead of acquiring the rq lock.
> 
> (But I guess BPF also wants to run with timer base lock held, right? ;-)
> ;-) ;-)).
> 
> /me going to check Paul's second fix at rcu/dev.
>


Oh I mis-read, there is no second fix, just a rcutorture changes. Let me
see if I can find out a quick fix ;-)

Regards,
Boqun

> Regards,
> Boqun
> 
> > 
> > If I understand this, this looks like an issue that can happen independent
> > of the conversion of the spin locks.
> > 
> > thanks,
> > 
> > -- 
> > Joel Fernandes

Re: Next-level bug in SRCU implementation of RCU Tasks Trace + PREEMPT_RT

Reply via email to