On Thu, 19 Mar 2026 at 17:48, Boqun Feng <[email protected]> wrote:
>
> On Thu, Mar 19, 2026 at 05:33:50PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2026-03-19 09:27:59 [-0700], Boqun Feng wrote:
> > > On Thu, Mar 19, 2026 at 10:03:15AM +0100, Sebastian Andrzej Siewior wrote:
> > > > Please just use the queue_delayed_work() with a delay >0.
> > > >
> > >
> > > That doesn't work since queue_delayed_work() with a positive delay will
> > > still acquire timer base lock, and we can have BPF instrument with timer
> > > base lock held i.e. calling call_srcu() with timer base lock.
> > >
> > > irq_work on the other hand doesn't use any locking.
> >
> > Could we please restrict BPF somehow so it does roam free? It is
> > absolutely awful to have irq_work() in call_srcu() just because it
> > might acquire locks.
> >
>
> I agree it's not RCU's fault ;-)
>
> I guess it'll be difficult to restrict BPF, however maybe BPF can call
> call_srcu() in irq_work instead? Or a more systematic defer mechanism
> that allows BPF to defer any lock holding functions to a different
> context. (We have a similar issue that BPF cannot call kfree_rcu() in
> some cases IIRC).
>
> But we need to fix this in v7.0, so this short-term fix is still needed.
>

I don't think this is an option, even longer term. We already do it
when it's incorrect to invoke call_rcu() or any other API in a
specific context (e.g., NMI, where we punt it using irq_work).
However, the case reported in this thread is different. It was an
existing user which worked fine before but got broken now. We were
using call_rcu_tasks_trace() just fine in scx callbacks where rq->lock
is held before, so the conversion underneath to call_srcu() should
continue to remain transparent in this respect.

> Regars,
> Boqun
>
> > > Regards,
> > > Boqun
> > >
> > Sebastian

Reply via email to