On 2025-06-20 04:23:49 [-0700], Paul E. McKenney wrote:
> > I hope not because it is not any different from
> > 
> >     CPU 2                   CPU 3
> >     =====                   =====
> >     NMI
> >     rcu_read_lock();
> >                             synchronize_rcu();
> >                             // need all CPUs report a QS.
> >     rcu_read_unlock();
> >     // no rcu_read_unlock_special() due to in_nmi().
> > 
> > If the NMI happens while the CPU is in userland (say a perf event) then
> > the NMI returns directly to userland.
> > After the tracing event completes (in this case) the CPU should run into
> > another RCU section on its way out via context switch or the tick
> > interrupt.
> > I assume the tick interrupt is what makes the NMI case work.
> 
> Are you promising that interrupts will be always be disabled across
> the whole rcu_read_lock_notrace() read-side critical section?  If so,
> could we please have a lockdep_assert_irqs_disabled() call to check that?

No, that should stay preemptible because bpf can attach itself to
tracepoints and this is the root cause of the exercise. Now if you say
it has to be run with disabled interrupts to match the NMI case then it
makes sense (since NMIs have interrupts off) but I do not understand why
it matters here (since the CPU returns to userland without passing the
kernel).

I'm not sure how much can be done here due to the notrace part. Assuming
rcu_read_unlock_special() is not doable, would forcing a context switch
(via setting need-resched and irq_work, as the IRQ-off case) do the
trick?
Looking through rcu_preempt_deferred_qs_irqrestore() it does not look to
be "usable from the scheduler (with rq lock held)" due to RCU-boosting
or the wake of expedited_wq (which is one of the requirement).

>                                                       Thanx, Paul
Sebastian

Reply via email to