On Wed, Feb 11, 2026 at 3:27 AM Harry Yoo <[email protected]> wrote: > > On Wed, Feb 11, 2026 at 11:53:46AM +0100, Uladzislau Rezki wrote: > > On Wed, Feb 11, 2026 at 07:44:37PM +0900, Harry Yoo wrote: > > > On Wed, Feb 11, 2026 at 11:16:51AM +0100, Uladzislau Rezki wrote: > > > > If this is supposed to be invoked from NMI, should we better just detect > > > > such context in the kvfree_call_rcu()? There are lot of "allow_spin" > > > > checks > > > > which make it easy to get lost. > > > > > > Detecting if it's NMI might be okay, but IIUC re-entrancy requirement > > > not only comes from NMI but also from attaching bpf programs to > > > kernel functions, something like: > > > > > > "Run a BPF program whenever queue_delayed_work() is called, > > > ... and the BPF program somehow frees memory via kfree_rcu_nolock()". > > > > > > Then, by the time the kernel calls queue_delayed_work() while holding > > > krcp->lock, it run the BPF program and calls kfree_rcu_nolock(), > > > it is not allowed to spin on krcp->lock. > > > > > > > > > > As i see you maintain llist and the idea is simply to re-enter to the > > > > kvfree_rcu() again with allow-spin=true, since then it will be "normal" > > > > context. > > > > > > It tries to acquire the lock and add it to krcp->head, but if somebody > > > is already holding the lock, it re-runs kvfree_rcu() with irq work. > > > > > > > Check no_spin on entry, if true, llist_add, queue-irq-work. Re-enter. > > That is much simpler! Actually, I tried this way during the initial > implementation. I like its simplicity. > > But I wasn't sure about performance implications of the approach > and switched to current implementation. > > It'd be nice to hear Alexei's thoughts on this; I think he'd have some > insights on performance aspect of this, as we have something similar > in slab (defer_free).
It's not a good idea. !allow_spin doesn't mean that we're in in_nmi or reentering. It means that the running context is unknown, but 99% of the time it's fine to go normal route and try to lock and everything will proceed as usual. if (!allow_spin) irq_work() will 100% hurt performance. In kfree_nolock() (before sheaves) we have a fallback to irq_work that we thought would be rare in practice. Turned out that even relatively rare spikes of irq_work hurt overall throughput by 5% for that workload. So, no, irq_work must be absolutely the last resort.
