On Wed, Sep 17, 2025 at 11:55:10AM +0200, Vlastimil Babka wrote: > On 9/17/25 10:30, Harry Yoo wrote: > > On Wed, Sep 10, 2025 at 10:01:06AM +0200, Vlastimil Babka wrote: > >> + sfw->skip = true; > >> + continue; > >> + } > >> > >> + INIT_WORK(&sfw->work, flush_rcu_sheaf); > >> + sfw->skip = false; > >> + sfw->s = s; > >> + queue_work_on(cpu, flushwq, &sfw->work); > >> + flushed = true; > >> + } > >> + > >> + for_each_online_cpu(cpu) { > >> + sfw = &per_cpu(slub_flush, cpu); > >> + if (sfw->skip) > >> + continue; > >> + flush_work(&sfw->work); > >> + } > >> + > >> + mutex_unlock(&flush_lock); > >> + } > >> + > >> + mutex_unlock(&slab_mutex); > >> + cpus_read_unlock(); > >> + > >> + if (flushed) > >> + rcu_barrier(); > > > > I think we need to call rcu_barrier() even if flushed == false? > > > > Maybe a kvfree_rcu()'d object was already waiting for the rcu callback to > > be processed before flush_all_rcu_sheaves() is called, and > > in flush_all_rcu_sheaves() we skipped all (cache, cpu) pairs, > > so flushed == false but the rcu callback isn't processed yet > > by the end of the function? > > > > That sounds like a very unlikely to happen in a realistic scenario, > > but still possible... > > Yes also good point, will flush unconditionally. > > Maybe in __kfree_rcu_sheaf() I should also move the call_rcu(...) before > local_unlock(). So we don't end up seeing a NULL pcs->rcu_free in > flush_all_rcu_sheaves() because __kfree_rcu_sheaf() already set it to NULL, > but didn't yet do the call_rcu() as it got preempted after local_unlock().
Makes sense to me. > But then rcu_barrier() itself probably won't mean we make sure such cpus > finished the local_locked section, if we didn't queue work on them. So maybe > we need synchronize_rcu()? Ah, it works because preemption disabled section works as a RCU read-side critical section? But then are we allowed to do release the local_lock to allocate empty sheaves in __kfree_rcu_sheaf()? -- Cheers, Harry / Hyeonggon