Thanks for the review on v5, Stephen. > -----Original Message----- > From: Stephen Hemminger <[email protected]> > Sent: Tuesday, June 2, 2026 12:58 AM > To: Wei Hu <[email protected]> > Cc: [email protected]; Long Li <[email protected]>; Wei Hu > <[email protected]> > Subject: [EXTERNAL] Re: [PATCH v5 1/1] net/mana: add device reset support > > The RCU use is not really RCU. thread_online/offline are called on every rx/tx > burst, and the "thread" token is the queue index, not a thread. So it is a > per- > queue in-use flag paid for on the hottest path, plus a new library dependency, > to express "wait until no queue is mid-burst" -- which the driver already half > does by swapping the burst function to mana_*_burst_removed and checking > dev_state. Please drop the rcu use. If you must drain readers, a per-queue > atomic flag is lighter and local; the fast path already has the dev_state > acquire- > load it needs. >
I have tested just swapping to mana_*_burst_removed doesn't stop the application from entering the burst functions. I will add per-queue atomic flags and spin poll all queue flags in the slower reset path. I think the actual fast path cost is nearly identical. However, it does remove the rcu library dependency. > The data path only needs the atomic, and that part is fine. The lock is > legitimate for serializing the teardown/rebuild against control ops, but the > way > it is used is the problem: it is acquired in mana_intr_handler, released in > mana_reset_enter, re-acquired in mana_reset_thread, and released in > mana_reset_exit_delay. That cross-function, cross-thread handoff is exactly > why every function needs __rte_no_thread_safety_analysis. > Acquire and release the lock in the same function and the annotations all go > away. Turning off thread-safety analysis needs strong justification and this > does not have it. I will remove the __rte_no_thread_safety_analysis from the code. It was added when I used spin lock and it caused clang build warnings. Now I have changed it to pthread mutex and clang build is clean without this. > - thread_online is taken at the top of the burst functions but > thread_offline is only on some return paths. Any early return that > misses it leaves a token non-quiescent and rte_rcu_qsbr_check() in > mana_reset_enter spins forever. This is the kind of breakage the per-burst > bracketing invites -- another reason to drop it. > I have checked mana_rx_burst and mana_tx_burst. There's no actual bug now -- every return path has thread-offline. Changing rcu to per-queue flags would still need to reset the flag in every return path. I agree it is kind of fragile. However, the reset path needs it to make sure there is no thread still in the fast path before it can start to tear down the resources. I will send out a new revision. Thanks, Wei

