On Fri, Mar 13, 2026 at 12:35:35PM -0700, Jakub Kicinski wrote:
> On Fri, 13 Mar 2026 18:31:12 +0100 Adrian Moreno wrote:
> > Currently the entire ovs module is write-protected using the global
> > ovs_mutex. While this simple approach works fine for control-plane
> > operations (such as vport configurations), requiring the global mutex
> > for flow modifications can be problematic.
>
> YNL selftest for ovs seems to trigger this:
>
> [ 88.995118][ T50] =============================
> [ 88.995287][ T50] WARNING: suspicious RCU usage
> [ 88.995448][ T50] 7.0.0-rc3-virtme #1 Not tainted
> [ 88.995630][ T50] -----------------------------
> [ 88.995788][ T50] net/openvswitch/datapath.c:2666 RCU-list traversed in
> non-reader section!!
> [ 88.996122][ T50]
> [ 88.996122][ T50] other info that might help us debug this:
> [ 88.996122][ T50]
> [ 88.996388][ T50]
> [ 88.996388][ T50] rcu_scheduler_active = 2, debug_locks = 1
> [ 88.996640][ T50] 3 locks held by kworker/2:1/50:
> [ 88.996800][ T50] #0: ff11000001139b48
> ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0xcb4/0x1390
> [ 88.997092][ T50] #1: ffa000000036fd10
> ((work_completion)(&(&ovs_net->masks_rebalance)->work)){+.+.}-{0:0}, at:
> process_one_work+0xd16/0x1390
> [ 88.997420][ T50] #2: ffffffffc08038e8 (ovs_mutex){+.+.}-{4:4}, at:
> ovs_dp_masks_rebalance+0x29/0x270 [openvswitch]
> [ 88.997707][ T50]
> [ 88.997707][ T50] stack backtrace:
> [ 88.997898][ T50] CPU: 2 UID: 0 PID: 50 Comm: kworker/2:1 Not tainted
> 7.0.0-rc3-virtme #1 PREEMPT(full)
> [ 88.997903][ T50] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [ 88.997904][ T50] Workqueue: events ovs_dp_masks_rebalance [openvswitch]
> [ 88.997911][ T50] Call Trace:
> [ 88.997914][ T50] <TASK>
> [ 88.997916][ T50] dump_stack_lvl+0x6f/0xa0
> [ 88.997921][ T50] lockdep_rcu_suspicious.cold+0x4f/0xad
> [ 88.997928][ T50] ovs_dp_masks_rebalance+0x226/0x270 [openvswitch]
> [ 88.997933][ T50] process_one_work+0xd57/0x1390
> [ 88.997940][ T50] ? pwq_dec_nr_in_flight+0x700/0x700
> [ 88.997942][ T50] ? lock_acquire.part.0+0xbc/0x260
> [ 88.997950][ T50] worker_thread+0x4d6/0xd40
> [ 88.997954][ T50] ? rescuer_thread+0x1330/0x1330
> [ 88.997956][ T50] ? __kthread_parkme+0xb3/0x200
> [ 88.997960][ T50] ? rescuer_thread+0x1330/0x1330
> [ 88.997962][ T50] kthread+0x30f/0x3f0
> [ 88.997964][ T50] ? trace_irq_enable.constprop.0+0x13c/0x190
> [ 88.997967][ T50] ? kthread_affine_node+0x150/0x150
> [ 88.997970][ T50] ret_from_fork+0x472/0x6b0
> [ 88.997974][ T50] ? arch_exit_to_user_mode_prepare.isra.0+0x140/0x140
> [ 88.997977][ T50] ? __switch_to+0x538/0xcf0
> [ 88.997980][ T50] ? kthread_affine_node+0x150/0x150
> [ 88.997983][ T50] ret_from_fork_asm+0x11/0x20
> [ 88.997991][ T50] </TASK>
>
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/tools/net/ynl/tests/ovs.c
>
Thanks!
I wonder why this did not come up in my tests. Jakub, would you mind
sharing the config used for this test?
For mask rebalancing I initially thought of doing the same thing as for
the other flow commands (rcu + refcount + flow_table mutex).
But given it's not that critical and it's not run in the context of
handler threads, I rolled back to locking in this case. That
"list_for_each_entry_rcu" is a leftover of my initial attempt and should
be replaced with normal "list_for_each_entry".
Thanks.
Adrián
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev