On Fri, Mar 13, 2026 at 12:35:35PM -0700, Jakub Kicinski wrote:
> On Fri, 13 Mar 2026 18:31:12 +0100 Adrian Moreno wrote:
> > Currently the entire ovs module is write-protected using the global
> > ovs_mutex. While this simple approach works fine for control-plane
> > operations (such as vport configurations), requiring the global mutex
> > for flow modifications can be problematic.
>
> YNL selftest for ovs seems to trigger this:
>
> [   88.995118][   T50] =============================
> [   88.995287][   T50] WARNING: suspicious RCU usage
> [   88.995448][   T50] 7.0.0-rc3-virtme #1 Not tainted
> [   88.995630][   T50] -----------------------------
> [   88.995788][   T50] net/openvswitch/datapath.c:2666 RCU-list traversed in 
> non-reader section!!
> [   88.996122][   T50]
> [   88.996122][   T50] other info that might help us debug this:
> [   88.996122][   T50]
> [   88.996388][   T50]
> [   88.996388][   T50] rcu_scheduler_active = 2, debug_locks = 1
> [   88.996640][   T50] 3 locks held by kworker/2:1/50:
> [   88.996800][   T50]  #0: ff11000001139b48 
> ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0xcb4/0x1390
> [   88.997092][   T50]  #1: ffa000000036fd10 
> ((work_completion)(&(&ovs_net->masks_rebalance)->work)){+.+.}-{0:0}, at: 
> process_one_work+0xd16/0x1390
> [   88.997420][   T50]  #2: ffffffffc08038e8 (ovs_mutex){+.+.}-{4:4}, at: 
> ovs_dp_masks_rebalance+0x29/0x270 [openvswitch]
> [   88.997707][   T50]
> [   88.997707][   T50] stack backtrace:
> [   88.997898][   T50] CPU: 2 UID: 0 PID: 50 Comm: kworker/2:1 Not tainted 
> 7.0.0-rc3-virtme #1 PREEMPT(full)
> [   88.997903][   T50] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [   88.997904][   T50] Workqueue: events ovs_dp_masks_rebalance [openvswitch]
> [   88.997911][   T50] Call Trace:
> [   88.997914][   T50]  <TASK>
> [   88.997916][   T50]  dump_stack_lvl+0x6f/0xa0
> [   88.997921][   T50]  lockdep_rcu_suspicious.cold+0x4f/0xad
> [   88.997928][   T50]  ovs_dp_masks_rebalance+0x226/0x270 [openvswitch]
> [   88.997933][   T50]  process_one_work+0xd57/0x1390
> [   88.997940][   T50]  ? pwq_dec_nr_in_flight+0x700/0x700
> [   88.997942][   T50]  ? lock_acquire.part.0+0xbc/0x260
> [   88.997950][   T50]  worker_thread+0x4d6/0xd40
> [   88.997954][   T50]  ? rescuer_thread+0x1330/0x1330
> [   88.997956][   T50]  ? __kthread_parkme+0xb3/0x200
> [   88.997960][   T50]  ? rescuer_thread+0x1330/0x1330
> [   88.997962][   T50]  kthread+0x30f/0x3f0
> [   88.997964][   T50]  ? trace_irq_enable.constprop.0+0x13c/0x190
> [   88.997967][   T50]  ? kthread_affine_node+0x150/0x150
> [   88.997970][   T50]  ret_from_fork+0x472/0x6b0
> [   88.997974][   T50]  ? arch_exit_to_user_mode_prepare.isra.0+0x140/0x140
> [   88.997977][   T50]  ? __switch_to+0x538/0xcf0
> [   88.997980][   T50]  ? kthread_affine_node+0x150/0x150
> [   88.997983][   T50]  ret_from_fork_asm+0x11/0x20
> [   88.997991][   T50]  </TASK>
>
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/tools/net/ynl/tests/ovs.c
>

Thanks!

I wonder why this did not come up in my tests. Jakub, would you mind
sharing the config used for this test?

For mask rebalancing I initially thought of doing the same thing as for
the other flow commands (rcu + refcount + flow_table mutex).

But given it's not that critical and it's not run in the context of
handler threads, I rolled back to locking in this case. That
"list_for_each_entry_rcu" is a leftover of my initial attempt and should
be replaced with normal "list_for_each_entry".

Thanks.
Adrián

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to