On 10 Dec 2025, at 16:12, Adrián Moreno wrote:

> On Wed, Dec 10, 2025 at 02:28:36PM +0100, Eelco Chaudron wrote:
>>
>>
>> On 10 Dec 2025, at 13:59, Toke Høiland-Jørgensen wrote:
>>
>>> The openvswitch teardown code will immediately call
>>> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
>>> It will then start the dp_notify_work workqueue, which will later end up
>>> calling the vport destroy() callback. This callback takes the RTNL to do
>>> another ovs_netdev_detach_port(), which in this case is unnecessary.
>>> This causes extra pressure on the RTNL, in some cases leading to
>>> "unregister_netdevice: waiting for XX to become free" warnings on
>>> teardown.
>>>
>>> We can straight-forwardly avoid the extra RTNL lock acquisition by
>>> checking the device flags before taking the lock, and skip the locking
>>> altogether if the IFF_OVS_DATAPATH flag has already been unset.
>>>
>>> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
>>> Tested-by: Adrian Moreno <[email protected]>
>>> Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
>>> ---
>>>  net/openvswitch/vport-netdev.c | 11 +++++++----
>>>  1 file changed, 7 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
>>> index 91a11067e458..519f038526f9 100644
>>> --- a/net/openvswitch/vport-netdev.c
>>> +++ b/net/openvswitch/vport-netdev.c
>>> @@ -160,10 +160,13 @@ void ovs_netdev_detach_dev(struct vport *vport)
>>>
>>>  static void netdev_destroy(struct vport *vport)
>>>  {
>>> -   rtnl_lock();
>>> -   if (netif_is_ovs_port(vport->dev))
>>> -           ovs_netdev_detach_dev(vport);
>>> -   rtnl_unlock();
>>> +   if (netif_is_ovs_port(vport->dev)) {
>>
>> Hi Toke,
>>
>> Thanks for digging into this!
>>
>> The patch looks technically correct to me, but maybe we should add a comment 
>> here explaining why we can do it this way, i.e., why we can call 
>> netif_is_ovs_port() without the lock.
>> For example:
>>
>> /* We can avoid taking the rtnl lock as the IFF_OVS_DATAPATH flag is 
>> set/cleared in either netdev_create()/netdev_destroy(), which are both 
>> called under the global ovs_lock(). */
>>
>> Additionally, I think the second netif_is_ovs_port() under the rtnl lock is 
>> not required due to the above.
>
> In the case of netdevs being unregistered outside of OVS, the
> ovs_dp_device_notifier gets called which then runs
> "ovs_netdev_detach_dev" only under RTNL. Locking ovs_lock() in that
> callback would be problematic since the rest of the OVS code assumes
> ovs_lock is nested outside of RTNL.
>
> So this could race with a ovs_vport_cmd_del AFAICS.

Not fully sure I understand the code path you are referring to, but if it’s 
through ovs_dp_notify_wq()->dp_detach_port_notify()->ovs_dp_detach_port(), it 
takes the ovs_lock().

By the way: in your testing, did you see the expected improvement, i.e., no 
more “unregister” delays?

//Eelco

>>
>>> +           rtnl_lock();
>>> +           /* check again while holding the lock */
>>> +           if (netif_is_ovs_port(vport->dev))
>>> +                   ovs_netdev_detach_dev(vport);
>>> +           rtnl_unlock();
>>> +   }
>>>
>>>     call_rcu(&vport->rcu, vport_netdev_free);
>>>  }
>>> --
>>> 2.52.0
>>

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to