On 10 Dec 2025, at 16:12, Adrián Moreno wrote:
> On Wed, Dec 10, 2025 at 02:28:36PM +0100, Eelco Chaudron wrote:
>>
>>
>> On 10 Dec 2025, at 13:59, Toke Høiland-Jørgensen wrote:
>>
>>> The openvswitch teardown code will immediately call
>>> ovs_netdev_detach_dev() in response to a NETDEV_UNREGISTER notification.
>>> It will then start the dp_notify_work workqueue, which will later end up
>>> calling the vport destroy() callback. This callback takes the RTNL to do
>>> another ovs_netdev_detach_port(), which in this case is unnecessary.
>>> This causes extra pressure on the RTNL, in some cases leading to
>>> "unregister_netdevice: waiting for XX to become free" warnings on
>>> teardown.
>>>
>>> We can straight-forwardly avoid the extra RTNL lock acquisition by
>>> checking the device flags before taking the lock, and skip the locking
>>> altogether if the IFF_OVS_DATAPATH flag has already been unset.
>>>
>>> Fixes: b07c26511e94 ("openvswitch: fix vport-netdev unregister")
>>> Tested-by: Adrian Moreno <[email protected]>
>>> Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
>>> ---
>>> net/openvswitch/vport-netdev.c | 11 +++++++----
>>> 1 file changed, 7 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
>>> index 91a11067e458..519f038526f9 100644
>>> --- a/net/openvswitch/vport-netdev.c
>>> +++ b/net/openvswitch/vport-netdev.c
>>> @@ -160,10 +160,13 @@ void ovs_netdev_detach_dev(struct vport *vport)
>>>
>>> static void netdev_destroy(struct vport *vport)
>>> {
>>> - rtnl_lock();
>>> - if (netif_is_ovs_port(vport->dev))
>>> - ovs_netdev_detach_dev(vport);
>>> - rtnl_unlock();
>>> + if (netif_is_ovs_port(vport->dev)) {
>>
>> Hi Toke,
>>
>> Thanks for digging into this!
>>
>> The patch looks technically correct to me, but maybe we should add a comment
>> here explaining why we can do it this way, i.e., why we can call
>> netif_is_ovs_port() without the lock.
>> For example:
>>
>> /* We can avoid taking the rtnl lock as the IFF_OVS_DATAPATH flag is
>> set/cleared in either netdev_create()/netdev_destroy(), which are both
>> called under the global ovs_lock(). */
>>
>> Additionally, I think the second netif_is_ovs_port() under the rtnl lock is
>> not required due to the above.
>
> In the case of netdevs being unregistered outside of OVS, the
> ovs_dp_device_notifier gets called which then runs
> "ovs_netdev_detach_dev" only under RTNL. Locking ovs_lock() in that
> callback would be problematic since the rest of the OVS code assumes
> ovs_lock is nested outside of RTNL.
>
> So this could race with a ovs_vport_cmd_del AFAICS.
Not fully sure I understand the code path you are referring to, but if it’s
through ovs_dp_notify_wq()->dp_detach_port_notify()->ovs_dp_detach_port(), it
takes the ovs_lock().
By the way: in your testing, did you see the expected improvement, i.e., no
more “unregister” delays?
//Eelco
>>
>>> + rtnl_lock();
>>> + /* check again while holding the lock */
>>> + if (netif_is_ovs_port(vport->dev))
>>> + ovs_netdev_detach_dev(vport);
>>> + rtnl_unlock();
>>> + }
>>>
>>> call_rcu(&vport->rcu, vport_netdev_free);
>>> }
>>> --
>>> 2.52.0
>>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev