On Tue, 2025-05-06 at 10:12 -0700, Stanislav Fomichev wrote: > On 05/06, Cosmin Ratiu wrote: > > __netdev_update_features() expects the netdevice to be ops-locked, > > but > > it gets called recursively on the lower level netdevices to sync > > their > > features, and nothing locks those. > > > > This commit fixes that, with the assumption that it shouldn't be > > possible > > for both higher-level and lover-level netdevices to require the > > instance > > lock, because that would lead to lock dependency warnings. > > > > Without this, playing with higher level (e.g. vxlan) netdevices on > > top > > of netdevices with instance locking enabled can run into issues: > > Mentioning vxlan is a bit confusing here; it shouldn't let you flip > lro (I > think). Which upper are you testing against?
It is vxlan, but LRO is just a red herring in this case, mlx5e_set_features calls every feature handler in turn, and this is just the example I picked from the sea of stack traces. > > Trying to understand if we can cover this case in the selftests. > netdevsim also doesn't expose F_LRO feature... (yet?) I see you found a way with teaming, but I think in essence a sequence of commands that makes __netdev_update_features call itself recursively once on the lower dev will trigger the netdev_ops_assert_locked on the lower dev. Thanks for the review! Cosmin.
