On Tue, Feb 11, 2025 at 06:43:21PM +0100, Marcin Szycik wrote:
> If ice_ena_vfs() fails after calling ice_create_vf_entries(), it frees
> all VFs without removing them from snapshot PF-VF mailbox list, leading
> to list corruption.
>
> Reproducer:
> devlink dev eswitch set $PF1_PCI mode switchdev
> ip l s $PF1 up
> ip l s $PF1 promisc on
> sleep 1
> echo 1 > /sys/class/net/$PF1/device/sriov_numvfs
Should the line above be "echo 0" to remove the VFs before creating VFs
below (I'm looking at sriov_numvfs_store())?
> sleep 1
> echo 1 > /sys/class/net/$PF1/device/sriov_numvfs
>
> Trace (minimized):
> list_add corruption. next->prev should be prev (ffff8882e241c6f0), but was
> 0000000000000000. (next=ffff888455da1330).
> kernel BUG at lib/list_debug.c:29!
> RIP: 0010:__list_add_valid_or_report+0xa6/0x100
> ice_mbx_init_vf_info+0xa7/0x180 [ice]
> ice_initialize_vf_entry+0x1fa/0x250 [ice]
> ice_sriov_configure+0x8d7/0x1520 [ice]
> ? __percpu_ref_switch_mode+0x1b1/0x5d0
> ? __pfx_ice_sriov_configure+0x10/0x10 [ice]
>
> Sometimes a KASAN report can be seen instead with a similar stack trace:
> BUG: KASAN: use-after-free in __list_add_valid_or_report+0xf1/0x100
>
> VFs are added to this list in ice_mbx_init_vf_info(), but only removed
> in ice_free_vfs(). Move the removing to ice_free_vf_entries(), which is
> also being called in other places where VFs are being removed (including
> ice_free_vfs() itself).
>
> Fixes: 8cd8a6b17d27 ("ice: move VF overflow message count into struct
> ice_mbx_vf_info")
> Reported-by: Sujai Buvaneswaran <[email protected]>
> Closes:
> https://lore.kernel.org/intel-wired-lan/ph0pr11mb50138b635f2e5ceb7075325d96...@ph0pr11mb5013.namprd11.prod.outlook.com
> Reviewed-by: Martyna Szapar-Mudlaw <[email protected]>
> Signed-off-by: Marcin Szycik <[email protected]>
The comment above notwithstanding, I agree that this addresses the
bug you have described.
Reviewed-by: Simon Horman <[email protected]>