On 13.02.2025 11:55, Simon Horman wrote:
> On Tue, Feb 11, 2025 at 06:43:21PM +0100, Marcin Szycik wrote:
>> If ice_ena_vfs() fails after calling ice_create_vf_entries(), it frees
>> all VFs without removing them from snapshot PF-VF mailbox list, leading
>> to list corruption.
>>
>> Reproducer:
>> devlink dev eswitch set $PF1_PCI mode switchdev
>> ip l s $PF1 up
>> ip l s $PF1 promisc on
>> sleep 1
>> echo 1 > /sys/class/net/$PF1/device/sriov_numvfs
>
> Should the line above be "echo 0" to remove the VFs before creating VFs
> below (I'm looking at sriov_numvfs_store())?
Both "echo 1" commands fail (I'm fixing it in patch 2/2), that's why there's
no "echo 0" in between. Also, in this minimal example I'm assuming no VFs
were initially present.
Thanks for reviewing!
Marcin
>> sleep 1
>> echo 1 > /sys/class/net/$PF1/device/sriov_numvfs
>>
>> Trace (minimized):
>> list_add corruption. next->prev should be prev (ffff8882e241c6f0), but was
>> 0000000000000000. (next=ffff888455da1330).
>> kernel BUG at lib/list_debug.c:29!
>> RIP: 0010:__list_add_valid_or_report+0xa6/0x100
>> ice_mbx_init_vf_info+0xa7/0x180 [ice]
>> ice_initialize_vf_entry+0x1fa/0x250 [ice]
>> ice_sriov_configure+0x8d7/0x1520 [ice]
>> ? __percpu_ref_switch_mode+0x1b1/0x5d0
>> ? __pfx_ice_sriov_configure+0x10/0x10 [ice]
>>
>> Sometimes a KASAN report can be seen instead with a similar stack trace:
>> BUG: KASAN: use-after-free in __list_add_valid_or_report+0xf1/0x100
>>
>> VFs are added to this list in ice_mbx_init_vf_info(), but only removed
>> in ice_free_vfs(). Move the removing to ice_free_vf_entries(), which is
>> also being called in other places where VFs are being removed (including
>> ice_free_vfs() itself).
>>
>> Fixes: 8cd8a6b17d27 ("ice: move VF overflow message count into struct
>> ice_mbx_vf_info")
>> Reported-by: Sujai Buvaneswaran <[email protected]>
>> Closes:
>> https://lore.kernel.org/intel-wired-lan/ph0pr11mb50138b635f2e5ceb7075325d96...@ph0pr11mb5013.namprd11.prod.outlook.com
>> Reviewed-by: Martyna Szapar-Mudlaw <[email protected]>
>> Signed-off-by: Marcin Szycik <[email protected]>
>
> The comment above notwithstanding, I agree that this addresses the
> bug you have described.
>
> Reviewed-by: Simon Horman <[email protected]>
>