On 1/6/2026 5:04 PM, Li Li via Intel-wired-lan wrote:
When an idpf HW reset is triggered, it clears the vport but does
not clear the netdev held by vport:
// In idpf_vport_dealloc() called by idpf_init_hard_reset(),
// idpf_init_hard_reset() sets IDPF_HR_RESET_IN_PROG, so
// idpf_decfg_netdev() doesn't get called.
if (!test_bit(IDPF_HR_RESET_IN_PROG, adapter->flags))
idpf_decfg_netdev(vport);
// idpf_decfg_netdev() would clear netdev but it isn't called:
unregister_netdev(vport->netdev);
free_netdev(vport->netdev);
vport->netdev = NULL;
// Later in idpf_init_hard_reset(), the vport is cleared:
kfree(adapter->vports);
adapter->vports = NULL;
During an idpf HW reset, when "ethtool -g/-G" is called on the netdev,
the vport associated with the netdev is NULL, and so a kernel panic
would happen:
[ 513.185327] BUG: kernel NULL pointer dereference, address: 0000000000000038
...
[ 513.232756] RIP: 0010:idpf_get_ringparam+0x45/0x80
This can be reproduced reliably by injecting a TX timeout to cause
an idpf HW reset, and injecting a virtchnl error to cause the HW
reset to fail and retry, while calling "ethtool -g/-G" on the netdev
at the same time.
I have posted series that resolves these issues in the reset path by
reshuffling the flow a bit and adding netif_device_detach/attach to
make sure the netdevs are better protected in the middle of a reset:
https://lore.kernel.org/intel-wired-lan/[email protected]/
If you are still seeing issues with the above applied, let me know and I
can take a look.
With this patch applied, we see the following error but no kernel
panics anymore:
[ 476.323630] idpf 0000:05:00.0 eth1: failed to get ring params due to no
vport in netdev
Signed-off-by: Li Li <[email protected]>
---
drivers/net/ethernet/intel/idpf/idpf_ethtool.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
index d5711be0b8e69..6a4b630b786c2 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
@@ -639,6 +638,10 @@ static void idpf_get_ringparam(struct net_device *netdev,
idpf_vport_ctrl_lock(netdev);
vport = idpf_netdev_to_vport(netdev);
+ if (!vport) {
We used to have these all over the place, but the code was changed to
rely on idpf_vport_ctrl_lock() for the protection of the vport state.
Still some issues remain with the error paths (hence the series above),
but in general we don't want to resort to vport NULL checks and rather
fix the reset flows to rely on cleaner logic and locks.
Thanks,
Emil
+ netdev_err(netdev, "failed to get ring params due to no vport in
netdev\n");
+ goto unlock;
+ }
ring->rx_max_pending = IDPF_MAX_RXQ_DESC;
ring->tx_max_pending = IDPF_MAX_TXQ_DESC;
@@ -647,6 +651,7 @@ static void idpf_get_ringparam(struct net_device *netdev,
kring->tcp_data_split = idpf_vport_get_hsplit(vport);
+unlock:
idpf_vport_ctrl_unlock(netdev);
}
@@ -673,6 +674,11 @@ static int idpf_set_ringparam(struct net_device *netdev,
idpf_vport_ctrl_lock(netdev);
vport = idpf_netdev_to_vport(netdev);
+ if (!vport) {
+ netdev_err(netdev, "ring params not changed due to no vport in
netdev\n");
+ err = -EFAULT;
+ goto unlock_mutex;
+ }
idx = vport->idx;