Hi David,

One question in line.

On Mon, Aug 12, 2024 at 3:52 PM Dawid Osuchowski
<[email protected]> wrote:
>
> Ethtool callbacks can be executed while reset is in progress and try to
> access deleted resources, e.g. getting coalesce settings can result in a
> NULL pointer dereference seen below.
>
> Once the driver is fully initialized, trigger reset:
>         # echo 1 > /sys/class/net/<interface>/device/reset
> when reset is in progress try to get coalesce settings using ethtool:
>         # ethtool -c <interface>
>
> Calling netif_device_detach() before reset makes the net core not call
> the driver when ethtool command is issued, the attempt to execute an
> ethtool command during reset will result in the following message:
>
>     netlink error: No such device
>
> instead of NULL pointer dereference. Once reset is done and
> ice_rebuild() is executing, the netif_device_attach() is called to allow
> for ethtool operations to occur again in a safe manner.
>
> [  +0.000105] BUG: kernel NULL pointer dereference, address: 0000000000000020
> [  +0.000027] #PF: supervisor read access in kernel mode
> [  +0.000011] #PF: error_code(0x0000) - not-present page
> [  +0.000011] PGD 0 P4D 0
> [  +0.000008] Oops: Oops: 0000 [#1] PREEMPT SMP PTI
> [  +0.000012] CPU: 11 PID: 19713 Comm: ethtool Tainted: G S                 
> 6.10.0-rc7+ #7
> [  +0.000015] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 
> 12/17/2015
> [  +0.000013] RIP: 0010:ice_get_q_coalesce+0x2e/0xa0 [ice]
> [  +0.000090] Code: 00 55 53 48 89 fb 48 89 f7 48 83 ec 08 0f b7 8b 86 04 00 
> 00 0f b7 83 82 04 00 00 39 d1 7e 30 48 8b 4b 18 48 63 ea 48 8b 0c e9 <48> 8b 
> 71 20 48 81 c6 a0 01 00 00 39 c2 7c 32 e8 ee fe ff ff 85 c0
> [  +0.000029] RSP: 0018:ffffbab1e9bcf6a8 EFLAGS: 00010206
> [  +0.000012] RAX: 000000000000000c RBX: ffff94512305b028 RCX: 
> 0000000000000000
> [  +0.000012] RDX: 0000000000000000 RSI: ffff9451c3f2e588 RDI: 
> ffff9451c3f2e588
> [  +0.000012] RBP: 0000000000000000 R08: 0000000000000000 R09: 
> 0000000000000000
> [  +0.000013] R10: ffff9451c3f2e580 R11: 000000000000001f R12: 
> ffff945121fa9000
> [  +0.000012] R13: ffffbab1e9bcf760 R14: 0000000000000013 R15: 
> ffffffff9e65dd40
> [  +0.000012] FS:  00007faee5fbe740(0000) GS:ffff94546fd80000(0000) 
> knlGS:0000000000000000
> [  +0.000014] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  +0.000011] CR2: 0000000000000020 CR3: 0000000106c2e005 CR4: 
> 00000000001706f0
> [  +0.000012] Call Trace:
> [  +0.000009]  <TASK>
> [  +0.000007]  ? __die+0x23/0x70
> [  +0.000012]  ? page_fault_oops+0x173/0x510
> [  +0.000011]  ? ice_get_q_coalesce+0x2e/0xa0 [ice]
> [  +0.000071]  ? search_module_extables+0x19/0x60
> [  +0.000013]  ? search_bpf_extables+0x5f/0x80
> [  +0.000012]  ? exc_page_fault+0x7e/0x180
> [  +0.000013]  ? asm_exc_page_fault+0x26/0x30
> [  +0.000014]  ? ice_get_q_coalesce+0x2e/0xa0 [ice]
> [  +0.000070]  ice_get_coalesce+0x17/0x30 [ice]
> [  +0.000070]  coalesce_prepare_data+0x61/0x80
> [  +0.000012]  ethnl_default_doit+0xde/0x340
> [  +0.000012]  genl_family_rcv_msg_doit+0xf2/0x150
> [  +0.000013]  genl_rcv_msg+0x1b3/0x2c0
> [  +0.000009]  ? __pfx_ethnl_default_doit+0x10/0x10
> [  +0.000011]  ? __pfx_genl_rcv_msg+0x10/0x10
> [  +0.000010]  netlink_rcv_skb+0x5b/0x110
> [  +0.000013]  genl_rcv+0x28/0x40
> [  +0.000007]  netlink_unicast+0x19c/0x290
> [  +0.000012]  netlink_sendmsg+0x222/0x490
> [  +0.000011]  __sys_sendto+0x1df/0x1f0
> [  +0.000013]  __x64_sys_sendto+0x24/0x30
> [  +0.000340]  do_syscall_64+0x82/0x160
> [  +0.000309]  ? __mod_memcg_lruvec_state+0xa6/0x150
> [  +0.000309]  ? __lruvec_stat_mod_folio+0x68/0xa0
> [  +0.000311]  ? folio_add_file_rmap_ptes+0x86/0xb0
> [  +0.000309]  ? next_uptodate_folio+0x89/0x290
> [  +0.000309]  ? filemap_map_pages+0x521/0x5f0
> [  +0.000302]  ? do_fault+0x26e/0x470
> [  +0.000293]  ? __handle_mm_fault+0x7dc/0x1060
> [  +0.000295]  ? __count_memcg_events+0x58/0xf0
> [  +0.000289]  ? count_memcg_events.constprop.0+0x1a/0x30
> [  +0.000292]  ? handle_mm_fault+0xae/0x320
> [  +0.000284]  ? do_user_addr_fault+0x33a/0x6a0
> [  +0.000280]  ? exc_page_fault+0x7e/0x180
> [  +0.000289]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  +0.000271] RIP: 0033:0x7faee60d8e27
>
> Fixes: 67fe64d78c43 ("ice: Implement getting and setting ethtool coalesce")
> Suggested-by: Jakub Kicinski <[email protected]>
> Signed-off-by: Dawid Osuchowski <[email protected]>
> Reviewed-by: Igor Bagnucki <[email protected]>
> ---
>  drivers/net/ethernet/intel/ice/ice_main.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c 
> b/drivers/net/ethernet/intel/ice/ice_main.c
> index eaa73cc200f4..16b4920741ff 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -608,6 +608,8 @@ ice_prepare_for_reset(struct ice_pf *pf, enum 
> ice_reset_req reset_type)
>                         memset(&vsi->mqprio_qopt, 0, 
> sizeof(vsi->mqprio_qopt));
>                 }
>         }
> +       if (vsi->netdev)
> +               netif_device_detach(vsi->netdev);
>  skip:
>
>         /* clear SW filtering DB */
> @@ -7568,11 +7570,13 @@ static void ice_update_pf_netdev_link(struct ice_pf 
> *pf)
>
>                 ice_get_link_status(pf->vsi[i]->port_info, &link_up);
>                 if (link_up) {
> +                       netif_device_attach(pf->vsi[i]->netdev);
>                         netif_carrier_on(pf->vsi[i]->netdev);
>                         netif_tx_wake_all_queues(pf->vsi[i]->netdev);
>                 } else {
>                         netif_carrier_off(pf->vsi[i]->netdev);
>                         netif_tx_stop_all_queues(pf->vsi[i]->netdev);
> +                       netif_device_detach(pf->vsi[i]->netdev);
[Kalesh] Is there any reason to attach back the netdev only if link is
up? IMO, you should attach the device back irrespective of physical
link status. In ice_prepare_for_reset(), you are detaching the device
unconditionally.

I may be missing something here.
>                 }
>         }
>  }

> --
> 2.44.0
>
>


--
Regards,
Kalesh A P

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to