From: 'Simon Horman' <[email protected]>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
Subject: ice: use element-by-element comparison for DCB config changes

This commit replaces the use of memcmp() for DCB configurations with an
element-by-element comparison using ice_dcb_need_recfg(). It exports this
function and applies it to the SW DCB netlink setters to prevent issues
with uninitialized padding bytes.

> diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c 
> b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
> index a10c1c8d86977..13a52c1a4da3e 100644
> --- a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
> +++ b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
> @@ -108,11 +108,17 @@ static int ice_dcbnl_setets(struct net_device *netdev, 
> struct ieee_ets *ets)
>       if (!bwrec)
>               new_cfg->etsrec.tcbwtable[0] = 100;
>  
> +     if (!ice_dcb_need_recfg(pf, &pf->hw.port_info->qos_cfg.local_dcbx_cfg,
> +                             new_cfg)) {
> +             err = ICE_DCB_NO_HW_CHG;
> +             goto ets_out;
> +     }

Does ice_dcb_need_recfg() correctly detect bandwidth changes?

Looking at the implementation of ice_dcb_need_recfg(), it only logs a debug
message but doesn't set need_reconfig to true when only the ETS Bandwidth
Table or TSA Table changes:

drivers/net/ethernet/intel/ice/ice_dcb_lib.c:ice_dcb_need_recfg() {
    ...
    if (memcmp(&new_cfg->etscfg.tcbwtable,
               &old_cfg->etscfg.tcbwtable,
               sizeof(new_cfg->etscfg.tcbwtable)))
        dev_dbg(dev, "ETS TC BW Table changed.\n");
    ...
}

Since the function returns false, will this cause ice_dcbnl_setets() to
abort early and silently drop user-requested bandwidth modifications?

Also, since new_cfg points to the shared desired_dcbx_cfg state, does this
early return leave uncommitted modifications in the software state?

If a subsequent unrelated operation (like setting PFC) correctly returns
true from ice_dcb_need_recfg(), it appears it could unintentionally push
these abandoned ETS bandwidth changes to the hardware.

>       err = ice_pf_dcb_cfg(pf, new_cfg, true);
>       /* return of zero indicates new cfg applied */
> -     if (err == ICE_DCB_HW_CHG_RST)
> +     if (!err)
>               ice_dcbnl_devreset(netdev);

This isn't a bug introduced by this patch, but can ice_dcbnl_devreset()
cause a deadlock against hardware reset tasks?

The ice_dcbnl_devreset() function spins in a usleep_range() loop while
ice_is_reset_in_progress() is true.

Since DCBNL callbacks like ice_dcbnl_setets() run with the global
rtnl_lock held, and a concurrent hardware or PCIe reset worker likely
requires rtnl_lock to complete the reset, will the reset task block
forever while this thread waits for it?

> -     if (err == ICE_DCB_NO_HW_CHG)
> +     else if (err == ICE_DCB_NO_HW_CHG)
>               err = ICE_DCB_HW_CHG_RST;
>  
>  ets_out:

Does this change cause the function to return a positive error code to
userspace when applying an identical configuration?

By bypassing the translation from ICE_DCB_NO_HW_CHG (which is 1) to
ICE_DCB_HW_CHG_RST (which evaluates to 0) in the early goto added
above, err remains 1 at the ets_out label.

The DCBNL core evaluates any non-zero return value as an error:

net/dcb/dcbnl.c:dcbnl_ieee_set() {
    ...
    err = netdev->dcbnl_ops->ieee_setets(netdev, ets);
    if (err)
        goto err;
    ...
}

Will this cause tools like lldptool to fail when configuring an
existing setup?

[ ... ]

> @@ -845,6 +858,12 @@ static int ice_dcbnl_setapp(struct net_device *netdev, 
> struct dcb_app *app)
>       new_cfg->dscp_map[app->protocol] = app->priority;
>       new_cfg->app[new_cfg->numapps++] = new_app;

This isn't a bug introduced by this patch, but does this array append
lack bounds checking against ICE_DCBX_MAX_APPS?

The app array has a capacity of 64. The function restricts duplicate
DSCP protocols using the dscp_mapped bitmap, but since DSCP_MAX is also
64, a user could map all 64 unique values.

If numapps starts greater than 0 due to existing EtherType applications,
adding 64 valid DSCP mappings would cause numapps to exceed 64, leading
to an out-of-bounds write that overwrites the adjacent dscp_mapped
bitmap.

Reply via email to