On Mon, Nov 18, 2024 at 08:51:49AM +0000, Mingjin Ye wrote: > This patch adds the ‘link_period’ devargs to adjust the link update > period (microseconds), when the value is less than or equal to 0 > it will be disabled, by default it is not enabled. >
Finally, getting to review this old patch. Initial comment is that the commit log message does not explain the reason for the patch. It explains what the patch does - adding a new devarg - but not the "why" of what problem this new devarg is designed to fix and how exactly adjusting the link update period fixes that. In next version, please include the problem description and a bit more detail on how the patch fixes it. > This patch is based on DPDK 24.07.0 > [b3485f4293997d35b6daecc3437bb0c183a51fb3], for customer cherry-pick. Ok to include this info, but please do so below the cut line, i.e. put it with the v2, v3 delta information, since it won't be part of the commit log. > > Signed-off-by: Mingjin Ye <[email protected]> > --- > v2: Fix using the wrong variable. > --- > v3: Callback only when link state changes. > --- > doc/guides/nics/ice.rst | 7 +++ > drivers/net/ice/ice_ethdev.c | 113 ++++++++++++++++++++++++++++++++++- > drivers/net/ice/ice_ethdev.h | 4 ++ > 3 files changed, 122 insertions(+), 2 deletions(-) > > diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst > index ae975d19ad..1f47c81baf 100644 > --- a/doc/guides/nics/ice.rst > +++ b/doc/guides/nics/ice.rst > @@ -273,6 +273,13 @@ Runtime Configuration > * ``segment``: Check number of mbuf segments does not exceed HW limits. > * ``offload``: Check for use of an unsupported offload flag. > > +- ``Cycle link update`` (default ``not enabled``) > + > + Cyclic link update is enabled when the PMD status changes to STARTED. > Setting > + the ``devargs`` parameter ``link_period`` adjusts the link update period in > + microseconds, and is disabled when the value set is less than or equal to > 0. > + For example ``-a 81:00.0,link_period=5000`` or ``-a > 81:00.0,link_period=0``. > + The section heading should match the name of the argument, otherwise it becomes hard for the user to find documentation on the argument by looking at the doc index. I'm also unclear on what is being referred to here, the argument is called "link_period", but the first line of the description talks about the PMD transitioning to started, when I expected some description of the link coming up. I think a short description of what is meant by the "link update period" would be helpful too. > Driver compilation and testing > ------------------------------ > > diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c > index 304f959b7e..7e332e74bc 100644 > --- a/drivers/net/ice/ice_ethdev.c > +++ b/drivers/net/ice/ice_ethdev.c > @@ -13,6 +13,7 @@ > > #include <rte_tailq.h> > #include <rte_os_shim.h> > +#include <rte_alarm.h> > > #include "eal_firmware.h" > > @@ -36,6 +37,7 @@ > #define ICE_ONE_PPS_OUT_ARG "pps_out" > #define ICE_RX_LOW_LATENCY_ARG "rx_low_latency" > #define ICE_MBUF_CHECK_ARG "mbuf_check" > +#define ICE_LINK_UPDATE_ARG "link_period" > > #define ICE_CYCLECOUNTER_MASK 0xffffffffffffffffULL > > @@ -52,6 +54,7 @@ static const char * const ice_valid_args[] = { > ICE_RX_LOW_LATENCY_ARG, > ICE_DEFAULT_MAC_DISABLE, > ICE_MBUF_CHECK_ARG, > + ICE_LINK_UPDATE_ARG, > NULL > }; > > @@ -1385,6 +1388,70 @@ ice_handle_aq_msg(struct rte_eth_dev *dev) > } > #endif > > +static void > +ice_link_cycle_update(void *cb_arg) > +{ > + int ret; > + struct timespec sys_time; > + struct rte_eth_dev *dev = cb_arg; > + struct ice_adapter *adapter = > + ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); > + > + ret = ice_link_update(dev, 0); > + if (!ret && adapter->link_status != dev->data->dev_link.link_status) { > + clock_gettime(CLOCK_REALTIME, &sys_time); > + PMD_DRV_LOG(INFO, "Current SYS Time: %.24s %.9ld ns", > + ctime(&sys_time.tv_sec), sys_time.tv_nsec); > + > + rte_eth_dev_callback_process > + (dev, RTE_ETH_EVENT_INTR_LSC, NULL); > + > + adapter->link_status = dev->data->dev_link.link_status; > + } > + > + /* re-alarm link update */ > + if (rte_eal_alarm_set(adapter->devargs.link_update_period, > + &ice_link_cycle_update, cb_arg)) > + PMD_DRV_LOG(ERR, "Failed to enable cycle link update"); If I understand this correctly, when this feature is enabled, rather than handling explicit link status update messages, we periodically use the interrupt thread to poll the link status and update it that way. > +} > + > +static void > +ice_link_cycle_update_init(struct rte_eth_dev *dev) > +{ > + struct ice_adapter *adapter = > + ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); > + > + if (adapter->link_cycle_update) > + return; > + > + adapter->link_cycle_update = true; What is the need for this link_cycle_update variable? Its name seems to imply that it indicates that we should use link_cycle_update but then it's not disabled when the feature is disabled. Is it just indicating whether or not the cycle update is already configured? Is that necessary to track, can we just use the update_period to determine if we should cancel the callback on uninit? Also, should this function not return an int rather than void, so we can return error if the configuration fails? If the user requests the feature but we can't enable it, we should fail the overall init, IMHO. > + > + if (adapter->devargs.link_update_period <= 0) { > + PMD_DRV_LOG(INFO, "Device cycle link update is disabled"); > + } else { > + PMD_DRV_LOG(INFO, "Enabling cycle link update, period is %dμs", > + adapter->devargs.link_update_period); > + if (rte_eal_alarm_set(adapter->devargs.link_update_period, > + &ice_link_cycle_update, (void *)dev)) { > + PMD_DRV_LOG(ERR, "Failed to enable cycle link update"); > + adapter->link_cycle_update = false; > + } > + } > +} > + > +static void > +ice_link_cycle_update_uninit(struct rte_eth_dev *dev) > +{ > + struct ice_adapter *adapter = > + ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); > + > + adapter->link_cycle_update = false; > + > + if (rte_eal_alarm_cancel(&ice_link_cycle_update, (void *)dev)) > + PMD_DRV_LOG(ERR, "Failed to cancel cycle link update"); > + > + PMD_DRV_LOG(INFO, "Cancel cycle link update"); > +} > /** > * Interrupt handler triggered by NIC for handling > * specific interrupt. > @@ -1401,6 +1468,8 @@ static void > ice_interrupt_handler(void *param) > { > struct rte_eth_dev *dev = (struct rte_eth_dev *)param; > + struct ice_adapter *adapter = > + ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); > struct ice_hw *hw = ICE_DEV_PRIVATE_TO_HW(dev->data->dev_private); > uint32_t oicr; > uint32_t reg; > @@ -1411,6 +1480,7 @@ ice_interrupt_handler(void *param) > #ifdef ICE_LSE_SPT > uint32_t int_fw_ctl; > #endif > + struct timespec sys_time; > > /* Disable interrupt */ > ice_pf_disable_irq0(hw); > @@ -1433,12 +1503,20 @@ ice_interrupt_handler(void *param) > ice_handle_aq_msg(dev); > } > #else > - if (oicr & PFINT_OICR_LINK_STAT_CHANGE_M) { > + if (oicr & PFINT_OICR_LINK_STAT_CHANGE_M && > + adapter->devargs.link_update_period <= 0) { I have a slight concern here. For the link coming UP, I don't think its an issue to not handle the interrupt and use the delayed processing. However, for a link coming DOWN, I think it can be problematic to use delayed processing and taht we should always handle the update immediately. > PMD_DRV_LOG(INFO, "OICR: link state change event"); > ret = ice_link_update(dev, 0); > - if (!ret) > + if (!ret && adapter->link_status != > dev->data->dev_link.link_status) { > + clock_gettime(CLOCK_REALTIME, &sys_time); > + PMD_DRV_LOG(INFO, "Current SYS Time: %.24s %.9ld ns", > + ctime(&sys_time.tv_sec), > sys_time.tv_nsec); > + > rte_eth_dev_callback_process > (dev, RTE_ETH_EVENT_INTR_LSC, NULL); > + adapter->link_status = dev->data->dev_link.link_status; > + } > + > } > #endif > > @@ -2153,6 +2231,27 @@ ice_parse_mbuf_check(__rte_unused const char *key, > const char *value, void *args > return ret; > } > > +static int > +ice_parse_clean_subtask_period(__rte_unused const char *key, > + const char *value, void *args) > +{ > + int *num = (int *)args; > + int tmp; > + > + errno = 0; > + tmp = strtoul(value, NULL, 10); > + if (errno == EINVAL || errno == ERANGE || > + tmp == 0) { Nit: the indent is wrong here, since it lines up with the body of the block. However, rather than fixing the indent, you can just put the whole condition on one line. I wonder though about disallowing "0" as a value. I think it's good to allow this as an explicit disable if that is what the user wants. To detect bad input, I'd recommend allowing zero, but replacing the NULL with an actual pointer value that you can use to verify that the end of parsing results in a '\0', i.e. whole string was parsed. > + PMD_DRV_LOG(WARNING, "%s: \"%s\" is not greater than or equal > to zero", > + key, value); > + return -1; > + } > + > + *num = tmp; > + > + return 0; > +} > + > static int ice_parse_devargs(struct rte_eth_dev *dev) > { > struct ice_adapter *ad = > @@ -2214,6 +2313,12 @@ static int ice_parse_devargs(struct rte_eth_dev *dev) > if (ret) > goto bail; > > + ret = rte_kvargs_process(kvlist, ICE_LINK_UPDATE_ARG, > + &ice_parse_clean_subtask_period, > + &ad->devargs.link_update_period); > + if (ret) > + goto bail; > + > ret = rte_kvargs_process(kvlist, ICE_RX_LOW_LATENCY_ARG, > &parse_bool, &ad->devargs.rx_low_latency); > > @@ -2525,6 +2630,8 @@ ice_dev_init(struct rte_eth_dev *dev) > /* reset all stats of the device, including pf and main vsi */ > ice_stats_reset(dev); > > + ice_link_cycle_update_init(dev); > + > return 0; > > err_flow_init: > @@ -2676,6 +2783,7 @@ ice_dev_close(struct rte_eth_dev *dev) > ice_interrupt_handler, dev); > > ret = ice_dev_stop(dev); > + ice_link_cycle_update_uninit(dev); > > if (!ad->is_safe_mode) > ice_flow_uninit(ad); > @@ -3864,6 +3972,7 @@ ice_dev_start(struct rte_eth_dev *dev) > > /* Call get_link_info aq command to enable/disable LSE */ > ice_link_update(dev, 1); > + ad->link_status = dev->data->dev_link.link_status; > > pf->adapter_stopped = false; > > diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h > index 3ea9f37dc8..d0230b839d 100644 > --- a/drivers/net/ice/ice_ethdev.h > +++ b/drivers/net/ice/ice_ethdev.h > @@ -568,6 +568,7 @@ struct ice_devargs { > /* Name of the field. */ > char xtr_field_name[RTE_MBUF_DYN_NAMESIZE]; > uint64_t mbuf_check; > + uint32_t link_update_period; > }; > > /** > @@ -618,6 +619,9 @@ struct ice_adapter { > uint64_t time_hw; > struct ice_fdir_prof_info fdir_prof_info[ICE_MAX_PTGS]; > struct ice_rss_prof_info rss_prof_info[ICE_MAX_PTGS]; > + /* cache link status */ > + bool link_status; > + bool link_cycle_update; > /* True if DCF state of the associated PF is on */ > RTE_ATOMIC(bool) dcf_state_on; > /* Set bit if the engine is disabled */ > -- > 2.25.1 >

