Hi Shahaf,

On Tue, Jan 23, 2018 at 07:01:06PM +0200, Shahaf Shuler wrote:
> Following commit c7bf62255edf ("net/mlx5: fix handling link status event")
> the link state must be up in order for the burst function to be set on
> the device ops.
> 
> As the link may take time to move between down and up state it is
> possible the rte_eth_dev_start call will return with wrong burst
> function (either null or the empty burst function).
> 
> Fixing it by forcing the link to be up before returning from device
> start. In case the link is still not up after 5 seconds fail the function.
> 
> Fixes: c7bf62255edf ("net/mlx5: fix handling link status event")
> Cc: ys...@mellanox.com
> 
> Signed-off-by: Shahaf Shuler <shah...@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5.h         |  1 +
>  drivers/net/mlx5/mlx5_defs.h    |  3 +++
>  drivers/net/mlx5/mlx5_ethdev.c  | 27 +++++++++++++++++++++++++++
>  drivers/net/mlx5/mlx5_trigger.c |  8 +++++++-
>  4 files changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> index a7ec607c3..30b737f76 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -246,6 +246,7 @@ int mlx5_dev_configure(struct rte_eth_dev *);
>  void mlx5_dev_infos_get(struct rte_eth_dev *, struct rte_eth_dev_info *);
>  const uint32_t *mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev);
>  int priv_link_update(struct priv *, int);
> +int priv_force_link_status_change(struct priv *, int);
>  int mlx5_link_update(struct rte_eth_dev *, int);
>  int mlx5_dev_set_mtu(struct rte_eth_dev *, uint16_t);
>  int mlx5_dev_get_flow_ctrl(struct rte_eth_dev *, struct rte_eth_fc_conf *);
> diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
> index a71db281d..57f295c58 100644
> --- a/drivers/net/mlx5/mlx5_defs.h
> +++ b/drivers/net/mlx5/mlx5_defs.h
> @@ -110,4 +110,7 @@
>  /* Supported RSS */
>  #define MLX5_RSS_HF_MASK (~(ETH_RSS_IP | ETH_RSS_UDP | ETH_RSS_TCP))
>  
> +/* Maximum number of attempts to query link status before giving up. */
> +#define MLX5_MAX_LINK_QUERY_ATTEMPTS 5
> +
>  #endif /* RTE_PMD_MLX5_DEFS_H_ */
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
> index 6624888c9..523865d15 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -966,6 +966,33 @@ priv_link_update(struct priv *priv, int wait_to_complete)
>  }
>  
>  /**
> + * Querying the link status till it changes to the desired state.
> + * Number of query attempts is bounded by MLX5_MAX_LINK_QUERY_ATTEMPTS.
> + *
> + * @param priv
> + *   Pointer to private structure.
> + * @param status
> + *   Link desired status.
> + *
> + * @return
> + *   0 on success, -1 on error.
> + */
> +int
> +priv_force_link_status_change(struct priv *priv, int status)
> +{
> +     int try = 0;
> +
> +     while (try < MLX5_MAX_LINK_QUERY_ATTEMPTS) {
> +             priv_link_update(priv, 0);
> +             if (priv->dev->data->dev_link.link_status == status)
> +                     return 0;
> +             try++;
> +             sleep(1);
> +     }
> +     return -1;
> +}
> +
> +/**
>   * DPDK callback to retrieve physical link information.
>   *
>   * @param dev
> diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
> index 827db2e7e..c5429e182 100644
> --- a/drivers/net/mlx5/mlx5_trigger.c
> +++ b/drivers/net/mlx5/mlx5_trigger.c
> @@ -166,7 +166,13 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>       priv_xstats_init(priv);
>       /* Update link status and Tx/Rx callbacks for the first time. */
>       memset(&dev->data->dev_link, 0, sizeof(struct rte_eth_link));
> -     priv_link_update(priv, 1);
> +     INFO("Forcing port %u link to be up", dev->data->port_id);
> +     err = priv_force_link_status_change(priv, ETH_LINK_UP);
> +     if (err) {
> +             DEBUG("Failed to set port %u link to be up",
> +                   dev->data->port_id);
> +             goto error;
> +     }
>       priv_dev_interrupt_handler_install(priv, dev);
>       priv_unlock(priv);
>       return 0;
> -- 
> 2.12.0

According to mlx5_dev_start() documentation function: 
 * @return
 *   0 on success, negative errno value on failure.

This code is returning -1 in case of error, which means: 
 EPERM           1      /* Operation not permitted */

which is a wrong value.

Why not returning an errno in your priv function with an EBUSY or EAGAIN
which is more accurate?

Regards,

-- 
Nélio Laranjeiro
6WIND

Reply via email to