Public bug reported:

we see degradation ~20% on ConnectX-5/4 in the following case:
TCP, 1 QP, 1 stream, unidir, single port.
Message sizes 1M and up show this degradation.

After changing the default TX moderation mode to off we see up to 40%
packet rate and up to 23% bandwidth degradtions.


There is an upstream commit that fix this issue, I will backport it and send it 
to the kernel-t...@lists.ubuntu.com


commit 48bfc39791b8b4a25f165e711f18b9c1617cefbc                                 
                                                                                
                                     
Author: Tal Gilboa <ta...@mellanox.com>                                         
                                                                                
                                     
Date:   Fri Mar 30 15:50:08 2018 -0700                                          
                                                                                
                                     

    net/mlx5e: Set EQE based as default TX interrupt moderation mode
                                                                    
    The default TX moderation mode was mistakenly set to CQE based. The
    intention was to add a control ability in order to improve some specific
    use-cases. In general, we prefer to use EQE based moderation as it gives
    much better numbers for the common cases.

    CQE based causes a degradation in the common case since it resets the
    moderation timer on CQE generation. This causes an issue when TSO is
    well utilized (large TSO sessions). The timer is set to 16us so traffic
    of ~64KB TSO sessions per second would mean timer reset (CQE per TSO
    session -> long time between CQEs). In this case we quickly reach the
    tcp_limit_output_bytes (256KB by default) and cause a halt in TX traffic.

    By setting EQE based moderation we make sure timer would expire after
    16us regardless of the packet rate.
    This fixes an up to 40% packet rate and up to 23% bandwidth degradtions.

    Fixes: 0088cbbc4b66 ("net/mlx5e: Enable CQE based moderation on TX CQ")
    Signed-off-by: Tal Gilboa <ta...@mellanox.com>
    Signed-off-by: Saeed Mahameed <sae...@mellanox.com>
    Signed-off-by: David S. Miller <da...@davemloft.net>

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index c71f4f10283b..0aab3afc6885 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4137,7 +4137,7 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev,
                            struct mlx5e_params *params,
                            u16 max_channels, u16 mtu)
 {
-   u8 cq_period_mode = 0;
+ u8 rx_cq_period_mode;

        params->sw_mtu = mtu;
        params->hard_mtu = MLX5E_ETH_HARD_MTU;
@@ -4173,12 +4173,12 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev,
        params->lro_timeout = mlx5e_choose_lro_timeout(mdev, 
MLX5E_DEFAULT_LRO_TIMEOUT);

        /* CQ moderation params */
-   cq_period_mode = MLX5_CAP_GEN(mdev, cq_period_start_from_cqe) ?
+ rx_cq_period_mode = MLX5_CAP_GEN(mdev, cq_period_start_from_cqe) ?
                        MLX5_CQ_PERIOD_MODE_START_FROM_CQE :
                        MLX5_CQ_PERIOD_MODE_START_FROM_EQE;
        params->rx_dim_enabled = MLX5_CAP_GEN(mdev, cq_moderation);
-   mlx5e_set_rx_cq_mode_params(params, cq_period_mode);
-   mlx5e_set_tx_cq_mode_params(params, cq_period_mode);
+ mlx5e_set_rx_cq_mode_params(params, rx_cq_period_mode);
+ mlx5e_set_tx_cq_mode_params(params, MLX5_CQ_PERIOD_MODE_START_FROM_EQE);

        /* TX inline */
        params->tx_min_inline_mode = mlx5e_params_calculate_tx_min_inline(mdev);

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1763325

Title:
  [bionic] ConnectX5 Large message size throughput degradation in TCP

Status in linux package in Ubuntu:
  New

Bug description:
  we see degradation ~20% on ConnectX-5/4 in the following case:
  TCP, 1 QP, 1 stream, unidir, single port.
  Message sizes 1M and up show this degradation.

  After changing the default TX moderation mode to off we see up to 40%
  packet rate and up to 23% bandwidth degradtions.

  
  There is an upstream commit that fix this issue, I will backport it and send 
it to the kernel-t...@lists.ubuntu.com

  
  commit 48bfc39791b8b4a25f165e711f18b9c1617cefbc                               
                                                                                
                                       
  Author: Tal Gilboa <ta...@mellanox.com>                                       
                                                                                
                                       
  Date:   Fri Mar 30 15:50:08 2018 -0700                                        
                                                                                
                                       

      net/mlx5e: Set EQE based as default TX interrupt moderation mode
                                                                      
      The default TX moderation mode was mistakenly set to CQE based. The
      intention was to add a control ability in order to improve some specific
      use-cases. In general, we prefer to use EQE based moderation as it gives
      much better numbers for the common cases.

      CQE based causes a degradation in the common case since it resets the
      moderation timer on CQE generation. This causes an issue when TSO is
      well utilized (large TSO sessions). The timer is set to 16us so traffic
      of ~64KB TSO sessions per second would mean timer reset (CQE per TSO
      session -> long time between CQEs). In this case we quickly reach the
      tcp_limit_output_bytes (256KB by default) and cause a halt in TX traffic.

      By setting EQE based moderation we make sure timer would expire after
      16us regardless of the packet rate.
      This fixes an up to 40% packet rate and up to 23% bandwidth degradtions.

      Fixes: 0088cbbc4b66 ("net/mlx5e: Enable CQE based moderation on TX CQ")
      Signed-off-by: Tal Gilboa <ta...@mellanox.com>
      Signed-off-by: Saeed Mahameed <sae...@mellanox.com>
      Signed-off-by: David S. Miller <da...@davemloft.net>

  diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
  index c71f4f10283b..0aab3afc6885 100644
  --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
  +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
  @@ -4137,7 +4137,7 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev,
                              struct mlx5e_params *params,
                              u16 max_channels, u16 mtu)
   {
  -   u8 cq_period_mode = 0;
  + u8 rx_cq_period_mode;

          params->sw_mtu = mtu;
          params->hard_mtu = MLX5E_ETH_HARD_MTU;
  @@ -4173,12 +4173,12 @@ void mlx5e_build_nic_params(struct mlx5_core_dev 
*mdev,
          params->lro_timeout = mlx5e_choose_lro_timeout(mdev, 
MLX5E_DEFAULT_LRO_TIMEOUT);

          /* CQ moderation params */
  -   cq_period_mode = MLX5_CAP_GEN(mdev, cq_period_start_from_cqe) ?
  + rx_cq_period_mode = MLX5_CAP_GEN(mdev, cq_period_start_from_cqe) ?
                          MLX5_CQ_PERIOD_MODE_START_FROM_CQE :
                          MLX5_CQ_PERIOD_MODE_START_FROM_EQE;
          params->rx_dim_enabled = MLX5_CAP_GEN(mdev, cq_moderation);
  -   mlx5e_set_rx_cq_mode_params(params, cq_period_mode);
  -   mlx5e_set_tx_cq_mode_params(params, cq_period_mode);
  + mlx5e_set_rx_cq_mode_params(params, rx_cq_period_mode);
  + mlx5e_set_tx_cq_mode_params(params, MLX5_CQ_PERIOD_MODE_START_FROM_EQE);

          /* TX inline */
          params->tx_min_inline_mode = 
mlx5e_params_calculate_tx_min_inline(mdev);

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763325/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to