Public bug reported:
ello,
Within our Network Performance runs in the RoCE Express 2(.1) area, we noticed
a performance regression with streaming workloads which could be mitigated by
using an ethtool setting.
The Commit which switched the default value from "Striding RQ" to "Legacy RQ"
for ConnectX-4 devices (RoCE Express 2(.1)) is attached here:
commit 5ffd81943d7a57423f204cd5844bf430b5634472 (refs/bisect/bad)
Author: Tariq Toukan <[email protected]>
Date: Tue Feb 20 15:17:54 2018 +0200
net/mlx5e: RX, Always prefer Linear SKB configuration
Prefer the linear SKB configuration of Legacy RQ over the
non-linear one of Striding RQ.
This implies that ConnectX-4 LX now uses legacy RQ by default,
as it does not support the linear configuration of Striding RQ.
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 2c634e50d051..333d4ed52b94 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4405,9 +4405,16 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev,
MLX5E_SET_PFLAG(params, MLX5E_PFLAG_RX_CQE_COMPRESS,
params->rx_cqe_compress_def);
/* RQ */
- if (mlx5e_striding_rq_possible(mdev, params))
- MLX5E_SET_PFLAG(params, MLX5E_PFLAG_RX_STRIDING_RQ,
- !slow_pci_heuristic(mdev));
+ /* Prefer Striding RQ, unless any of the following holds:
+ * - Striding RQ configuration is not possible/supported.
+ * - Slow PCI heuristic.
+ * - Legacy RQ would use linear SKB while Striding RQ would use
non-linear.
+ */
+ if (!slow_pci_heuristic(mdev) &&
+ mlx5e_striding_rq_possible(mdev, params) &&
+ (mlx5e_rx_mpwqe_is_linear_skb(mdev, params) ||
+ !mlx5e_rx_is_linear_skb(mdev, params)))
+ MLX5E_SET_PFLAG(params, MLX5E_PFLAG_RX_STRIDING_RQ, true);
mlx5e_set_rq_type(mdev, params);
mlx5e_init_rq_type_params(mdev, params);
We have modified the upstream-kernel to allow us running of measurements and
compare differences between Legacy RQ vs Striding RQ. Here is an example below:
Kernel used: 5.4.0-rc7
The measurements run on a dedicated machine (z14) using uperf with streaming
profiles (MTU size 1500).
Example throughput drop:
(traffic via a shared card, i.e. client and server using VFs from the same
ConnectX-4)
--------------------------------------------------------------------------
| | Legacy RQ | Striding RQ |
--------------------------------------------------------------------------
|str-writex30k (1 connection) | 24.62Gb/s | 33.47Gb/s |
--------------------------------------------------------------------------
Additionaly, two tests with transactional workload using the ethtool proposed
switch:
--------------------------------------------------------------------------
| | Legacy RQ | Striding RQ |
--------------------------------------------------------------------------
| rr1c-200x30k---1 | 4.12Gb/s | 5.66Gb/s |
--------------------------------------------------------------------------
| rr1c-200x30k--10 | 15.10Gb/s | 20.77Gb/s |
--------------------------------------------------------------------------
As concluded in the communication with Mellanox, there is a possibility to use
a simple ethtool command to switch between the queuing methods, allowing us to
avoid kernel code changes:
ethtool --set-priv-flags DEVNAME rx_striding_rq on
(To list the available settings you may use: ethtool --show-priv-flags DEVNAME)
** Affects: linux (Ubuntu)
Importance: Undecided
Assignee: Skipper Bug Screeners (skipper-screen-team)
Status: New
** Tags: architecture-s39064 bugnameltc-184497 severity-medium
targetmilestone-inin2004
** Tags added: architecture-s39064 bugnameltc-184497 severity-medium
targetmilestone-inin2004
** Changed in: ubuntu
Assignee: (unassigned) => Skipper Bug Screeners (skipper-screen-team)
** Package changed: ubuntu => linux (Ubuntu)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1868113
Title:
[Ubuntu 20.04] Striding RQ als Default für ConnectX-4 in Distros
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1868113/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs