Not sure if I understand that ticket correctly - which 5.4-rc7 was used?
Was is directly from upstream?
And why is still a RC kernel in use - 5.4 is released since quite some time?
Well, if only commit "net/mlx5e: RX, Always prefer Linear SKB
configuration" is needed, then we are done, since it's already in the
Ubuntu focal kernel since quite some time:
focal-clean$ git log --oneline --grep "net/mlx5e: RX, Always prefer Linear SKB
configuration"
5ffd81943d7a net/mlx5e: RX, Always prefer Linear SKB configuration
focal-clean$ git tag --contains 5ffd81943d7a
Ubuntu-5.4-5.4.0-10.13
Ubuntu-5.4-5.4.0-11.14
Ubuntu-5.4-5.4.0-12.15
Ubuntu-5.4-5.4.0-13.16
Ubuntu-5.4-5.4.0-14.17
Ubuntu-5.4.0-15.18
Ubuntu-5.4.0-16.19
Ubuntu-5.4.0-17.20
Ubuntu-5.4.0-17.21
Ubuntu-5.4.0-18.22
Ubuntu-5.4.0-8.11
Ubuntu-5.4.0-9.12
Looks to me that you can simply move to the current Ubuntu focal kernel
(ideally the one from proposed) and proceed with testing from there ...
** Changed in: linux (Ubuntu)
Status: New => Incomplete
** Also affects: ubuntu-z-systems
Importance: Undecided
Status: New
** Changed in: ubuntu-z-systems
Status: New => Incomplete
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1868113
Title:
[Ubuntu 20.04] Striding RQ as Default for ConnectX-4
Status in Ubuntu on IBM z Systems:
Incomplete
Status in linux package in Ubuntu:
Incomplete
Bug description:
ello,
Within our Network Performance runs in the RoCE Express 2(.1) area, we
noticed a performance regression with streaming workloads which could be
mitigated by using an ethtool setting.
The Commit which switched the default value from "Striding RQ" to "Legacy RQ"
for ConnectX-4 devices (RoCE Express 2(.1)) is attached here:
commit 5ffd81943d7a57423f204cd5844bf430b5634472 (refs/bisect/bad)
Author: Tariq Toukan <[email protected]>
Date: Tue Feb 20 15:17:54 2018 +0200
net/mlx5e: RX, Always prefer Linear SKB configuration
Prefer the linear SKB configuration of Legacy RQ over the
non-linear one of Striding RQ.
This implies that ConnectX-4 LX now uses legacy RQ by default,
as it does not support the linear configuration of Striding RQ.
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 2c634e50d051..333d4ed52b94 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4405,9 +4405,16 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev,
MLX5E_SET_PFLAG(params, MLX5E_PFLAG_RX_CQE_COMPRESS,
params->rx_cqe_compress_def);
/* RQ */
- if (mlx5e_striding_rq_possible(mdev, params))
- MLX5E_SET_PFLAG(params, MLX5E_PFLAG_RX_STRIDING_RQ,
- !slow_pci_heuristic(mdev));
+ /* Prefer Striding RQ, unless any of the following holds:
+ * - Striding RQ configuration is not possible/supported.
+ * - Slow PCI heuristic.
+ * - Legacy RQ would use linear SKB while Striding RQ would use
non-linear.
+ */
+ if (!slow_pci_heuristic(mdev) &&
+ mlx5e_striding_rq_possible(mdev, params) &&
+ (mlx5e_rx_mpwqe_is_linear_skb(mdev, params) ||
+ !mlx5e_rx_is_linear_skb(mdev, params)))
+ MLX5E_SET_PFLAG(params, MLX5E_PFLAG_RX_STRIDING_RQ, true);
mlx5e_set_rq_type(mdev, params);
mlx5e_init_rq_type_params(mdev, params);
We have modified the upstream-kernel to allow us running of measurements and
compare differences between Legacy RQ vs Striding RQ. Here is an example below:
Kernel used: 5.4.0-rc7
The measurements run on a dedicated machine (z14) using uperf with streaming
profiles (MTU size 1500).
Example throughput drop:
(traffic via a shared card, i.e. client and server using VFs from the same
ConnectX-4)
--------------------------------------------------------------------------
| | Legacy RQ | Striding RQ |
--------------------------------------------------------------------------
|str-writex30k (1 connection) | 24.62Gb/s | 33.47Gb/s |
--------------------------------------------------------------------------
Additionaly, two tests with transactional workload using the ethtool proposed
switch:
--------------------------------------------------------------------------
| | Legacy RQ | Striding RQ |
--------------------------------------------------------------------------
| rr1c-200x30k---1 | 4.12Gb/s | 5.66Gb/s |
--------------------------------------------------------------------------
| rr1c-200x30k--10 | 15.10Gb/s | 20.77Gb/s |
--------------------------------------------------------------------------
As concluded in the communication with Mellanox, there is a possibility to
use a simple ethtool command to switch between the queuing methods, allowing us
to avoid kernel code changes:
ethtool --set-priv-flags DEVNAME rx_striding_rq on
(To list the available settings you may use: ethtool --show-priv-flags
DEVNAME)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1868113/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp