On 6/24/2020 9:56 AM, Saeed Mahameed wrote:
On Tue, 2020-06-23 at 14:31 -0700, Jakub Kicinski wrote:
On Tue, 23 Jun 2020 12:52:29 -0700 Saeed Mahameed wrote:
From: Aya Levin <a...@mellanox.com>

The concept of Relaxed Ordering in the PCI Express environment
allows
switches in the path between the Requester and Completer to reorder
some
transactions just received before others that were previously
enqueued.

In ETH driver, there is no question of write integrity since each
memory
segment is written only once per cycle. In addition, the driver
doesn't
access the memory shared with the hardware until the corresponding
CQE
arrives indicating all PCI transactions are done.


Hi Jakub, sorry i missed your comments on this patch.

Assuming the device sets the RO bits appropriately, right? Otherwise
CQE write could theoretically surpass the data write, no?


Yes HW guarantees correctness of correlated queues and transactions.

With relaxed ordering set, traffic on the remote-numa is at the
same
level as when on the local numa.

Same level of? Achievable bandwidth?


Yes, Bandwidth, according the below explanation, i see that the message
needs improvements.

Running TCP single stream over ConnectX-4 LX, ARM CPU on remote-
numa
has 300% improvement in the bandwidth.
With relaxed ordering turned off: BW:10 [GB/s]
With relaxed ordering turned on:  BW:40 [GB/s]

The driver turns relaxed ordering off by default. It exposes 2
boolean
private-flags in ethtool: pci_ro_read and pci_ro_write for user
control.

$ ethtool --show-priv-flags eth2
Private flags for eth2:
...
pci_ro_read        : off
pci_ro_write       : off

$ ethtool --set-priv-flags eth2 pci_ro_write on
$ ethtool --set-priv-flags eth2 pci_ro_read on

I think Michal will rightly complain that this does not belong in
private flags any more. As (/if?) ARM deployments take a foothold
in DC this will become a common setting for most NICs.

Initially we used pcie_relaxed_ordering_enabled() to
  programmatically enable this on/off on boot but this seems to
introduce some degradation on some Intel CPUs since the Intel Faulty
CPUs list is not up to date. Aya is discussing this with Bjorn.
Adding Bjorn Helgaas

So until we figure this out, will keep this off by default.

for the private flags we want to keep them for performance analysis as
we do with all other mlx5 special performance features and flags.

Reply via email to