http://bugs.dpdk.org/show_bug.cgi?id=1867

            Bug ID: 1867
           Summary: mlx5: enabling internal VF-to-VF communication causes
                    performance to drop significantly
           Product: DPDK
           Version: 25.11
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: other
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected]
  Target Milestone: ---

SUMMARY
=======

Enabling internal VF-to-VF communication causes performance to drop
significantly.

Here is a visual representation of the topology:

+---------------------------------------------------------------------+
|                                 dut                                 |
|                                                                     |
| +---------------+         +-------------+         +---------------+ |
| |    testpmd    |         |   testpmd   |         |    testpmd    | |
| |   "outer 0"   |         |   "inner"   |         |   "outer 1"   | |
| +--+---------+--+         +--+-------+--+         +--+---------+--+ |
|    |         |               |       |               |         |    |
|    |         |       ,-------'       `-------.       |         |    |
|    |         |       |                       |       |         |    |
| +--+--+   +--+--+ +--+--+                 +--+--+ +--+--+   +--+--+ |
| | vf0 |   | vf1 | | vf2 |                 | vf2 | | vf1 |   | vf0 | |
| +--+--+   +--+--+ +--+--+                 +--+--+ +--+--+   +--+--+ |
|    |         |       |                       |       |         |    |
|    |         |       |                       |       |         |    |
| +--+---------|-------|--+                 +--|-------|---------+--+ |
| |            `-------'  |                 |  `-------'            | |
| |          pf0          |                 |          pf1          | |
| +-----------+-----------+                 +-----------+-----------+ |
|             |                                         |             |
+-------------|-----------------------------------------|-------------+
              |                                         |
+-------------|-----------------------------------------|-------------+
|             |                 switch                  |             |
+-------------|-----------------------------------------|-------------+
              |                                         |
+-------------|-----------------------------------------|-------------+
|             |                                         |             |
| +-----------+------------+                +-----------+-----------+ |
| |          pf0           |                |          pf1          | |
| +-----------+------------+                +-----------+-----------+ |
|             |                                         |             |
| +-----------+-----------------------------------------+-----------+ |
| |                                                                 | |
| |                               trex                              | |
| |                                                                 | |
| +-----------------------------------------------------------------+ |
|                                                                     |
|                                 tgen                                |
+---------------------------------------------------------------------+

Important notes:

* Promisc mode is *disabled* on all ports.
* Traffic is sent from trex with VF0 mac addresses as ethernet destination.
* The switch does *not* flood anything.

Observations:

* When the tgen emits at 1M pkt/s per side, every packet is properly forwarded
to both "outer" testpmds.
* When emitting at 10M pkt/s per side, only ~6.8M pkt/s are received by the
"outer" testpmds on VF0.
* When emitting at 37.5M pkt/s (25G line rate), only ~1.5M pkt/s are received
by the "outer" testpmds on VF0.
* ethtool stats on PF interfaces reflect the actual transmission rate of the
traffic generator (rx_packets_phy) but only a portion of these are relayed to
VF0.

With this simpler setup:

+--------------------------+
|               dut        |
|                          |
| +--------------+         |
| |   testpmd    |         |
| +--+-------+---+         |
|    |       |             |
|    |       |             |
|    |       |             |
| +--+--+ +--+--+          |
| | vf0 | | vf1 |          |
| +--+--+ +--+--+          |
|    |       |             |
|    |       |             |
| +--|-------|-----+       |
| |  \      /      |       |
| |   \    /   pf0 |       |
| +----\  /--------+       |
|       ||                 |
+-------||-----------------+
        ||
+-------||---------------------------------+
|       ||                                 |
|       |\--------------------------\      |
|       |          switch           |      |
|       |                           |      |
+-------|---------------------------|------+
        |                           |
+-------|---------------------------|------+
|       |                           |      |
| +-----+-----+                +----+----+ |
| |    pf0    |                |   pf1   | |
| +-----+-----+                +----+----+ |
|       |                           |      |
| +-----+---------------------------+----+ |
| |                                      | |
| |                  trex                | |
| |                                      | |
| +--------------------------------------+ |
|                                          |
|                    tgen                  |
+------------------------------------------+

The maximum line rate of the port can be achieved (37.5M pkt/s total, 18.75M
pkt/s per side).

SOFTWARE
========

~/dpdk# git describe 
v25.11-4-gcd60dcd503b9

~# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-6.16.7-100.fc41.x86_64 ... \
        intel_iommu=on iommu=pt default_hugepagesz=1GB hugepagesz=1G
hugepages=32 \
        skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 nohz=on \
        isolcpus=10-19,30-39 nohz_full=10-19,30-39 rcu_nocbs=10-19,30-39 \
        tuned.non_isolcpus=3ff003ff intel_pstate=passive nosoftlockup

libibverbs.so.1.14.51.0
libmlx5.so.1.24.51.0

HARDWARE
========

CPU Model name:                          Intel(R) Xeon(R) Silver 4316 CPU @
2.30GHz

SLOT          DRIVER     IFNAME        MAC                LINK/STATE  SPEED  
DEVICE
0000:18:00.0  mlx5_core  enp24s0f0np0  b8:3f:d2:fa:53:86  1/up        25Gb/s 
MT2894 Family [ConnectX-6 Lx]
0000:18:00.1  mlx5_core  enp24s0f1np1  b8:3f:d2:fa:53:87  1/up        25Gb/s 
MT2894 Family [ConnectX-6 Lx]
0000:18:00.2  mlx5_core  enp24s0f0v0   02:aa:aa:aa:aa:00  1/up        25Gb/s 
ConnectX Family mlx5Gen Virtual Function
0000:18:00.3  mlx5_core  enp24s0f0v1   02:aa:aa:aa:aa:01  1/up        25Gb/s 
ConnectX Family mlx5Gen Virtual Function
0000:18:00.4  mlx5_core  enp24s0f0v2   02:aa:aa:aa:aa:02  1/up        25Gb/s 
ConnectX Family mlx5Gen Virtual Function
0000:18:08.2  mlx5_core  enp24s0f1v0   02:cc:cc:cc:cc:00  1/up        25Gb/s 
ConnectX Family mlx5Gen Virtual Function
0000:18:08.3  mlx5_core  enp24s0f1v1   02:cc:cc:cc:cc:01  1/up        25Gb/s 
ConnectX Family mlx5Gen Virtual Function
0000:18:08.4  mlx5_core  enp24s0f1v2   02:cc:cc:cc:cc:02  1/up        25Gb/s 
ConnectX Family mlx5Gen Virtual Function

~# ethtool -i enp24s0f0np0
driver: mlx5_core
version: 6.16.7-100.fc41.x86_64
firmware-version: 26.41.1000 (MT_0000000532)
expansion-rom-version:
bus-info: 0000:18:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

~# ethtool -i enp24s0f1np1
driver: mlx5_core
version: 6.16.7-100.fc41.x86_64
firmware-version: 26.41.1000 (MT_0000000532)
expansion-rom-version:
bus-info: 0000:18:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

VF CONFIGURATION
================

+ pf0=enp24s0f0np0
+ pf1=enp24s0f1np1

+++ readlink -ve /sys/class/net/enp24s0f0np0/device
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.0
+ pci=0000:18:00.0
+ devlink dev eswitch set pci/0000:18:00.0 mode legacy
+ echo 1
+ tee /sys/class/net/enp24s0f0np0/device/sriov_drivers_autoprobe
1
+ echo 3
+ tee /sys/class/net/enp24s0f0np0/device/sriov_numvfs
3

+ ip link set enp24s0f0np0 vf 0 mac 02:aa:aa:aa:aa:00
+ ip link set enp24s0f0np0 vf 1 mac 02:aa:aa:aa:aa:01
+ ip link set enp24s0f0np0 vf 2 mac 02:aa:aa:aa:aa:02
+ ip link set enp24s0f0v0 address 02:aa:aa:aa:aa:00
+ ip link set enp24s0f0v0 up
+ ip link set enp24s0f0v1 address 02:aa:aa:aa:aa:01
+ ip link set enp24s0f0v1 up
+ ip link set enp24s0f0v2 address 02:aa:aa:aa:aa:02
+ ip link set enp24s0f0v2 up

+++ readlink -ve /sys/class/net/enp24s0f0np0/device/virtfn0
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.2
+ vf00_pci=0000:18:00.2
+++ readlink -ve /sys/class/net/enp24s0f0np0/device/virtfn1
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.3
+ vf01_pci=0000:18:00.3
+++ readlink -ve /sys/class/net/enp24s0f0np0/device/virtfn2
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.4
+ vf02_pci=0000:18:00.4

+++ readlink -ve /sys/class/net/enp24s0f1np1/device
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:00.1
+ pci=0000:18:00.1
+ devlink dev eswitch set pci/0000:18:00.1 mode legacy
+ echo 1
+ tee /sys/class/net/enp24s0f1np1/device/sriov_drivers_autoprobe
1
+ echo 3
+ tee /sys/class/net/enp24s0f1np1/device/sriov_numvfs
3

+++ readlink -ve /sys/class/net/enp24s0f1np1/device/virtfn0
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:08.2
+ vf10_pci=0000:18:08.2
+++ readlink -ve /sys/class/net/enp24s0f1np1/device/virtfn1
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:08.3
+ vf11_pci=0000:18:08.3
+++ readlink -ve /sys/class/net/enp24s0f1np1/device/virtfn2
++ basename /sys/devices/pci0000:17/0000:17:02.0/0000:18:08.4
+ vf12_pci=0000:18:08.4

+ ip link set enp24s0f1np1 vf 0 mac 02:cc:cc:cc:cc:00
+ ip link set enp24s0f1np1 vf 1 mac 02:cc:cc:cc:cc:01
+ ip link set enp24s0f1np1 vf 2 mac 02:cc:cc:cc:cc:02
+ ip link set enp24s0f1v0 address 02:cc:cc:cc:cc:00
+ ip link set enp24s0f1v0 up
+ ip link set enp24s0f1v1 address 02:cc:cc:cc:cc:01
+ ip link set enp24s0f1v1 up
+ ip link set enp24s0f1v2 address 02:cc:cc:cc:cc:02
+ ip link set enp24s0f1v2 up

TESTPMD COMMANDS
================

~# cat ./testpmd
set promisc all off

RUNTIME_DIRECTORY=/tmp/outer0 dpdk-testpmd -n 4 -l 0,12,32,13,33 -a
0000:18:00.2 -a 0000:18:00.3  -- \
        --nb-cores 4 --rxq=4 --txq=4 --rxd=2048 --txd=2048 --forward-mode=mac
-i \
        --eth-peer=0,30:3e:a7:0b:f2:54 --eth-peer=1,02:aa:aa:aa:aa:02 \
        --rss-udp --auto-start --cmdline-file=./testpmd --record-burst-stats

RUNTIME_DIRECTORY=/tmp/inner dpdk-testpmd -n 4 -l 0,14,34,15,35 -a 0000:18:00.4
-a 0000:18:08.4 -- \
        --nb-cores 4 --rxq=4 --txq=4 --rxd=2048 --txd=2048 --forward-mode=mac
-i \
        --eth-peer=0,02:aa:aa:aa:aa:01 --eth-peer=1,02:cc:cc:cc:cc:01 \
        --rss-udp --auto-start --cmdline-file=./testpmd --record-burst-stats

RUNTIME_DIRECTORY=/tmp/outer1 dpdk-testpmd -n 4 -l 0,10,30,11,31 -a
0000:18:08.2 -a 0000:18:08.3 -- \
        --nb-cores 4 --rxq=4 --txq=4 --rxd=2048 --txd=2048 --forward-mode=mac
-i \
        --eth-peer=0,30:3e:a7:0b:f2:55 --eth-peer=1,02:cc:cc:cc:cc:02 \
        --rss-udp --auto-start --cmdline-file=./testpmd --record-burst-stats

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to