BTW, offload features are on in my test client1 and server1 (iperf server)
vagrant@client1:~$ ethtool -k enp0s8
Features for enp0s8:
rx-checksumming: on [fixed]
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: on
udp-fragmentation-offload: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: on [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: on [fixed]
hw-tc-offload: off [fixed]
vagrant@client1:~$
vagrant@server1:~$ ifconfig enp0s8
enp0s8 Link encap:Ethernet HWaddr 08:00:27:c0:a6:0b
inet addr:192.168.230.101 Bcast:192.168.230.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fec0:a60b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:4228443 errors:0 dropped:0 overruns:0 frame:0
TX packets:2484988 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:34527894301 (34.5 GB) TX bytes:528944799 (528.9 MB)
vagrant@server1:~$ ethtool -k enp0s8
Features for enp0s8:
rx-checksumming: on [fixed]
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: on
udp-fragmentation-offload: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: on [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: on [fixed]
hw-tc-offload: off [fixed]
vagrant@server1:~$
-----邮件原件-----
发件人: Yi Yang (杨燚)-云服务集团
发送时间: 2019年7月11日 8:22
收件人: [email protected]; [email protected]
抄送: Yi Yang (杨燚)-云服务集团 <[email protected]>
主题: 答复: [ovs-dev] Why is ovs DPDK much worse than ovs in my test case?
重要性: 高
Ilya, thank you so much, using 9K MTU for all the virtio interfaces in
transport path does help (including DPDK port), the data is here.
vagrant@client1:~$ iperf -t 60 -i 10 -c 192.168.230.101
------------------------------------------------------------
Client connecting to 192.168.230.101, TCP port 5001
TCP window size: 325 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.200.101 port 53956 connected with 192.168.230.101 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 315 MBytes 264 Mbits/sec
[ 3] 10.0-20.0 sec 333 MBytes 280 Mbits/sec
[ 3] 20.0-30.0 sec 300 MBytes 252 Mbits/sec
[ 3] 30.0-40.0 sec 307 MBytes 258 Mbits/sec
[ 3] 40.0-50.0 sec 322 MBytes 270 Mbits/sec
[ 3] 50.0-60.0 sec 316 MBytes 265 Mbits/sec
[ 3] 0.0-60.0 sec 1.85 GBytes 265 Mbits/sec
vagrant@client1:~$
But it is still much worse than ovs kernel. In my test case, I used VirtualBox
network, the whole transport path traverses several different VMs, every VM has
turned on offload features except ovs DPDK VM, I understand tso offload should
be done on send side, so when the packet is sent out from the send side or
receive side, it has been segmented by tso to adapt to path MTU, so in ovs
kernel VM/ovs DPDK VM, the packet size has been MTU of ovs port/DPDK port, so
it needn't do tso work, right?
-----邮件原件-----
发件人: Ilya Maximets [mailto:[email protected]]
发送时间: 2019年7月10日 18:11
收件人: [email protected]; Yi Yang (杨燚)-云服务集团 <[email protected]>
主题: Re: [ovs-dev] Why is ovs DPDK much worse than ovs in my test case?
> Hi, all
>
> I just use ovs as a static router in my test case, ovs is ran in
> vagrant VM, ethernet interfaces uses virtio driver, I create two ovs
> bridges, each one adds one ethernet interface, two bridges are
> connected by patch port, only default openflow rule is there.
>
> table=0, priority=0 actions=NORMAL
> Bridge br-int
> Port patch-br-ex
> Interface patch-br-ex
> type: patch
> options: {peer=patch-br-int}
> Port br-int
> Interface br-int
> type: internal
> Port "dpdk0"
> Interface "dpdk0"
> type: dpdk
> options: {dpdk-devargs="0000:00:08.0"}
> Bridge br-ex
> Port "dpdk1"
> Interface "dpdk1"
> type: dpdk
> options: {dpdk-devargs="0000:00:09.0"}
> Port patch-br-int
> Interface patch-br-int
> type: patch
> options: {peer=patch-br-ex}
> Port br-ex
> Interface br-ex
> type: internal
>
> But when I run iperf to do performance benchmark, the result shocked me.
>
> For ovs nondpdk, the result is
>
> vagrant at client1:~$ iperf -t 60 -i 10 -c 192.168.230.101
>
> ------------------------------------------------------------
> Client connecting to 192.168.230.101, TCP port 5001 TCP window size:
> 85.0 KByte (default)
> ------------------------------------------------------------
> [ 3] local 192.168.200.101 port 53900 connected with 192.168.230.101
> port
> 5001
> [ ID] Interval Transfer Bandwidth
> [ 3] 0.0-10.0 sec 1.05 GBytes 905 Mbits/sec
> [ 3] 10.0-20.0 sec 1.02 GBytes 877 Mbits/sec
> [ 3] 20.0-30.0 sec 1.07 GBytes 922 Mbits/sec
> [ 3] 30.0-40.0 sec 1.08 GBytes 927 Mbits/sec
> [ 3] 40.0-50.0 sec 1.06 GBytes 914 Mbits/sec
> [ 3] 50.0-60.0 sec 1.07 GBytes 922 Mbits/sec
> [ 3] 0.0-60.0 sec 6.37 GBytes 911 Mbits/sec
>
> vagrant at client1:~$
>
> For ovs dpdk, the bandwidth is just about 45Mbits/sec, why? I really
> don’t understand what happened.
>
> vagrant at client1:~$ iperf -t 60 -i 10 -c 192.168.230.101
>
> ------------------------------------------------------------
> Client connecting to 192.168.230.101, TCP port 5001 TCP window size:
> 85.0 KByte (default)
> ------------------------------------------------------------
> [ 3] local 192.168.200.101 port 53908 connected with 192.168.230.101
> port
> 5001
> [ ID] Interval Transfer Bandwidth
> [ 3] 0.0-10.0 sec 54.6 MBytes 45.8 Mbits/sec [ 3] 10.0-20.0 sec
> 55.5 MBytes 46.6 Mbits/sec [ 3] 20.0-30.0 sec 52.5 MBytes 44.0
> Mbits/sec [ 3] 30.0-40.0 sec 53.6 MBytes 45.0 Mbits/sec [ 3]
> 40.0-50.0 sec 54.0 MBytes 45.3 Mbits/sec [ 3] 50.0-60.0 sec 53.9
> MBytes 45.2 Mbits/sec
> [ 3] 0.0-60.0 sec 324 MBytes 45.3 Mbits/sec
>
> vagrant at client1:~$
>
> By the way, I tried to pin physical cores to qemu processes which
> correspond to ovs pmd threads, but it hardly affects on performance.
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> P
> 16303 yangyi 20 0 9207120 209700 107500 R 99.9 0.1 63:02.37
> EMT-1 1
> 16304 yangyi 20 0 9207120 209700 107500 R 99.9 0.1 69:16.16
> EMT-2 2
Hi.
There might be a lot of reasons for a bad performance, but the most likely this
is just because of disabled offloading capabilities on the VM interface (TSO
mostly).
Try using UDP flow for testing. You should have almost same results for kernel
and DPDK in UDP case. Try:
# iperf -t 60 -i 10 -c 192.168.230.101 -u -b 10G -l 1460
The bottleneck in your setup is the tap interface that connects VM with the
host network, that is why I'm expecting similar results in both cases.
In case of kernel OVS, the tap interface on host will have TSO and checksum
offloading enabled so iperf will use huge 64K packets that will never be
segmented, since everything happens on the same host and all kernels (guest and
host) by default has TSO and checksum offloading support.
While using OVS with DPDK in guest, tap interface on host will have no
TSO/chksum offloading and the host kernel will have to split each 64K TCP
packet into MTU sized frames and re-calculate checksums for all the chunks.
This significantly slows down everything.
To partially mitigate the issue with TCP you could increase the MTU to 8K on
all your interfaces (all the host and gest interfaces ). Use 'mtu_request' to
set the MTU for DPDK ports. This should give a good performance boost in DPDK
case.
Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev