Re: [ovs-discuss] Slow Performance with OvS-DPDK when Running Iperf

2023-07-27 Thread Ilya Maximets via discuss
On 7/27/23 18:48, Matheus Stolet via discuss wrote:
> Hello,

Hi.

> 
> I am running some performance benchmarks to figure out why I am getting 
> fairly low performance when running iperf inside a VM with OvS-DPDK. I 
> started VMs on two different physical machines using QEMU and KVM. In 
> one machine I ran the iperf server and on the other machine I ran the 
> iperf client. Both machines are also running OvS with DPDK on the 
> datapath and using dpdkvhostuser in the appropriate port. To do this I 
> followed the instructions on these two pages 
> (https://docs.openvswitch.org/en/latest/intro/install/dpdk/) and 
> (https://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/). I

You might want to read the following doc as well:
  https://docs.openvswitch.org/en/latest/howto/dpdk/

 
> would expect some performance decrease when running things inside a VM 
> and with a virtual switch, but the performance I am getting is 
> alarmingly low. When running the an iperf client in one VM that contacts 
> the iperf server running on the other VM i got a throughput of 3.43 
> Gb/s. When I ran the same iperf benchmark on the physical machines I got 
> a throughput of 23.7 Gb/s. 3.43 Gb/s seems like it is way too low and I 
> feel like I am missing some essential configuration.
> 
> Things I have tried:
> - Setting dpdk-lcore-mask to specify the CPU cores to use with DPDK
> 
>ovs-vsctl --no-wait set Open_vSwitch . 
> other_config:dpdk-lcore-mask=0x2a

Nit: Don't set lcore mask, it's generally not needed and usually harmful,
unless you know exactly what you're doing.

> 
> - Setting pmd-cpu-mask so that ovs uses multiple pmd threads
> 
>ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=0xa80
> 
> - Turning off some nic offload functions in both guest and host OS that 
> seem to reduce virtio performance with KVM
> 
>ethtool --ofload enp59s0f0np0 gso off tso off sg off gro off
> 
> Turning off the NIC offload operations actually helped. At first I was 
> getting measly 161 Mb/s throughput with iperf and turning off those 
> offloads helped get that up to 3.43 Gb/s, which is still far from ideal. 
> Does anyone have any ideas as to why I am getting such poor performance 
> when compared to running the same benchmark on the physical machine?
> 
> 
> OvS commands used to create bridge and ports
> 
> $ ovs-vsctl add-br br0
> $ ovs-vsctl add-port br0 enp59s0f0np0

The first thing you need to change is to open the physical
port above as a dpdk port (interface type=dpdk).  In the current
config it is just open via AF_PACKET socket, and it doesn't have
impressive performance.

Note that TCP performance is tricky.  In your case you may not
get the same rate as you have on the host even if you do everything
right.  The main reason is that OVS with DPDK doesn't support TSO
by default.  You may enable support for it though, see:
  https://docs.openvswitch.org/en/latest/topics/userspace-tso/
But it is experimental and not all scenarios will work, e.g.
it will not be possible to use tunnels.

Packet-per-second performance should be much higher though.

> $ ovs-vsctl set bridge br0 datapath_type=netdev
> $ ovs-vsctl add-port br0 vhost0 -- set Interface vhost0 options:n_rxq=10

Nit: 'n_rxq' doesn't work for vhost interfaces, number of queues
is derived from the actual virtio device in qemu.  It's harmless
to have it set, but it will not affect anything, OVS will detect
10 queues anyway.
Nit: And there is no much sense in having 10 queues and only 3
cores in pmd-cpu-mask.  Also, iperf is single threaded by default,
so only one queue will likely be utilized anyway.

Bets regards, Ilya Maximets.

> type=dpdkvhostuser
> 
> QEMU command used to start VM
> 
> $ taskset -c 
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42 \
>  sudo qemu-system-x86_64 \
>  -nographic -monitor none -serial stdio \
>  -machine accel=kvm,type=q35 \
>  -cpu host \
>  -smp 11 \
>  -m 10G \
>  -netdev user,id=net0,hostfwd=tcp::2220-:22 \
>  -device virtio-net-pci,netdev=net0 \
>  -chardev socket,id=char0,path=/usr/local/var/run/openvswitch/vhost0 
> \
>  -netdev 
> type=vhost-user,chardev=char0,vhostforce=on,queues=10,id=net1 \
>  -device virtio-net-pci,netdev=net1,mac=$mac,mq=on,vectors=22 \
>  -object 
> memory-backend-file,id=mem,size=10G,mem-path=/dev/hugepages,share=on \
>  -numa node,memdev=mem -mem-prealloc \
>  -drive if=virtio,format=raw,file="base.img" \
>  -drive if=virtio,format=raw,file="seed.img" \
> 
> OvS was compiled to use DPDK with the following configurations:
> ./configure --with-dpdk=static CFLAGS="-Ofast -msse4.2 -mpopcnt 
> -march=native"
> 
> Specs
> VM
> DPDK: 21.11.4
> Kernel: 5.4.0-148-generic
> Distribution: Ubuntu 20.04
> 
> Host
> DPDK: 21.11.4
> QEMU: 8.0.90
> OvS: 3.0.5
> Kernel: 5.15.111.1.amd64-smp
> Distribution: Debian 11
> CPU: Two Intel Xeon Gold 6152 CPUs
> NIC: Mellanox ConnectX-5 Ex 100Gbit/s
> 
> Thanks,
> Matheus


[ovs-discuss] Slow Performance with OvS-DPDK when Running Iperf

2023-07-27 Thread Matheus Stolet via discuss

Hello,

I am running some performance benchmarks to figure out why I am getting 
fairly low performance when running iperf inside a VM with OvS-DPDK. I 
started VMs on two different physical machines using QEMU and KVM. In 
one machine I ran the iperf server and on the other machine I ran the 
iperf client. Both machines are also running OvS with DPDK on the 
datapath and using dpdkvhostuser in the appropriate port. To do this I 
followed the instructions on these two pages 
(https://docs.openvswitch.org/en/latest/intro/install/dpdk/) and 
(https://docs.openvswitch.org/en/latest/topics/dpdk/vhost-user/). I 
would expect some performance decrease when running things inside a VM 
and with a virtual switch, but the performance I am getting is 
alarmingly low. When running the an iperf client in one VM that contacts 
the iperf server running on the other VM i got a throughput of 3.43 
Gb/s. When I ran the same iperf benchmark on the physical machines I got 
a throughput of 23.7 Gb/s. 3.43 Gb/s seems like it is way too low and I 
feel like I am missing some essential configuration.


Things I have tried:
- Setting dpdk-lcore-mask to specify the CPU cores to use with DPDK

  ovs-vsctl --no-wait set Open_vSwitch . 
other_config:dpdk-lcore-mask=0x2a


- Setting pmd-cpu-mask so that ovs uses multiple pmd threads

  ovs-vsctl --no-wait set Open_vSwitch . other_config:pmd-cpu-mask=0xa80

- Turning off some nic offload functions in both guest and host OS that 
seem to reduce virtio performance with KVM


  ethtool --ofload enp59s0f0np0 gso off tso off sg off gro off

Turning off the NIC offload operations actually helped. At first I was 
getting measly 161 Mb/s throughput with iperf and turning off those 
offloads helped get that up to 3.43 Gb/s, which is still far from ideal. 
Does anyone have any ideas as to why I am getting such poor performance 
when compared to running the same benchmark on the physical machine?



OvS commands used to create bridge and ports

$ ovs-vsctl add-br br0
$ ovs-vsctl add-port br0 enp59s0f0np0
$ ovs-vsctl set bridge br0 datapath_type=netdev
$ ovs-vsctl add-port br0 vhost0 -- set Interface vhost0 options:n_rxq=10 
type=dpdkvhostuser


QEMU command used to start VM

$ taskset -c 
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42 \

sudo qemu-system-x86_64 \
-nographic -monitor none -serial stdio \
-machine accel=kvm,type=q35 \
-cpu host \
-smp 11 \
-m 10G \
-netdev user,id=net0,hostfwd=tcp::2220-:22 \
-device virtio-net-pci,netdev=net0 \
-chardev socket,id=char0,path=/usr/local/var/run/openvswitch/vhost0 
\
-netdev 
type=vhost-user,chardev=char0,vhostforce=on,queues=10,id=net1 \

-device virtio-net-pci,netdev=net1,mac=$mac,mq=on,vectors=22 \
-object 
memory-backend-file,id=mem,size=10G,mem-path=/dev/hugepages,share=on \

-numa node,memdev=mem -mem-prealloc \
-drive if=virtio,format=raw,file="base.img" \
-drive if=virtio,format=raw,file="seed.img" \

OvS was compiled to use DPDK with the following configurations:
./configure --with-dpdk=static CFLAGS="-Ofast -msse4.2 -mpopcnt 
-march=native"


Specs
VM
DPDK: 21.11.4
Kernel: 5.4.0-148-generic
Distribution: Ubuntu 20.04

Host
DPDK: 21.11.4
QEMU: 8.0.90
OvS: 3.0.5
Kernel: 5.15.111.1.amd64-smp
Distribution: Debian 11
CPU: Two Intel Xeon Gold 6152 CPUs
NIC: Mellanox ConnectX-5 Ex 100Gbit/s

Thanks,
Matheus
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss