Hi Florin, On Fri, Sep 11, 2020 at 11:23 PM Florin Coras <fcoras.li...@gmail.com> wrote:
> Hi Vijay, > > Quick replies inline. > > On Sep 11, 2020, at 7:27 PM, Vijay Sampath <vsamp...@gmail.com> wrote: > > Hi Florin, > > Thanks once again for looking at this issue. Please see inline: > > On Fri, Sep 11, 2020 at 2:06 PM Florin Coras <fcoras.li...@gmail.com> > wrote: > >> Hi Vijay, >> >> Inline. >> >> On Sep 11, 2020, at 1:08 PM, Vijay Sampath <vsamp...@gmail.com> wrote: >> >> Hi Florin, >> >> Thanks for the response. Please see inline: >> >> On Fri, Sep 11, 2020 at 10:42 AM Florin Coras <fcoras.li...@gmail.com> >> wrote: >> >>> Hi Vijay, >>> >>> Cool experiment. More inline. >>> >>> > On Sep 11, 2020, at 9:42 AM, Vijay Sampath <vsamp...@gmail.com> wrote: >>> > >>> > Hi, >>> > >>> > I am using iperf3 as a client on an Ubuntu 18.04 Linux machine >>> connected to another server running VPP using 100G NICs. Both servers are >>> Intel Xeon with 24 cores. >>> >>> May I ask the frequency for those cores? Also what type of nic are you >>> using? >>> >> >> 2700 MHz. >> >> >> Probably this somewhat limits throughput per single connection compared >> to my testbed where the Intel cpu boosts to 4GHz. >> > > Please see below, I noticed an anomaly. > > >> The nic is a Pensando DSC100. >> >> >> Okay, not sure what to expect there. Since this mostly stresses the rx >> side, what’s the number of rx descriptors? Typically I test with 256, with >> more connections higher throughput you might need more. >> > > This is the default - comments seem to suggest that is 1024. I don't see > any rx queue empty errors on the nic, which probably means there are > sufficient buffers. > > > Reasonable. Might want to try to reduce it down to 256 but performance > will depend a lot on other things as well. > This seems to help, but I do get rx queue empty nic drops. More below. > > > I am trying to push 100G traffic from the iperf Linux TCP client by >>> starting 10 parallel iperf connections on different port numbers and >>> pinning them to different cores on the sender side. On the VPP receiver >>> side I have 10 worker threads and 10 rx-queues in dpdk, and running iperf3 >>> using VCL library as follows >>> > >>> > taskset 0x00400 sh -c >>> "LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libvcl_ldpreload.so >>> VCL_CONFIG=/etc/vpp/vcl.conf iperf3 -s -4 -p 9000" & >>> > taskset 0x00800 sh -c >>> "LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libvcl_ldpreload.so >>> VCL_CONFIG=/etc/vpp/vcl.conf iperf3 -s -4 -p 9001" & >>> > taskset 0x01000 sh -c "LD_PRELOAD=/usr/lib/x86_64 >>> > ... >>> > >>> > MTU is set to 9216 everywhere, and TCP MSS set to 8200 on client: >>> > >>> > taskset 0x0001 iperf3 -c 10.1.1.102 -M 8200 -Z -t 6000 -p 9000 >>> > taskset 0x0002 iperf3 -c 10.1.1.102 -M 8200 -Z -t 6000 -p 9001 >>> > ... >>> >>> Could you try first with only 1 iperf server/client pair, just to see >>> where performance is with that? >>> >> >> These are the numbers I get >> rx-fifo-size 65536: ~8G >> rx-fifo-size 524288: 22G >> rx-fifo-size 4000000: 25G >> >> >> Okay, so 4MB is probably the sweet spot. Btw, could you check the vector >> rate (and the errors) in this case also? >> > > I noticed that adding "enable-tcp-udp-checksum" back seems to improve > performance. Not sure if this is an issue with the dpdk driver for the nic. > Anyway in the "show hardware" flags I see now that tcp and udp checksum > offloads are enabled: > > root@server:~# vppctl show hardware > Name Idx Link Hardware > eth0 1 up dsc1 > Link speed: 100 Gbps > Ethernet address 00:ae:cd:03:79:51 > ### UNKNOWN ### > carrier up full duplex mtu 9000 > flags: admin-up pmd maybe-multiseg rx-ip4-cksum > Devargs: > rx: queues 4 (max 16), desc 1024 (min 16 max 32768 align 1) > tx: queues 5 (max 16), desc 1024 (min 16 max 32768 align 1) > pci: device 1dd8:1002 subsystem 1dd8:400a address 0000:15:00.00 numa 0 > max rx packet len: 9208 > promiscuous: unicast off all-multicast on > vlan offload: strip off filter off qinq off > rx offload avail: vlan-strip ipv4-cksum udp-cksum tcp-cksum > vlan-filter > jumbo-frame scatter > rx offload active: ipv4-cksum udp-cksum tcp-cksum jumbo-frame scatter > tx offload avail: vlan-insert ipv4-cksum udp-cksum tcp-cksum tcp-tso > outer-ipv4-cksum multi-segs mbuf-fast-free > outer-udp-cksum > tx offload active: multi-segs > rss avail: ipv4-tcp ipv4-udp ipv4 ipv6-tcp ipv6-udp ipv6 > rss active: ipv4-tcp ipv4-udp ipv4 ipv6-tcp ipv6-udp ipv6 > tx burst function: ionic_xmit_pkts > rx burst function: ionic_recv_pkts > > With this I get better performance per iperf3 connection - about 30.5G. > Show run output attached (1connection.txt) > > > Interesting. Yes, dpdk does request offload rx ip/tcp checksum computation > when possible but it currently (unless some of the pending patches were > merged) does not mark the packet appropriately and ip4-local will > recompute/validate the checksum. From your logs, it seems ip4-local needs > ~1.8e3 cycles in the 1 connection setup and ~3.1e3 for 7 connections. > That’s a lot, so it seems to confirm that the checksum is recomputed. > > So, it’s somewhat counter intuitive the fact that performance improves. > How do the show run numbers change? Could be that performance worsens > because of tcp’s congestion recovery/flow control, i.e., the packets are > processes faster but some component starts dropping/queues get full. > That's interesting. I got confused by the "show hardware" output since it doesn't show any output against "tx offload active". You are right, though it definitely uses less cycles without this option present, so I took it out for further tests. I am attaching the show run output for both 1 connection and 7 connection case without this option present. With 1 connection, it appears VPP is not loaded at all since there is no batching happening? With 7 connections I do see it getting around 90-92G. When I drop the rx queue to 256, I do see some nic drops, but performance improves and I am getting 99G now. Can you please explain why this makes a difference? Does it have to do with caches? Are the other cores kind of unusable now due to being on a different numa? With Linux TCP, I believe I was able to use most of the cores and scale the number of connections. Anyway this is good that I can get close to line rate. I will try more experiments and see. Thanks for your help. Thanks, Vijay
root@server:~# vppctl show run; vppctl show error; vppctl show tcp stats Thread 0 vpp_main (lcore 0) Time 4.2, 10 sec internal node vector rate 0.00 loops/sec 1452525.71 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call cnat-scanner-process any wait 0 0 5 2.06e3 0.00 dpdk-process any wait 0 0 1 1.85e4 0.00 fib-walk any wait 0 0 2 2.41e3 0.00 ikev2-manager-process any wait 0 0 5 2.16e3 0.00 ip4-full-reassembly-expire-wal any wait 0 0 1 3.31e3 0.00 ip4-sv-reassembly-expire-walk any wait 0 0 1 2.83e3 0.00 ip6-full-reassembly-expire-wal any wait 0 0 1 2.52e3 0.00 ip6-mld-process any wait 0 0 5 9.51e2 0.00 ip6-ra-process any wait 0 0 5 1.00e3 0.00 ip6-sv-reassembly-expire-walk any wait 0 0 1 3.05e3 0.00 session-queue-main polling 748050 0 0 1.07e2 0.00 session-queue-process any wait 0 0 4 9.73e2 0.00 statseg-collector-process time wait 0 0 1 2.54e4 0.00 unix-cli-local:32 active 3 0 6 1.69e14 0.00 unix-cli-new-session any wait 0 0 7 1.06e3 0.00 unix-epoll-input polling 748050 0 0 1.20e4 0.00 wg-timer-manager any wait 0 0 422 3.03e2 0.00 --------------- Thread 1 vpp_wk_0 (lcore 1) Time 4.2, 10 sec internal node vector rate 1.00 loops/sec 6435691.03 vector rates in 4.7367e-1, out 0.0000e0, drop 4.7367e-1, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 28048168 2 0 1.46e9 0.00 drop active 2 2 0 5.71e2 1.00 error-drop active 2 2 0 9.54e2 1.00 ethernet-input active 2 2 0 2.49e3 1.00 llc-input active 2 2 0 2.87e2 1.00 session-queue polling 28048168 0 0 1.45e2 0.00 unix-epoll-input polling 27364 0 0 3.83e2 0.00 --------------- Thread 2 vpp_wk_1 (lcore 2) Time 4.2, 10 sec internal node vector rate 2.38 loops/sec 332294.93 vector rates in 6.2611e5, out 1.4979e5, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 1655268 2011225 0 3.97e2 1.22 dsc1-output active 632451 632451 0 2.61e2 1.00 dsc1-tx active 632451 632451 0 3.13e2 1.00 ethernet-input active 632452 2011225 0 1.03e2 3.18 ip4-input-no-checksum active 632452 2011225 0 1.01e2 3.18 ip4-local active 632452 2011225 0 8.93e1 3.18 ip4-lookup active 730721 2643676 0 9.48e1 3.62 ip4-rewrite active 632451 632451 0 1.92e2 1.00 session-queue polling 1655268 632451 0 9.97e2 .38 tcp4-established active 632452 2011225 0 2.92e3 3.18 tcp4-input active 632452 2011225 0 1.45e2 3.18 tcp4-output active 632451 632451 0 3.31e2 1.00 unix-epoll-input polling 1615 0 0 8.51e2 0.00 --------------- Thread 3 vpp_wk_2 (lcore 3) Time 4.2, 10 sec internal node vector rate 0.00 loops/sec 6256218.08 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 27519330 0 0 1.07e2 0.00 session-queue polling 27519330 0 0 1.46e2 0.00 unix-epoll-input polling 26849 0 0 3.95e2 0.00 --------------- Thread 4 vpp_wk_3 (lcore 4) Time 4.2, 10 sec internal node vector rate 0.00 loops/sec 6420721.44 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 27884503 0 0 1.04e2 0.00 session-queue polling 27884503 0 0 1.53e2 0.00 unix-epoll-input polling 27205 0 0 4.11e2 0.00 Count Node Reason 2 llc-input unknown llc ssap/dsap 632453 session-queue Packets transmitted 2011751 tcp4-established Packets pushed into rx fifo 632453 tcp4-output Packets sent Thread 0: Thread 1: Thread 2: Thread 3: Thread 4:
root@server:~# vppctl show run; vppctl show error; vppctl show tcp stats Thread 0 vpp_main (lcore 0) Time 3.2, 10 sec internal node vector rate 0.00 loops/sec 1212360.50 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call cnat-scanner-process any wait 0 0 3 7.17e3 0.00 dpdk-process any wait 0 0 1 2.77e4 0.00 fib-walk any wait 0 0 1 8.36e3 0.00 ikev2-manager-process any wait 0 0 3 5.65e3 0.00 ip6-mld-process any wait 0 0 3 1.84e3 0.00 ip6-ra-process any wait 0 0 3 3.93e3 0.00 session-queue-main polling 561429 0 0 1.07e2 0.00 session-queue-process any wait 0 0 3 4.14e3 0.00 unix-cli-local:38 active 3 0 6 1.69e14 0.00 unix-cli-new-session any wait 0 0 7 2.45e3 0.00 unix-epoll-input polling 561429 0 0 1.23e4 0.00 wg-timer-manager any wait 0 0 323 3.48e2 0.00 --------------- Thread 1 vpp_wk_0 (lcore 1) Time 3.2, 10 sec internal node vector rate 140.97 loops/sec 1931.07 vector rates in 4.9828e5, out 5.6866e3, drop 9.2852e-1, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 6217 1591552 0 3.03e2 256.00 drop active 3 3 0 2.62e3 1.00 dsc1-output active 6217 18373 0 2.69e2 2.96 dsc1-tx active 6217 18373 0 5.85e2 2.96 error-drop active 3 3 0 1.63e3 1.00 ethernet-input active 6217 1591552 0 1.79e1 256.00 ip4-input-no-checksum active 6217 1591549 0 1.94e1 255.99 ip4-local active 6217 1591549 0 2.47e1 255.99 ip4-lookup active 12434 1609922 0 2.52e1 129.48 ip4-rewrite active 6217 18373 0 1.84e2 2.96 llc-input active 3 3 0 2.76e3 1.00 session-queue polling 6217 18373 0 1.18e3 2.96 snap-input active 1 1 0 4.85e3 1.00 tcp4-established active 6217 1591549 0 3.97e3 255.99 tcp4-input active 6217 1591549 0 6.45e1 255.99 tcp4-output active 6217 18373 0 5.31e2 2.96 unix-epoll-input polling 7 0 0 3.26e3 0.00 --------------- Thread 2 vpp_wk_1 (lcore 2) Time 3.2, 10 sec internal node vector rate 0.00 loops/sec 6375947.26 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 21415346 0 0 1.04e2 0.00 session-queue polling 21415346 0 0 1.52e2 0.00 unix-epoll-input polling 20893 0 0 4.94e2 0.00 --------------- Thread 3 vpp_wk_2 (lcore 3) Time 3.2, 10 sec internal node vector rate 140.55 loops/sec 1845.32 vector rates in 4.7233e5, out 3.6587e3, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 5915 1514240 0 3.18e2 256.00 dsc1-output active 5915 11821 0 4.11e2 1.99 dsc1-tx active 5915 11821 0 7.99e2 1.99 ethernet-input active 5915 1514240 0 1.84e1 256.00 ip4-input-no-checksum active 5915 1514240 0 1.92e1 256.00 ip4-local active 5915 1514240 0 2.47e1 256.00 ip4-lookup active 11830 1526061 0 2.60e1 128.99 ip4-rewrite active 5915 11821 0 2.71e2 1.99 session-queue polling 5915 11821 0 1.56e3 1.99 tcp4-established active 5915 1514240 0 4.18e3 256.00 tcp4-input active 5915 1514240 0 6.42e1 256.00 tcp4-output active 5915 11821 0 8.72e2 1.99 unix-epoll-input polling 6 0 0 2.32e3 0.00 --------------- Thread 4 vpp_wk_3 (lcore 4) Time 3.2, 10 sec internal node vector rate 140.55 loops/sec 1718.46 vector rates in 4.5117e5, out 3.4965e3, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 5650 1446400 0 3.26e2 256.00 dsc1-output active 5650 11297 0 4.51e2 1.99 dsc1-tx active 5650 11297 0 7.07e2 1.99 ethernet-input active 5650 1446400 0 1.92e1 256.00 ip4-input-no-checksum active 5650 1446400 0 2.00e1 256.00 ip4-local active 5650 1446400 0 2.54e1 256.00 ip4-lookup active 11300 1457697 0 2.74e1 128.99 ip4-rewrite active 5650 11297 0 2.66e2 1.99 session-queue polling 5650 11297 0 1.69e3 1.99 tcp4-established active 5650 1446400 0 4.38e3 256.00 tcp4-input active 5650 1446400 0 6.65e1 256.00 tcp4-output active 5650 11297 0 8.37e2 1.99 unix-epoll-input polling 5 0 0 2.34e3 0.00 Count Node Reason 18373 session-queue Packets transmitted 1586542 tcp4-established Packets pushed into rx fifo 5007 tcp4-established OOO packets pushed into rx fifo 18373 tcp4-output Packets sent 1 snap-input unknown oui/snap protocol 2 llc-input unknown llc ssap/dsap 11823 session-queue Packets transmitted 1514496 tcp4-established Packets pushed into rx fifo 11823 tcp4-output Packets sent 11299 session-queue Packets transmitted 1446656 tcp4-established Packets pushed into rx fifo 11299 tcp4-output Packets sent Thread 0: Thread 1: Thread 2: Thread 3: Thread 4:
root@server:~# vppctl show run; vppctl show error; vppctl show tcp stats Thread 0 vpp_main (lcore 0) Time 2.1, 10 sec internal node vector rate 0.00 loops/sec 1263586.07 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call cnat-scanner-process any wait 0 0 2 5.63e3 0.00 fib-walk any wait 0 0 1 6.67e3 0.00 ikev2-manager-process any wait 0 0 2 4.55e3 0.00 ip6-mld-process any wait 0 0 2 2.44e3 0.00 ip6-ra-process any wait 0 0 2 2.46e3 0.00 session-queue-main polling 355005 0 0 1.14e2 0.00 session-queue-process any wait 0 0 2 2.94e3 0.00 unix-cli-local:10 active 3 0 6 1.70e14 0.00 unix-cli-new-session any wait 0 0 7 2.18e3 0.00 unix-epoll-input polling 355005 0 0 1.24e4 0.00 wg-timer-manager any wait 0 0 206 4.75e2 0.00 --------------- Thread 1 vpp_wk_0 (lcore 1) Time 2.1, 10 sec internal node vector rate 12.75 loops/sec 25016.09 vector rates in 5.6834e5, out 5.2130e4, drop 4.8461e-1, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 53781 1065214 0 2.39e2 19.81 drop active 1 1 0 1.34e3 1.00 dsc1-output active 53781 107572 0 1.77e2 2.00 dsc1-tx active 53781 107572 0 2.42e2 2.00 error-drop active 1 1 0 1.73e3 1.00 ethernet-input active 53781 1065214 0 2.75e1 19.81 ip4-input-no-checksum active 53781 1065213 0 3.04e1 19.81 ip4-local active 53781 1065213 0 4.08e1 19.81 ip4-lookup active 53782 1172785 0 3.64e1 21.81 ip4-rewrite active 53781 107572 0 1.68e2 2.00 llc-input active 1 1 0 1.46e3 1.00 session-queue polling 53781 107572 0 9.77e2 2.00 tcp4-established active 53781 1065213 0 3.59e3 19.81 tcp4-input active 53781 1065213 0 8.29e1 19.81 tcp4-output active 53781 107572 0 2.37e2 2.00 unix-epoll-input polling 52 0 0 1.90e3 0.00 --------------- Thread 2 vpp_wk_1 (lcore 2) Time 2.1, 10 sec internal node vector rate 1.23 loops/sec 198018.28 vector rates in 4.7414e5, out 2.1513e5, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 444430 534460 0 6.88e2 1.20 dsc1-output active 443931 443931 0 2.54e2 1.00 dsc1-tx active 443931 443931 0 3.23e2 1.00 ethernet-input active 443931 534460 0 2.44e2 1.20 ip4-input-no-checksum active 443931 534460 0 1.96e2 1.20 ip4-local active 443931 534460 0 1.92e2 1.20 ip4-lookup active 444129 978391 0 1.45e2 2.20 ip4-rewrite active 443931 443931 0 1.89e2 1.00 session-queue polling 444430 443931 0 9.38e2 .99 tcp4-established active 443931 534460 0 4.84e3 1.20 tcp4-input active 443931 534460 0 2.82e2 1.20 tcp4-output active 443931 443931 0 3.23e2 1.00 unix-epoll-input polling 434 0 0 1.77e3 0.00 --------------- Thread 3 vpp_wk_2 (lcore 3) Time 2.1, 10 sec internal node vector rate 1.29 loops/sec 172803.28 vector rates in 4.5671e5, out 1.9766e5, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 408053 534551 0 6.57e2 1.31 dsc1-output active 407872 407872 0 2.54e2 1.00 dsc1-tx active 407872 407872 0 3.09e2 1.00 ethernet-input active 407872 534551 0 2.13e2 1.31 ip4-input-no-checksum active 407872 534551 0 1.78e2 1.31 ip4-local active 407872 534551 0 1.84e2 1.31 ip4-lookup active 407893 942423 0 1.36e2 2.31 ip4-rewrite active 407872 407872 0 1.91e2 1.00 session-queue polling 408053 407872 0 1.21e3 .99 tcp4-established active 407872 534551 0 4.88e3 1.31 tcp4-input active 407872 534551 0 3.05e2 1.31 tcp4-output active 407872 407872 0 3.16e2 1.00 unix-epoll-input polling 399 0 0 1.74e3 0.00 --------------- Thread 4 vpp_wk_3 (lcore 4) Time 2.1, 10 sec internal node vector rate 116.97 loops/sec 2633.27 vector rates in 4.8335e5, out 7.8512e3, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 5157 981197 0 2.23e2 190.27 dsc1-output active 5157 16201 0 2.36e2 3.14 dsc1-tx active 5157 16201 0 3.95e2 3.14 ethernet-input active 5157 981197 0 1.67e1 190.27 ip4-input-no-checksum active 5157 981197 0 1.87e1 190.27 ip4-local active 5157 981197 0 2.71e1 190.27 ip4-lookup active 5265 997398 0 2.57e1 189.44 ip4-rewrite active 5157 16201 0 1.51e2 3.14 session-queue polling 5157 16201 0 1.29e3 3.14 tcp4-established active 5157 981197 0 4.19e3 190.27 tcp4-input active 5157 981197 0 6.74e1 190.27 tcp4-output active 5157 16201 0 3.51e2 3.14 unix-epoll-input polling 5 0 0 2.71e3 0.00 Count Node Reason 107570 session-queue Packets transmitted 1065518 tcp4-established Packets pushed into rx fifo 107570 tcp4-output Packets sent 1 llc-input unknown llc ssap/dsap 444037 session-queue Packets transmitted 534277 tcp4-established Packets pushed into rx fifo 444037 tcp4-output Packets sent 407953 session-queue Packets transmitted 534415 tcp4-established Packets pushed into rx fifo 407953 tcp4-output Packets sent 16195 session-queue Packets transmitted 950564 tcp4-established Packets pushed into rx fifo 30417 tcp4-established OOO packets pushed into rx fifo 16195 tcp4-output Packets sent Thread 0: Thread 1: Thread 2: Thread 3: Thread 4:
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#17385): https://lists.fd.io/g/vpp-dev/message/17385 Mute This Topic: https://lists.fd.io/mt/76783803/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-