Dear Olivier, Hardware specifications
The router (Device Under Test or dut): DL360 G6, 16GB RAM, 1x Intel Xeon E5520@2.27GHz 4 cores, hyperthreading disbled The generator\receiver: DL360 G5, 8GB RAM, 1x Intel Xeon E5410@2.33GHz 4 cores, hyperthreading disbled Network interfaces for both systems are Chelsion T520-CR, 10Gbps SFP+ transcievers, LC-LC SM patch cords The interfaces are connected via Extreme Networks x670 switch (for counter reference). The switch is configured with: configure fdb agingtime 0 disable lldp ports 1,3,5,7 disable flow-control tx ports 1,3,5,7 disable flow-control rx ports 1,3,5,7 create vlan vlan2 tag 2 create vlan vlan3 tag 3 configure vlan "Default" del port 1,3,5,7 configure vlan2 add ports 1,3 configure vlan3 add ports 5,7 #####dut (DL360 G6) configuration: #cxl0 MAC 00:07:43:2f:29:b0 #cxl1 MAC 00:07:43:2f:29:b8 ifconfig cxl0 198.18.0.203/24 ifconfig cxl1 198.19.0.203/24 #to create static arp records on restart use sysrc static_arp_pairs="generator receiver" sysrc static_arp_generator="198.18.0.201 00:07:43:32:b1:61" sysrc static_arp_receiver="198.19.0.201 00:07:43:32:b1:69" #to create static arp records on the fly (lost after restart) arp -S 198.18.0.201 00:07:43:32:b1:61 arp -S 198.19.0.201 00:07:43:32:b1:69 #add a static route for 198.19.0.0/16 route add -net 198.19.0.0/16 198.19.0.201 #####generator/receiver (DL360 G5) configuration #cxl0 MAC 00:07:43:32:b1:60 #cxl1 MAC 00:07:43:32:b1:68 #vcxl0 MAC 00:07:43:32:b1:61 #vcxl1 MAC 00:07:43:32:b1:69 #In order to achieve packet generation above 1.5Mpps do: #mount / as write and read mount -uw / #to enable line speed pkt-gen add in /boot/loader.conf.local hw.cxgbe.num_vis=2 #if hw.cxgbe.num_vis=1 only 1.5Mpps can be generated with Chelsio T5 #mount / as read only after finished with file editing #reboot the system so that hw.cxgbe.num_vis=2 takes effect #after reboot vcxl0 and vcxl1 appear as interfaces ifconfig vcxl0 198.18.0.201/24 ifconfig vcxl1 198.19.0.201/24 #####start the test #start the receiver part pkt-gen -N -f rx -i vcxl1 -w 4 #start the generator part, use '-p 2' on order to achieve higher throughput from the processor. pkt-gen -N -f tx -i vcxl0 -n 1000000000 -4 -d 198.19.10.1:2000-198.19.10.100 -D 00:07:43:2f:29:b0 -s 198.18.10.1:2000-198.18.10.20 -S 00:07:43:32:b1:61 -w 4 -l 60 -U -p 2 #check for network cards errors on dut and generator/receiver sysctl -n dev.t5nex.0.misc.tp_err_stats netstat -hdw1 -I cxl<n> #####Results: The generator can generate about 5.5 Mpps. The router drops about 1.5 Mpps. The reciever gets about 4 Mpps. There is a 100% saturation on one of the CPU cores and about 75% evenly distributed among the other cores on the router: last pid: 34533; load averages: 2.54, 2.26, 1.20 up 6+02:16:26 23:13:13 141 threads: 9 running, 109 sleeping, 23 waiting CPU 0: 0.0% user, 0.0% nice, 0.0% system, 84.6% interrupt, 15.4% idle CPU 1: 0.0% user, 0.0% nice, 0.0% system, 100% interrupt, 0.0% idle CPU 2: 0.0% user, 0.0% nice, 0.0% system, 76.0% interrupt, 24.0% idle CPU 3: 0.0% user, 0.0% nice, 0.0% system, 79.9% interrupt, 20.1% idle Mem: 4396K Active, 21M Inact, 480M Wired, 208M Buf, 15G Free Swap: PID USERNAME PRI NICE SIZE RES STATE C TIME CPU COMMAND 11 root -92 - 0 432K CPU1 1 27:24 99.83% intr{irq259: t5nex0:0a0} 11 root -92 - 0 432K CPU0 0 37:34 99.01% intr{irq262: t5nex0:0a3} 11 root -92 - 0 432K RUN 3 21:26 73.16% intr{irq260: t5nex0:0a1} 11 root -92 - 0 432K CPU2 2 19:07 65.92% intr{irq261: t5nex0:0a2} 10 root 155 ki31 0 64K RUN 3 145.9H 22.63% idle{idle: cpu3} 10 root 155 ki31 0 64K RUN 2 145.7H 19.56% idle{idle: cpu2} 10 root 155 ki31 0 64K RUN 0 145.3H 19.14% idle{idle: cpu0} 11 root -100 - 0 432K WAIT 0 33:35 0.40% intr{irq20: hpet0 uhci0+} 28208 root 20 0 13M 2940K CPU2 2 0:00 0.04% top 11 root -60 - 0 432K WAIT 2 2:22 0.04% intr{swi4: clock (0)} 0 root -92 - 0 448K - 3 0:00 0.03% kernel{t5nex0 tq1} 11 root -92 - 0 432K WAIT 2 0:00 0.01% intr{irq263: t5nex0:1a0} 11 root -92 - 0 432K WAIT 2 0:00 0.01% intr{irq266: t5nex0:1a3} 11 root -92 - 0 432K WAIT 2 0:00 0.01% intr{irq264: t5nex0:1a1} 11 root -92 - 0 432K WAIT 2 0:00 0.01% intr{irq265: t5nex0:1a2} 18436 root 20 0 20M 6460K select 3 0:00 0.01% sshd 21 root -16 - 0 48K psleep 3 0:08 0.00% pagedaemon{dom0} 18 root -16 - 0 16K - 3 0:10 0.00% rand_harvestq 24 root -16 - 0 128K qsleep 2 0:07 0.00% bufdaemon{bufdaemon} 19 root -16 - 0 16K tzpoll 2 0:01 0.00% acpi_thermal 11 root -92 - 0 432K WAIT 3 0:02 0.00% intr{irq267: bce0} 13 root -72 - 0 480K - 3 0:03 0.00% usb{usbus4} The generator can generate more than 10 Mpps but the number of flows should be one or two for this hardware configuration: pkt-gen -N -f tx -i vcxl0 -n 1000000000 -4 -d 198.19.10.1:2000-198.19.10.2 -D 00:07:43:2f:29:b0 -s 198.18.10.1:2000-198.18.10.10 -S 00:07:43:32:b1:61 -w 4 -l 60 -U -p 2 Regards, Lyubomir On Thu, 23 Jan 2020 at 09:39, Lyubomir Yotov <l.yo...@gmail.com> wrote: > Dear Olivier and Junior, > Thank you for your prompt responses. > Olivier, I am testing a routing scenario and will try to implement your > test case with DUT and a switch with static MAC addresses (currently I have > only two servers available). I will also try with the TSO LRO settings on > the "sending" router as you suggested. > Junior, thank your for the presentation. Unfortunately my Brazilian > Portugese is zero to none and the youtube translation to English is not > perfect, but I got the idea. > I will reply when i have results. > One more question - is it better to have 2x10Gbps in one lagg (for > redundancy and a bit of capacity) towards the network, or to use separate > 10Gbps for upstream and for the AS network. I have read somewhere that lagg > could cause performance issues. So far I have used lagg on 1Gbps but never > on 10Gbps. > > Regards, > > Lyubo > > On Wed, 22 Jan 2020 at 16:17, Junior Corazza <cora...@telic.com.br> wrote: > >> I got almost 20gb / s added, >> >> Below is a video of a presentation of mine showing the case. >> >> Dell r410, Intel X520-da2, >> >> https://youtu.be/8DdtN_fj_uQ >> >> >> >> [image: assinatura] >> >> >> >> *De:* Lyubomir Yotov <l.yo...@gmail.com> >> *Enviada em:* quarta-feira, 22 de janeiro de 2020 09:05 >> *Para:* bsdrp-users@lists.sourceforge.net >> *Assunto:* [Bsdrp-users] 10Gbps interface performance with BSDRP >> >> >> >> Dear Olivier, >> >> All the best for the New 2020! >> >> In the last few weeks I did some tests with a colleague (a Linux guy) >> about 10Gbps performance and BSDRP. The test was between two servers with >> connected through a switch. I use Chelsio T520-CR on two HP DL360 with dual >> Xeon E5520@2,27GHz. I was a bit surprised that we couldn't get more than >> ~3.8Gbps-4.0Bgps with iperf (iperf2 and iperf3 single flow, multiple >> flows...), both BSDRP v. 1.96 and stock FreeBSD 12.0. We did the same tests >> (BSDRP v. 19.6 only) on two directly connected Dell R610 with >> X5670@2.90GHz with the same success. >> >> At the same time with a Proxmox I can get ~7Gbps, which seems a >> reasonable result. When using the netmap userland tools I can see that >> ~14Mpps are going to the switch. So it is not CPU bound IMHO. >> >> I searched for tunings on the web but what I found was already set in >> BSDRP. Any help would be appreciated as I intend to use the cards in >> production but the current performance is not satisfactory. >> >> >> >> Regards, >> >> >> >> Lyubo >> >> >> _______________________________________________ >> Bsdrp-users mailing list >> Bsdrp-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/bsdrp-users >> >
_______________________________________________ Bsdrp-users mailing list Bsdrp-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bsdrp-users