Re: 10G performance regression / difference cxl and ix RELENG11 vs HEAD
On 10/12/2018 12:52 PM, Navdeep Parhar wrote: > > The number of retries (the "Retr" column) should have been 0 in a > controlled test like this. Is this the default stack with all default > parameters or have you tuned TCP and/or sockets in any way? No tuning at all. After a reboot and one test, I am seeing a bunch of overflows. I am going to netboot back to RELENG_11 to confirm sysctl -a dev.cxl.1.stats dev.cxl.1.stats.rx_tls_octets: 0 dev.cxl.1.stats.rx_tls_records: 0 dev.cxl.1.stats.tx_tls_octets: 0 dev.cxl.1.stats.tx_tls_records: 0 dev.cxl.1.stats.rx_trunc3: 0 dev.cxl.1.stats.rx_trunc2: 12 dev.cxl.1.stats.rx_trunc1: 0 dev.cxl.1.stats.rx_trunc0: 0 dev.cxl.1.stats.rx_ovflow3: 0 dev.cxl.1.stats.rx_ovflow2: 58 dev.cxl.1.stats.rx_ovflow1: 0 dev.cxl.1.stats.rx_ovflow0: 0 dev.cxl.1.stats.rx_ppp7: 0 dev.cxl.1.stats.rx_ppp6: 0 dev.cxl.1.stats.rx_ppp5: 0 dev.cxl.1.stats.rx_ppp4: 0 dev.cxl.1.stats.rx_ppp3: 0 dev.cxl.1.stats.rx_ppp2: 0 dev.cxl.1.stats.rx_ppp1: 0 dev.cxl.1.stats.rx_ppp0: 0 dev.cxl.1.stats.rx_pause: 0 dev.cxl.1.stats.rx_frames_1519_max: 0 dev.cxl.1.stats.rx_frames_1024_1518: 6169625 dev.cxl.1.stats.rx_frames_512_1023: 473 dev.cxl.1.stats.rx_frames_256_511: 133 dev.cxl.1.stats.rx_frames_128_255: 150 dev.cxl.1.stats.rx_frames_65_127: 1015832 dev.cxl.1.stats.rx_frames_64: 4 dev.cxl.1.stats.rx_runt: 0 dev.cxl.1.stats.rx_symbol_err: 0 dev.cxl.1.stats.rx_len_err: 0 dev.cxl.1.stats.rx_fcs_err: 0 dev.cxl.1.stats.rx_jabber: 0 dev.cxl.1.stats.rx_too_long: 0 dev.cxl.1.stats.rx_ucast_frames: 7186215 dev.cxl.1.stats.rx_mcast_frames: 0 dev.cxl.1.stats.rx_bcast_frames: 2 dev.cxl.1.stats.rx_frames: 7186217 dev.cxl.1.stats.rx_octets: 9437149499 dev.cxl.1.stats.tx_ppp7: 0 dev.cxl.1.stats.tx_ppp6: 0 dev.cxl.1.stats.tx_ppp5: 0 dev.cxl.1.stats.tx_ppp4: 0 dev.cxl.1.stats.tx_ppp3: 0 dev.cxl.1.stats.tx_ppp2: 0 dev.cxl.1.stats.tx_ppp1: 0 dev.cxl.1.stats.tx_ppp0: 0 dev.cxl.1.stats.tx_pause: 222 dev.cxl.1.stats.tx_drop: 0 dev.cxl.1.stats.tx_frames_1519_max: 0 dev.cxl.1.stats.tx_frames_1024_1518: 5409152 dev.cxl.1.stats.tx_frames_512_1023: 11968 dev.cxl.1.stats.tx_frames_256_511: 162 dev.cxl.1.stats.tx_frames_128_255: 26 dev.cxl.1.stats.tx_frames_65_127: 3095205 dev.cxl.1.stats.tx_frames_64: 210 dev.cxl.1.stats.tx_error_frames: 0 dev.cxl.1.stats.tx_ucast_frames: 8516714 dev.cxl.1.stats.tx_mcast_frames: 0 dev.cxl.1.stats.tx_bcast_frames: 9 dev.cxl.1.stats.tx_frames: 8516945 dev.cxl.1.stats.tx_octets: 8434330861 dev.cxl.1.stats.tx_parse_error: 0 > Regards, > Navdeep > > > -- --- Mike Tancsa, tel +1 519 651 3400 x203 Sentex Communications, m...@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: 10G performance regression / difference cxl and ix RELENG11 vs HEAD
On 10/12/18 9:52 AM, Navdeep Parhar wrote: > The number of retries (the "Retr" column) should have been 0 in a retransmits, not retries. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: 10G performance regression / difference cxl and ix RELENG11 vs HEAD
On 10/12/18 8:37 AM, Mike Tancsa wrote: > I was doing a quick iperf test with r339328 GENERIC-NODEBUG amd64, and > noticed I can no longer saturate a 10G nic with iperf3. I tried first > with the ix adapter, but was not sure if it was the driver or not. I > tried as well with a Chelsio and got similar numbers. > > # iperf3 -c 192.168.242.3 > Connecting to host 192.168.242.3, port 5201 > [ 5] local 192.168.242.2 port 50474 connected to 192.168.242.3 port 5201 > [ ID] Interval Transfer Bitrate Retr Cwnd > [ 5] 0.00-1.00 sec 997 MBytes 8.36 Gbits/sec 717 175 > KBytes > [ 5] 1.00-2.00 sec 975 MBytes 8.18 Gbits/sec 668 41.1 > KBytes > [ 5] 2.00-3.00 sec 880 MBytes 7.38 Gbits/sec 846 25.6 > KBytes > [ 5] 3.00-4.00 sec 523 MBytes 4.39 Gbits/sec 802 59.8 > KBytes > [ 5] 4.00-5.00 sec 520 MBytes 4.36 Gbits/sec 882 48.4 > KBytes > [ 5] 5.00-6.00 sec 543 MBytes 4.55 Gbits/sec 838 56.9 > KBytes > [ 5] 6.00-7.00 sec 556 MBytes 4.66 Gbits/sec 850 11.4 > KBytes > [ 5] 7.00-8.00 sec 538 MBytes 4.51 Gbits/sec 795 39.9 > KBytes > [ 5] 8.00-9.00 sec 540 MBytes 4.53 Gbits/sec 853 57.0 > KBytes > [ 5] 9.00-10.00 sec 503 MBytes 4.22 Gbits/sec 814 59.8 > KBytes > - - - - - - - - - - - - - - - - - - - - - - - - - > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-10.00 sec 6.42 GBytes 5.52 Gbits/sec 8065 > sender > [ 5] 0.00-10.00 sec 6.42 GBytes 5.52 Gbits/sec > receiver The number of retries (the "Retr" column) should have been 0 in a controlled test like this. Is this the default stack with all default parameters or have you tuned TCP and/or sockets in any way? Regards, Navdeep ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
10G performance regression / difference cxl and ix RELENG11 vs HEAD
I was doing a quick iperf test with r339328 GENERIC-NODEBUG amd64, and noticed I can no longer saturate a 10G nic with iperf3. I tried first with the ix adapter, but was not sure if it was the driver or not. I tried as well with a Chelsio and got similar numbers. # iperf3 -c 192.168.242.3 Connecting to host 192.168.242.3, port 5201 [ 5] local 192.168.242.2 port 50474 connected to 192.168.242.3 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 997 MBytes 8.36 Gbits/sec 717 175 KBytes [ 5] 1.00-2.00 sec 975 MBytes 8.18 Gbits/sec 668 41.1 KBytes [ 5] 2.00-3.00 sec 880 MBytes 7.38 Gbits/sec 846 25.6 KBytes [ 5] 3.00-4.00 sec 523 MBytes 4.39 Gbits/sec 802 59.8 KBytes [ 5] 4.00-5.00 sec 520 MBytes 4.36 Gbits/sec 882 48.4 KBytes [ 5] 5.00-6.00 sec 543 MBytes 4.55 Gbits/sec 838 56.9 KBytes [ 5] 6.00-7.00 sec 556 MBytes 4.66 Gbits/sec 850 11.4 KBytes [ 5] 7.00-8.00 sec 538 MBytes 4.51 Gbits/sec 795 39.9 KBytes [ 5] 8.00-9.00 sec 540 MBytes 4.53 Gbits/sec 853 57.0 KBytes [ 5] 9.00-10.00 sec 503 MBytes 4.22 Gbits/sec 814 59.8 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 6.42 GBytes 5.52 Gbits/sec 8065 sender [ 5] 0.00-10.00 sec 6.42 GBytes 5.52 Gbits/sec receiver If I do a parallel transfer I get closer to the max. However, on RELENG11 I could do pretty close to a full 10G no problem with just a single stream. This is with a GENERIC-NODEBUG kernel using a couple of T520s back to back t5iov1@pci0:9:0:1: class=0x02 card=0x1425 chip=0x50011425 rev=0x00 hdr=0x00 vendor = 'Chelsio Communications Inc' device = 'T520-CR Unified Wire Ethernet Controller' class = network subclass = ethernet bar [10] = type Memory, range 64, base 0xf668, size 524288, enabled bar [18] = type Memory, range 64, base 0xf660, size 524288, enabled bar [20] = type Memory, range 64, base 0xf688a000, size 8192, enabled cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 05[50] = MSI supports 8 messages, 64 bit, vector masks cap 10[70] = PCI-Express 2 endpoint max data 512(2048) FLR RO NS link x8(x8) speed 8.0(8.0) -- --- Mike Tancsa, tel +1 519 651 3400 x203 Sentex Communications, m...@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"