Re: 10G performance regression / difference cxl and ix RELENG11 vs HEAD

2018-10-12 Thread Mike Tancsa
On 10/12/2018 12:52 PM, Navdeep Parhar wrote:
>
> The number of retries (the "Retr" column) should have been 0 in a
> controlled test like this.  Is this the default stack with all default
> parameters or have you tuned TCP and/or sockets in any way?

No tuning at all.  After a reboot and one test, I am seeing a bunch of
overflows. I am going to netboot back to RELENG_11 to confirm

 sysctl -a dev.cxl.1.stats
dev.cxl.1.stats.rx_tls_octets: 0
dev.cxl.1.stats.rx_tls_records: 0
dev.cxl.1.stats.tx_tls_octets: 0
dev.cxl.1.stats.tx_tls_records: 0
dev.cxl.1.stats.rx_trunc3: 0
dev.cxl.1.stats.rx_trunc2: 12
dev.cxl.1.stats.rx_trunc1: 0
dev.cxl.1.stats.rx_trunc0: 0
dev.cxl.1.stats.rx_ovflow3: 0
dev.cxl.1.stats.rx_ovflow2: 58
dev.cxl.1.stats.rx_ovflow1: 0
dev.cxl.1.stats.rx_ovflow0: 0
dev.cxl.1.stats.rx_ppp7: 0
dev.cxl.1.stats.rx_ppp6: 0
dev.cxl.1.stats.rx_ppp5: 0
dev.cxl.1.stats.rx_ppp4: 0
dev.cxl.1.stats.rx_ppp3: 0
dev.cxl.1.stats.rx_ppp2: 0
dev.cxl.1.stats.rx_ppp1: 0
dev.cxl.1.stats.rx_ppp0: 0
dev.cxl.1.stats.rx_pause: 0
dev.cxl.1.stats.rx_frames_1519_max: 0
dev.cxl.1.stats.rx_frames_1024_1518: 6169625
dev.cxl.1.stats.rx_frames_512_1023: 473
dev.cxl.1.stats.rx_frames_256_511: 133
dev.cxl.1.stats.rx_frames_128_255: 150
dev.cxl.1.stats.rx_frames_65_127: 1015832
dev.cxl.1.stats.rx_frames_64: 4
dev.cxl.1.stats.rx_runt: 0
dev.cxl.1.stats.rx_symbol_err: 0
dev.cxl.1.stats.rx_len_err: 0
dev.cxl.1.stats.rx_fcs_err: 0
dev.cxl.1.stats.rx_jabber: 0
dev.cxl.1.stats.rx_too_long: 0
dev.cxl.1.stats.rx_ucast_frames: 7186215
dev.cxl.1.stats.rx_mcast_frames: 0
dev.cxl.1.stats.rx_bcast_frames: 2
dev.cxl.1.stats.rx_frames: 7186217
dev.cxl.1.stats.rx_octets: 9437149499
dev.cxl.1.stats.tx_ppp7: 0
dev.cxl.1.stats.tx_ppp6: 0
dev.cxl.1.stats.tx_ppp5: 0
dev.cxl.1.stats.tx_ppp4: 0
dev.cxl.1.stats.tx_ppp3: 0
dev.cxl.1.stats.tx_ppp2: 0
dev.cxl.1.stats.tx_ppp1: 0
dev.cxl.1.stats.tx_ppp0: 0
dev.cxl.1.stats.tx_pause: 222
dev.cxl.1.stats.tx_drop: 0
dev.cxl.1.stats.tx_frames_1519_max: 0
dev.cxl.1.stats.tx_frames_1024_1518: 5409152
dev.cxl.1.stats.tx_frames_512_1023: 11968
dev.cxl.1.stats.tx_frames_256_511: 162
dev.cxl.1.stats.tx_frames_128_255: 26
dev.cxl.1.stats.tx_frames_65_127: 3095205
dev.cxl.1.stats.tx_frames_64: 210
dev.cxl.1.stats.tx_error_frames: 0
dev.cxl.1.stats.tx_ucast_frames: 8516714
dev.cxl.1.stats.tx_mcast_frames: 0
dev.cxl.1.stats.tx_bcast_frames: 9
dev.cxl.1.stats.tx_frames: 8516945
dev.cxl.1.stats.tx_octets: 8434330861
dev.cxl.1.stats.tx_parse_error: 0

> Regards,
> Navdeep
>
>
>

-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 10G performance regression / difference cxl and ix RELENG11 vs HEAD

2018-10-12 Thread Navdeep Parhar
On 10/12/18 9:52 AM, Navdeep Parhar wrote:
> The number of retries (the "Retr" column) should have been 0 in a

retransmits, not retries.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 10G performance regression / difference cxl and ix RELENG11 vs HEAD

2018-10-12 Thread Navdeep Parhar
On 10/12/18 8:37 AM, Mike Tancsa wrote:
> I was doing a quick iperf test with  r339328 GENERIC-NODEBUG  amd64, and
> noticed  I can no longer saturate a 10G nic with iperf3.  I tried first
> with the ix adapter, but was not sure if it was the driver or not. I
> tried as well with a Chelsio and got similar numbers.
> 
> # iperf3 -c 192.168.242.3
> Connecting to host 192.168.242.3, port 5201
> [  5] local 192.168.242.2 port 50474 connected to 192.168.242.3 port 5201
> [ ID] Interval   Transfer Bitrate Retr  Cwnd
> [  5]   0.00-1.00   sec   997 MBytes  8.36 Gbits/sec  717    175
> KBytes  
> [  5]   1.00-2.00   sec   975 MBytes  8.18 Gbits/sec  668   41.1
> KBytes  
> [  5]   2.00-3.00   sec   880 MBytes  7.38 Gbits/sec  846   25.6
> KBytes  
> [  5]   3.00-4.00   sec   523 MBytes  4.39 Gbits/sec  802   59.8
> KBytes  
> [  5]   4.00-5.00   sec   520 MBytes  4.36 Gbits/sec  882   48.4
> KBytes  
> [  5]   5.00-6.00   sec   543 MBytes  4.55 Gbits/sec  838   56.9
> KBytes  
> [  5]   6.00-7.00   sec   556 MBytes  4.66 Gbits/sec  850   11.4
> KBytes  
> [  5]   7.00-8.00   sec   538 MBytes  4.51 Gbits/sec  795   39.9
> KBytes  
> [  5]   8.00-9.00   sec   540 MBytes  4.53 Gbits/sec  853   57.0
> KBytes  
> [  5]   9.00-10.00  sec   503 MBytes  4.22 Gbits/sec  814   59.8
> KBytes  
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval   Transfer Bitrate Retr
> [  5]   0.00-10.00  sec  6.42 GBytes  5.52 Gbits/sec  8065
> sender
> [  5]   0.00-10.00  sec  6.42 GBytes  5.52 Gbits/sec 
> receiver

The number of retries (the "Retr" column) should have been 0 in a
controlled test like this.  Is this the default stack with all default
parameters or have you tuned TCP and/or sockets in any way?

Regards,
Navdeep

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


10G performance regression / difference cxl and ix RELENG11 vs HEAD

2018-10-12 Thread Mike Tancsa
I was doing a quick iperf test with  r339328 GENERIC-NODEBUG  amd64, and
noticed  I can no longer saturate a 10G nic with iperf3.  I tried first
with the ix adapter, but was not sure if it was the driver or not. I
tried as well with a Chelsio and got similar numbers.

# iperf3 -c 192.168.242.3
Connecting to host 192.168.242.3, port 5201
[  5] local 192.168.242.2 port 50474 connected to 192.168.242.3 port 5201
[ ID] Interval   Transfer Bitrate Retr  Cwnd
[  5]   0.00-1.00   sec   997 MBytes  8.36 Gbits/sec  717    175
KBytes  
[  5]   1.00-2.00   sec   975 MBytes  8.18 Gbits/sec  668   41.1
KBytes  
[  5]   2.00-3.00   sec   880 MBytes  7.38 Gbits/sec  846   25.6
KBytes  
[  5]   3.00-4.00   sec   523 MBytes  4.39 Gbits/sec  802   59.8
KBytes  
[  5]   4.00-5.00   sec   520 MBytes  4.36 Gbits/sec  882   48.4
KBytes  
[  5]   5.00-6.00   sec   543 MBytes  4.55 Gbits/sec  838   56.9
KBytes  
[  5]   6.00-7.00   sec   556 MBytes  4.66 Gbits/sec  850   11.4
KBytes  
[  5]   7.00-8.00   sec   538 MBytes  4.51 Gbits/sec  795   39.9
KBytes  
[  5]   8.00-9.00   sec   540 MBytes  4.53 Gbits/sec  853   57.0
KBytes  
[  5]   9.00-10.00  sec   503 MBytes  4.22 Gbits/sec  814   59.8
KBytes  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval   Transfer Bitrate Retr
[  5]   0.00-10.00  sec  6.42 GBytes  5.52 Gbits/sec  8065
sender
[  5]   0.00-10.00  sec  6.42 GBytes  5.52 Gbits/sec 
receiver


If I do a parallel transfer I get closer to the max.  However, on
RELENG11 I could do pretty close to a full 10G no problem with just a
single stream. This is with a GENERIC-NODEBUG kernel using a couple of
T520s back to back


t5iov1@pci0:9:0:1:  class=0x02 card=0x1425 chip=0x50011425
rev=0x00 hdr=0x00
    vendor = 'Chelsio Communications Inc'
    device = 'T520-CR Unified Wire Ethernet Controller'
    class  = network
    subclass   = ethernet
    bar   [10] = type Memory, range 64, base 0xf668, size 524288,
enabled
    bar   [18] = type Memory, range 64, base 0xf660, size 524288,
enabled
    bar   [20] = type Memory, range 64, base 0xf688a000, size 8192, enabled
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 8 messages, 64 bit, vector masks
    cap 10[70] = PCI-Express 2 endpoint max data 512(2048) FLR RO NS
 link x8(x8) speed 8.0(8.0)



-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"