I've done some test between a couple of machines, in my lab, using iperf to measure TCP performance, to try to replicate the sort of tests that Thomas was trying.
(Some may say that this is more appropriate for network-discuss, but I think it it useful to post it here in storage-discuss, as I believe that before the try to troubleshoot iscsi performance, its useful to check that the underlying TCP performance is Ok.) For both machines, I used Intel pci-e network cards. And the network cards were directly connected via a cross-over cable. One machine used OpenSolaris 2009.06, updated to snv_118. The other used Fedora 11 Linux. The OpenSolaris machine has a Intel Core 2 Duo processor, and the Fedora machine had a Intel Pentium-D processor. Both machines were of 'workstation' class, rather than 'servers'. I left the TCP tuneables at their default values. :--| First, the view from the OpenSolaris side |--: $ uname -a SunOS opensolaris 5.11 snv_118 i86pc i386 i86pc Solaris $ ndd /dev/tcp tcp_max_buf 1048576 $ ndd /dev/tcp tcp_cwnd_max 1048576 $ ndd /dev/tcp tcp_xmit_hiwat 49152 $ ndd /dev/tcp tcp_recv_hiwat 49152 $ ndd /dev/tcp tcp_wscale_always 1 $ ndd /dev/tcp tcp_tstamp_if_wscale 1 $ ndd /dev/tcp tcp_sack_permitted 2 $ ndd /dev/tcp tcp_tstamp_always 0 $ ndd /dev/tcp tcp_naglim_def 4095 $ dladm show-phys e1000g0 LINK MEDIA STATE SPEED DUPLEX DEVICE e1000g0 Ethernet up 1000 full e1000g0 $ ifconfig e1000g0 e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 10.0.0.1 netmask ff000000 broadcast 10.255.255.255 $ iperf -c 10.0.0.2 -t 2 ------------------------------------------------------------ Client connecting to 10.0.0.2, TCP port 5001 TCP window size: 48.0 KByte (default) ------------------------------------------------------------ [ 3] local 10.0.0.1 port 60713 connected with 10.0.0.2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 2.0 sec 227 MBytes 950 Mbits/sec $ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 48.0 KByte (default) ------------------------------------------------------------ [ 4] local 10.0.0.1 port 5001 connected with 10.0.0.2 port 52405 [ ID] Interval Transfer Bandwidth [ 4] 0.0- 2.0 sec 192 MBytes 804 Mbits/sec ^C $ iperf -c 10.0.0.2 -t 2 -l 512 ------------------------------------------------------------ Client connecting to 10.0.0.2, TCP port 5001 TCP window size: 48.0 KByte (default) ------------------------------------------------------------ [ 3] local 10.0.0.1 port 33351 connected with 10.0.0.2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 2.0 sec 211 MBytes 884 Mbits/sec $ iperf -s -l 512 ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 48.0 KByte (default) ------------------------------------------------------------ [ 4] local 10.0.0.1 port 5001 connected with 10.0.0.2 port 52406 [ ID] Interval Transfer Bandwidth [ 4] 0.0- 2.0 sec 86.0 MBytes 361 Mbits/sec ^C :--| Second, the view from the Fedora side |--: # uname -a Linux fed11 2.6.29.4-167.fc11.i686.PAE #1 SMP Wed May 27 17:28:22 EDT 2009 i686 i686 i386 GNU/Linux # cat /etc/redhat-release Fedora release 11 (Leonidas) # cat /proc/sys/net/ipv4/tcp_rmem 4096 87380 3301376 # cat /proc/sys/net/ipv4/tcp_wmem 4096 16384 3301376 # cat /proc/sys/net/core/rmem_max 131071 # cat /proc/sys/net/core/wmem_max 131071 # cat /proc/sys/net/ipv4/tcp_moderate_rcvbuf 1 # cat /proc/sys/net/ipv4/tcp_timestamps 1 # cat /proc/sys/net/ipv4/tcp_window_scaling 1 # cat /proc/sys/net/ipv4/tcp_sack 1 # iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 10.0.0.2 port 5001 connected with 10.0.0.1 port 60713 [ ID] Interval Transfer Bandwidth [ 4] 0.0- 2.0 sec 227 MBytes 946 Mbits/sec ^C # iperf -c 10.0.0.1 -t 2 ------------------------------------------------------------ Client connecting to 10.0.0.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 3] local 10.0.0.2 port 52405 connected with 10.0.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 2.0 sec 192 MBytes 805 Mbits/sec # iperf -s -l 512 ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 10.0.0.2 port 5001 connected with 10.0.0.1 port 33351 [ ID] Interval Transfer Bandwidth [ 4] 0.0- 2.0 sec 211 MBytes 882 Mbits/sec ^C # iperf -c 10.0.0.1 -t 2 -l 512 ------------------------------------------------------------ Client connecting to 10.0.0.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 3] local 10.0.0.2 port 52406 connected with 10.0.0.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 2.0 sec 86.0 MBytes 361 Mbits/sec :---------------------------: I'm not entirely sure what Thomas is trying to prove by using the '-l 512' option. Maybe to emulate iscsi block size, but I'm not sure this is appropriate. I think that how iperf uses the option could be dependent on the operating system being used. I may investigate further if I can spare the time. Anyway, hopefully the above results give a baseline. BTW, you can use tcpdump to show the first two packets of the TCP handshake, the SYN and the SYN-ACK, which will show which TCP options are being negotiated: # tcpdump -i eth1 -nn 'tcp[13] & 2 =2' Here is OpenSolaris connecting to Fedora: 22:52:14.110400 IP 10.0.0.1.52123 > 10.0.0.2.5001: S 1556807772:1556807772(0) win 49640 [mss 1460,nop,wscale 0,nop,nop,sackOK] 22:52:14.110434 IP 10.0.0.2.5001 > 10.0.0.1.52123: S 2639313353:2639313353(0) ack 1556807773 win 5840 [mss 1460,nop,nop,sackOK,nop,wscale 6] So from OpenSolaris, we set the windows to 49640, with scale of (2^0) = 1. and SACK 'Selective Acknowledgment' is enabled Fedora responds with a window size of 5840, but with a scale factor of (2^6) = 64. And, with Fedora connecting to OpenSolaris: 23:01:02.450807 IP 10.0.0.2.44164 > 10.0.0.1.5001: S 2346183985:2346183985(0) win 5840 [mss 1460,sackOK,timestamp 97064562 0,nop,wscale 6] 23:01:02.454443 IP 10.0.0.1.5001 > 10.0.0.2.44164: S 1686911719:1686911719(0) ack 2346183986 win 49232 [nop,nop,timestamp 9478697 97064562,mss 1460,nop,wscale 0,nop,nop,sackOK] So in addition, Fedora is requesting the timestamp option, which OpenSolaris is agreeing to use. Thanks Nigel Smith http://www.nwsmith.net/ http://nwsmith.blogspot.com/ -- This message posted from opensolaris.org _______________________________________________ storage-discuss mailing list storage-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/storage-discuss