I've done some test between a couple of machines, in my lab,
using iperf to measure TCP performance, to try to replicate
the sort of tests that Thomas was trying.

(Some may say that this is more appropriate for network-discuss,
but I think it it useful to post it here in storage-discuss,
as I believe that before the try to troubleshoot iscsi performance,
its useful to check that the underlying TCP performance is Ok.)

For both machines, I used Intel pci-e network cards.
And the network cards were directly connected via a cross-over cable.
One machine used OpenSolaris 2009.06, updated to snv_118.
The other used Fedora 11 Linux.
The OpenSolaris machine has a Intel Core 2 Duo processor,
and the Fedora machine had a Intel Pentium-D processor.
Both machines were of 'workstation' class, rather than 'servers'.
I left the TCP tuneables at their default values.

:--| First, the view from the OpenSolaris side |--:

$ uname -a
SunOS opensolaris 5.11 snv_118 i86pc i386 i86pc Solaris
$ ndd /dev/tcp tcp_max_buf
1048576
$ ndd /dev/tcp tcp_cwnd_max
1048576
$ ndd /dev/tcp tcp_xmit_hiwat
49152
$ ndd /dev/tcp tcp_recv_hiwat
49152
$ ndd /dev/tcp tcp_wscale_always
1
$ ndd /dev/tcp tcp_tstamp_if_wscale
1
$ ndd /dev/tcp tcp_sack_permitted
2
$ ndd /dev/tcp tcp_tstamp_always
0
$ ndd /dev/tcp tcp_naglim_def
4095
$ dladm show-phys e1000g0
LINK         MEDIA                STATE      SPEED  DUPLEX    DEVICE
e1000g0      Ethernet             up         1000   full      e1000g0

$ ifconfig e1000g0
e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
       inet 10.0.0.1 netmask ff000000 broadcast 10.255.255.255

$ iperf -c 10.0.0.2 -t 2
------------------------------------------------------------
Client connecting to 10.0.0.2, TCP port 5001
TCP window size: 48.0 KByte (default)
------------------------------------------------------------
[  3] local 10.0.0.1 port 60713 connected with 10.0.0.2 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 2.0 sec    227 MBytes    950 Mbits/sec
$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 48.0 KByte (default)
------------------------------------------------------------
[  4] local 10.0.0.1 port 5001 connected with 10.0.0.2 port 52405
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 2.0 sec    192 MBytes    804 Mbits/sec
^C
$ iperf -c 10.0.0.2 -t 2 -l 512
------------------------------------------------------------
Client connecting to 10.0.0.2, TCP port 5001
TCP window size: 48.0 KByte (default)
------------------------------------------------------------
[  3] local 10.0.0.1 port 33351 connected with 10.0.0.2 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 2.0 sec    211 MBytes    884 Mbits/sec
$ iperf -s -l 512
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 48.0 KByte (default)
------------------------------------------------------------
[  4] local 10.0.0.1 port 5001 connected with 10.0.0.2 port 52406
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 2.0 sec  86.0 MBytes    361 Mbits/sec
^C

:--| Second, the view from the Fedora side |--:

# uname -a
Linux fed11 2.6.29.4-167.fc11.i686.PAE #1 SMP Wed May 27 17:28:22 EDT
2009 i686 i686 i386 GNU/Linux
# cat /etc/redhat-release
Fedora release 11 (Leonidas)
# cat /proc/sys/net/ipv4/tcp_rmem
4096    87380   3301376
# cat /proc/sys/net/ipv4/tcp_wmem
4096    16384   3301376
# cat /proc/sys/net/core/rmem_max
131071
# cat /proc/sys/net/core/wmem_max
131071
# cat /proc/sys/net/ipv4/tcp_moderate_rcvbuf
1
# cat /proc/sys/net/ipv4/tcp_timestamps
1
# cat /proc/sys/net/ipv4/tcp_window_scaling
1
# cat /proc/sys/net/ipv4/tcp_sack
1
# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.0.0.2 port 5001 connected with 10.0.0.1 port 60713
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 2.0 sec    227 MBytes    946 Mbits/sec
^C
# iperf -c 10.0.0.1 -t 2
------------------------------------------------------------
Client connecting to 10.0.0.1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  3] local 10.0.0.2 port 52405 connected with 10.0.0.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 2.0 sec    192 MBytes    805 Mbits/sec
# iperf -s -l 512
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.0.0.2 port 5001 connected with 10.0.0.1 port 33351
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 2.0 sec    211 MBytes    882 Mbits/sec
^C
# iperf -c 10.0.0.1 -t 2 -l 512
------------------------------------------------------------
Client connecting to 10.0.0.1, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[  3] local 10.0.0.2 port 52406 connected with 10.0.0.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 2.0 sec  86.0 MBytes    361 Mbits/sec

:---------------------------:

I'm not entirely sure what Thomas is trying to prove by using the '-l 512' 
option.
Maybe to emulate iscsi block size, but I'm not sure this is appropriate.
I think that how iperf uses the option could be dependent on the
operating system being used.
I may investigate further if I can spare the time.

Anyway, hopefully the above results give a baseline.

BTW, you can use tcpdump to show the first two packets of the
TCP handshake, the SYN and the SYN-ACK,
which will show which TCP options are being negotiated:

 # tcpdump -i eth1 -nn 'tcp[13] & 2 =2'

Here is OpenSolaris connecting to Fedora:

22:52:14.110400 IP 10.0.0.1.52123 > 10.0.0.2.5001:
 S 1556807772:1556807772(0) win 49640 
 [mss 1460,nop,wscale 0,nop,nop,sackOK]

22:52:14.110434 IP 10.0.0.2.5001 > 10.0.0.1.52123:
 S 2639313353:2639313353(0) ack 1556807773 win 5840
 [mss 1460,nop,nop,sackOK,nop,wscale 6]

So from OpenSolaris, we set the windows to 49640, with scale of (2^0) = 1.
and SACK 'Selective Acknowledgment' is enabled
Fedora responds with a window size of 5840, but with a scale factor of
(2^6) = 64.

And, with Fedora connecting to OpenSolaris:

23:01:02.450807 IP 10.0.0.2.44164 > 10.0.0.1.5001:
 S 2346183985:2346183985(0) win 5840 
 [mss 1460,sackOK,timestamp 97064562 0,nop,wscale 6]

23:01:02.454443 IP 10.0.0.1.5001 > 10.0.0.2.44164:
 S 1686911719:1686911719(0) ack 2346183986 win 49232
 [nop,nop,timestamp 9478697 97064562,mss 1460,nop,wscale 0,nop,nop,sackOK]

So in addition, Fedora is requesting the timestamp option,
which OpenSolaris is agreeing to use.

Thanks
Nigel Smith
http://www.nwsmith.net/
http://nwsmith.blogspot.com/
-- 
This message posted from opensolaris.org
_______________________________________________
storage-discuss mailing list
storage-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to