Re: Over one million IOPS using software iSCSI and 10 Gbit Ethernet

Vladislav Bolkhovitin Fri, 05 Feb 2010 03:15:35 -0800

Nicholas A. Bellinger, on 01/29/2010 07:25 PM wrote:

On Thu, 2010-01-28 at 20:45 +0200, Pasi Kärkkäinen wrote:

On Thu, Jan 28, 2010 at 07:38:28PM +0100, Bart Van Assche wrote:

On Thu, Jan 28, 2010 at 4:01 PM, Joe Landman
<[email protected]> wrote:

Pasi Kärkkäinen wrote:

Please check these news items:


http://blog.fosketts.net/2010/01/14/microsoft-intel-push-million-iscsi-iops/

http://communities.intel.com/community/openportit/server/blog/2010/01/19/1000000-iops-with-iscsi--thats-not-a-typo

http://www.infostor.com/index/blogs_new/dave_simpson_storage/blogs/infostor/dave_simpon_storage/post987_37501094375591341.html

"1,030,000 IOPS over a single 10 Gb Ethernet link"

This is less than 1us per IOP.  Interesting.  Their hardware may not
actually support this.  10GbE typically is 7-10us, though ConnectX and some
others get down to 2ish.

Which I/O depth has been used in the test ? Latency matters most with
an I/O depth of one and is almost irrelevant for high I/O depth
values.

iirc outstanding I/Os was 20 in that benchmark.


Also of interest, according to the following link

http://gestaltit.com/featured/top/stephen/wirespeed-10-gb-iscsi/

is that I/Os are being multiplexed across multiple TCP connections using
RFC-3720 defined Multiple Connection per Session (MC/S) logic between
the MSFT Initiator Nehalem machine and Netapp Target array:

"The configuration tested (on the initiator side) was an IBM x3550 with
dual 2 GHz CPUs, 4 GB of RAM, and an Intel 82598 adapter. This is not a
special server – in fact, it’s pretty low-end! The connection was tuned
with RSS, NetDMA, LRO, LSO, and jumbo frames and maxed out over 4 MCS
connections per second. I’m not sure what kind of access they were doing
(I’ll ask Suzanne), but it’s pretty impressive that the NetApp Filer
could push 1,174 megabytes per second!'

It just goes to show that software iSCSI MC/S can really scale to some
very impressive results with enough x86_64 horsepower behind it..

Well, if on Windows MC/S scales better than MPIO on random IO tests, asit is seen from the WinHEC presentation, it must mean that MPIO onWindows has serious scalability problems. It would well explain whyMicrosoft is the only OS vendor who pushes MC/S-capable initiator. I'msure, if in the next version they fix those scalability problems, itwill be presented as a great achievement ;). It isn't necessary Linuxshould also have such problems.

(MC/S requires to serialize all the commands across all connectionsaccording to commands' SNs, even if preserving the commands deliveryorder isn't needed. This is a known scalability limitation. MPIO doesnot require any such commands serialization, hence doesn't have suchlimitation. It is especially seen on high IOPS tests. For MPIO you canassign each IO thread to particular CPU and make network hardware to putdata for corresponding connection on that CPU. It will allow the bestCPU caches usage as well as avoid the "cache ping-pong" between CPUs.With MC/S such setup isn't possible, because all commands from allconnections must pass single serialization point.)


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Re: Over one million IOPS using software iSCSI and 10 Gbit Ethernet

Reply via email to