Hello, Alex, >You might try posing the question to the netdev list
Thanks for the hint. I'll give it a try. > Also you may want to clarify things as the data is a bit confusing since > it seems like you have two tests that are "iperf client -> server over 1 > teql Aggregate", with one yielding 5700Mb/s and the other ~3000Mb/s and > I am not sure what the difference is supposed to be between those two Sorry for the ambiguity. Its the number of iperf processes running in parallel - using the same link - that makes the difference. When I start more than one iperf client instance in parallel over the same teql link (i. e. same IP pair), I get close to 100 % bandwith utilisation. So there is no bottleneck in the underlying layers any more. I tested it with 2 and with 10 iperf instances in parallel - the total sum of throughput is always > 95 %. However, when I start only one iperf process, I get only half the throughput. So it looks like there is a bottleneck at the per-process or per-TCP-stack sender side (which has Intel NICs + e1000e drivers). I don't have this bottleneck when I start iperf client on a venerable HP blade 460c blade node (with much lesser general system performance, tg3 NIC driver) - neither from a blade node to Sabertooth gateway (which is the same phyical link), nor between two blades. just to clarify: Manual page iperf(1) says "To perform an iperf test the user must establish both a server (to discard traffic) and a client (to generate traffic)." So, in iperf terms, test traffc by default is going from client -> server. Optionally, I can also do bidirectional testing. It does not matter where client/server-iperf programs live, but which way the traffic flows. In contrast, below figures refer to "box-centered" designation: client = HP blade 460c G1 with broadcom NICs + tg3 server = Asus Sabertooth with Intel NICs + e1000e > > all right, current perfomance tests look like this: > > > > iperf client -> server over 6 pyhsical Links : > > 6 x 990 MBit - OK > > iperf client -> server over 1 teql Aggregate : > > 5700 MBit > 90 % - OK > > > > iperf server - client over 6 phys links > > 6 x 990...1000 MBIt -OK > > > > iperf client -> server over 1 teql Aggregate : > > ~ 3000 MBit ~~ 50 % - ##### NOT OK ##### > > > > iperf client -> server over 10 parallel teql Aggregate : > > ~ 5800 MBit > 95 % - OK > > > > iperf client -> server over 2 parallel teql Aggregate : > > ~ 5900 MBit > 95 % - OK > > > Specifically the 2.5GT/s with a width of > > > x1 can barely push 1Gb/s. This slot needs to be at least a x4 if you > > > want to push anything more than 1Gb/s. Just for curiousity, the answer of Asus support: >> I think below picture are telling all the possible usage with PCIE slots: >> 16/8/8/4 can be obtained via 3-way SLI. >> [cid:image001.png@01D06338.C4B3F1F0] >> [cid:image002.png@01D06339.348A7800] The images refer to their manual pages how to place different Video cards into their board. It really looks like they can't imagine there that people put anything else int a PCIe slot than graphic cards. :-\\\\ Odd, isn't it??? Wolfgang Rosner ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired