Re: e1000 full-duplex TCP performance well below wire speed

2008-02-01 Thread Carsten Aulbert
Hi all Rick Jones wrote: 2) use the aforementioned burst TCP_RR test. This is then a single netperf with data flowing both ways on a single connection so no issue of skew, but perhaps an issue of being one connection and so one process on each end. Since our major gaol is to establish a

RE: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi Jesse, It's good to be talking directly to one of the e1000 developers and maintainers. Although at this point I am starting to think that the issue may be TCP stack related and nothing to do with the NIC. Am I correct that these are quite distinct parts of the kernel? Yes, quite. OK.

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi Sangtae, Thanks for joining this discussion -- it's good to a CUBIC author and expert here! In our application (cluster computing) we use a very tightly coupled high-speed low-latency network. There is no 'wide area traffic'. So it's hard for me to understand why any networking

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Andi Kleen
Bruce Allen [EMAIL PROTECTED] writes: Important note: we ARE able to get full duplex wire speed (over 900 Mb/s simulaneously in both directions) using UDP. The problems occur only with TCP connections. Another issue with full duplex TCP not mentioned yet is that if TSO is used the output

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi Andi! Important note: we ARE able to get full duplex wire speed (over 900 Mb/s simulaneously in both directions) using UDP. The problems occur only with TCP connections. Another issue with full duplex TCP not mentioned yet is that if TSO is used the output will be somewhat bursty and

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bill Fink
On Wed, 30 Jan 2008, SANGTAE HA wrote: On Jan 30, 2008 5:25 PM, Bruce Allen [EMAIL PROTECTED] wrote: In our application (cluster computing) we use a very tightly coupled high-speed low-latency network. There is no 'wide area traffic'. So it's hard for me to understand why any

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Carsten Aulbert
Good morning (my TZ), I'll try to answer all questions, hoewver if I miss something big, please point my nose to it again. Rick Jones wrote: As asked in LKML thread, please post the exact netperf command used to start the client/server, whether or not you're using irqbalanced (aka

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread David Acker
Bill Fink wrote: If the receive direction uses a different GigE NIC that's part of the same quad-GigE, all is fine: [EMAIL PROTECTED] ~]$ nuttcp -f-beta -Itx -w2m 192.168.6.79 nuttcp -f-beta -Irx -r -w2m 192.168.5.79 tx: 1186.5051 MB / 10.05 sec = 990.2250 Mbps 12 %TX 13 %RX 0 retrans rx:

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Carsten Aulbert
Hi all, slowly crawling through the mails. Brandeburg, Jesse wrote: The test was done with various mtu sizes ranging from 1500 to 9000, with ethernet flow control switched on and off, and using reno and cubic as a TCP congestion control. As asked in LKML thread, please post the exact netperf

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Carsten Aulbert
Brief question I forgot to ask: Right now we are using the old version 7.3.20-k2. To save some effort on your end, shall we upgrade this to 7.6.15 or should our version be good enough? Thanks Carsten -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi Bill, I see similar results on my test systems Thanks for this report and for confirming our observations. Could you please confirm that a single-port bidrectional UDP link runs at wire speed? This helps to localize the problem to the TCP stack or interaction of the TCP stack with the

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi David, Could this be an issue with pause frames? At a previous job I remember having issues with a similar configuration using two broadcom sb1250 3 gigE port devices. If I ran bidirectional tests on a single pair of ports connected via cross over, it was slower than when I gave each

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Carsten Aulbert
Hi Andi, Andi Kleen wrote: Another issue with full duplex TCP not mentioned yet is that if TSO is used the output will be somewhat bursty and might cause problems with the TCP ACK clock of the other direction because the ACKs would need to squeeze in between full TSO bursts. You could try

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Carsten Aulbert
Hi all, Brandeburg, Jesse wrote: I would suggest you try TCP_RR with a command line something like this: netperf -t TCP_RR -H hostname -C -c -- -b 4 -r 64K I did that and the results can be found here: https://n0.aei.uni-hannover.de/wiki/index.php/NetworkTest seems something went wrong and

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bill Fink
Hi Bruce, On Thu, 31 Jan 2008, Bruce Allen wrote: I see similar results on my test systems Thanks for this report and for confirming our observations. Could you please confirm that a single-port bidrectional UDP link runs at wire speed? This helps to localize the problem to the TCP

RE: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Brandeburg, Jesse
Carsten Aulbert wrote: PS: Am I right that the TCP_RR tests should only be run on a single node at a time, not on both ends simultaneously? yes, they are a request/response test, and so perform the bidirectional test with a single node starting the test. -- To unsubscribe from this list: send

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Rick Jones
netperf was used without any special tuning parameters. Usually we start two processes on two hosts which start (almost) simultaneously, last for 20-60 seconds and simply use UDP_STREAM (works well) and TCP_STREAM, i.e. on 192.168.0.202: netperf -H 192.168.2.203 -t TCP_STREAL -l 20 on

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Kok, Auke
Carsten Aulbert wrote: Hi Andi, Andi Kleen wrote: Another issue with full duplex TCP not mentioned yet is that if TSO is used the output will be somewhat bursty and might cause problems with the TCP ACK clock of the other direction because the ACKs would need to squeeze in between full

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Rick Jones
Carsten Aulbert wrote: Hi all, slowly crawling through the mails. Brandeburg, Jesse wrote: The test was done with various mtu sizes ranging from 1500 to 9000, with ethernet flow control switched on and off, and using reno and cubic as a TCP congestion control. As asked in LKML thread,

RE: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Brandeburg, Jesse
Bill Fink wrote: a 2.6.15.4 kernel. The GigE NICs are Intel PRO/1000 82546EB_QUAD_COPPER, on a 64-bit/133-MHz PCI-X bus, using version 6.1.16-k2 of the e1000 driver, and running with 9000-byte jumbo frames. The TCP congestion control is BIC. Bill, FYI, there was a known issue with e1000

running aggregate netperf TCP_RR Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Rick Jones
PS: Am I right that the TCP_RR tests should only be run on a single node at a time, not on both ends simultaneously? It depends on what you want to measure. In this specific case since the goal is to saturate the link in both directions it is unlikely you should need a second instance

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Rick Jones
A lot of people tend to forget that the pci-express bus has enough bandwidth on first glance - 2.5gbit/sec for 1gbit of traffix, but apart from data going over it there is significant overhead going on: each packet requires transmit, cleanup and buffer transactions, and there are many irq

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Rick Jones
Sounds like tools to show PCI* bus utilization would be helpful... that would be a hardware profiling thing and highly dependent on the part sticking out of the slot, vendor bus implementation etc... Perhaps Intel has some tools for this already but I personally do not know of any :/ Small

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi Auke, Important note: we ARE able to get full duplex wire speed (over 900 Mb/s simulaneously in both directions) using UDP. The problems occur only with TCP connections. That eliminates bus bandwidth issues, probably, but small packets take up a lot of extra descriptors, bus bandwidth,

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi Auke, Based on the discussion in this thread, I am inclined to believe that lack of PCI-e bus bandwidth is NOT the issue. The theory is that the extra packet handling associated with TCP acknowledgements are pushing the PCI-e x1 bus past its limits. However the evidence seems to show

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Kok, Auke
Bruce Allen wrote: Hi Auke, Important note: we ARE able to get full duplex wire speed (over 900 Mb/s simulaneously in both directions) using UDP. The problems occur only with TCP connections. That eliminates bus bandwidth issues, probably, but small packets take up a lot of extra

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi Bill, I see similar results on my test systems Thanks for this report and for confirming our observations. Could you please confirm that a single-port bidrectional UDP link runs at wire speed? This helps to localize the problem to the TCP stack or interaction of the TCP stack with the

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bill Fink
On Thu, 31 Jan 2008, Bruce Allen wrote: Based on the discussion in this thread, I am inclined to believe that lack of PCI-e bus bandwidth is NOT the issue. The theory is that the extra packet handling associated with TCP acknowledgements are pushing the PCI-e x1 bus past its limits.

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi Bill, I started musing if once one side's transmitter got the upper hand, it might somehow defer the processing of received packets, causing the resultant ACKs to be delayed and thus further slowing down the other end's transmitter. I began to wonder if the txqueuelen could have an

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Bruce Allen
Hi David, Thanks for your note. (The performance of a full duplex stream should be close to 1Gb/s in both directions.) This is not a reasonable expectation. ACKs take up space on the link in the opposite direction of the transfer. So the link usage in the opposite direction of the transfer

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread David Miller
From: Bruce Allen [EMAIL PROTECTED] Date: Wed, 30 Jan 2008 03:51:51 -0600 (CST) [ netdev@vger.kernel.org added to CC: list, that is where kernel networking issues are discussed. ] (The performance of a full duplex stream should be close to 1Gb/s in both directions.) This is not a reasonable

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Stephen Hemminger
On Wed, 30 Jan 2008 08:01:46 -0600 (CST) Bruce Allen [EMAIL PROTECTED] wrote: Hi David, Thanks for your note. (The performance of a full duplex stream should be close to 1Gb/s in both directions.) This is not a reasonable expectation. ACKs take up space on the link in the

RE: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Brandeburg, Jesse
Bruce Allen wrote: Details: Kernel version: 2.6.23.12 ethernet NIC: Intel 82573L ethernet driver: e1000 version 7.3.20-k2 motherboard: Supermicro PDSML-LN2+ (one quad core Intel Xeon X3220, Intel 3000 chipset, 8GB memory) Hi Bruce, The 82573L (a client NIC, regardless of the class of

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Rick Jones
As asked in LKML thread, please post the exact netperf command used to start the client/server, whether or not you're using irqbalanced (aka irqbalance) and what cat /proc/interrupts looks like (you ARE using MSI, right?) In particular, it would be good to know if you are doing two concurrent

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Ben Greear
Bruce Allen wrote: (Pádraig Brady has suggested that I post this to Netdev. It was originally posted to LKML here: http://lkml.org/lkml/2008/1/30/141 ) Dear NetDev, We've connected a pair of modern high-performance boxes with integrated copper Gb/s Intel NICS, with an ethernet crossover

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Bruce Allen
Hi Stephen, Thanks for your helpful reply and especially for the literature pointers. Indeed, we are not asking to see 1000 Mb/s. We'd be happy to see 900 Mb/s. Netperf is trasmitting a large buffer in MTU-sized packets (min 1500 bytes). Since the acks are only about 60 bytes in size, they

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Bruce Allen
Hi Ben, Thank you for the suggestions and questions. We've connected a pair of modern high-performance boxes with integrated copper Gb/s Intel NICS, with an ethernet crossover cable, and have run some netperf full duplex TCP tests. The transfer rates are well below wire speed. We're

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Stephen Hemminger
On Wed, 30 Jan 2008 16:25:12 -0600 (CST) Bruce Allen [EMAIL PROTECTED] wrote: Hi Stephen, Thanks for your helpful reply and especially for the literature pointers. Indeed, we are not asking to see 1000 Mb/s. We'd be happy to see 900 Mb/s. Netperf is trasmitting a large buffer in

RE: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Bruce Allen
Hi Jesse, It's good to be talking directly to one of the e1000 developers and maintainers. Although at this point I am starting to think that the issue may be TCP stack related and nothing to do with the NIC. Am I correct that these are quite distinct parts of the kernel? The 82573L (a

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Bruce Allen
Hi Rick, First off, thanks for netperf. I've used it a lot and find it an extremely useful tool. As asked in LKML thread, please post the exact netperf command used to start the client/server, whether or not you're using irqbalanced (aka irqbalance) and what cat /proc/interrupts looks like

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Bruce Allen
Hi Stephen, Indeed, we are not asking to see 1000 Mb/s. We'd be happy to see 900 Mb/s. Netperf is trasmitting a large buffer in MTU-sized packets (min 1500 bytes). Since the acks are only about 60 bytes in size, they should be around 4% of the total traffic. Hence we would not expect to see

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread SANGTAE HA
Hi Bruce, On Jan 30, 2008 5:25 PM, Bruce Allen [EMAIL PROTECTED] wrote: In our application (cluster computing) we use a very tightly coupled high-speed low-latency network. There is no 'wide area traffic'. So it's hard for me to understand why any networking components or software layers

RE: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Brandeburg, Jesse
Bruce Allen wrote: Hi Jesse, It's good to be talking directly to one of the e1000 developers and maintainers. Although at this point I am starting to think that the issue may be TCP stack related and nothing to do with the NIC. Am I correct that these are quite distinct parts of the