On Mon, Mar 20, 2006 at 02:37:04AM -0800, David S. Miller wrote:
> From: "Michael S. Tsirkin" <[EMAIL PROTECTED]>
> Date: Mon, 20 Mar 2006 12:22:34 +0200
>
> > Quoting r. David S. Miller <[EMAIL PROTECTED]>:
> > > The path an SKB can take is opaque and unknown until the very last
> > > moment it i
Wouldn't the appropriate place to add the tunable for Stretch ACKs be as
a route attribute similar to RTAX_ADVMSS? Then system administrators
who are aware of the local network topology, netfilters, etc, could use
an "ip route" or whatever command to enable it on the route entry for
the loc
From: Benjamin LaHaise <[EMAIL PROTECTED]>
Date: Mon, 20 Mar 2006 10:09:42 -0500
> Wouldn't it make sense to strech the ACK when the previous ACK is still in
> the TX queue of the device? I know that sort of behaviour was always an
> issue on modem links where you don't want to send out redunda
Wouldn't it make sense to strech the ACK when the previous ACK is still in
the TX queue of the device? I know that sort of behaviour was always an
issue on modem links where you don't want to send out redundant ACKs.
Perhaps, but it isn't clear that it would be worth the cycles to check.
I
On Mon, Mar 20, 2006 at 02:04:07PM +0200, Michael S. Tsirkin wrote:
> does not stretch ACKs anymore. RFC 2581 does mention that it might be OK to
> stretch ACKs "after careful consideration", and we are seeing that it helps
> IP over InfiniBand, so recent Linux kernels perform worse in that respect
Quoting Arjan van de Ven <[EMAIL PROTECTED]>:
> > I read it as if he was proposing to have a sysctl knob to turn off
> > TCP congestion control completely (which has so many issues it's not
> > even funny.)
>
> owww that's so bad I didn't even consider that
No, I think that comment was taken out
Quoting r. Lennert Buytenhek <[EMAIL PROTECTED]>:
> > > > I disagree with Linux changing it's behavior. It would be great to
> > > > turn off congestion control completely over local gigabit networks,
> > > > but that isn't determinable in any way, so we don't do that.
> > >
> > > Interesting. Wo
On Mon, 2006-03-20 at 12:49 +0100, Lennert Buytenhek wrote:
> On Mon, Mar 20, 2006 at 12:47:03PM +0100, Arjan van de Ven wrote:
>
> > > > I disagree with Linux changing it's behavior. It would be great to
> > > > turn off congestion control completely over local gigabit networks,
> > > > but that
On Mon, Mar 20, 2006 at 12:47:03PM +0100, Arjan van de Ven wrote:
> > > I disagree with Linux changing it's behavior. It would be great to
> > > turn off congestion control completely over local gigabit networks,
> > > but that isn't determinable in any way, so we don't do that.
> >
> > Interest
On Mon, 2006-03-20 at 13:27 +0200, Michael S. Tsirkin wrote:
> Quoting David S. Miller <[EMAIL PROTECTED]>:
> > I disagree with Linux changing it's behavior. It would be great to
> > turn off congestion control completely over local gigabit networks,
> > but that isn't determinable in any way, so
Quoting David S. Miller <[EMAIL PROTECTED]>:
> I disagree with Linux changing it's behavior. It would be great to
> turn off congestion control completely over local gigabit networks,
> but that isn't determinable in any way, so we don't do that.
Interesting. Would it make sense to make it anothe
From: "Michael S. Tsirkin" <[EMAIL PROTECTED]>
Date: Mon, 20 Mar 2006 12:22:34 +0200
> Quoting r. David S. Miller <[EMAIL PROTECTED]>:
> > The path an SKB can take is opaque and unknown until the very last
> > moment it is actually given to the device transmit function.
>
> Why, I was proposing l
Quoting r. David S. Miller <[EMAIL PROTECTED]>:
> The path an SKB can take is opaque and unknown until the very last
> moment it is actually given to the device transmit function.
Why, I was proposing looking at dst cache. If that's NULL, well,
we won't stretch ACKs. Worst case we apply the wrong
From: "Michael S. Tsirkin" <[EMAIL PROTECTED]>
Date: Mon, 20 Mar 2006 11:06:29 +0200
> Is it the case then that this requirement is less essential on
> networks such as IP over InfiniBand, which are very low latency
> and essencially lossless (with explicit congestion contifications
> in hardware)
Quoting r. David S. Miller <[EMAIL PROTECTED]>:
> > well, there are stacks which do "stretch acks" (after a fashion) that
> > make sure when they see packet loss to "do the right thing" wrt sending
> > enough acks to allow cwnds to open again in a timely fashion.
>
> Once a loss happens, it's to
David S. Miller wrote:
From: Rick Jones <[EMAIL PROTECTED]>
Date: Thu, 09 Mar 2006 16:21:05 -0800
well, there are stacks which do "stretch acks" (after a fashion) that
make sure when they see packet loss to "do the right thing" wrt sending
enough acks to allow cwnds to open again in a timely
From: Rick Jones <[EMAIL PROTECTED]>
Date: Thu, 09 Mar 2006 16:21:05 -0800
> well, there are stacks which do "stretch acks" (after a fashion) that
> make sure when they see packet loss to "do the right thing" wrt sending
> enough acks to allow cwnds to open again in a timely fashion.
Once a los
From: "Michael S. Tsirkin" <[EMAIL PROTECTED]>
Date: Fri, 10 Mar 2006 02:10:31 +0200
> But with the change we are discussing, could an ack now be sent even
> sooner than we have at least two full sized segments? Or does
> __tcp_ack_snd_check delay until we have at least two full sized
> segments?
Quoting r. Michael S. Tsirkin <[EMAIL PROTECTED]>:
> Or does __tcp_ack_snd_check delay until we have at least two full sized
> segments?
What I'm trying to say, since RFC 2525, 2.13 talks about
"every second full-sized segment", so following the code from
__tcp_ack_snd_check, why does it do
David S. Miller wrote:
From: "Michael S. Tsirkin" <[EMAIL PROTECTED]>
Date: Wed, 8 Mar 2006 14:53:11 +0200
What I was trying to figure out was, how can we re-enable the trick
without hurting TSO? Could a solution be to simply look at the frame
size, and call tcp_send_delayed_ack if the frame s
Quoting David S. Miller <[EMAIL PROTECTED]>:
>Description
> To improve efficiency (both computer and network) a data receiver
> may refrain from sending an ACK for each incoming segment,
> according to [RFC1122]. However, an ACK should not be delayed an
> inordinate amo
From: "Michael S. Tsirkin" <[EMAIL PROTECTED]>
Date: Wed, 8 Mar 2006 14:53:11 +0200
> What I was trying to figure out was, how can we re-enable the trick
> without hurting TSO? Could a solution be to simply look at the frame
> size, and call tcp_send_delayed_ack if the frame size is small?
The ch
From: "Michael S. Tsirkin" <[EMAIL PROTECTED]>
Date: Wed, 8 Mar 2006 14:53:11 +0200
> What I was trying to figure out was, how can we re-enable the trick without
> hurting TSO? Could a solution be to simply look at the frame size, and call
> tcp_send_delayed_ack if the frame size is small?
The pr
Quoting r. David S. Miller <[EMAIL PROTECTED]>:
> Subject: Re: Re: TSO and IPoIB performance degradation
>
> From: Roland Dreier <[EMAIL PROTECTED]>
> Date: Tue, 07 Mar 2006 17:17:30 -0800
>
> > The reason TSO comes up is that reverting the patch described b
David> I wish you had started the thread by mentioning this
David> specific patch, we wasted an enormous amount of precious
David> developer time speculating and asking for arbitrary tests
David> to be run in order to narrow down the problem, yet you knew
David> the specific cha
From: Roland Dreier <[EMAIL PROTECTED]>
Date: Tue, 07 Mar 2006 17:17:30 -0800
> The reason TSO comes up is that reverting the patch described below
> helps (or helped at some point at least) IPoIB throughput quite a bit.
I wish you had started the thread by mentioning this specific
patch, we wast
David> How limited are the IPoIB devices, TX descriptor wise?
David> One side effect of the TSO changes is that one extra
David> descriptor will be used for outgoing packets. This is
David> because we have to put the headers as well as the user
David> data, into page based buf
From: Matt Leininger <[EMAIL PROTECTED]>
Date: Tue, 07 Mar 2006 16:11:37 -0800
> I used the standard setting for tcp_rmem and tcp_wmem. Here are a
> few other runs that change those variables. I was able to improve
> performance by ~30MB/s to 403 MB/s, but this is still a ways from the
> 474
On Tue, 2006-03-07 at 13:49 -0800, Stephen Hemminger wrote:
> On Tue, 07 Mar 2006 13:44:51 -0800
> Matt Leininger <[EMAIL PROTECTED]> wrote:
>
> > On Mon, 2006-03-06 at 19:13 -0800, Shirley Ma wrote:
> > >
> > > > More likely you are getting hit by the fact that TSO prevents the
> > > congestion
Quoting r. Stephen Hemminger <[EMAIL PROTECTED]>:
> Is IB using NAPI or just doing netif_rx()?
No, IPoIB doesn't use NAPI.
--
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
On Tue, 07 Mar 2006 13:44:51 -0800
Matt Leininger <[EMAIL PROTECTED]> wrote:
> On Mon, 2006-03-06 at 19:13 -0800, Shirley Ma wrote:
> >
> > > More likely you are getting hit by the fact that TSO prevents the
> > congestion
> > window from increasing properly. This was fixed in 2.6.15 (around mid
On Mon, 2006-03-06 at 19:13 -0800, Shirley Ma wrote:
>
> > More likely you are getting hit by the fact that TSO prevents the
> congestion
> window from increasing properly. This was fixed in 2.6.15 (around mid
> of Nov 2005).
>
> Yep, I noticed the same problem. After updating to the new kernel,
On Tue, 7 Mar 2006 00:34:38 +0200
"Michael S. Tsirkin" <[EMAIL PROTECTED]> wrote:
> Hello, Dave!
> As you might know, the TSO patches merged into mainline kernel
> since 2.6.11 have hurt performance for the simple (non-TSO)
> high-speed netdevice that is IPoIB driver.
>
> This was discussed at le
From: "Michael S. Tsirkin" <[EMAIL PROTECTED]>
Date: Tue, 7 Mar 2006 00:34:38 +0200
> So I'm trying to get a handle on it: could a solution be to simply
> look at the frame size, and call tcp_send_delayed_ack from
> if the frame size is no larger than 1/8?
>
> Does this make sense?
The comment y
Hello, Dave!
As you might know, the TSO patches merged into mainline kernel
since 2.6.11 have hurt performance for the simple (non-TSO)
high-speed netdevice that is IPoIB driver.
This was discussed at length here
http://openib.org/pipermail/openib-general/2005-October/012271.html
I'm trying to fi
35 matches
Mail list logo