Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-12 Thread Stephen Hemminger
On Fri, 12 Oct 2007 09:08:58 -0700 Brandeburg, Jesse [EMAIL PROTECTED] wrote: Andi Kleen wrote: When the hw TX queue gains space, the driver self-batches packets from the sw queue to the hw queue. I don't really see the advantage over the qdisc in that scheme. It's certainly not

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-12 Thread Andi Kleen
Use RCU? or write a generic version and get it reviewed. You really want someone with knowledge of all the possible barrier impacts to review it. I guess he was thinking of using cmpxchg; but we don't support this in portable code. RCU is not really suitable for this because it assume

RE: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-12 Thread Brandeburg, Jesse
Andi Kleen wrote: When the hw TX queue gains space, the driver self-batches packets from the sw queue to the hw queue. I don't really see the advantage over the qdisc in that scheme. It's certainly not simpler and probably more code and would likely also not require less locks (e.g. a

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-12 Thread Andi Kleen
related to this comment, does Linux have a lockless (using atomics) singly linked list element? That would be very useful in a driver hot path. No; it doesn't. At least not a portable one. Besides they tend to be not faster anyways because e.g. cmpxchg tends to be as slow as an explicit

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-11 Thread Krishna Kumar2
Hi Dave, David Miller wrote on 10/10/2007 02:13:31 AM: Hopefully that new qdisc will just use the TX rings of the hardware directly. They are typically large enough these days. That might avoid some locking in this critical path. Indeed, I also realized last night that for the default

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-10 Thread Andi Kleen
A 256 entry TX hw queue fills up trivially on 1GB and 10GB, but if you With TSO really? increase the size much more performance starts to go down due to L2 cache thrashing. Another possibility would be to consider using cache avoidance instructions while updating the TX ring (e.g. write

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-10 Thread David Miller
From: Andi Kleen [EMAIL PROTECTED] Date: Wed, 10 Oct 2007 11:16:44 +0200 A 256 entry TX hw queue fills up trivially on 1GB and 10GB, but if you With TSO really? Yes. increase the size much more performance starts to go down due to L2 cache thrashing. Another possibility would be to

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-10 Thread Herbert Xu
On Wed, Oct 10, 2007 at 11:16:44AM +0200, Andi Kleen wrote: A 256 entry TX hw queue fills up trivially on 1GB and 10GB, but if you With TSO really? Hardware queues are generally per-page rather than per-skb so it'd fill up quicker than a software queue even with TSO. Cheers, -- Visit

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-10 Thread Andi Kleen
On Wed, Oct 10, 2007 at 02:25:50AM -0700, David Miller wrote: The chip I was working with at the time (UltraSPARC-IIi) compressed all the linear stores into 64-byte full cacheline transactions via the store buffer. That's a pretty old CPU. Conclusions on more modern ones might be different.

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-10 Thread David Miller
From: Andi Kleen [EMAIL PROTECTED] Date: Wed, 10 Oct 2007 12:23:31 +0200 On Wed, Oct 10, 2007 at 02:25:50AM -0700, David Miller wrote: The chip I was working with at the time (UltraSPARC-IIi) compressed all the linear stores into 64-byte full cacheline transactions via the store buffer.

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-10 Thread Andi Kleen
We've done similar testing with ixgbe to push maximum descriptor counts, and we lost performance very quickly in the same range you're quoting on NIU. Did you try it with WC writes to the ring or CLFLUSH? -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-10 Thread Bill Fink
On Tue, 09 Oct 2007, David Miller wrote: From: jamal [EMAIL PROTECTED] Date: Tue, 09 Oct 2007 17:56:46 -0400 if the h/ware queues are full because of link pressure etc, you drop. We drop today when the s/ware queues are full. The driver txmit lock takes place of the qdisc queue lock

RE: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-10 Thread Waskiewicz Jr, Peter P
PROTECTED]; [EMAIL PROTECTED]; netdev@vger.kernel.org; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-10 Thread David Miller
From: jamal [EMAIL PROTECTED] Date: Wed, 10 Oct 2007 09:08:48 -0400 On Wed, 2007-10-10 at 03:44 -0700, David Miller wrote: I've always gotten very poor results when increasing the TX queue a lot, for example with NIU the point of diminishing returns seems to be in the range of 256-512 TX

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-10 Thread David Miller
From: Bill Fink [EMAIL PROTECTED] Date: Wed, 10 Oct 2007 12:02:15 -0400 On Tue, 09 Oct 2007, David Miller wrote: We have to keep in mind, however, that the sw queue right now is 1000 packets. I heavily discourage any driver author to try and use any single TX queue of that size. Which

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-09 Thread Stephen Hemminger
On 09 Oct 2007 18:51:51 +0200 Andi Kleen [EMAIL PROTECTED] wrote: David Miller [EMAIL PROTECTED] writes: 2) Switch the default qdisc away from pfifo_fast to a new DRR fifo with load balancing using the code in #1. I think this is kind of in the territory of what Peter said he is

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-09 Thread Andi Kleen
I wonder about the whole idea of queueing in general at such high speeds. Given the normal bi-modal distribution of packets, and the predominance of 1500 byte MTU; does it make sense to even have any queueing in software at all? Yes that is my point -- it should just pass it through directly

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-09 Thread David Miller
From: Andi Kleen [EMAIL PROTECTED] Date: 09 Oct 2007 18:51:51 +0200 Hopefully that new qdisc will just use the TX rings of the hardware directly. They are typically large enough these days. That might avoid some locking in this critical path. Indeed, I also realized last night that for the

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-09 Thread Stephen Hemminger
On Tue, 09 Oct 2007 13:43:31 -0700 (PDT) David Miller [EMAIL PROTECTED] wrote: From: Andi Kleen [EMAIL PROTECTED] Date: 09 Oct 2007 18:51:51 +0200 Hopefully that new qdisc will just use the TX rings of the hardware directly. They are typically large enough these days. That might avoid

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-09 Thread David Miller
From: Stephen Hemminger [EMAIL PROTECTED] Date: Tue, 9 Oct 2007 13:53:40 -0700 I was thinking why not have a default transmit queue len of 0 like the virtual devices. I'm not so sure. Even if the device has huge queues I still think we need a software queue for when the hardware one backs up.

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-09 Thread jamal
On Tue, 2007-09-10 at 14:22 -0700, David Miller wrote: Even if the device has huge queues I still think we need a software queue for when the hardware one backs up. It should be fine to just pretend the qdisc exists despite it sitting in the driver and not have s/ware queues at all to avoid

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-09 Thread David Miller
From: jamal [EMAIL PROTECTED] Date: Tue, 09 Oct 2007 17:56:46 -0400 if the h/ware queues are full because of link pressure etc, you drop. We drop today when the s/ware queues are full. The driver txmit lock takes place of the qdisc queue lock etc. I am assuming there is still need for that

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-09 Thread Andi Kleen
On Tue, Oct 09, 2007 at 05:04:35PM -0700, David Miller wrote: We have to keep in mind, however, that the sw queue right now is 1000 packets. I heavily discourage any driver author to try and use any single TX queue of that size. Why would you discourage them? If 1000 is ok for a software

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

2007-10-09 Thread David Miller
From: Andi Kleen [EMAIL PROTECTED] Date: Wed, 10 Oct 2007 02:37:16 +0200 On Tue, Oct 09, 2007 at 05:04:35PM -0700, David Miller wrote: We have to keep in mind, however, that the sw queue right now is 1000 packets. I heavily discourage any driver author to try and use any single TX queue