Re: [RFC] New driver API to speed up small packets xmits

2007-05-22 Thread David Miller
From: Shirley Ma [EMAIL PROTECTED] Date: Tue, 22 May 2007 15:22:35 -0700 Yep, for any NIC that supports SG but not TSO then software GSO will be a big win. When the NIC doesn't support SG then the win is mostly offset by the need to copy the packet again. Cheers, -- We could

Re: [RFC] New driver API to speed up small packets xmits

2007-05-22 Thread David Miller
From: Shirley Ma [EMAIL PROTECTED] Date: Tue, 22 May 2007 15:58:05 -0700 Sorry for the confusion. I am thinking to avoid copy in skb_segment() for GSO. The way could be in tcp_sendmsg() to allocate small discontiguous buffers (equal = MTU) instead of allocating pages. The SKB splitting

Re: [RFC] New driver API to speed up small packets xmits

2007-05-22 Thread Herbert Xu
On Tue, May 22, 2007 at 03:36:36PM -0700, David Miller wrote: Yep, for any NIC that supports SG but not TSO then software GSO will be a big win. When the NIC doesn't support SG then the win is mostly offset by the need to copy the packet again. ... SKB's from TSO are composed of

Re: [RFC] New driver API to speed up small packets xmits

2007-05-21 Thread Herbert Xu
David Miller [EMAIL PROTECTED] wrote: From: Shirley Ma [EMAIL PROTECTED] Date: Tue, 15 May 2007 14:22:57 -0700 I just wonder without TSO support in HW, how much benefit we can get by pushing GSO from interface layer to device layer besides we can do multiple packets in IPoIB. I bet

Re: [WIP] [PATCH] WAS Re: [RFC] New driver API to speed up small packets xmits

2007-05-18 Thread jamal
On Wed, 2007-16-05 at 23:25 -0400, jamal wrote: This patch now includes two changed drivers (tun and e1000). I have tested tun with this patch. I tested e1000 earlier and i couldnt find any issues - although as the tittle says its a WIP. As before you need net-2.6. You also need the qdisc

Re: [WIP] [PATCH] WAS Re: [RFC] New driver API to speed up small packets xmits

2007-05-16 Thread Sridhar Samudrala
Jamal, Here are some comments i have on your patch. See them inline. Thanks Sridhar +static int try_get_tx_pkts(struct net_device *dev, struct Qdisc *q, int count) +{ + struct sk_buff *skb; + struct sk_buff_head *skbs = dev-blist; + int tdq = count; + + /* +*

Re: [WIP] [PATCH] WAS Re: [RFC] New driver API to speed up small packets xmits

2007-05-16 Thread jamal
On Wed, 2007-16-05 at 15:12 -0700, Sridhar Samudrala wrote: Jamal, Here are some comments i have on your patch. See them inline. Thanks for taking the time Sridhar. try_tx_pkts() is directly calling the device's batch xmit routine. Don't we need to call dev_hard_start_xmit() to handle

Re: [WIP] [PATCH] WAS Re: [RFC] New driver API to speed up small packets xmits

2007-05-16 Thread jamal
On Wed, 2007-16-05 at 18:52 -0400, jamal wrote: On Wed, 2007-16-05 at 15:12 -0700, Sridhar Samudrala wrote: I will have to think a bit about this; i may end up coalescing when grabbing the packets but call the nit from the driver using a helper. Thats what i did. This would hopefully work

Re: [WIP] [PATCH] WAS Re: [RFC] New driver API to speed up small packets xmits

2007-05-16 Thread Krishna Kumar2
Hi Sridhar, Sridhar Samudrala [EMAIL PROTECTED] wrote on 05/17/2007 03:42:03 AM: AFAIK, gso_skb can be a list of skb's. Can we add a list to another list using __skb_queue_head()? Also, if gso_skb is a list of multiple skb's, i think the count needs to be decremented by the number of

Re: [WIP] [PATCH] WAS Re: [RFC] New driver API to speed up small packets xmits

2007-05-16 Thread Sridhar Samudrala
Krishna Kumar2 wrote: Hi Sridhar, Sridhar Samudrala [EMAIL PROTECTED] wrote on 05/17/2007 03:42:03 AM: AFAIK, gso_skb can be a list of skb's. Can we add a list to another list using __skb_queue_head()? Also, if gso_skb is a list of multiple skb's, i think the count needs to be decremented by

Re: [WIP] [PATCH] WAS Re: [RFC] New driver API to speed up small packets xmits

2007-05-16 Thread Krishna Kumar2
Sridhar Samudrala [EMAIL PROTECTED] wrote on 05/17/2007 03:14:41 AM: Krishna Kumar2 wrote: Hi Sridhar, Sridhar Samudrala [EMAIL PROTECTED] wrote on 05/17/2007 03:42:03 AM: AFAIK, gso_skb can be a list of skb's. Can we add a list to another list using __skb_queue_head()? Also, if

Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread Roland Dreier
As I said before, getting multiple packets in one call to xmit would be nice for amortizing per-xmit overhead in IPoIB. So it would be nice if the cases where the stack does GSO ended up passing all the segments into the driver in one go. Well TCP does upto 64k -- that is what

Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread David Miller
From: Roland Dreier [EMAIL PROTECTED] Date: Tue, 15 May 2007 09:25:28 -0700 I'll have to think about implementing that for IPoIB. One issue I see is if I have, say, 4 free entries in my send queue and skb_gso_segment() gives me back 5 packets to send. It's not clear I can recover at that

Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread Roland Dreier
I'll have to think about implementing that for IPoIB. One issue I see is if I have, say, 4 free entries in my send queue and skb_gso_segment() gives me back 5 packets to send. It's not clear I can recover at that point -- I guess I have to check against gso_segs in the xmit routine

Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread Michael Chan
On Tue, 2007-05-15 at 13:52 -0700, Roland Dreier wrote: Well, IPoIB doesn't do netif_wake_queue() until half the device's TX queue is free, so we should get batching. However, I'm not sure that I can count on a fudge factor ensuring that there's enough space to handle everything

Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread Roland Dreier
Well, IPoIB doesn't do netif_wake_queue() until half the device's TX queue is free, so we should get batching. However, I'm not sure that I can count on a fudge factor ensuring that there's enough space to handle everything skb_gso_segment() gives me -- is there any reliable way to

Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread Roland Dreier
I thought to enable GSO, device driver actually does nothing rather than enabling the flag. GSO moved TCP offloading to interface layer before device xmit. It's a different idea with multiple packets per xmit. GSO still queue the packet one bye one in QDISC and xmit one bye one. The

Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread Michael Chan
On Tue, 2007-05-15 at 14:08 -0700, Roland Dreier wrote: Well, IPoIB doesn't do netif_wake_queue() until half the device's TX queue is free, so we should get batching. However, I'm not sure that I can count on a fudge factor ensuring that there's enough space to handle everything

Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread David Miller
From: Michael Chan [EMAIL PROTECTED] Date: Tue, 15 May 2007 15:05:28 -0700 On Tue, 2007-05-15 at 14:08 -0700, Roland Dreier wrote: Well, IPoIB doesn't do netif_wake_queue() until half the device's TX queue is free, so we should get batching. However, I'm not sure that I can count

Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread David Miller
From: Shirley Ma [EMAIL PROTECTED] Date: Tue, 15 May 2007 14:22:57 -0700 I just wonder without TSO support in HW, how much benefit we can get by pushing GSO from interface layer to device layer besides we can do multiple packets in IPoIB. I bet the gain is non-trivial. I'd say about

Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread Roland Dreier
Shirley I just wonder without TSO support in HW, how much Shirley benefit we can get by pushing GSO from interface layer to Shirley device layer besides we can do multiple packets in IPoIB. The entire benefit comes from having multiple packets to queue in one call to the xmit

[WIP] [PATCH] WAS Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread jamal
On Tue, 2007-15-05 at 14:32 -0700, David Miller wrote: An efficient qdisc--driver transfer during netif_wake_queue() could help solve some of that, as is being discussed here. Ok, heres the approach i discussed at netconf. It needs net-2.6 and the patch i posted earlier to clean up

Re: [WIP] [PATCH] WAS Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread jamal
On Tue, 2007-15-05 at 18:17 -0400, jamal wrote: I will post a patch for tun device in a few minutes that i use to test on my laptop (i need to remove some debugs) to show an example. Ok, here it is. The way i test is to point packets at a tun device. [One way i do it is attach an ingress

Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread David Miller
From: Shirley Ma [EMAIL PROTECTED] Date: Tue, 15 May 2007 16:33:22 -0700 That's interesting. So a generic LRO in interface layer will benefit the preformance more, right? Receiving path TCP N times is more expensive than sending, I think. If you look at some of the drivers doing LRO,

Re: [WIP] [PATCH] WAS Re: [RFC] New driver API to speed up small packets xmits

2007-05-15 Thread jamal
On Tue, 2007-15-05 at 18:48 -0400, jamal wrote: I will try to post the e1000 patch tonight or tommorow morning. I have the e1000 path done; a few features from the 2.6.18 missing (mainly the one mucking with tx ring pruning on the tx path). While it compiles and looks right - i havent tested it

Re: [RFC] New driver API to speed up small packets xmits

2007-05-13 Thread Andi Kleen
On Friday 11 May 2007 13:16:44 Roland Dreier wrote: I wasn't talking about sending. But there actually is :- TSO/GSO. As I said before, getting multiple packets in one call to xmit would be nice for amortizing per-xmit overhead in IPoIB. So it would be nice if the cases where the

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Gagan Arneja
Krishna Kumar2 wrote: What about a race between trying to reacquire queue_lock and another failed transmit? That is not possible too. I hold the QDISC_RUNNING bit in dev-state and am the only sender for this device, so there is no other failed transmit. Also, on failure of dev_hard_start_xmit,

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Krishna Kumar2
Hi Dave, David Miller [EMAIL PROTECTED] wrote on 05/11/2007 02:27:07 AM: I don't understand how transmitting already batched up packets in one go introduce latency. Keep thinking :-) The only case where these ideas can be seriously considered is during netif_wake_queue(). In all other

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Krishna Kumar2
Hi Gagan, Gagan Arneja [EMAIL PROTECTED] wrote on 05/11/2007 11:27:54 AM: Right, but I am the sole dequeue'r, and on failure, I requeue those packets to the beginning of the queue (just as it would happen in the regular case of one packet xmit/failure/requeue). What about a race

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Krishna Kumar2
(Mistaken didn't reply-all previous time) Hi Dave, David Stevens [EMAIL PROTECTED] wrote on 05/11/2007 02:57:56 AM: The word small is coming up a lot in this discussion, and I think packet size really has nothing to do with it. Multiple streams generating packets of any size would benefit;

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Krishna Kumar2
Hi Gagan, I have to claim incomplete familiarity for the code. But still, if you're out there running with no locks for a period, there's no assumption you can make. The lock could be held quickly assertion is a fallacy. I will try to explain since the code is pretty complicated. Packets

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Krishna Kumar2
Hi Roland, Roland Dreier [EMAIL PROTECTED] wrote on 05/11/2007 01:51:50 AM: This is pretty interesting to me for IP-over-InfiniBand, for a couple of reasons. First of all, I can push multiple send requests to the underlying adapter in one go, which saves taking and dropping the same lock

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Andi Kleen
Krishna Kumar [EMAIL PROTECTED] writes: Doing some measurements, I found that for small packets like 128 bytes, the bandwidth is approximately 60% of the line speed. To possibly speed up performance of small packet xmits, a method of linking skbs was thought of - where two pointers

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Krishna Kumar2
Hi Andy, [EMAIL PROTECTED] wrote on 05/11/2007 02:35:05 PM: You don't need that. You can just use the normal next/prev pointers. In general it's a good idea to lower lock overhead etc., the VM has used similar tricks very successfully in the past. Does this mean each skb should be for the

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Krishna Kumar2
Hi Andy, Andi Kleen [EMAIL PROTECTED] wrote on 05/11/2007 03:07:14 PM: But without it aggregation on RX is much less useful because the packets cannot be kept together after socket demux which happens relatively early in the packet processing path. Then I misunderstood you, my proposal is

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Evgeniy Polyakov
On Fri, May 11, 2007 at 10:34:22AM +0530, Krishna Kumar2 ([EMAIL PROTECTED]) wrote: Not combining packets, I am sending them out in the same sequence it was queued. If the xmit failed, the driver's new API returns the skb which failed to be sent. This skb and all other linked skbs are

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Krishna Kumar2
Hi Evgeniy, Evgeniy Polyakov [EMAIL PROTECTED] wrote on 05/11/2007 02:31:38 PM: On Fri, May 11, 2007 at 10:34:22AM +0530, Krishna Kumar2 ([EMAIL PROTECTED]) wrote: Not combining packets, I am sending them out in the same sequence it was queued. If the xmit failed, the driver's new API

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Krishna Kumar2
Evgeniy Polyakov [EMAIL PROTECTED] wrote on 05/11/2007 03:02:02 PM: On Fri, May 11, 2007 at 02:48:14PM +0530, Krishna Kumar2 ([EMAIL PROTECTED]) wrote: And what if you have thousand(s) of packets queued and first one has failed, requeing all the rest one-by-one is not a solution. If it is

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Evgeniy Polyakov
On Fri, May 11, 2007 at 03:22:13PM +0530, Krishna Kumar2 ([EMAIL PROTECTED]) wrote: No locks, no requeues? Seems simple imho. I will analyze this in more detail when I return (leaving just now, so got really no time). The only issue that I see quickly is No locks, since to get things off

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Krishna Kumar2
Hi all, Very preliminary testing with 20 procs on E1000 driver gives me following result: skbszOrg BW New BW % Org demand New Demand % 32 315.98 347.489.97% 21090 20958 0.62% 96

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Evgeniy Polyakov
On Fri, May 11, 2007 at 02:48:14PM +0530, Krishna Kumar2 ([EMAIL PROTECTED]) wrote: And what if you have thousand(s) of packets queued and first one has failed, requeing all the rest one-by-one is not a solution. If it is being done under heavy lock (with disabled irqs especially) it

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Roland Dreier
Sounds a good idea. I had a question on error handling. What happens if the driver asynchronously returns an error for this WR (single WR containing multiple skbs) ? Does it mean all the skbs failed to be sent ? Requeuing all of them is a bad idea since it leads to infinitely doing the

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread jamal
On Fri, 2007-11-05 at 10:52 +0530, Krishna Kumar2 wrote: I didn't try to optimize the driver to take any real advantage, I coded it as simply as : top: next = skb-skb_flink; Original driver code here, or another option is to remove the locking and put it before the

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread jamal
On Fri, 2007-11-05 at 13:56 +0400, Evgeniy Polyakov wrote: I meant no locks during processing of the packets (pci read/write, dma setup and so on), of course it is needed to dequeue a packet, but only for that operation. I dont think you can avoid the lock Evgeniy. You need to protect against

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Roland Dreier
I wasn't talking about sending. But there actually is :- TSO/GSO. As I said before, getting multiple packets in one call to xmit would be nice for amortizing per-xmit overhead in IPoIB. So it would be nice if the cases where the stack does GSO ended up passing all the segments into the

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread Evgeniy Polyakov
On Fri, May 11, 2007 at 07:30:02AM -0400, jamal ([EMAIL PROTECTED]) wrote: I meant no locks during processing of the packets (pci read/write, dma setup and so on), of course it is needed to dequeue a packet, but only for that operation. I dont think you can avoid the lock Evgeniy. You

Re: [RFC] New driver API to speed up small packets xmits

2007-05-11 Thread jamal
On Fri, 2007-11-05 at 15:53 +0400, Evgeniy Polyakov wrote: As I said there might be another lock, if interrupt handler is shared, or registers are accessed, but it is privite driver's business, which has nothing in common with stack itself. Ok, we are saying the same thing then. eg in e1000

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Gagan Arneja
Right, but I am the sole dequeue'r, and on failure, I requeue those packets to the beginning of the queue (just as it would happen in the regular case of one packet xmit/failure/requeue). What about a race between trying to reacquire queue_lock and another failed transmit? -- Gagan - KK

[RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Krishna Kumar
Hi all, While looking at common packet sizes on xmits, I found that most of the packets are small. On my personal system, the statistics of packets after using (browsing, mail, ftp'ing two linux kernels from www.kernel.org) for about 6 hours is :

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Evgeniy Polyakov
On Thu, May 10, 2007 at 08:23:51PM +0530, Krishna Kumar ([EMAIL PROTECTED]) wrote: The reason to implement the same was to speed up IPoIB driver. But before doing that, a proof of concept for E1000/AMSO drivers was considered (as most of the code is generic) before implementing for IPoIB. I

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Krishna Kumar2
Hi Evgeniy, Evgeniy Polyakov [EMAIL PROTECTED] wrote on 05/10/2007 08:38:33 PM: On Thu, May 10, 2007 at 08:23:51PM +0530, Krishna Kumar ([EMAIL PROTECTED]) wrote: The reason to implement the same was to speed up IPoIB driver. But before doing that, a proof of concept for E1000/AMSO drivers

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Evgeniy Polyakov
On Thu, May 10, 2007 at 08:52:12PM +0530, Krishna Kumar2 ([EMAIL PROTECTED]) wrote: The reason to implement the same was to speed up IPoIB driver. But before doing that, a proof of concept for E1000/AMSO drivers was considered (as most of the code is generic) before implementing for

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread jamal
On Thu, 2007-10-05 at 19:48 +0400, Evgeniy Polyakov wrote: IMHO if you do not see in profile anything related to driver's xmit function, it does not require to be fixed. True, but i think there may be value in amortizing the cost towards the driver. i.e If you grab a lock and send X packets

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Rick Jones
It is the reverse - GSO will segment one super-packet just before calling the driver so that the stack is traversed only once. In my case, I am trying to send out multiple skbs, possibly small packets, in one shot. GSO will not help for small packets. If there are small packets that implies

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Sridhar Samudrala
On Thu, 2007-05-10 at 10:19 -0700, Rick Jones wrote: It is the reverse - GSO will segment one super-packet just before calling the driver so that the stack is traversed only once. In my case, I am trying to send out multiple skbs, possibly small packets, in one shot. GSO will not help for

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Vlad Yasevich
Rick Jones wrote: It is the reverse - GSO will segment one super-packet just before calling the driver so that the stack is traversed only once. In my case, I am trying to send out multiple skbs, possibly small packets, in one shot. GSO will not help for small packets. If there are small

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Rick Jones
Vlad Yasevich wrote: Rick Jones wrote: It is the reverse - GSO will segment one super-packet just before calling the driver so that the stack is traversed only once. In my case, I am trying to send out multiple skbs, possibly small packets, in one shot. GSO will not help for small packets.

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Vlad Yasevich
Rick Jones wrote: Vlad Yasevich wrote: Rick Jones wrote: It is the reverse - GSO will segment one super-packet just before calling the driver so that the stack is traversed only once. In my case, I am trying to send out multiple skbs, possibly small packets, in one shot. GSO will not help

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Rick Jones
Not sure if DCCP might fall into this category as well... I think the idea of this patch is gather some number of these small packets and shove them at the driver in one go instead of each small packet at a time. This reminds me... (rick starts waxing rhapshodic about old HP-UX behviour :)

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Ian McDonald
On 5/11/07, Vlad Yasevich [EMAIL PROTECTED] wrote: May be for TCP? What about other protocols? There are other protocols?-) True, UDP, and I suppose certain modes of SCTP might be sending streams of small packets, as might TCP with TCP_NODELAY set. Do they often queue-up outside the

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Vlad Yasevich
Ian McDonald wrote: On 5/11/07, Vlad Yasevich [EMAIL PROTECTED] wrote: May be for TCP? What about other protocols? There are other protocols?-) True, UDP, and I suppose certain modes of SCTP might be sending streams of small packets, as might TCP with TCP_NODELAY set. Do they

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Ian McDonald
On 5/11/07, Vlad Yasevich [EMAIL PROTECTED] wrote: The win might be biggest on a system were a lot of applications send a lot of small packets. Some number will aggregate in the prio queue and then get shoved into a driver in one go. That's assuming that the device doesn't run out of things

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Gagan Arneja
small packets belonging to the same connection could be coalesced by TCP, but this may help the case where multiple parallel connections are sending small packets. It's not just small packets. The cost of calling hard_start_xmit/byte was rather high on your particular device. I've seen PCI

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread jamal
The discussion seems to have steered into protocol coalescing. My tests for example were related to forwarding and not specific to any protocol. On Thu, 2007-10-05 at 12:43 -0700, Gagan Arneja wrote: It's not just small packets. The cost of calling hard_start_xmit/byte was rather high on

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Rick Jones
jamal wrote: The discussion seems to have steered into protocol coalescing. My tests for example were related to forwarding and not specific to any protocol. Just the natural tendency of end-system types to think of end-system things rather than router things. rick jones - To unsubscribe

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Gagan Arneja
jamal wrote: You would need to almost re-write the driver to make sure it does IO which is taking advantage of the batching. Really! It's just the transmit routine. How radical can that be? -- Gagan - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread jamal
On Thu, 2007-10-05 at 13:14 -0700, Rick Jones wrote: Just the natural tendency of end-system types to think of end-system things rather than router things. Well router types felt they were being left out ;- cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread jamal
On Thu, 2007-10-05 at 13:15 -0700, Gagan Arneja wrote: Really! It's just the transmit routine. How radical can that be? Ok, you have a point there, but it could be challenging with many tunables: For example: my biggest challenge with the e1000 was just hacking up the DMA setup path - i seem

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Roland Dreier
This is pretty interesting to me for IP-over-InfiniBand, for a couple of reasons. First of all, I can push multiple send requests to the underlying adapter in one go, which saves taking and dropping the same lock and also probably allows fewer MMIO writes for doorbells. However the second reason

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Gagan Arneja
For example: my biggest challenge with the e1000 was just hacking up the DMA setup path - i seem to get better numbers if i dont kick the DMA until i stash all the packets on the ring first etc. It seemed counter-intuitive. That seems to make sense. The rings are(?) in system memory and you

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread David Miller
From: Vlad Yasevich [EMAIL PROTECTED] Date: Thu, 10 May 2007 15:21:30 -0400 The win might be biggest on a system were a lot of applications send a lot of small packets. Some number will aggregate in the prio queue and then get shoved into a driver in one go. But... this is all conjecture

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread David Miller
From: Gagan Arneja [EMAIL PROTECTED] Date: Thu, 10 May 2007 12:43:53 -0700 It's not just small packets. The cost of calling hard_start_xmit/byte was rather high on your particular device. I've seen PCI read transaction in hard_start_xmit taking ~10,000 cycles on one particular device.

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Gagan Arneja
David Miller wrote: If the qdisc is packed with packets and we would just loop sending them to the device, yes it might make sense. But if that isn't the case, which frankly is the usual case, you add a non-trivial amount of latency by batching and that's bad exactly for the kind of

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Rick Jones
David Miller wrote: From: Vlad Yasevich [EMAIL PROTECTED] Date: Thu, 10 May 2007 15:21:30 -0400 The win might be biggest on a system were a lot of applications send a lot of small packets. Some number will aggregate in the prio queue and then get shoved into a driver in one go. But... this

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread David Miller
From: Gagan Arneja [EMAIL PROTECTED] Date: Thu, 10 May 2007 13:40:22 -0700 David Miller wrote: If the qdisc is packed with packets and we would just loop sending them to the device, yes it might make sense. But if that isn't the case, which frankly is the usual case, you add a

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread David Miller
From: Rick Jones [EMAIL PROTECTED] Date: Thu, 10 May 2007 13:49:44 -0700 I'd think one would only do this in those situations/places where a natural out of driver queue develops in the first place wouldn't one? Indeed. - To unsubscribe from this list: send the line unsubscribe netdev in the

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Gagan Arneja
David Miller wrote: From: Rick Jones [EMAIL PROTECTED] Date: Thu, 10 May 2007 13:49:44 -0700 I'd think one would only do this in those situations/places where a natural out of driver queue develops in the first place wouldn't one? Indeed. And one builds in qdisc because your device sink is

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread David Stevens
The word small is coming up a lot in this discussion, and I think packet size really has nothing to do with it. Multiple streams generating packets of any size would benefit; the key ingredient is a queue length greater than 1. I think the intent is to remove queue lock cycles by taking the whole

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread David Miller
From: David Stevens [EMAIL PROTECTED] Date: Thu, 10 May 2007 14:27:56 -0700 The word small is coming up a lot in this discussion, and I think packet size really has nothing to do with it. Multiple streams generating packets of any size would benefit; the key ingredient is a queue length

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Eric Dumazet
David Stevens a écrit : The word small is coming up a lot in this discussion, and I think packet size really has nothing to do with it. Multiple streams generating packets of any size would benefit; the key ingredient is a queue length greater than 1. I think the intent is to remove queue lock

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Rick Jones
David Stevens wrote: The word small is coming up a lot in this discussion, and I think packet size really has nothing to do with it. Multiple streams generating packets of any size would benefit; the key ingredient is a queue length greater than 1. I think the intent is to remove queue lock

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Gagan Arneja
David Miller wrote: Right. But I think it's critical to do two things: 1) Do this when netif_wake_queue() is triggers and thus the TX is locked already. 2) Have some way for the driver to say how many free TX slots there are in order to minimize if not eliminate requeueing during

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread David Stevens
Which worked _very_ well (the whole list) going in the other direction for the netisr queue(s) in HP-UX 10.20. OK, I promise no more old HP-UX stories for the balance of the week :) Yes, OSes I worked on in other lives usually took the whole queue and then took responsibility for

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread David Miller
From: Gagan Arneja [EMAIL PROTECTED] Date: Thu, 10 May 2007 14:50:19 -0700 David Miller wrote: If you drop the TX lock, the number of free slots can change as another cpu gets in there queuing packets. Can you ever have more than one thread inside the driver? Isn't xmit_lock held

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Rick Jones
Eric Dumazet wrote: David Stevens a écrit : The word small is coming up a lot in this discussion, and I think packet size really has nothing to do with it. Multiple streams generating packets of any size would benefit; the key ingredient is a queue length greater than 1. I think the intent is

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Stephen Hemminger
On Thu, 10 May 2007 14:14:05 -0700 Gagan Arneja [EMAIL PROTECTED] wrote: David Miller wrote: From: Rick Jones [EMAIL PROTECTED] Date: Thu, 10 May 2007 13:49:44 -0700 I'd think one would only do this in those situations/places where a natural out of driver queue develops in the first

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Gagan Arneja
If you have braindead slow hardware, there is nothing that says your start_xmit routine can't do its own coalescing. The cost of calling the transmit routine is the responsibility of the driver, not the core network code. Yes, except you very likely run the risk of the driver introducing

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Krishna Kumar2
Ian McDonald [EMAIL PROTECTED] wrote on 05/11/2007 12:29:08 AM: As I see this proposed patch it is about reducing the number of task switches between the driver and the protocol. I use task switch in speech marks as it isn't really as is in the kernel. So in other words we are hoping that

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Krishna Kumar2
Gagan Arneja [EMAIL PROTECTED] wrote on 05/11/2007 01:13:53 AM: Also, I think, you don't have to chain skbs, they're already chained in Qdisc-q. All you have to do is take the whole q and try to shove it at the device hoping for better results. But then, if you have rather big backlog, you

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Krishna Kumar2
J Hadi Salim [EMAIL PROTECTED] wrote on 05/11/2007 01:41:27 AM: It's not just small packets. The cost of calling hard_start_xmit/byte was rather high on your particular device. I've seen PCI read transaction in hard_start_xmit taking ~10,000 cycles on one particular device. Count the

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Krishna Kumar2
David Miller [EMAIL PROTECTED] wrote on 05/11/2007 02:07:10 AM: From: Gagan Arneja [EMAIL PROTECTED] Date: Thu, 10 May 2007 12:43:53 -0700 Also, I think, you don't have to chain skbs, they're already chained in Qdisc-q. All you have to do is take the whole q and try to shove it at

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Gagan Arneja
Krishna Kumar2 wrote: I haven't seen reordering packets (I did once when I was having a bug in the requeue code, some TCP messages on receiver indicating packets out of order). When a send fails, the packet are requeued in reverse (go to end of the failed skb and traverse back to the failed skb

Re: [RFC] New driver API to speed up small packets xmits

2007-05-10 Thread Krishna Kumar2
Gagan Arneja [EMAIL PROTECTED] wrote on 05/11/2007 11:05:47 AM: Krishna Kumar2 wrote: I haven't seen reordering packets (I did once when I was having a bug in the requeue code, some TCP messages on receiver indicating packets out of order). When a send fails, the packet are requeued in