Re: Who's allowed to set a skb destructor?
On Thu, Jul 05, 2007 at 03:06:40PM +0200, Andi Kleen wrote: On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski wrote: I wonder if it's very unsound to think about a one way list of destructors. Of course, not owners could only clean their private allocations. Woudn't this save some skb clonning, copying or adding new fields for private infos? skb cloning isn't very expensive when you need it. And they got a little private area you can use for your own stuff while you have it queued (skb-cb) As a historical note one of the big changes during the Linux 2.0 and 2.1 TCP rewrite was that TCP was changed to always clone for the retransmit queue. This cleaned up the code greatly and fixed many problems. Cloning was also especially optimized for this. When TCP which is about one of the most performance critical protocols around can afford it likely other code can too. I've thought about this a bit more, and, if I don't miss something, there is a possibility to use these things together: let's imagine such simplified api: - a driver which needs a bit of space to track skbs in a few places, registers itself with some function telling the size, maybe a callback/destructor and maybe a protocol id; some index is returned; - if this is the first one registered, api allocates new space using skb clonning or some similar slab pool, to get blank space, and reserves space for this driver according to the index (internally mapped to some offset); since this moment every new skb is automatically 'cloned' and the driver can read/write its place using the api to map the requests; - next registered drivers use the same 'clone', unless there is no more space, so next 'clones' are generated; - the lifetime of such 'clones' is controlled similarly to the 'real clones'; with the most basic version destructors could be avoided; - some indexes could be made public constants to allow sharing. Is this wrong? Regards, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Who's allowed to set a skb destructor?
On Thu, Jul 05, 2007 at 04:28:47PM +0400, Evgeniy Polyakov wrote: Hi, Jarek. On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski ([EMAIL PROTECTED]) wrote: I wonder if it's very unsound to think about a one way list of destructors. Of course, not owners could only clean their private allocations. Woudn't this save some skb clonning, copying or adding new fields for private infos? There should not be any additional allocations, since they are very slow, that part of mbuf is really horrible for performance - openbsd hackers removed additional allocation of mbuf tag in PF code during the last hackathon, which doubled its performance, that is why skb has only one control structure and data area, which incorporates additional control information, thus there is no need for multiple destructors. I'd like to add a few words about performance-way-thinking. Some time ago I've read mainly networking/admins lists. One of the most often questions was: what should I choose linux or bsd? And very often bsd was praised for better performance, but almost always linux was advised as more universal (even by people who said they use both). BSDs were sometimes recommended for specific jobs like mail etc. but usually linux better fitted the needs. Especially well linux appeared for an internet gateway/router/firewall/antispam thing, and the main reasons were: netfilter with additional, unofficial patches e.g. l-7 filtering and imq. BSD was no option here. Some time later, reading this list, I've found many people almost hate netfilter for performance. You can imagine how l-7 adds to this performance. IMQ isn't even mentioned here - looks like some dirty word (lack of programmers affects it's quality and doesn't help linux too). But it's nothing near performance too. I can also remember quite a lot of questions like: how can I avoid tc/ip and do this with netfilter only? Probably the most of the readers/writers were small or middle networks admins (but quite often servicing hundreds or thousans boxes too), probably not always advanced enough, but you know what, they made 99% of interested. So, I understand something could've changed with voip, and there are high performance linux servers too (their admins have never heard of imq), and probably thinking about them could pay off better, but there could be some cost of such thinking too. Regards, Jarek P. PS: in my opinion lack of linux performance wasn't even the second most often asked question there; rather this: why my new beautiful linux box sometimes lockups? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Who's allowed to set a skb destructor?
On Fri, Jul 06, 2007 at 11:08:35AM +0200, Jarek Poplawski wrote: ... BSDs were sometimes recommended for specific jobs like mail etc. but usually linux better fitted the needs. Especially well linux appeared for an internet gateway/router/firewall/antispam thing, and the main reasons were: netfilter with additional, unofficial patches e.g. l-7 filtering and imq. BSD was no option here. I've forgotten to mention two other performance boosters which very often completed these solutions: htb or hfsc. Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Who's allowed to set a skb destructor?
Brice Goglin [EMAIL PROTECTED] writes: I am trying to understand whether I can setup a skb destructor in my code (which is basically a protocol above dev_queue_xmit() and co). From what I see in many parts in the current kernel code, the protocol (I mean, the one who actually creates the skb) may setup a destructor. The socket layer generally needs it for its own accounting. Unless you never pass it up you can't use it. However, I also see some places where some low-level drivers might be using a destructor too , without apparently checking whether an upper layer already uses one. For instance, write_ofld_wr() in cxgb3/sge.c. Likely a bug. Normally that should not slip past code review. found some old threads about adding support for multiple destructors but I don't see anything like this in the current kernel. So, I'd like to have a clear statement about who's allowed to use a destructor :) The traditional standpoint was that having your own large skb pools is not recommended because you won't interact well with the rest of the system running low on memory and you tieing up memory. Essentially you would recreate all the problems traditional Unix systems have with fixed size mbuf pools. Linux always used a more dynamic and flexible allocate-only-as-you-need approach even when it can have a little more overhead in managing IOMMUs etc. These days there are shrinker callbacks that would in theory allow you to handle this, but it would be likely still hard to implement correctly. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Who's allowed to set a skb destructor?
Andi Kleen wrote: Brice Goglin [EMAIL PROTECTED] writes: I am trying to understand whether I can setup a skb destructor in my code (which is basically a protocol above dev_queue_xmit() and co). From what I see in many parts in the current kernel code, the protocol (I mean, the one who actually creates the skb) may setup a destructor. The socket layer generally needs it for its own accounting. Unless you never pass it up you can't use it. However, I also see some places where some low-level drivers might be using a destructor too , without apparently checking whether an upper layer already uses one. For instance, write_ofld_wr() in cxgb3/sge.c. Likely a bug. Normally that should not slip past code review. Andi, The destructor method is set and used for skbs originating from the RDMA driver sitting above cxgb3. The patch introducing this code was discussed at the time. http://marc.info/?l=linux-netdevm=117029329230969w=2 Cheers, Divy - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Who's allowed to set a skb destructor?
On 05-07-2007 12:08, Andi Kleen wrote: ... The traditional standpoint was that having your own large skb pools is not recommended because you won't interact well with the rest of the system running low on memory and you tieing up memory. Essentially you would recreate all the problems traditional Unix systems have with fixed size mbuf pools. Linux always used a more dynamic and flexible allocate-only-as-you-need approach even when it can have a little more overhead in managing IOMMUs etc. I wonder if it's very unsound to think about a one way list of destructors. Of course, not owners could only clean their private allocations. Woudn't this save some skb clonning, copying or adding new fields for private infos? Regards, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Who's allowed to set a skb destructor?
Hi, Jarek. On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski ([EMAIL PROTECTED]) wrote: I wonder if it's very unsound to think about a one way list of destructors. Of course, not owners could only clean their private allocations. Woudn't this save some skb clonning, copying or adding new fields for private infos? There should not be any additional allocations, since they are very slow, that part of mbuf is really horrible for performance - openbsd hackers removed additional allocation of mbuf tag in PF code during the last hackathon, which doubled its performance, that is why skb has only one control structure and data area, which incorporates additional control information, thus there is no need for multiple destructors. Regards, Jarek P. -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Who's allowed to set a skb destructor?
On Thu, Jul 05, 2007 at 04:28:47PM +0400, Evgeniy Polyakov wrote: Hi, Jarek. On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski ([EMAIL PROTECTED]) wrote: I wonder if it's very unsound to think about a one way list of destructors. Of course, not owners could only clean their private allocations. Woudn't this save some skb clonning, copying or adding new fields for private infos? There should not be any additional allocations, since they are very slow, that part of mbuf is really horrible for performance - openbsd hackers removed additional allocation of mbuf tag in PF code during the last hackathon, which doubled its performance, that is why skb has only one control structure and data area, which incorporates additional control information, thus there is no need for multiple destructors. Of course, my knowledge of this is far not enough, and maybe I got this reversed, but from Andi's words I've understood that linux prefers another (mixed) approach, so I've thought such list should be a consequence... Thanks, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Who's allowed to set a skb destructor?
On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski wrote: I wonder if it's very unsound to think about a one way list of destructors. Of course, not owners could only clean their private allocations. Woudn't this save some skb clonning, copying or adding new fields for private infos? skb cloning isn't very expensive when you need it. And they got a little private area you can use for your own stuff while you have it queued (skb-cb) As a historical note one of the big changes during the Linux 2.0 and 2.1 TCP rewrite was that TCP was changed to always clone for the retransmit queue. This cleaned up the code greatly and fixed many problems. Cloning was also especially optimized for this. When TCP which is about one of the most performance critical protocols around can afford it likely other code can too. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Who's allowed to set a skb destructor?
The destructor method is set and used for skbs originating from the RDMA driver sitting above cxgb3. If these skbs never reach the normal sockets based stack it might be ok. -Andi - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Who's allowed to set a skb destructor?
On Thu, Jul 05, 2007 at 03:06:40PM +0200, Andi Kleen wrote: On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski wrote: I wonder if it's very unsound to think about a one way list of destructors. Of course, not owners could only clean their private allocations. Woudn't this save some skb clonning, copying or adding new fields for private infos? skb cloning isn't very expensive when you need it. And they got a little private area you can use for your own stuff while you have it queued (skb-cb) Not expensive in speed, but allocating size_of skb when you e.g. need 2 or 3 integers looks like a little expensive. As a historical note one of the big changes during the Linux 2.0 and 2.1 TCP rewrite was that TCP was changed to always clone for the retransmit queue. This cleaned up the code greatly and fixed many problems. Cloning was also especially optimized for this. When TCP which is about one of the most performance critical protocols around can afford it likely other code can too. I've read opinions that current skb structure is far from optimal. So, it seems clonnig wasn't enough in many situations, and fiels were added. Of course, it's only a part of the story: some other clients couldn't think about the structure changed for them, so probably made it other, more expensive way? Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Who's allowed to set a skb destructor?
Hi, I am trying to understand whether I can setup a skb destructor in my code (which is basically a protocol above dev_queue_xmit() and co). From what I see in many parts in the current kernel code, the protocol (I mean, the one who actually creates the skb) may setup a destructor. However, I also see some places where some low-level drivers might be using a destructor too , without apparently checking whether an upper layer already uses one. For instance, write_ofld_wr() in cxgb3/sge.c. I found some old threads about adding support for multiple destructors but I don't see anything like this in the current kernel. So, I'd like to have a clear statement about who's allowed to use a destructor :) Thanks, Brice - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Who's allowed to set a skb destructor?
On Wed, Jul 04, 2007 at 10:04:54AM +0200, Brice Goglin ([EMAIL PROTECTED]) wrote: So, I'd like to have a clear statement about who's allowed to use a destructor :) That one who allocates skb - if it is socket layer, it sets own socket destructor, netlink has own too and so on. Thanks, Brice -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html