Re: Who's allowed to set a skb destructor?

2007-07-06 Thread Jarek Poplawski
On Thu, Jul 05, 2007 at 03:06:40PM +0200, Andi Kleen wrote:
 On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski wrote:
  I wonder if it's very unsound to think about a one way list
  of destructors. Of course, not owners could only clean their
  private allocations. Woudn't this save some skb clonning,
  copying or adding new fields for private infos?
 
 skb cloning isn't very expensive when you need it. And they
 got a little private area you can use for your own stuff 
 while you have it queued (skb-cb) 
 
 As a historical note one of the big changes during the Linux 2.0
 and 2.1 TCP rewrite was that TCP was changed to always clone for the
 retransmit queue. This cleaned up the code greatly and fixed
 many problems. Cloning was also especially optimized for this. When TCP 
 which is about one of the most performance critical protocols around can 
 afford it likely other code can too.

I've thought about this a bit more, and, if I don't miss something,
there is a possibility to use these things together: let's imagine
such simplified api:

- a driver which needs a bit of space to track skbs in a few places,
  registers itself with some function telling the size, maybe a
  callback/destructor and maybe a protocol id; some index is returned;
- if this is the first one registered, api allocates new space using
  skb clonning or some similar slab pool, to get blank space, and
  reserves space for this driver according to the index (internally
  mapped to some offset); since this moment every new skb is
  automatically 'cloned' and the driver can read/write its place using
  the api to map the requests;
- next registered drivers use the same 'clone', unless there is no
  more space, so next 'clones' are generated;
- the lifetime of such 'clones' is controlled similarly to the 'real 
  clones'; with the most basic version destructors could be avoided;
- some indexes could be made public constants to allow sharing.

Is this wrong?

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who's allowed to set a skb destructor?

2007-07-06 Thread Jarek Poplawski
On Thu, Jul 05, 2007 at 04:28:47PM +0400, Evgeniy Polyakov wrote:
 Hi, Jarek.
 
 On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski ([EMAIL PROTECTED]) 
 wrote:
  I wonder if it's very unsound to think about a one way list
  of destructors. Of course, not owners could only clean their
  private allocations. Woudn't this save some skb clonning,
  copying or adding new fields for private infos?
 
 There should not be any additional allocations, since they are very
 slow, that part of mbuf is really horrible for performance - openbsd
 hackers removed additional allocation of mbuf tag in PF code during the
 last hackathon, which doubled its performance, that is why skb has only 
 one control structure and data area, which incorporates additional 
 control information, thus there is no need for multiple destructors.

I'd like to add a few words about performance-way-thinking.

Some time ago I've read mainly networking/admins lists. One of the
most often questions was: what should I choose linux or bsd? And
very often bsd was praised for better performance, but almost
always linux was advised as more universal (even by people who
said they use both).

BSDs were sometimes recommended for specific jobs like mail etc.
but usually linux better fitted the needs. Especially well linux
appeared for an internet gateway/router/firewall/antispam thing,
and the main reasons were: netfilter with additional, unofficial
patches e.g. l-7 filtering and imq. BSD was no option here.

Some time later, reading this list, I've found many people almost
hate netfilter for performance. You can imagine how l-7 adds to
this performance. IMQ isn't even mentioned here - looks like
some dirty word (lack of programmers affects it's quality and
doesn't help linux too). But it's nothing near performance too.
I can also remember quite a lot of questions like: how can I avoid
tc/ip and do this with netfilter only?

Probably the most of the readers/writers were small or middle
networks admins (but quite often servicing hundreds or thousans
boxes too), probably not always advanced enough, but you know what,
they made 99% of interested. So, I understand something could've
changed with voip, and there are high performance linux servers
too (their admins have never heard of imq), and probably thinking
about them could pay off better, but there could be some cost
of such thinking too.

Regards,
Jarek P.

PS: in my opinion lack of linux performance wasn't even the second
most often asked question there; rather this: why my new  beautiful
linux box sometimes lockups?
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who's allowed to set a skb destructor?

2007-07-06 Thread Jarek Poplawski
On Fri, Jul 06, 2007 at 11:08:35AM +0200, Jarek Poplawski wrote:
...
 BSDs were sometimes recommended for specific jobs like mail etc.
 but usually linux better fitted the needs. Especially well linux
 appeared for an internet gateway/router/firewall/antispam thing,
 and the main reasons were: netfilter with additional, unofficial
 patches e.g. l-7 filtering and imq. BSD was no option here.

I've forgotten to mention two other performance boosters which
very often completed these solutions: htb or hfsc.

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who's allowed to set a skb destructor?

2007-07-05 Thread Andi Kleen
Brice Goglin [EMAIL PROTECTED] writes:

 I am trying to understand whether I can setup a skb destructor in my
 code (which is basically a protocol above dev_queue_xmit() and co). From
 what I see in many parts in the current kernel code, the protocol (I
 mean, the one who actually creates the skb) may setup a destructor.

The socket layer generally needs it for its own accounting.
Unless you never pass it up you can't use it.

 However, I also see some places where some low-level drivers might be
 using a destructor too , without apparently checking whether an upper
 layer already uses one. For instance, write_ofld_wr() in cxgb3/sge.c.

Likely a bug. Normally that should not slip past code review.

 found some old threads about adding support for multiple destructors but
 I don't see anything like this in the current kernel.
 
 So, I'd like to have a clear statement about who's allowed to use a
 destructor :)

The traditional standpoint was that having your own large skb pools 
is not recommended because you won't interact well with the 
rest of the system running low on memory and you tieing up 
memory.

Essentially you would recreate all the problems traditional Unix
systems have with fixed size mbuf pools. Linux always used a more
dynamic and flexible allocate-only-as-you-need approach even when it
can have a little more overhead in managing IOMMUs etc.

These days there are shrinker callbacks that would in theory
allow you to handle this, but it would be likely still hard to implement
correctly.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who's allowed to set a skb destructor?

2007-07-05 Thread Divy Le Ray

Andi Kleen wrote:

Brice Goglin [EMAIL PROTECTED] writes:

  

I am trying to understand whether I can setup a skb destructor in my
code (which is basically a protocol above dev_queue_xmit() and co). From
what I see in many parts in the current kernel code, the protocol (I
mean, the one who actually creates the skb) may setup a destructor.



The socket layer generally needs it for its own accounting.
Unless you never pass it up you can't use it.

  

However, I also see some places where some low-level drivers might be
using a destructor too , without apparently checking whether an upper
layer already uses one. For instance, write_ofld_wr() in cxgb3/sge.c.



Likely a bug. Normally that should not slip past code review.
  


Andi,

The destructor method is set and used for skbs originating from the RDMA 
driver sitting above cxgb3.


The patch introducing this code was discussed at the time.
http://marc.info/?l=linux-netdevm=117029329230969w=2

Cheers,
Divy
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who's allowed to set a skb destructor?

2007-07-05 Thread Jarek Poplawski
On 05-07-2007 12:08, Andi Kleen wrote:
...
 The traditional standpoint was that having your own large skb pools 
 is not recommended because you won't interact well with the 
 rest of the system running low on memory and you tieing up 
 memory.
 
 Essentially you would recreate all the problems traditional Unix
 systems have with fixed size mbuf pools. Linux always used a more
 dynamic and flexible allocate-only-as-you-need approach even when it
 can have a little more overhead in managing IOMMUs etc.

I wonder if it's very unsound to think about a one way list
of destructors. Of course, not owners could only clean their
private allocations. Woudn't this save some skb clonning,
copying or adding new fields for private infos?

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who's allowed to set a skb destructor?

2007-07-05 Thread Evgeniy Polyakov
Hi, Jarek.

On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski ([EMAIL PROTECTED]) 
wrote:
 I wonder if it's very unsound to think about a one way list
 of destructors. Of course, not owners could only clean their
 private allocations. Woudn't this save some skb clonning,
 copying or adding new fields for private infos?

There should not be any additional allocations, since they are very
slow, that part of mbuf is really horrible for performance - openbsd
hackers removed additional allocation of mbuf tag in PF code during the
last hackathon, which doubled its performance, that is why skb has only 
one control structure and data area, which incorporates additional 
control information, thus there is no need for multiple destructors.

 Regards,
 Jarek P.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who's allowed to set a skb destructor?

2007-07-05 Thread Jarek Poplawski
On Thu, Jul 05, 2007 at 04:28:47PM +0400, Evgeniy Polyakov wrote:
 Hi, Jarek.
 
 On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski ([EMAIL PROTECTED]) 
 wrote:
  I wonder if it's very unsound to think about a one way list
  of destructors. Of course, not owners could only clean their
  private allocations. Woudn't this save some skb clonning,
  copying or adding new fields for private infos?
 
 There should not be any additional allocations, since they are very
 slow, that part of mbuf is really horrible for performance - openbsd
 hackers removed additional allocation of mbuf tag in PF code during the
 last hackathon, which doubled its performance, that is why skb has only 
 one control structure and data area, which incorporates additional 
 control information, thus there is no need for multiple destructors.

Of course, my knowledge of this is far not enough, and maybe
I got this reversed, but from Andi's words I've understood
that linux prefers another (mixed) approach, so I've thought
such list should be a consequence...

Thanks,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who's allowed to set a skb destructor?

2007-07-05 Thread Andi Kleen
On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski wrote:
 I wonder if it's very unsound to think about a one way list
 of destructors. Of course, not owners could only clean their
 private allocations. Woudn't this save some skb clonning,
 copying or adding new fields for private infos?

skb cloning isn't very expensive when you need it. And they
got a little private area you can use for your own stuff 
while you have it queued (skb-cb) 

As a historical note one of the big changes during the Linux 2.0
and 2.1 TCP rewrite was that TCP was changed to always clone for the
retransmit queue. This cleaned up the code greatly and fixed
many problems. Cloning was also especially optimized for this. When TCP 
which is about one of the most performance critical protocols around can 
afford it likely other code can too.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who's allowed to set a skb destructor?

2007-07-05 Thread Andi Kleen
 The destructor method is set and used for skbs originating from the RDMA 
 driver sitting above cxgb3.

If these skbs never reach the normal sockets based stack it might be ok.

-Andi
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who's allowed to set a skb destructor?

2007-07-05 Thread Jarek Poplawski
On Thu, Jul 05, 2007 at 03:06:40PM +0200, Andi Kleen wrote:
 On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski wrote:
  I wonder if it's very unsound to think about a one way list
  of destructors. Of course, not owners could only clean their
  private allocations. Woudn't this save some skb clonning,
  copying or adding new fields for private infos?
 
 skb cloning isn't very expensive when you need it. And they
 got a little private area you can use for your own stuff 
 while you have it queued (skb-cb) 

Not expensive in speed, but allocating size_of skb when
you e.g. need 2 or 3 integers looks like a little expensive.

 
 As a historical note one of the big changes during the Linux 2.0
 and 2.1 TCP rewrite was that TCP was changed to always clone for the
 retransmit queue. This cleaned up the code greatly and fixed
 many problems. Cloning was also especially optimized for this. When TCP 
 which is about one of the most performance critical protocols around can 
 afford it likely other code can too.

I've read opinions that current skb structure is far from
optimal. So, it seems clonnig wasn't enough in many situations,
and fiels were added. Of course, it's only a part of the story:
some other clients couldn't think about the structure changed
for them, so probably made it other, more expensive way?
 
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Who's allowed to set a skb destructor?

2007-07-04 Thread Brice Goglin
Hi,

I am trying to understand whether I can setup a skb destructor in my
code (which is basically a protocol above dev_queue_xmit() and co). From
what I see in many parts in the current kernel code, the protocol (I
mean, the one who actually creates the skb) may setup a destructor.

However, I also see some places where some low-level drivers might be
using a destructor too , without apparently checking whether an upper
layer already uses one. For instance, write_ofld_wr() in cxgb3/sge.c. I
found some old threads about adding support for multiple destructors but
I don't see anything like this in the current kernel.

So, I'd like to have a clear statement about who's allowed to use a
destructor :)

Thanks,
Brice

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Who's allowed to set a skb destructor?

2007-07-04 Thread Evgeniy Polyakov
On Wed, Jul 04, 2007 at 10:04:54AM +0200, Brice Goglin ([EMAIL PROTECTED]) 
wrote:
 So, I'd like to have a clear statement about who's allowed to use a
 destructor :)

That one who allocates skb - if it is socket layer, it sets own
socket destructor, netlink has own too and so on.

 Thanks,
 Brice

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html