Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-09 Thread David Miller
From: Hans Westgaard Ry 
Date: Wed,  3 Feb 2016 09:26:57 +0100

> Devices may have limits on the number of fragments in an skb they support.
> Current codebase uses a constant as maximum for number of fragments one
> skb can hold and use.
> When enabling scatter/gather and running traffic with many small messages
> the codebase uses the maximum number of fragments and may thereby violate
> the max for certain devices.
> The patch introduces a global variable as max number of fragments.
> 
> Signed-off-by: Hans Westgaard Ry 
> Reviewed-by: Håkon Bugge 

I know some people don't like this patch, but no better solution exists
at this time.

Like others, I'd personally would rather this be a per-device attribute
but that currently would not work at all.

The device that TCP and other elements see when the build packets is
not necessarily the one that is going to send the frame.  Encapsulation
and other structures hide the truely transmitting device.

And we lack a foolproof way to propagate attributes like this through
the stack of devices up to the top.

So for now this is what we have to use, as unfortunate as it may be.

If someone is suitably angry about this state of affairs, I encourage
them to direct that energy at a better long term solution :-)

Applied and queued up for -stable, thanks.


Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Herbert Xu
On Wed, Feb 03, 2016 at 09:26:57AM +0100, Hans Westgaard Ry wrote:
> Devices may have limits on the number of fragments in an skb they support.
> Current codebase uses a constant as maximum for number of fragments one
> skb can hold and use.
> When enabling scatter/gather and running traffic with many small messages
> the codebase uses the maximum number of fragments and may thereby violate
> the max for certain devices.
> The patch introduces a global variable as max number of fragments.
> 
> Signed-off-by: Hans Westgaard Ry 
> Reviewed-by: Håkon Bugge 

I have to say this seems rather dirty.  I mean if taken to the
extreme wouldn't this mean that we should disable frags altogether
if some NIC can't handle them at all?

Someone suggested earlier to partially linearise the skb, why
couldn't we do that? IOW let's handle this craziness in the crazy
drivers and not in the general stack.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Hannes Frederic Sowa

On 03.02.2016 12:25, Herbert Xu wrote:

On Wed, Feb 03, 2016 at 09:26:57AM +0100, Hans Westgaard Ry wrote:

Devices may have limits on the number of fragments in an skb they support.
Current codebase uses a constant as maximum for number of fragments one
skb can hold and use.
When enabling scatter/gather and running traffic with many small messages
the codebase uses the maximum number of fragments and may thereby violate
the max for certain devices.
The patch introduces a global variable as max number of fragments.

Signed-off-by: Hans Westgaard Ry 
Reviewed-by: Håkon Bugge 


I have to say this seems rather dirty.  I mean if taken to the
extreme wouldn't this mean that we should disable frags altogether
if some NIC can't handle them at all?

Someone suggested earlier to partially linearise the skb, why
couldn't we do that? IOW let's handle this craziness in the crazy
drivers and not in the general stack.


Agreed that it feels like a hack, but a rather simple one. I would
consider this to be just a performance improvement. We certainly need
a slow-path when virtio drivers submit gso packets to the stack (and
already discussed with Hans). The sysctl can't help here. But without
the sysctl the packets would constantly hit the slow-path in case of
e.g. IPoIB and that would also be rather bad.

Thanks,
Hannes


[PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Hans Westgaard Ry
Devices may have limits on the number of fragments in an skb they support.
Current codebase uses a constant as maximum for number of fragments one
skb can hold and use.
When enabling scatter/gather and running traffic with many small messages
the codebase uses the maximum number of fragments and may thereby violate
the max for certain devices.
The patch introduces a global variable as max number of fragments.

Signed-off-by: Hans Westgaard Ry 
Reviewed-by: Håkon Bugge 

---
 include/linux/skbuff.h |  1 +
 net/core/skbuff.c  |  2 ++
 net/core/sysctl_net_core.c | 10 ++
 net/ipv4/tcp.c |  4 ++--
 4 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4355129..fe47ad3 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -219,6 +219,7 @@ struct sk_buff;
 #else
 #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 1)
 #endif
+extern int sysctl_max_skb_frags;
 
 typedef struct skb_frag_struct skb_frag_t;
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 152b9c7..c336b97 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -79,6 +79,8 @@
 
 struct kmem_cache *skbuff_head_cache __read_mostly;
 static struct kmem_cache *skbuff_fclone_cache __read_mostly;
+int sysctl_max_skb_frags __read_mostly = MAX_SKB_FRAGS;
+EXPORT_SYMBOL(sysctl_max_skb_frags);
 
 /**
  * skb_panic - private function for out-of-line support
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 95b6139..a6beb7b 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -26,6 +26,7 @@ static int zero = 0;
 static int one = 1;
 static int min_sndbuf = SOCK_MIN_SNDBUF;
 static int min_rcvbuf = SOCK_MIN_RCVBUF;
+static int max_skb_frags = MAX_SKB_FRAGS;
 
 static int net_msg_warn;   /* Unused, but still a sysctl */
 
@@ -392,6 +393,15 @@ static struct ctl_table net_core_table[] = {
.mode   = 0644,
.proc_handler   = proc_dointvec
},
+   {
+   .procname   = "max_skb_frags",
+   .data   = _max_skb_frags,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = ,
+   .extra2 = _skb_frags,
+   },
{ }
 };
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index c82cca1..3dc7a2fd 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -938,7 +938,7 @@ new_segment:
 
i = skb_shinfo(skb)->nr_frags;
can_coalesce = skb_can_coalesce(skb, i, page, offset);
-   if (!can_coalesce && i >= MAX_SKB_FRAGS) {
+   if (!can_coalesce && i >= sysctl_max_skb_frags) {
tcp_mark_push(tp, skb);
goto new_segment;
}
@@ -1211,7 +1211,7 @@ new_segment:
 
if (!skb_can_coalesce(skb, i, pfrag->page,
  pfrag->offset)) {
-   if (i == MAX_SKB_FRAGS || !sg) {
+   if (i == sysctl_max_skb_frags || !sg) {
tcp_mark_push(tp, skb);
goto new_segment;
}
-- 
2.4.3



Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Alexander Duyck
On Wed, Feb 3, 2016 at 11:23 AM, Eric Dumazet  wrote:
> On Wed, 2016-02-03 at 10:24 -0800, Alexander Duyck wrote:
>
>> If this is only meant to be a performance modification and is only
>> really targeted at TCP TSO/GRO then all I ask is that we use a name
>> like tcp_max_gso_frags and relocate the sysctl to the TCP section.
>> Otherwise if we are actually going to try to scope this out on a wider
>> level and limit all frags which is what the name implies then the
>> patch set needs to make a better attempt at covering all cases where
>> it may apply.
>
>
> This is the goal.
>
> Other skb providers (like tun and af_packet) will also use this optional
> limit.
>
> I fail to see why Hans should send a complete patch series.

You realize that conflicts with what anybody else would be told.  What
was provided in this patch is a half solution, and it may cause bigger
messes since it is unclear exactly how this sysctl is meant to be
used.

> We will send followup patches, as we always did.
>
> I will send the GRO change for example.
>
> So please keep a sysctl name _without_ TCP in it, it really has nothing
> to do with TCP.

In the end I am not the one you have to convince.  I have simply
stated my opinion, and I guess we will have to agree to disagree.  It
is entirely up to Dave if he wants to apply it or not.  I have slides
I need to work on for next week.. :-)

- Alex


Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Eric Dumazet
On Wed, 2016-02-03 at 20:20 +0800, Herbert Xu wrote:
> On Wed, Feb 03, 2016 at 12:36:21PM +0100, Hannes Frederic Sowa wrote:
> >
> > Agreed that it feels like a hack, but a rather simple one. I would
> > consider this to be just a performance improvement. We certainly need
> > a slow-path when virtio drivers submit gso packets to the stack (and
> > already discussed with Hans). The sysctl can't help here. But without
> > the sysctl the packets would constantly hit the slow-path in case of
> > e.g. IPoIB and that would also be rather bad.
> 
> So you want to penalise every NIC in the system if just one of
> them is broken? This is insane.  Just do the partial linearisation
> in that one driver that needs it and not only won't you have to
> penalise anyone else but you still get the best result for that
> driver that needs it.

No penalization :

- default is the optimal value

- TCP stack tends to build skb with 32KB frags anyway. It is very rare
to actually get to 17 frags per skb (pathological sendpage() with tiny
parts, or tiny write() on many sockets from one thread). 

> 
> Besides, you have to implement the linearisation anyway because
> of virtualisation.

Sure.

We use a similar patch here at Google, since bnx2x has in some cases a
limit of 13 frags per skb. This driver calls linearize which can fail
under memory fragmentation. TCP usually retransmits, so only effect of
failures is extra latencies.

I am actually okay with this patch.

Acked-by: Eric Dumazet 






Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Hannes Frederic Sowa

On 03.02.2016 13:20, Herbert Xu wrote:

On Wed, Feb 03, 2016 at 12:36:21PM +0100, Hannes Frederic Sowa wrote:


Agreed that it feels like a hack, but a rather simple one. I would
consider this to be just a performance improvement. We certainly need
a slow-path when virtio drivers submit gso packets to the stack (and
already discussed with Hans). The sysctl can't help here. But without
the sysctl the packets would constantly hit the slow-path in case of
e.g. IPoIB and that would also be rather bad.


So you want to penalise every NIC in the system if just one of
them is broken? This is insane.  Just do the partial linearisation
in that one driver that needs it and not only won't you have to
penalise anyone else but you still get the best result for that
driver that needs it.


Most normal Ethernet systems and drivers currently don't need tweating 
this knob at all, only some special kinds of installations. This patch 
referred to IPoIB as a possible user which drivers/firmware/cards seem 
to have this problem. Current behavior just leaves everything as-is.


If you use IPoIB you probably use it quite regular and linearizing an 
skbs *always* seems to be much more work than simply capping the number 
of frags globally.



Besides, you have to implement the linearisation anyway because
of virtualisation.


Yes, the slow-path is necessary. But instead of writing a new 
complicated linearizing function to just reduce the fragments we could 
also simply linearize it completely and ask the admin to also tune the 
vm guests.


I only see this tuning in kind in very specific environments where the 
admins now what they do.


Bye,
Hannes



Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Alexander Duyck
On Wed, Feb 3, 2016 at 12:26 AM, Hans Westgaard Ry
 wrote:
> Devices may have limits on the number of fragments in an skb they support.
> Current codebase uses a constant as maximum for number of fragments one
> skb can hold and use.
> When enabling scatter/gather and running traffic with many small messages
> the codebase uses the maximum number of fragments and may thereby violate
> the max for certain devices.
> The patch introduces a global variable as max number of fragments.
>
> Signed-off-by: Hans Westgaard Ry 
> Reviewed-by: Håkon Bugge 
>
> ---
>  include/linux/skbuff.h |  1 +
>  net/core/skbuff.c  |  2 ++
>  net/core/sysctl_net_core.c | 10 ++
>  net/ipv4/tcp.c |  4 ++--
>  4 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 4355129..fe47ad3 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -219,6 +219,7 @@ struct sk_buff;
>  #else
>  #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 1)
>  #endif
> +extern int sysctl_max_skb_frags;
>
>  typedef struct skb_frag_struct skb_frag_t;
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 152b9c7..c336b97 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -79,6 +79,8 @@
>
>  struct kmem_cache *skbuff_head_cache __read_mostly;
>  static struct kmem_cache *skbuff_fclone_cache __read_mostly;
> +int sysctl_max_skb_frags __read_mostly = MAX_SKB_FRAGS;
> +EXPORT_SYMBOL(sysctl_max_skb_frags);
>
>  /**
>   * skb_panic - private function for out-of-line support
> diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
> index 95b6139..a6beb7b 100644
> --- a/net/core/sysctl_net_core.c
> +++ b/net/core/sysctl_net_core.c

I really don't think these changes belong in the core. Below you only
modify the TCP code path so this more likely belongs in the TCP path
unless you are going to guarantee that all other code paths obey the
sysctl.  It probably belongs in net/ipv4/sysctl_net_ipv4.c

> @@ -26,6 +26,7 @@ static int zero = 0;
>  static int one = 1;
>  static int min_sndbuf = SOCK_MIN_SNDBUF;
>  static int min_rcvbuf = SOCK_MIN_RCVBUF;
> +static int max_skb_frags = MAX_SKB_FRAGS;
>
>  static int net_msg_warn;   /* Unused, but still a sysctl */
>
> @@ -392,6 +393,15 @@ static struct ctl_table net_core_table[] = {
> .mode   = 0644,
> .proc_handler   = proc_dointvec
> },
> +   {
> +   .procname   = "max_skb_frags",
> +   .data   = _max_skb_frags,
> +   .maxlen = sizeof(int),
> +   .mode   = 0644,
> +   .proc_handler   = proc_dointvec_minmax,
> +   .extra1 = ,
> +   .extra2 = _skb_frags,
> +   },
> { }
>  };

I'm not really a fan of this name either.  Maybe it should be
something like tcp_max_gso_frags.

> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index c82cca1..3dc7a2fd 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -938,7 +938,7 @@ new_segment:
>
> i = skb_shinfo(skb)->nr_frags;
> can_coalesce = skb_can_coalesce(skb, i, page, offset);
> -   if (!can_coalesce && i >= MAX_SKB_FRAGS) {
> +   if (!can_coalesce && i >= sysctl_max_skb_frags) {
> tcp_mark_push(tp, skb);
> goto new_segment;
> }
> @@ -1211,7 +1211,7 @@ new_segment:
>
> if (!skb_can_coalesce(skb, i, pfrag->page,
>   pfrag->offset)) {
> -   if (i == MAX_SKB_FRAGS || !sg) {
> +   if (i == sysctl_max_skb_frags || !sg) {
> tcp_mark_push(tp, skb);
> goto new_segment;
> }

This bit looks good.

I was wondering.  Have you considered looking at something like what
was done with gso_max_size?  It seems like it is meant to address a
problem similar to what you have described where the NICs only support
a certain layout for the GSO frame.  Though now that I look over the
code it seems like it might be flawed in that I don't see bridges or
tunnels really respecting the value so it seems like they could cause
issues.

- Alex


Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Herbert Xu
On Wed, Feb 03, 2016 at 12:36:21PM +0100, Hannes Frederic Sowa wrote:
>
> Agreed that it feels like a hack, but a rather simple one. I would
> consider this to be just a performance improvement. We certainly need
> a slow-path when virtio drivers submit gso packets to the stack (and
> already discussed with Hans). The sysctl can't help here. But without
> the sysctl the packets would constantly hit the slow-path in case of
> e.g. IPoIB and that would also be rather bad.

So you want to penalise every NIC in the system if just one of
them is broken? This is insane.  Just do the partial linearisation
in that one driver that needs it and not only won't you have to
penalise anyone else but you still get the best result for that
driver that needs it.

Besides, you have to implement the linearisation anyway because
of virtualisation.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Eric Dumazet
On Wed, 2016-02-03 at 09:43 -0800, Alexander Duyck wrote:

> Read the history.  I still say it is best if we don't accept a partial
> solution.  If we are going to introduce the sysctl as a core item it
> should function as a core item and not as something that belongs to
> TCP only.


But this patch is the base, adding both the core sysctl and its first
usage.

Do we really need to split it in 2 patches ? Really ?

The goal is to use it in all skb providers were it might be a
performance gain, once they are identified.

Your points were already raised and will be addressed, by either me or
you. And maybe others.





Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Eric Dumazet
On Wed, 2016-02-03 at 07:58 -0800, Alexander Duyck wrote:
> > +++ b/net/core/sysctl_net_core.c
> 
> I really don't think these changes belong in the core. Below you only
> modify the TCP code path so this more likely belongs in the TCP path
> unless you are going to guarantee that all other code paths obey the
> sysctl.  It probably belongs in net/ipv4/sysctl_net_ipv4.c


Alexander, this is a v3.

We rejected prior attempts doing exactly what you suggest.

Think about GRO : These people also need to use the same sysctl in GRO
to limit number of frags.

Limiting the stuff at the egress is useless in forwarding setups.
It will be too late as they'll need to linearize -> huge performance
drop.

This is why we wanted a global setup so that these guys can tweak the
default limit.

Please read netdev history about this stuff.

Plan of action :

1) This patch, adding a core sysctl.
2) Use it in TCP (already done in this patch)
3) Use it in GRO




Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Alexander Duyck
On Wed, Feb 3, 2016 at 8:07 AM, Eric Dumazet  wrote:
> On Wed, 2016-02-03 at 07:58 -0800, Alexander Duyck wrote:
>> > +++ b/net/core/sysctl_net_core.c
>>
>> I really don't think these changes belong in the core. Below you only
>> modify the TCP code path so this more likely belongs in the TCP path
>> unless you are going to guarantee that all other code paths obey the
>> sysctl.  It probably belongs in net/ipv4/sysctl_net_ipv4.c
>
>
> Alexander, this is a v3.

Well I guess that means that a v4 might be needed.  I get that others
have reviewed it but obviously their opinions differed from mine as I
have a few objections to parts of this patch.

> We rejected prior attempts doing exactly what you suggest.

Okay so it sounds like there are some other opinions on this then that
I am not aware of.

> Think about GRO : These people also need to use the same sysctl in GRO
> to limit number of frags.

Okay, well without the GRO changes this patch set is incomplete then.

> Limiting the stuff at the egress is useless in forwarding setups.
> It will be too late as they'll need to linearize -> huge performance
> drop.
>
> This is why we wanted a global setup so that these guys can tweak the
> default limit.
>
> Please read netdev history about this stuff.

Read the history.  I still say it is best if we don't accept a partial
solution.  If we are going to introduce the sysctl as a core item it
should function as a core item and not as something that belongs to
TCP only.

Also I wasn't saying to go the gso_max_size route.  As I commented I
think that probably needs to be fixed as well.  Maybe turned into a
sysctl as is being proposed here since I have found scenarios such as
tunnels where the gso_max_size may not be observed.

> Plan of action :
>
> 1) This patch, adding a core sysctl.
> 2) Use it in TCP (already done in this patch)
> 3) Use it in GRO

What you are talking about is a TCP offloads, one on the transmit side
and one on the receive side.  The name max_skb_frags implies that this
value it is going to cover ALL users of fragments and it doesn't.

If you are going to try and pass this off as a core how about covering
other cases such as __ip_append_data(), skb_append_datato_frags() and
the rest of the functions out there that will totally ignore this
current change and still put together a frame with MAX_SKB_FRAGS
instead of the sysctl value?

In addition it makes sense to have things setup so that you have both
the sysctl and the device value.  Then if someone wants to they can
leave the value set large and just let the one NIC sit there and
linearize frames because NETIF_F_SG gets cleared in netif_skb_features
if the number of frags used exceeds the value for max_frags reported
in the netdev.

- Alex


RE: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread David Laight
From: Herbert Xu
> Sent: 03 February 2016 12:21
> On Wed, Feb 03, 2016 at 12:36:21PM +0100, Hannes Frederic Sowa wrote:
> >
> > Agreed that it feels like a hack, but a rather simple one. I would
> > consider this to be just a performance improvement. We certainly need
> > a slow-path when virtio drivers submit gso packets to the stack (and
> > already discussed with Hans). The sysctl can't help here. But without
> > the sysctl the packets would constantly hit the slow-path in case of
> > e.g. IPoIB and that would also be rather bad.
> 
> So you want to penalise every NIC in the system if just one of
> them is broken? This is insane.  Just do the partial linearisation
> in that one driver that needs it and not only won't you have to
> penalise anyone else but you still get the best result for that
> driver that needs it.
> 
> Besides, you have to implement the linearisation anyway because
> of virtualisation.

And if a MAC driver needs to linearize a tx frame it might as well
copy it into a separately allocated tx buffer area.
Indeed it can copy fragments until the number left is less than the
fragment limit.

David



Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Eric Dumazet
On Wed, 2016-02-03 at 10:24 -0800, Alexander Duyck wrote:

> If this is only meant to be a performance modification and is only
> really targeted at TCP TSO/GRO then all I ask is that we use a name
> like tcp_max_gso_frags and relocate the sysctl to the TCP section.
> Otherwise if we are actually going to try to scope this out on a wider
> level and limit all frags which is what the name implies then the
> patch set needs to make a better attempt at covering all cases where
> it may apply.


This is the goal.

Other skb providers (like tun and af_packet) will also use this optional
limit.

I fail to see why Hans should send a complete patch series.

We will send followup patches, as we always did.

I will send the GRO change for example.

So please keep a sysctl name _without_ TCP in it, it really has nothing
to do with TCP.






Re: [PATCH v3] net:Add sysctl_max_skb_frags

2016-02-03 Thread Alexander Duyck
On Wed, Feb 3, 2016 at 9:54 AM, Eric Dumazet  wrote:
> On Wed, 2016-02-03 at 09:43 -0800, Alexander Duyck wrote:
>
>> Read the history.  I still say it is best if we don't accept a partial
>> solution.  If we are going to introduce the sysctl as a core item it
>> should function as a core item and not as something that belongs to
>> TCP only.
>
>
> But this patch is the base, adding both the core sysctl and its first
> usage.
>
> Do we really need to split it in 2 patches ? Really ?
>
> The goal is to use it in all skb providers were it might be a
> performance gain, once they are identified.

That is what I thought.  So why are we trying to sell this as a core
change then.  All I am asking for is the sysctl to be moved and
renamed since based on all of your descriptions this clearly only
impacts TCP.

> Your points were already raised and will be addressed, by either me or
> you. And maybe others.

Please don't sign me up for work I didn't volunteer for.  I already
have enough broken code to try and fix.  I'm pretty sure I need to go
in and fix the gso_max_size code for starters.

If this is only meant to be a performance modification and is only
really targeted at TCP TSO/GRO then all I ask is that we use a name
like tcp_max_gso_frags and relocate the sysctl to the TCP section.
Otherwise if we are actually going to try to scope this out on a wider
level and limit all frags which is what the name implies then the
patch set needs to make a better attempt at covering all cases where
it may apply.

- Alex