Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-09 Thread Shmulik Ladkani
On Sat, 9 Jul 2016 11:35:03 -0400 Hannes Frederic Sowa 
 wrote:
> On 09.07.2016 11:18, Shmulik Ladkani wrote:
> > On Fri, 8 Jul 2016 19:04:27 -0400 Hannes Frederic Sowa 
> >  wrote:  
>  I really do wonder if GRO on top of fragmentation does have any effect.
>  Would be great if someone has data for that already?
> >>>
> >>> I think that logic is kind of backwards.  It is already there.
> >>> Instead of asking people to prove that this change is invalid the onus
> >>> should be on the submitter to prove the change causes no harm.
> >>
> >> Of course, sorry, I didn't want to make the impression others should do
> >> that. I asked because Shmulik made the impression on me he had
> >> experience with GRO+fragmentation on vxlan and/or geneve and could
> >> provide some data, maybe even just anecdotal.  
> > 
> > Few anecdotal updates.
> > 
> > I don't have ready-made data as the systems are not using this exact
> > kind of of setup.
> > 
> > However, by performing some quick experimentations, it reveals that GRO
> > on top of the tunnels, where tunnel datagrams are fragmented, has some
> > effect. The packets indeed get aggregated, although not aggresively as
> > in the non-fragmented case.
> > 
> > Whether the effect is significant depends on the system.
> > 
> > In a system that is very sensitive to non-aggregated skbs (due to a cpu
> > bottleneck during further processing of the decapsulated packets), the
> > effect of aggregation is indeed significant.  
> 
> Cool, thanks. I thought it wouldn't happen because of the packet pacing.
> We will also do some more tests ourselves. Maybe it is time to add
> fragmentation support to inet_gro_receive to handle those cases much
> more easily without going through fragmentation engine at all, would
> probably speed up your usage significantly?

Indeed, that seems beneficial. I wondered about this back ago. I found
it not trivial, though. Without the transport headers available per
received SKB, it makes GRO complex than currently is :)

> Talking about ip fragmentation in general, are you end-host or
> mid-router fragmented? 

Currently dealing with end-host fragmentation.
(follow the thread at [1] - usecase is better explained there)

[1] http://www.spinics.net/lists/netdev/msg385085.html

Regards,
Shmulik


Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-09 Thread Hannes Frederic Sowa
On 09.07.2016 11:18, Shmulik Ladkani wrote:
> On Fri, 8 Jul 2016 19:04:27 -0400 Hannes Frederic Sowa 
>  wrote:
 I really do wonder if GRO on top of fragmentation does have any effect.
 Would be great if someone has data for that already?  
>>>
>>> I think that logic is kind of backwards.  It is already there.
>>> Instead of asking people to prove that this change is invalid the onus
>>> should be on the submitter to prove the change causes no harm.  
>>
>> Of course, sorry, I didn't want to make the impression others should do
>> that. I asked because Shmulik made the impression on me he had
>> experience with GRO+fragmentation on vxlan and/or geneve and could
>> provide some data, maybe even just anecdotal.
> 
> Few anecdotal updates.
> 
> I don't have ready-made data as the systems are not using this exact
> kind of of setup.
> 
> However, by performing some quick experimentations, it reveals that GRO
> on top of the tunnels, where tunnel datagrams are fragmented, has some
> effect. The packets indeed get aggregated, although not aggresively as
> in the non-fragmented case.
> 
> Whether the effect is significant depends on the system.
> 
> In a system that is very sensitive to non-aggregated skbs (due to a cpu
> bottleneck during further processing of the decapsulated packets), the
> effect of aggregation is indeed significant.

Cool, thanks. I thought it wouldn't happen because of the packet pacing.
We will also do some more tests ourselves. Maybe it is time to add
fragmentation support to inet_gro_receive to handle those cases much
more easily without going through fragmentation engine at all, would
probably speed up your usage significantly?

Talking about ip fragmentation in general, are you end-host or
mid-router fragmented? Do you know if there are different
characteristics if linux fragments vs. some kind of hw-router in the
middle (do fragments get paced?).

Thanks,
Hannes



Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-09 Thread Shmulik Ladkani
On Fri, 8 Jul 2016 19:04:27 -0400 Hannes Frederic Sowa 
 wrote:
> >> I really do wonder if GRO on top of fragmentation does have any effect.
> >> Would be great if someone has data for that already?  
> > 
> > I think that logic is kind of backwards.  It is already there.
> > Instead of asking people to prove that this change is invalid the onus
> > should be on the submitter to prove the change causes no harm.  
> 
> Of course, sorry, I didn't want to make the impression others should do
> that. I asked because Shmulik made the impression on me he had
> experience with GRO+fragmentation on vxlan and/or geneve and could
> provide some data, maybe even just anecdotal.

Few anecdotal updates.

I don't have ready-made data as the systems are not using this exact
kind of of setup.

However, by performing some quick experimentations, it reveals that GRO
on top of the tunnels, where tunnel datagrams are fragmented, has some
effect. The packets indeed get aggregated, although not aggresively as
in the non-fragmented case.

Whether the effect is significant depends on the system.

In a system that is very sensitive to non-aggregated skbs (due to a cpu
bottleneck during further processing of the decapsulated packets), the
effect of aggregation is indeed significant.

Regards,
Shmulik


Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-08 Thread Alexander Duyck
On Fri, Jul 8, 2016 at 4:04 PM, Hannes Frederic Sowa
 wrote:
> On 08.07.2016 18:11, Alexander Duyck wrote:
>> On Fri, Jul 8, 2016 at 2:51 PM, Hannes Frederic Sowa
>>  wrote:
>>> On 08.07.2016 17:27, Alexander Duyck wrote:
 On Fri, Jul 8, 2016 at 1:57 PM, Hannes Frederic Sowa
  wrote:
> On 08.07.2016 16:17, Shmulik Ladkani wrote:
>> On Fri, 8 Jul 2016 09:21:40 -0700 Alexander Duyck 
>>  wrote:
>>> On Thu, Jul 7, 2016 at 8:58 AM, Paolo Abeni  wrote:
 With udp tunnel offload in place, the kernel can do GRO for some udp 
 tunnels
 at the ingress device level. Currently both the geneve and the vxlan 
 drivers
 implement an additional GRO aggregation point via gro_cells.
 The latter takes effect for tunnels using zero checksum udp packets, 
 which are
 currently explicitly not aggregated by the udp offload layer.

 This patch series adapts the udp tunnel offload to process also zero 
 checksum
 udp packets, if the tunnel's socket allow it. Aggregation, if possible 
 is always
 performed at the ingress device level.

 Then the gro_cells hooks, in both vxlan and geneve driver are removed.
>>>
>>> I think removing the gro_cells hooks may be taking things one step too 
>>> far.
>>
>> +1
>>
>>> I get that there is an impression that it is redundant but there are a
>>> number of paths that could lead to VXLAN or GENEVE frames being
>>> received that are not aggregated via GRO.
>>
>> There's the case where the vxlan/geneve datagrams get IP fragmented, and
>> IP frags are not GROed.
>> GRO aggregation at the vxlan/geneve level is beneficial for this case.
>
> Isn't this a misconfiguration? TCP should not fragment at all, not even
> in vxlan/geneve if one cares about performance? And UDP is not GRO'ed
> anyway.

 The problem is that the DF bit of the outer header is what matters,
 not the inner one.  I believe the default for many of the UDP tunnels
 is to not set the DF bit on the outer header.  The fact is
 fragmentation shouldn't happen, but it can and we need to keep that in
 mind when we work on these sort of things.
>>>
>>> "Old" tunnel protocols inherit the outer DF bit from the inner one,
>>> geneve and vxlan do not. I think we simply never should set DF bit
>>> because vxlan pmtu with source port rss perturbation breaks pmtu
>>> discovery anyway. On the other hand doing so and not having end-to-end
>>> protection of the VNI because of proposed 0 checksum is also...
>>> otherwise at least the Ethernet FCS could protect the frame.
>>
>> The "Old" tunnel protocols can be configured to behave the same way as
>> well.  It is just that most of them default to a mode where they
>> support getting a path MTU if I recall correctly.
>
> That actually only happend very recently, about one month ago (for gre
> at least). The other (ipip) doesn't support that.
>
 There have been a number of cases in the past with tunnels where "it
 works for me" has been how things have been implemented and we need to
 avoid that as it creates a significant amount of pain for end users
 when they do things and they don't work because they have strayed off
 of the one use case that has been tested.
>>>
>>> We certainly don't want to break fragmentation with vxlan and this patch
>>> doesn't change so.
>>>
>>> I really do wonder if GRO on top of fragmentation does have any effect.
>>> Would be great if someone has data for that already?
>>
>> I think that logic is kind of backwards.  It is already there.
>> Instead of asking people to prove that this change is invalid the onus
>> should be on the submitter to prove the change causes no harm.
>
> Of course, sorry, I didn't want to make the impression others should do
> that. I asked because Shmulik made the impression on me he had
> experience with GRO+fragmentation on vxlan and/or geneve and could
> provide some data, maybe even just anecdotal.
>
>> The whole argument is kind of moot anyway based on the comment from
>> Tom.  This is based around being able to aggregate frames with a zero
>> checksum which we can already do, but it requires that the hardware
>> has validated the inner checksum.  What this patch set is doing is
>> trying to take frames that have no checksum and force us to perform a
>> checksum for the inner headers in software.  Really we don't do that
>> now for general GRO, why should we do it in the case of tunnels?  Also
>> I think there is a test escape for the case of a frame with an outer
>> checksum that was not validated by hardware.  In that case the
>> checksum isn't converted until you hit the UDP handler which will then
>> convert it to checksum complete and it then would rely 

Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-08 Thread Hannes Frederic Sowa
On 08.07.2016 18:11, Alexander Duyck wrote:
> On Fri, Jul 8, 2016 at 2:51 PM, Hannes Frederic Sowa
>  wrote:
>> On 08.07.2016 17:27, Alexander Duyck wrote:
>>> On Fri, Jul 8, 2016 at 1:57 PM, Hannes Frederic Sowa
>>>  wrote:
 On 08.07.2016 16:17, Shmulik Ladkani wrote:
> On Fri, 8 Jul 2016 09:21:40 -0700 Alexander Duyck 
>  wrote:
>> On Thu, Jul 7, 2016 at 8:58 AM, Paolo Abeni  wrote:
>>> With udp tunnel offload in place, the kernel can do GRO for some udp 
>>> tunnels
>>> at the ingress device level. Currently both the geneve and the vxlan 
>>> drivers
>>> implement an additional GRO aggregation point via gro_cells.
>>> The latter takes effect for tunnels using zero checksum udp packets, 
>>> which are
>>> currently explicitly not aggregated by the udp offload layer.
>>>
>>> This patch series adapts the udp tunnel offload to process also zero 
>>> checksum
>>> udp packets, if the tunnel's socket allow it. Aggregation, if possible 
>>> is always
>>> performed at the ingress device level.
>>>
>>> Then the gro_cells hooks, in both vxlan and geneve driver are removed.
>>
>> I think removing the gro_cells hooks may be taking things one step too 
>> far.
>
> +1
>
>> I get that there is an impression that it is redundant but there are a
>> number of paths that could lead to VXLAN or GENEVE frames being
>> received that are not aggregated via GRO.
>
> There's the case where the vxlan/geneve datagrams get IP fragmented, and
> IP frags are not GROed.
> GRO aggregation at the vxlan/geneve level is beneficial for this case.

 Isn't this a misconfiguration? TCP should not fragment at all, not even
 in vxlan/geneve if one cares about performance? And UDP is not GRO'ed
 anyway.
>>>
>>> The problem is that the DF bit of the outer header is what matters,
>>> not the inner one.  I believe the default for many of the UDP tunnels
>>> is to not set the DF bit on the outer header.  The fact is
>>> fragmentation shouldn't happen, but it can and we need to keep that in
>>> mind when we work on these sort of things.
>>
>> "Old" tunnel protocols inherit the outer DF bit from the inner one,
>> geneve and vxlan do not. I think we simply never should set DF bit
>> because vxlan pmtu with source port rss perturbation breaks pmtu
>> discovery anyway. On the other hand doing so and not having end-to-end
>> protection of the VNI because of proposed 0 checksum is also...
>> otherwise at least the Ethernet FCS could protect the frame.
> 
> The "Old" tunnel protocols can be configured to behave the same way as
> well.  It is just that most of them default to a mode where they
> support getting a path MTU if I recall correctly.

That actually only happend very recently, about one month ago (for gre
at least). The other (ipip) doesn't support that.

>>> There have been a number of cases in the past with tunnels where "it
>>> works for me" has been how things have been implemented and we need to
>>> avoid that as it creates a significant amount of pain for end users
>>> when they do things and they don't work because they have strayed off
>>> of the one use case that has been tested.
>>
>> We certainly don't want to break fragmentation with vxlan and this patch
>> doesn't change so.
>>
>> I really do wonder if GRO on top of fragmentation does have any effect.
>> Would be great if someone has data for that already?
> 
> I think that logic is kind of backwards.  It is already there.
> Instead of asking people to prove that this change is invalid the onus
> should be on the submitter to prove the change causes no harm.

Of course, sorry, I didn't want to make the impression others should do
that. I asked because Shmulik made the impression on me he had
experience with GRO+fragmentation on vxlan and/or geneve and could
provide some data, maybe even just anecdotal.

> The whole argument is kind of moot anyway based on the comment from
> Tom.  This is based around being able to aggregate frames with a zero
> checksum which we can already do, but it requires that the hardware
> has validated the inner checksum.  What this patch set is doing is
> trying to take frames that have no checksum and force us to perform a
> checksum for the inner headers in software.  Really we don't do that
> now for general GRO, why should we do it in the case of tunnels?  Also
> I think there is a test escape for the case of a frame with an outer
> checksum that was not validated by hardware.  In that case the
> checksum isn't converted until you hit the UDP handler which will then
> convert it to checksum complete and it then would rely on the GRO
> cells to merge the frames after we have already validated the checksum
> for the outer header.

I do agree that Tom's comment changes things a little bit, but currently
I tend 

Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-08 Thread Alexander Duyck
On Fri, Jul 8, 2016 at 2:51 PM, Hannes Frederic Sowa
 wrote:
> On 08.07.2016 17:27, Alexander Duyck wrote:
>> On Fri, Jul 8, 2016 at 1:57 PM, Hannes Frederic Sowa
>>  wrote:
>>> On 08.07.2016 16:17, Shmulik Ladkani wrote:
 On Fri, 8 Jul 2016 09:21:40 -0700 Alexander Duyck 
  wrote:
> On Thu, Jul 7, 2016 at 8:58 AM, Paolo Abeni  wrote:
>> With udp tunnel offload in place, the kernel can do GRO for some udp 
>> tunnels
>> at the ingress device level. Currently both the geneve and the vxlan 
>> drivers
>> implement an additional GRO aggregation point via gro_cells.
>> The latter takes effect for tunnels using zero checksum udp packets, 
>> which are
>> currently explicitly not aggregated by the udp offload layer.
>>
>> This patch series adapts the udp tunnel offload to process also zero 
>> checksum
>> udp packets, if the tunnel's socket allow it. Aggregation, if possible 
>> is always
>> performed at the ingress device level.
>>
>> Then the gro_cells hooks, in both vxlan and geneve driver are removed.
>
> I think removing the gro_cells hooks may be taking things one step too 
> far.

 +1

> I get that there is an impression that it is redundant but there are a
> number of paths that could lead to VXLAN or GENEVE frames being
> received that are not aggregated via GRO.

 There's the case where the vxlan/geneve datagrams get IP fragmented, and
 IP frags are not GROed.
 GRO aggregation at the vxlan/geneve level is beneficial for this case.
>>>
>>> Isn't this a misconfiguration? TCP should not fragment at all, not even
>>> in vxlan/geneve if one cares about performance? And UDP is not GRO'ed
>>> anyway.
>>
>> The problem is that the DF bit of the outer header is what matters,
>> not the inner one.  I believe the default for many of the UDP tunnels
>> is to not set the DF bit on the outer header.  The fact is
>> fragmentation shouldn't happen, but it can and we need to keep that in
>> mind when we work on these sort of things.
>
> "Old" tunnel protocols inherit the outer DF bit from the inner one,
> geneve and vxlan do not. I think we simply never should set DF bit
> because vxlan pmtu with source port rss perturbation breaks pmtu
> discovery anyway. On the other hand doing so and not having end-to-end
> protection of the VNI because of proposed 0 checksum is also...
> otherwise at least the Ethernet FCS could protect the frame.

The "Old" tunnel protocols can be configured to behave the same way as
well.  It is just that most of them default to a mode where they
support getting a path MTU if I recall correctly.

>> There have been a number of cases in the past with tunnels where "it
>> works for me" has been how things have been implemented and we need to
>> avoid that as it creates a significant amount of pain for end users
>> when they do things and they don't work because they have strayed off
>> of the one use case that has been tested.
>
> We certainly don't want to break fragmentation with vxlan and this patch
> doesn't change so.
>
> I really do wonder if GRO on top of fragmentation does have any effect.
> Would be great if someone has data for that already?

I think that logic is kind of backwards.  It is already there.
Instead of asking people to prove that this change is invalid the onus
should be on the submitter to prove the change causes no harm.

The whole argument is kind of moot anyway based on the comment from
Tom.  This is based around being able to aggregate frames with a zero
checksum which we can already do, but it requires that the hardware
has validated the inner checksum.  What this patch set is doing is
trying to take frames that have no checksum and force us to perform a
checksum for the inner headers in software.  Really we don't do that
now for general GRO, why should we do it in the case of tunnels?  Also
I think there is a test escape for the case of a frame with an outer
checksum that was not validated by hardware.  In that case the
checksum isn't converted until you hit the UDP handler which will then
convert it to checksum complete and it then would rely on the GRO
cells to merge the frames after we have already validated the checksum
for the outer header.

- Alex


Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-08 Thread Hannes Frederic Sowa
On 08.07.2016 17:19, Shmulik Ladkani wrote:
> On Fri, 8 Jul 2016 16:57:10 -0400 Hannes Frederic Sowa 
>  wrote:
>> On 08.07.2016 16:17, Shmulik Ladkani wrote:
>>> On Fri, 8 Jul 2016 09:21:40 -0700 Alexander Duyck 
>>>  wrote:  
 I get that there is an impression that it is redundant but there are a
 number of paths that could lead to VXLAN or GENEVE frames being
 received that are not aggregated via GRO.  
>>>
>>> There's the case where the vxlan/geneve datagrams get IP fragmented, and
>>> IP frags are not GROed.
>>> GRO aggregation at the vxlan/geneve level is beneficial for this case.  
>>
>> Isn't this a misconfiguration? TCP should not fragment at all, not even
>> in vxlan/geneve if one cares about performance? And UDP is not GRO'ed
>> anyway.
> 
> It's not an ideal configuration, but it is a valid one.
> 
> Imagine TCP within vxlan/geneve, that gets properly segmented and
> encapsulated.
> 
> The vxlan/geneve datagrams go out the wire, and these can occasionally
> be fragmented on the way (e.g. when we can't control the MTUs along the
> path, or if unable to use PMTUD for whatever reason).

PMTUD doesn't work with vxlan in most situations anyway. But you can
still control the mtu/mss with ip route and you don't need to modify the
mid-hosts.

> At the receiving vxlan/geneve termination, these IP frags are not GROed.
> 
> Instead they get reassembled by the IP stack, then handed to UDP and to
> the vxlan/geneve drivers.
> 
> From that point, GROing at the vxlan/geneve device, which aggregates
> the TCP segments into a TCP super-packet still make sense and has
> benefits.

Given the spreading with which those fragments will be send, I wonder if
the GRO ontop of the tunnels will really aggregate them?

Bye,
Hannes




Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-08 Thread Hannes Frederic Sowa
On 08.07.2016 17:27, Alexander Duyck wrote:
> On Fri, Jul 8, 2016 at 1:57 PM, Hannes Frederic Sowa
>  wrote:
>> On 08.07.2016 16:17, Shmulik Ladkani wrote:
>>> On Fri, 8 Jul 2016 09:21:40 -0700 Alexander Duyck 
>>>  wrote:
 On Thu, Jul 7, 2016 at 8:58 AM, Paolo Abeni  wrote:
> With udp tunnel offload in place, the kernel can do GRO for some udp 
> tunnels
> at the ingress device level. Currently both the geneve and the vxlan 
> drivers
> implement an additional GRO aggregation point via gro_cells.
> The latter takes effect for tunnels using zero checksum udp packets, 
> which are
> currently explicitly not aggregated by the udp offload layer.
>
> This patch series adapts the udp tunnel offload to process also zero 
> checksum
> udp packets, if the tunnel's socket allow it. Aggregation, if possible is 
> always
> performed at the ingress device level.
>
> Then the gro_cells hooks, in both vxlan and geneve driver are removed.

 I think removing the gro_cells hooks may be taking things one step too far.
>>>
>>> +1
>>>
 I get that there is an impression that it is redundant but there are a
 number of paths that could lead to VXLAN or GENEVE frames being
 received that are not aggregated via GRO.
>>>
>>> There's the case where the vxlan/geneve datagrams get IP fragmented, and
>>> IP frags are not GROed.
>>> GRO aggregation at the vxlan/geneve level is beneficial for this case.
>>
>> Isn't this a misconfiguration? TCP should not fragment at all, not even
>> in vxlan/geneve if one cares about performance? And UDP is not GRO'ed
>> anyway.
> 
> The problem is that the DF bit of the outer header is what matters,
> not the inner one.  I believe the default for many of the UDP tunnels
> is to not set the DF bit on the outer header.  The fact is
> fragmentation shouldn't happen, but it can and we need to keep that in
> mind when we work on these sort of things.

"Old" tunnel protocols inherit the outer DF bit from the inner one,
geneve and vxlan do not. I think we simply never should set DF bit
because vxlan pmtu with source port rss perturbation breaks pmtu
discovery anyway. On the other hand doing so and not having end-to-end
protection of the VNI because of proposed 0 checksum is also...
otherwise at least the Ethernet FCS could protect the frame.

> There have been a number of cases in the past with tunnels where "it
> works for me" has been how things have been implemented and we need to
> avoid that as it creates a significant amount of pain for end users
> when they do things and they don't work because they have strayed off
> of the one use case that has been tested.

We certainly don't want to break fragmentation with vxlan and this patch
doesn't change so.

I really do wonder if GRO on top of fragmentation does have any effect.
Would be great if someone has data for that already?

Bye,
Hannes



Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-08 Thread Alexander Duyck
On Fri, Jul 8, 2016 at 1:57 PM, Hannes Frederic Sowa
 wrote:
> On 08.07.2016 16:17, Shmulik Ladkani wrote:
>> On Fri, 8 Jul 2016 09:21:40 -0700 Alexander Duyck 
>>  wrote:
>>> On Thu, Jul 7, 2016 at 8:58 AM, Paolo Abeni  wrote:
 With udp tunnel offload in place, the kernel can do GRO for some udp 
 tunnels
 at the ingress device level. Currently both the geneve and the vxlan 
 drivers
 implement an additional GRO aggregation point via gro_cells.
 The latter takes effect for tunnels using zero checksum udp packets, which 
 are
 currently explicitly not aggregated by the udp offload layer.

 This patch series adapts the udp tunnel offload to process also zero 
 checksum
 udp packets, if the tunnel's socket allow it. Aggregation, if possible is 
 always
 performed at the ingress device level.

 Then the gro_cells hooks, in both vxlan and geneve driver are removed.
>>>
>>> I think removing the gro_cells hooks may be taking things one step too far.
>>
>> +1
>>
>>> I get that there is an impression that it is redundant but there are a
>>> number of paths that could lead to VXLAN or GENEVE frames being
>>> received that are not aggregated via GRO.
>>
>> There's the case where the vxlan/geneve datagrams get IP fragmented, and
>> IP frags are not GROed.
>> GRO aggregation at the vxlan/geneve level is beneficial for this case.
>
> Isn't this a misconfiguration? TCP should not fragment at all, not even
> in vxlan/geneve if one cares about performance? And UDP is not GRO'ed
> anyway.

The problem is that the DF bit of the outer header is what matters,
not the inner one.  I believe the default for many of the UDP tunnels
is to not set the DF bit on the outer header.  The fact is
fragmentation shouldn't happen, but it can and we need to keep that in
mind when we work on these sort of things.

There have been a number of cases in the past with tunnels where "it
works for me" has been how things have been implemented and we need to
avoid that as it creates a significant amount of pain for end users
when they do things and they don't work because they have strayed off
of the one use case that has been tested.

- Alex


Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-08 Thread Shmulik Ladkani
On Fri, 8 Jul 2016 16:57:10 -0400 Hannes Frederic Sowa 
 wrote:
> On 08.07.2016 16:17, Shmulik Ladkani wrote:
> > On Fri, 8 Jul 2016 09:21:40 -0700 Alexander Duyck 
> >  wrote:  
> >> I get that there is an impression that it is redundant but there are a
> >> number of paths that could lead to VXLAN or GENEVE frames being
> >> received that are not aggregated via GRO.  
> > 
> > There's the case where the vxlan/geneve datagrams get IP fragmented, and
> > IP frags are not GROed.
> > GRO aggregation at the vxlan/geneve level is beneficial for this case.  
> 
> Isn't this a misconfiguration? TCP should not fragment at all, not even
> in vxlan/geneve if one cares about performance? And UDP is not GRO'ed
> anyway.

It's not an ideal configuration, but it is a valid one.

Imagine TCP within vxlan/geneve, that gets properly segmented and
encapsulated.

The vxlan/geneve datagrams go out the wire, and these can occasionally
be fragmented on the way (e.g. when we can't control the MTUs along the
path, or if unable to use PMTUD for whatever reason).

At the receiving vxlan/geneve termination, these IP frags are not GROed.

Instead they get reassembled by the IP stack, then handed to UDP and to
the vxlan/geneve drivers.

>From that point, GROing at the vxlan/geneve device, which aggregates
the TCP segments into a TCP super-packet still make sense and has
benefits.

Regards,
Shmulik


Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-08 Thread Hannes Frederic Sowa
On 08.07.2016 16:17, Shmulik Ladkani wrote:
> On Fri, 8 Jul 2016 09:21:40 -0700 Alexander Duyck  
> wrote:
>> On Thu, Jul 7, 2016 at 8:58 AM, Paolo Abeni  wrote:
>>> With udp tunnel offload in place, the kernel can do GRO for some udp tunnels
>>> at the ingress device level. Currently both the geneve and the vxlan drivers
>>> implement an additional GRO aggregation point via gro_cells.
>>> The latter takes effect for tunnels using zero checksum udp packets, which 
>>> are
>>> currently explicitly not aggregated by the udp offload layer.
>>>
>>> This patch series adapts the udp tunnel offload to process also zero 
>>> checksum
>>> udp packets, if the tunnel's socket allow it. Aggregation, if possible is 
>>> always
>>> performed at the ingress device level.
>>>
>>> Then the gro_cells hooks, in both vxlan and geneve driver are removed.  
>>
>> I think removing the gro_cells hooks may be taking things one step too far.
> 
> +1
> 
>> I get that there is an impression that it is redundant but there are a
>> number of paths that could lead to VXLAN or GENEVE frames being
>> received that are not aggregated via GRO.
> 
> There's the case where the vxlan/geneve datagrams get IP fragmented, and
> IP frags are not GROed.
> GRO aggregation at the vxlan/geneve level is beneficial for this case.

Isn't this a misconfiguration? TCP should not fragment at all, not even
in vxlan/geneve if one cares about performance? And UDP is not GRO'ed
anyway.

Bye,
Hannes



Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-08 Thread Shmulik Ladkani
On Fri, 8 Jul 2016 09:21:40 -0700 Alexander Duyck  
wrote:
> On Thu, Jul 7, 2016 at 8:58 AM, Paolo Abeni  wrote:
> > With udp tunnel offload in place, the kernel can do GRO for some udp tunnels
> > at the ingress device level. Currently both the geneve and the vxlan drivers
> > implement an additional GRO aggregation point via gro_cells.
> > The latter takes effect for tunnels using zero checksum udp packets, which 
> > are
> > currently explicitly not aggregated by the udp offload layer.
> >
> > This patch series adapts the udp tunnel offload to process also zero 
> > checksum
> > udp packets, if the tunnel's socket allow it. Aggregation, if possible is 
> > always
> > performed at the ingress device level.
> >
> > Then the gro_cells hooks, in both vxlan and geneve driver are removed.  
> 
> I think removing the gro_cells hooks may be taking things one step too far.

+1

> I get that there is an impression that it is redundant but there are a
> number of paths that could lead to VXLAN or GENEVE frames being
> received that are not aggregated via GRO.

There's the case where the vxlan/geneve datagrams get IP fragmented, and
IP frags are not GROed.
GRO aggregation at the vxlan/geneve level is beneficial for this case.

Regards,
Shmulik


Re: [PATCH net-next 0/4] net: cleanup for UDP tunnel's GRO

2016-07-08 Thread Alexander Duyck
On Thu, Jul 7, 2016 at 8:58 AM, Paolo Abeni  wrote:
> With udp tunnel offload in place, the kernel can do GRO for some udp tunnels
> at the ingress device level. Currently both the geneve and the vxlan drivers
> implement an additional GRO aggregation point via gro_cells.
> The latter takes effect for tunnels using zero checksum udp packets, which are
> currently explicitly not aggregated by the udp offload layer.
>
> This patch series adapts the udp tunnel offload to process also zero checksum
> udp packets, if the tunnel's socket allow it. Aggregation, if possible is 
> always
> performed at the ingress device level.
>
> Then the gro_cells hooks, in both vxlan and geneve driver are removed.

I think removing the gro_cells hooks may be taking things one step too far.

I get that there is an impression that it is redundant but there are a
number of paths that could lead to VXLAN or GENEVE frames being
received that are not aggregated via GRO.  My concern here is that you
are optimizing for one specific use case while at the same time
possibly negatively impacting other use cases.  It would be useful to
provide some data on what the advantages and disadvantages are
expected to be.

- Alex