Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864

2016-04-02 Thread Rick Jones

On 04/01/2016 07:21 PM, Eric Dumazet wrote:

On Fri, 2016-04-01 at 22:16 -0400, David Miller wrote:

From: Alexander Duyck 
Date: Fri, 1 Apr 2016 12:58:41 -0700


RFC 6864 is pretty explicit about this, IPv4 ID used only for
fragmentation.  https://tools.ietf.org/html/rfc6864#section-4.1

The goal with this change is to try and keep most of the existing
behavior in tact without violating this rule?  I would think the
sequence number should give you the ability to infer a drop in the
case of TCP.  In the case of UDP tunnels we are now getting a bit more
data since we were ignoring the outer IP header ID before.


When retransmits happen, the sequence numbers are the same.  But you
can then use the IP ID to see exactly what happened.  You can even
tell if multiple retransmits got reordered.

Eric's use case is extremely useful, and flat out eliminates ambiguity
when analyzing TCP traces.


Yes, our team (including Van Jacobson ;) ) would be sad to not have
sequential IP ID (but then we don't have them for IPv6 ;) )


Your team would not be the only one sad to see that go away.

rick jones


Since the cost of generating them is pretty small (inet->inet_id
counter), we probably should keep them in linux. Their usage will phase
out as IPv6 wins the Internet war...






Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864

2016-04-02 Thread Alexander Duyck
On Fri, Apr 1, 2016 at 7:16 PM, David Miller  wrote:
> From: Alexander Duyck 
> Date: Fri, 1 Apr 2016 12:58:41 -0700
>
>> RFC 6864 is pretty explicit about this, IPv4 ID used only for
>> fragmentation.  https://tools.ietf.org/html/rfc6864#section-4.1
>>
>> The goal with this change is to try and keep most of the existing
>> behavior in tact without violating this rule?  I would think the
>> sequence number should give you the ability to infer a drop in the
>> case of TCP.  In the case of UDP tunnels we are now getting a bit more
>> data since we were ignoring the outer IP header ID before.
>
> When retransmits happen, the sequence numbers are the same.  But you
> can then use the IP ID to see exactly what happened.  You can even
> tell if multiple retransmits got reordered.
>
> Eric's use case is extremely useful, and flat out eliminates ambiguity
> when analyzing TCP traces.

I'm not really sure the IP ID is really all that useful.  On a 10G
link it takes about 80ms for it to wrap using an MTU of 1500.  That
value is going to keep dropping as we move up to 40G and 100G.  That
was one of the motivations behind RFC 6864 because it starts becoming
a bottle-neck if you want to keep the IP IDs truly unique.  In
addition while this change would allow you to combined disjointed IP
IDs I don't think you would lose the re-transmission as there would
likely be a gap in sequence numbers that would cause the flow to be
flushed from GRO, and it isn't as if we can prepend the retransmit to
the aggregated frame, we are always adding to the tail.

I would think the most likely case where this change would foul up any
IP IDs would be the garbage-in/garbage-out case like the IPv6 to IPv4
translation that is using the fixed IP ID of 0.  I agree that it isn't
desirable, however at the same time per RFC 6864 there is nothing
there to say we cannot do that.  In addition it would likely help to
mitigate some of the impact of IP ID on things like SLHC compression
since the resegmented frames would be incrementing so it would reduce
the number of times the IP ID would have to be updated.

- Alex


Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864

2016-04-01 Thread Eric Dumazet
On Sat, 2016-04-02 at 10:19 +0800, Herbert Xu wrote:
> On Fri, Apr 01, 2016 at 07:15:33PM -0700, Eric Dumazet wrote:
> > On Sat, 2016-04-02 at 09:57 +0800, Herbert Xu wrote:
> > > 
> > > We could easily fix that by adding a feature bit to control this,
> > > something like SKB_GSO_TCP_FIXEDID.
> > 
> > I understood the patch allowed to aggregate 4 segments having
> > 
> > ID=12   ID=40   ID=80  ID=1000
> 
> Right.  But I haven't seen any justification that we should aggregate
> such packets at all.  The only valid case that I have seen is for
> the case of unchanging IDs, no?

Presumably repeats of "DF=1 ID=0" should be what we really need to
handle.

On my wish list, having some reordering logic in GRO would be far more
interesting than these minor details ;)





Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864

2016-04-01 Thread Eric Dumazet
On Fri, 2016-04-01 at 22:16 -0400, David Miller wrote:
> From: Alexander Duyck 
> Date: Fri, 1 Apr 2016 12:58:41 -0700
> 
> > RFC 6864 is pretty explicit about this, IPv4 ID used only for
> > fragmentation.  https://tools.ietf.org/html/rfc6864#section-4.1
> > 
> > The goal with this change is to try and keep most of the existing
> > behavior in tact without violating this rule?  I would think the
> > sequence number should give you the ability to infer a drop in the
> > case of TCP.  In the case of UDP tunnels we are now getting a bit more
> > data since we were ignoring the outer IP header ID before.
> 
> When retransmits happen, the sequence numbers are the same.  But you
> can then use the IP ID to see exactly what happened.  You can even
> tell if multiple retransmits got reordered.
> 
> Eric's use case is extremely useful, and flat out eliminates ambiguity
> when analyzing TCP traces.

Yes, our team (including Van Jacobson ;) ) would be sad to not have
sequential IP ID (but then we don't have them for IPv6 ;) )

Since the cost of generating them is pretty small (inet->inet_id
counter), we probably should keep them in linux. Their usage will phase
out as IPv6 wins the Internet war...





Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864

2016-04-01 Thread Herbert Xu
On Fri, Apr 01, 2016 at 07:15:33PM -0700, Eric Dumazet wrote:
> On Sat, 2016-04-02 at 09:57 +0800, Herbert Xu wrote:
> > 
> > We could easily fix that by adding a feature bit to control this,
> > something like SKB_GSO_TCP_FIXEDID.
> 
> I understood the patch allowed to aggregate 4 segments having
> 
> ID=12   ID=40   ID=80  ID=1000

Right.  But I haven't seen any justification that we should aggregate
such packets at all.  The only valid case that I have seen is for
the case of unchanging IDs, no?

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864

2016-04-01 Thread David Miller
From: Alexander Duyck 
Date: Fri, 1 Apr 2016 12:58:41 -0700

> RFC 6864 is pretty explicit about this, IPv4 ID used only for
> fragmentation.  https://tools.ietf.org/html/rfc6864#section-4.1
> 
> The goal with this change is to try and keep most of the existing
> behavior in tact without violating this rule?  I would think the
> sequence number should give you the ability to infer a drop in the
> case of TCP.  In the case of UDP tunnels we are now getting a bit more
> data since we were ignoring the outer IP header ID before.

When retransmits happen, the sequence numbers are the same.  But you
can then use the IP ID to see exactly what happened.  You can even
tell if multiple retransmits got reordered.

Eric's use case is extremely useful, and flat out eliminates ambiguity
when analyzing TCP traces.


Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864

2016-04-01 Thread Eric Dumazet
On Sat, 2016-04-02 at 09:57 +0800, Herbert Xu wrote:
> Eric Dumazet  wrote:
> >
> > I do not particularly care, but it is worth mentioning that GRO+TSO
> > would not be idempotent anymore.
> 
> We could easily fix that by adding a feature bit to control this,
> something like SKB_GSO_TCP_FIXEDID.

I understood the patch allowed to aggregate 4 segments having

ID=12   ID=40   ID=80  ID=1000

-> resulting GRO packet with 4 segments and ID=12.  TSO would generate
12,13,14,15   or 12,12,12,12 with your flag ?

(Before the patch no aggregation took place and exact same packets were
forwarded with 12, 40, 80, 1000)

As I said, I am not sure we should care :)





Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864

2016-04-01 Thread Herbert Xu
Eric Dumazet  wrote:
>
> I do not particularly care, but it is worth mentioning that GRO+TSO
> would not be idempotent anymore.

We could easily fix that by adding a feature bit to control this,
something like SKB_GSO_TCP_FIXEDID.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


RE: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864

2016-04-01 Thread Subash Abhinov Kasiviswanathan
|  For transmit we can leave the IP ID code as is.  For receive we should not be
|  snooping into the IP ID for any frames that have the DF bit set as devices
|  that have adopted RFC 6864 on their transmit path will end up causing issues.

Currently, GRO does not coalesce TCP packets originating from nodes which do 
IPv6 to IPv4 translation.
These packets have DF set to 1 and IP ID set 0 according to 
https://tools.ietf.org/html/rfc6145#page-17 

With this patch, we should be able to see coalescing happen for those cases as 
well.



Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864

2016-04-01 Thread Alexander Duyck
On Fri, Apr 1, 2016 at 12:24 PM, David Miller  wrote:
> From: Eric Dumazet 
> Date: Fri, 01 Apr 2016 11:49:03 -0700
>
>> For example, TCP stack tracks per socket ID generation, even if it
>> sends DF=1 frames. Damn useful for tcpdump analysis and drop
>> inference.
>
> Thanks for mentioning this, I never considered this use case.

RFC 6864 is pretty explicit about this, IPv4 ID used only for
fragmentation.  https://tools.ietf.org/html/rfc6864#section-4.1

The goal with this change is to try and keep most of the existing
behavior in tact without violating this rule?  I would think the
sequence number should give you the ability to infer a drop in the
case of TCP.  In the case of UDP tunnels we are now getting a bit more
data since we were ignoring the outer IP header ID before.

>> With your change, the resulting GRO packet would propagate the ID of
>> first frag. Most GSO/GSO engines will then provide a ID sequence,
>> which might not match original packets.

Right.  But that is only in the case where the IP IDs did not already
increment or were left uninitialized meaning the transmitter was
probably already following RFC 6864 and chose a fixed value.  Odds are
in such a case we end up improving the performance if anything as
there are plenty of legacy systems out there that still require the
IPv4 ID increment in order to get LRO/GRO.

>> I do not particularly care, but it is worth mentioning that GRO+TSO
>> would not be idempotent anymore.

In the patch I mentioned we had already broken that.  I'm basically
just going through and fixing the cases for tunnels where we were
doing the outer header wrong while at the same time relaxing the
requirements for the inner header if DF is set.  I'll probably add
some documentation do the Documentation folder about it as well.  I'm
currently in the process of writing up documentation for GSO and GSO
partial for the upcoming patchset.  I can pretty easily throw in a few
comments about GRO as well.

> Our eventual plan was to start emitting zero in the ID field for
> outgoing TCP datagrams with DF set, since the issue that caused us to
> generate incrementing IDs in the first place (buggy Microsoft SLHC
> compression) we decided is not relevant and important enough to
> accommodate any more.

For the GSO partial stuff I was probably just going to have the IP ID
on the inner headers lurch forward in chunks equal to gso_segs when we
are doing the segmentation.  I didn't want to use a fixed value just
because that would likely make it easy to identify Linux devices being
a bump in the wire.  I figure if there are already sources that
weren't updating IP ID for their segmentation offloads then if we just
take that approach odds are we will blend in with the other devices
and be more difficult to single out.

Another reason for doing it this way is that different devices are
going to have different behaviors with GSO partial.  In the case of
the i40e driver it recognizes both inner and outer network headers so
it can increment both correctly.  In the case of igb and ixgbe they
only can support the outer header so the inner IP ID value would be
lurching by gso_size every time we move from one GSO frame to the
next.

> So outside of your TCP behavior analysis case, there isn't a
> compelling argument to keeping that code around any more, rather than
> just put zero in the ID field.
>
> I suppose we could keep the counter code around and allow it to be
> enabled using a sysctl or socket option, but how strongly do you
> really feel about this?

I'm not suggesting we drop the counter code for transmit.  What RFC
6864 says is "Originating sources MAY set the IPv4 ID field of atomic
datagrams to any value."

For transmit we can leave the IP ID code as is.  For receive we should
not be snooping into the IP ID for any frames that have the DF bit set
as devices that have adopted RFC 6864 on their transmit path will end
up causing issues.

- Alex


Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864

2016-04-01 Thread David Miller
From: Eric Dumazet 
Date: Fri, 01 Apr 2016 11:49:03 -0700

> For example, TCP stack tracks per socket ID generation, even if it
> sends DF=1 frames. Damn useful for tcpdump analysis and drop
> inference.

Thanks for mentioning this, I never considered this use case.

> With your change, the resulting GRO packet would propagate the ID of
> first frag. Most GSO/GSO engines will then provide a ID sequence,
> which might not match original packets.
> 
> I do not particularly care, but it is worth mentioning that GRO+TSO
> would not be idempotent anymore.

Our eventual plan was to start emitting zero in the ID field for
outgoing TCP datagrams with DF set, since the issue that caused us to
generate incrementing IDs in the first place (buggy Microsoft SLHC
compression) we decided is not relevant and important enough to
accommodate any more.

So outside of your TCP behavior analysis case, there isn't a
compelling argument to keeping that code around any more, rather than
just put zero in the ID field.

I suppose we could keep the counter code around and allow it to be
enabled using a sysctl or socket option, but how strongly do you
really feel about this?


Re: [net PATCH 2/2] ipv4/GRO: Make GRO conform to RFC 6864

2016-04-01 Thread Eric Dumazet
On Fri, 2016-04-01 at 11:05 -0700, Alexander Duyck wrote:
> RFC 6864 states that the IPv4 ID field MUST NOT be used for purposes other
> than fragmentation and reassembly.  Currently we are looking at this field
> as a way of identifying what frames can be aggregated and  which cannot for
> GRO.  While this is valid for frames that do not have DF set, it is invalid
> to do so if the bit is set.
> 
> In addition we were generating IPv4 ID collisions when 2 or more flows were
> interleaved over the same tunnel.  To prevent that we store the result of
> all IP ID checks via a "|=" instead of overwriting previous values.

Note that for atomic datagrams (DF = 1), since fragmentation and
reassembly can not occur, maybe some people use ID field for other
purposes.

For example, TCP stack tracks per socket ID generation, even if it sends
DF=1 frames. Damn useful for tcpdump analysis and drop inference.


With your change, the resulting GRO packet would propagate the ID of
first frag. Most GSO/GSO engines will then provide a ID sequence, which
might not match original packets.

I do not particularly care, but it is worth mentioning that GRO+TSO
would not be idempotent anymore.