[IPV6] Q: corrupt checksums when transferring data

2006-08-25 Thread Brandeburg, Jesse
I'm enabling e1000 to offload IPv6 since the 2.6.18+ kernels support it.
The kernel I'm testing is 2.6.18-rc4.  

Everything with the hardware offload is working fine, but it appears
that the GSO code may not correctly segment frames sometimes for IPv6
traffic.
I did a tcpdump on both ends with all hardware offloading disabled
through ethtool.  Here is what I got, note the long frame and then the
retransmit.

Has this problem been addressed already?  I'll compile and test a couple
newer kernels, any suggested target patches or kernels would be
appreciated.

Sender:
===
15:56:28.769034 bk1-6.33541  lh2-6.12865: S 3200244805:3200244805(0)
win 5760 mss 1440,sackOK,timestamp 64767859 0,nop,wscale 7
15:56:28.769042 lh2-6.12865  bk1-6.33541: S 1558653050:1558653050(0)
ack 3200244806 win 5712 mss 1440,sackOK,timestamp 172654320
64767859,nop,wscale 7
15:56:28.769102 bk1-6.33541  lh2-6.12865: . ack 1 win 45
nop,nop,timestamp 64767859 172654320
15:56:28.769350 bk1-6.33541  lh2-6.12865: P 1:257(256) ack 1 win 45
nop,nop,timestamp 64767859 172654320
15:56:28.769381 lh2-6.12865  bk1-6.33541: . ack 257 win 53
nop,nop,timestamp 172654320 64767859
15:56:28.769731 lh2-6.12865  bk1-6.33541: P 1:257(256) ack 257 win 53
nop,nop,timestamp 172654320 64767859
15:56:28.769851 bk1-6.33541  lh2-6.12865: . ack 257 win 54
nop,nop,timestamp 64767860 172654320
15:56:28.769860 bk1-6.46315  lh2-6.35704: S 3205139672:3205139672(0)
win 5760 mss 1440,sackOK,timestamp 64767860 0,nop,wscale 7
15:56:28.769873 lh2-6.35704  bk1-6.46315: S 1557432368:1557432368(0)
ack 3205139673 win 5712 mss 1440,sackOK,timestamp 172654320
64767860,nop,wscale 7
15:56:28.769975 bk1-6.46315  lh2-6.35704: . ack 1 win 45
nop,nop,timestamp 64767860 172654320
15:56:28.770009 lh2-6.35704  bk1-6.46315: . 1:2857(2856) ack 1 win 45
nop,nop,timestamp 172654320 64767860
15:56:28.972354 lh2-6.35704  bk1-6.46315: . 1:1429(1428) ack 1 win 45
nop,nop,timestamp 172654371 64767860
15:56:28.972478 bk1-6.46315  lh2-6.35704: . ack 1429 win 68
nop,nop,timestamp 64767910 172654371
15:56:28.972493 lh2-6.35704  bk1-6.46315: . 1429:2857(1428) ack 1 win
45 nop,nop,timestamp 172654371 64767910
15:56:28.972602 bk1-6.46315  lh2-6.35704: . ack 2857 win 90
nop,nop,timestamp 64767910 172654371
15:56:28.972611 lh2-6.35704  bk1-6.46315: . 2857:4285(1428) ack 1 win
45 nop,nop,timestamp 172654371 64767910
15:56:28.972727 bk1-6.46315  lh2-6.35704: . ack 4285 win 112
nop,nop,timestamp 64767910 172654371
15:56:28.972735 lh2-6.35704  bk1-6.46315: . 4285:5713(1428) ack 1 win
45 nop,nop,timestamp 172654371 64767910
15:56:28.972742 lh2-6.35704  bk1-6.46315: . 5713:7141(1428) ack 1 win
45 nop,nop,timestamp 172654371 64767910
15:56:28.972853 bk1-6.46315  lh2-6.35704: . ack 5713 win 135
nop,nop,timestamp 64767910 172654371
15:56:28.972862 lh2-6.35704  bk1-6.46315: . 7141:8569(1428) ack 1 win
45 nop,nop,timestamp 172654371 64767910
15:56:28.972868 lh2-6.35704  bk1-6.46315: . 8569:9997(1428) ack 1 win
45 nop,nop,timestamp 172654371 64767910

Receiver:
=
15:56:28.764058 bk1-6.33541  lh2-6.12865: S 3200244805:3200244805(0)
win 5760 mss 1440,sackOK,timestamp 64767859 0,nop,wscale 7
15:56:28.764181 lh2-6.12865  bk1-6.33541: S 1558653050:1558653050(0)
ack 3200244806 win 5712 mss 1440,sackOK,timestamp 172654320
64767859,nop,wscale 7
15:56:28.764205 bk1-6.33541  lh2-6.12865: . ack 1 win 45
nop,nop,timestamp 64767859 172654320
15:56:28.764441 bk1-6.33541  lh2-6.12865: P 1:257(256) ack 1 win 45
nop,nop,timestamp 64767859 172654320
15:56:28.764552 lh2-6.12865  bk1-6.33541: . ack 257 win 53
nop,nop,timestamp 172654320 64767859
15:56:28.764926 lh2-6.12865  bk1-6.33541: P 1:257(256) ack 257 win 53
nop,nop,timestamp 172654320 64767859
15:56:28.764936 bk1-6.33541  lh2-6.12865: . ack 257 win 54
nop,nop,timestamp 64767860 172654320
15:56:28.764962 bk1-6.46315  lh2-6.35704: S 3205139672:3205139672(0)
win 5760 mss 1440,sackOK,timestamp 64767860 0,nop,wscale 7
15:56:28.765052 lh2-6.35704  bk1-6.46315: S 1557432368:1557432368(0)
ack 3205139673 win 5712 mss 1440,sackOK,timestamp 172654320
64767860,nop,wscale 7
15:56:28.765061 bk1-6.46315  lh2-6.35704: . ack 1 win 45
nop,nop,timestamp 64767860 172654320
15:56:28.765300 lh2-6.35704  bk1-6.46315: . 1:1429(1428) ack 1 win 45
nop,nop,timestamp 172654320 64767860
15:56:28.765306 lh2-6.35704  bk1-6.46315: . 1429:2857(1428) ack 1 win
45 nop,nop,timestamp 172654320 64767860
15:56:28.967565 lh2-6.35704  bk1-6.46315: . 1:1429(1428) ack 1 win 45
nop,nop,timestamp 172654371 64767860
15:56:28.967581 bk1-6.46315  lh2-6.35704: . ack 1429 win 68
nop,nop,timestamp 64767910 172654371
15:56:28.967691 lh2-6.35704  bk1-6.46315: . 1429:2857(1428) ack 1 win
45 nop,nop,timestamp 172654371 64767910
15:56:28.967702 bk1-6.46315  lh2-6.35704: . ack 2857 win 90
nop,nop,timestamp 64767910 172654371
15:56:28.967816 lh2-6.35704  bk1-6.46315: . 2857:4285(1428) ack 1 win
45 nop,nop,timestamp 172654371 64767910
15:56:28.967826 bk1-6.46315  lh2-6.35704: . ack 4285 win 112
nop,nop,timestamp 64767910 

Re: [IPV6] Q: corrupt checksums when transferring data

2006-08-25 Thread Stephen Hemminger
On Fri, 25 Aug 2006 11:13:48 -0700
Brandeburg, Jesse [EMAIL PROTECTED] wrote:

 I'm enabling e1000 to offload IPv6 since the 2.6.18+ kernels support it.
 The kernel I'm testing is 2.6.18-rc4.  

Yes, something is wrong with the GSO code. I am bisecting this bug
http://bugzilla.kernel.org/show_bug.cgi?id=7050


It looks like GSO is handing an IPV6 segment down to the sky2 driver
even though it asks for only NETIF_F_TSO.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [IPV6] Q: corrupt checksums when transferring data

2006-08-25 Thread Brandeburg, Jesse
Stephen Hemminger wrote:
 On Fri, 25 Aug 2006 11:13:48 -0700
 Brandeburg, Jesse [EMAIL PROTECTED] wrote:
 
 I'm enabling e1000 to offload IPv6 since the 2.6.18+ kernels support
 it. The kernel I'm testing is 2.6.18-rc4.
 
 Yes, something is wrong with the GSO code. I am bisecting this bug
   http://bugzilla.kernel.org/show_bug.cgi?id=7050
 
 
 It looks like GSO is handing an IPV6 segment down to the sky2 driver
 even though it asks for only NETIF_F_TSO.

Ah ha, I was wondering if that bug report on sky2 might be related to
this issue.  E1000 actually sends the data I think (it just has a bad
checksum) when handed a too long frame.  Seems like the stack should
never give us something longer than the MTU + enet header, esp with all
hardware offloads disabled.

So I have a very easy repro with netperf
on remote: netserver -4 -6
netperf -H lh2-6,6 -t TCP_MAERTS -- -m4K -S128K -s128K

The remote will generate the bad frames.

Jesse
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [IPV6] Q: corrupt checksums when transferring data

2006-08-25 Thread Brandeburg, Jesse
Stephen Hemminger wrote:
 I think this the problem. Does it fix e1000? I am testing now.
 
 TCP over IPV6 would incorrectly inherit the GSO settings on accepted
 children.
 
 --- linux-2.6.orig/net/ipv6/tcp_ipv6.c2006-08-03
09:09:16.0
 -0700 +++ linux-2.6/net/ipv6/tcp_ipv6.c   2006-08-25
15:30:31.0
 -0700 @@ -944,7 +944,7 @@
* comment in that function for the gory details. -acme
*/
 
 - sk-sk_gso_type = SKB_GSO_TCPV6;
 + newsk-sk_gso_type = SKB_GSO_TCPV6;
   __ip6_dst_store(newsk, dst, NULL);
 
   newtcp6sk = (struct tcp6_sock *)newsk;

ah, no more errors, I didn't go through and validate much more past
that. I'm now able to do hardware offloads with no errors.

I think it's a good patch, at least it makes sense to me and works for
me.

Thanks!
 Jesse
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html