Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-19 Thread Eric Dumazet



On 06/19/2018 03:32 PM, Andreas Schwab wrote:
> On Jun 19 2018, Eric Dumazet  wrote:
> 
>> diff --git a/drivers/net/ethernet/sun/sungem.c 
>> b/drivers/net/ethernet/sun/sungem.c
>> index 
>> 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..672d6748ab44f0890e92d5ca55d6ff6834c20dc9
>>  100644
>> --- a/drivers/net/ethernet/sun/sungem.c
>> +++ b/drivers/net/ethernet/sun/sungem.c
>> @@ -60,8 +60,7 @@
>>  #include 
>>  #include "sungem.h"
>>  
>> -/* Stripping FCS is causing problems, disabled for now */
>> -#undef STRIP_FCS
>> +#define STRIP_FCS
>>  
>>  #define DEFAULT_MSG(NETIF_MSG_DRV  | \
>>  NETIF_MSG_PROBE| \
>> @@ -435,7 +434,7 @@ static int gem_rxmac_reset(struct gem *gp)
>> writel(desc_dma & 0x, gp->regs + RXDMA_DBLOW);
>> writel(RX_RING_SIZE - 4, gp->regs + RXDMA_KICK);
>> val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
>> -  ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
>> +  (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
>> writel(val, gp->regs + RXDMA_CFG);
>> if (readl(gp->regs + GREG_BIFCFG) & GREG_BIFCFG_M66EN)
>> writel(((5 & RXDMA_BLANK_IPKTS) |
>> @@ -857,6 +856,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
>>  
>> csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 
>> 0x);
>> skb->csum = csum_unfold(csum);
>> +   {
>> +   __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - 
>> ETH_HLEN, 0);
>> +   if (csum != csum_fold(rsum) && net_ratelimit())
>> +   pr_err("sungem wrong csum : %x/%x, len %u bytes\n",
>> +   csum, csum_fold(rsum), len);
>> +   print_hex_dump(KERN_ERR, "raw data: ", 
>> DUMP_PREFIX_OFFSET,
>> +   16, 1, skb->data, len, true);
>> +   }
>> skb->ip_summed = CHECKSUM_COMPLETE;
>> skb->protocol = eth_type_trans(skb, gp->dev);
>>  
>> @@ -1761,7 +1768,7 @@ static void gem_init_dma(struct gem *gp)
>> writel(0, gp->regs + TXDMA_KICK);
>>  
>> val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
>> -  ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
>> +  (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
>> writel(val, gp->regs + RXDMA_CFG);
>>  
>> writel(desc_dma >> 32, gp->regs + RXDMA_DBHI);
> 
> With that patch I still get the wrong csum messages, but no longer the
> hw csum failure messages (tested on a PowerMac G5).
> 
> [  662.659767] sungem: sungem wrong csum : 8359/7ca6, len 86 bytes, 
> c001fee9cc02
> [  662.659775] raw data: : 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 
> 61 01  ...C.b.=~LH...a.
> [  662.659778] raw data: 0010: 1c 1e 00 20 06 40 20 01 0a 62 17 11 88 01 
> 00 00  ... .@ ..b..
> [  662.659780] raw data: 0020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 
> 00 00  .8 ..b..
> [  662.659783] raw data: 0030: 00 00 00 00 00 07 9a 18 00 16 c1 9a 7e ea 
> ea 44  ~..D
> [  662.659785] raw data: 0040: fb 4a 80 10 05 93 44 08 00 00 01 01 08 0a 
> 59 68  .JD...Yh
> [  662.659788] raw data: 0050: ba e2 0e bb ac ae  
>   ..
> 
> Andreas.
> 

Note that 8359 and 7ca6 are the same really (a missing ~ to invert 
csum_partial())

So the bug was that :

1) Driver programmed a wrong start offset for the csum  (7 bytes instead of 14 
bytes to skip Ethernet Header)

2) FCS was not stripped.

Basically CHECKSUM_COMPLETE support never worked, but this was hidden by the 
fact that linux stack
had to throw away this CHECKSUM_COMPLETE because the FCS had to be removed.

Thanks !


Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-19 Thread Andreas Schwab
On Jun 19 2018, Eric Dumazet  wrote:

> diff --git a/drivers/net/ethernet/sun/sungem.c 
> b/drivers/net/ethernet/sun/sungem.c
> index 
> 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..672d6748ab44f0890e92d5ca55d6ff6834c20dc9
>  100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -60,8 +60,7 @@
>  #include 
>  #include "sungem.h"
>  
> -/* Stripping FCS is causing problems, disabled for now */
> -#undef STRIP_FCS
> +#define STRIP_FCS
>  
>  #define DEFAULT_MSG(NETIF_MSG_DRV  | \
>  NETIF_MSG_PROBE| \
> @@ -435,7 +434,7 @@ static int gem_rxmac_reset(struct gem *gp)
> writel(desc_dma & 0x, gp->regs + RXDMA_DBLOW);
> writel(RX_RING_SIZE - 4, gp->regs + RXDMA_KICK);
> val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
> -  ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
> +  (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
> writel(val, gp->regs + RXDMA_CFG);
> if (readl(gp->regs + GREG_BIFCFG) & GREG_BIFCFG_M66EN)
> writel(((5 & RXDMA_BLANK_IPKTS) |
> @@ -857,6 +856,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
>  
> csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 
> 0x);
> skb->csum = csum_unfold(csum);
> +   {
> +   __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - 
> ETH_HLEN, 0);
> +   if (csum != csum_fold(rsum) && net_ratelimit())
> +   pr_err("sungem wrong csum : %x/%x, len %u bytes\n",
> +   csum, csum_fold(rsum), len);
> +   print_hex_dump(KERN_ERR, "raw data: ", 
> DUMP_PREFIX_OFFSET,
> +   16, 1, skb->data, len, true);
> +   }
> skb->ip_summed = CHECKSUM_COMPLETE;
> skb->protocol = eth_type_trans(skb, gp->dev);
>  
> @@ -1761,7 +1768,7 @@ static void gem_init_dma(struct gem *gp)
> writel(0, gp->regs + TXDMA_KICK);
>  
> val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
> -  ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
> +  (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
> writel(val, gp->regs + RXDMA_CFG);
>  
> writel(desc_dma >> 32, gp->regs + RXDMA_DBHI);

With that patch I still get the wrong csum messages, but no longer the
hw csum failure messages (tested on a PowerMac G5).

[  662.659767] sungem: sungem wrong csum : 8359/7ca6, len 86 bytes, 
c001fee9cc02
[  662.659775] raw data: : 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 
01  ...C.b.=~LH...a.
[  662.659778] raw data: 0010: 1c 1e 00 20 06 40 20 01 0a 62 17 11 88 01 00 
00  ... .@ ..b..
[  662.659780] raw data: 0020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 
00  .8 ..b..
[  662.659783] raw data: 0030: 00 00 00 00 00 07 9a 18 00 16 c1 9a 7e ea ea 
44  ~..D
[  662.659785] raw data: 0040: fb 4a 80 10 05 93 44 08 00 00 01 01 08 0a 59 
68  .JD...Yh
[  662.659788] raw data: 0050: ba e2 0e bb ac ae
..

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-19 Thread Eric Dumazet



On 06/19/2018 01:10 PM, Eric Dumazet wrote:
> 
> 
> On 06/19/2018 12:10 PM, Andreas Schwab wrote:
>> On Jun 18 2018, Eric Dumazet  wrote:
>>
>>> DUMP_PREFIX_ADDRESS might give us more information (say alignment problem, 
>>> or crossing page boundaries)
>>
>> DUMP_PREFIX_ADDRESS is useless for that purpose.
>>
>> Here are some samples of broken csums:
>>
>> [  853.849225] sungem: sungem wrong csum : 9886/07be, len 94 bytes, 
>> c001fa187e02
>> [  853.849232] raw data: : 00 0d 93 43 81 62 18 d6 c7 51 b8 1c 08 00 
>> 45 10  ...C.b...QE.
>> [  853.849235] raw data: 0010: 00 4c cb a0 40 00 40 11 d9 97 c0 a8 0a 01 
>> c0 a8  .L..@.@.
>> [  853.849237] raw data: 0020: 0a 07 00 7b 00 7b 00 38 69 e1 1c 03 0c f7 
>> 00 00  ...{.{.8i...
>> [  853.849240] raw data: 0030: 08 f0 00 00 15 f0 c0 35 67 67 de d3 ca c9 
>> d9 5b  ...5gg.[
>> [  853.849242] raw data: 0040: 1f ff de d3 d2 86 8f 67 fa f2 de d3 d2 86 
>> 8f 38  ...g...8
>> [  853.849244] raw data: 0050: 2f ff de d3 d2 86 8f 3b ff ff d1 93 bc 50 
>>/..;.P
> 
> Thanks.
> 
> 4 bytes in excess.
> 
> Might be the FCS, and it does not look like provided csum has a relation with 
> it.
> 
> For some reason FCS stripping was disabled by :
> 
> commit 3e32011d4da6424b3bc65b1e1a047e30ac9882c7
> Author: Benjamin Herrenschmidt 
> Date:   Mon May 19 09:39:11 2003 -0700
> 
> [SUNGEM]: Updates from PowerPC people.
> 
> Support more chips and split out all the complex PHY
> handling into a seperate file.
> 
> 
> Since this NIC never had CHECKSUM_COMPLETE support (since we have to trim 
> each skb,
> thus were forcing ip_summed to CHECKSUM_NONE) we probably should remove it 
> and be happy.
> 
> Unless you guys find a way to let the NIC strip the FCS, and double check the 
> csum is a real csum ;)
> 
> diff --git a/drivers/net/ethernet/sun/sungem.c 
> b/drivers/net/ethernet/sun/sungem.c
> index 
> 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..85439308375e95c3854e4a1561697d69ec85399b
>  100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -760,7 +760,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
> struct net_device *dev = gp->dev;
> int entry, drops, work_done = 0;
> u32 done;
> -   __sum16 csum;
>  
> if (netif_msg_rx_status(gp))
> printk(KERN_DEBUG "%s: rx interrupt, done: %d, rx_new: %d\n",
> @@ -855,9 +854,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
> skb = copy_skb;
> }
>  
> -   csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 
> 0x);
> -   skb->csum = csum_unfold(csum);
> -   skb->ip_summed = CHECKSUM_COMPLETE;
> skb->protocol = eth_type_trans(skb, gp->dev);
>  
> napi_gro_receive(>napi, skb);
> 


If you have time, you also could check if changing the suspect (14 / 2) to 
ETH_HLEN would help
(and also enabling STRIP_FCS)



diff --git a/drivers/net/ethernet/sun/sungem.c 
b/drivers/net/ethernet/sun/sungem.c
index 
7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..672d6748ab44f0890e92d5ca55d6ff6834c20dc9
 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -60,8 +60,7 @@
 #include 
 #include "sungem.h"
 
-/* Stripping FCS is causing problems, disabled for now */
-#undef STRIP_FCS
+#define STRIP_FCS
 
 #define DEFAULT_MSG(NETIF_MSG_DRV  | \
 NETIF_MSG_PROBE| \
@@ -435,7 +434,7 @@ static int gem_rxmac_reset(struct gem *gp)
writel(desc_dma & 0x, gp->regs + RXDMA_DBLOW);
writel(RX_RING_SIZE - 4, gp->regs + RXDMA_KICK);
val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
-  ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
+  (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
writel(val, gp->regs + RXDMA_CFG);
if (readl(gp->regs + GREG_BIFCFG) & GREG_BIFCFG_M66EN)
writel(((5 & RXDMA_BLANK_IPKTS) |
@@ -857,6 +856,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
 
csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 
0x);
skb->csum = csum_unfold(csum);
+   {
+   __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - 
ETH_HLEN, 0);
+   if (csum != csum_fold(rsum) && net_ratelimit())
+   pr_err("sungem wrong csum : %x/%x, len %u bytes\n",
+   csum, csum_fold(rsum), len);
+   print_hex_dump(KERN_ERR, "raw data: ", 
DUMP_PREFIX_OFFSET,
+   16, 1, skb->data, len, true);
+   }
skb->ip_summed = CHECKSUM_COMPLETE;
skb->protocol = eth_type_trans(skb, gp->dev);
 
@@ -1761,7 +1768,7 @@ static void gem_init_dma(struct gem *gp)
writel(0, gp->regs + TXDMA_KICK);
 
val = 

Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-19 Thread Eric Dumazet



On 06/19/2018 12:10 PM, Andreas Schwab wrote:
> On Jun 18 2018, Eric Dumazet  wrote:
> 
>> DUMP_PREFIX_ADDRESS might give us more information (say alignment problem, 
>> or crossing page boundaries)
> 
> DUMP_PREFIX_ADDRESS is useless for that purpose.
> 
> Here are some samples of broken csums:
> 
> [  853.849225] sungem: sungem wrong csum : 9886/07be, len 94 bytes, 
> c001fa187e02
> [  853.849232] raw data: : 00 0d 93 43 81 62 18 d6 c7 51 b8 1c 08 00 
> 45 10  ...C.b...QE.
> [  853.849235] raw data: 0010: 00 4c cb a0 40 00 40 11 d9 97 c0 a8 0a 01 
> c0 a8  .L..@.@.
> [  853.849237] raw data: 0020: 0a 07 00 7b 00 7b 00 38 69 e1 1c 03 0c f7 
> 00 00  ...{.{.8i...
> [  853.849240] raw data: 0030: 08 f0 00 00 15 f0 c0 35 67 67 de d3 ca c9 
> d9 5b  ...5gg.[
> [  853.849242] raw data: 0040: 1f ff de d3 d2 86 8f 67 fa f2 de d3 d2 86 
> 8f 38  ...g...8
> [  853.849244] raw data: 0050: 2f ff de d3 d2 86 8f 3b ff ff d1 93 bc 50  
>   /..;.P

Thanks.

4 bytes in excess.

Might be the FCS, and it does not look like provided csum has a relation with 
it.

For some reason FCS stripping was disabled by :

commit 3e32011d4da6424b3bc65b1e1a047e30ac9882c7
Author: Benjamin Herrenschmidt 
Date:   Mon May 19 09:39:11 2003 -0700

[SUNGEM]: Updates from PowerPC people.

Support more chips and split out all the complex PHY
handling into a seperate file.


Since this NIC never had CHECKSUM_COMPLETE support (since we have to trim each 
skb,
thus were forcing ip_summed to CHECKSUM_NONE) we probably should remove it and 
be happy.

Unless you guys find a way to let the NIC strip the FCS, and double check the 
csum is a real csum ;)

diff --git a/drivers/net/ethernet/sun/sungem.c 
b/drivers/net/ethernet/sun/sungem.c
index 
7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..85439308375e95c3854e4a1561697d69ec85399b
 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -760,7 +760,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
struct net_device *dev = gp->dev;
int entry, drops, work_done = 0;
u32 done;
-   __sum16 csum;
 
if (netif_msg_rx_status(gp))
printk(KERN_DEBUG "%s: rx interrupt, done: %d, rx_new: %d\n",
@@ -855,9 +854,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
skb = copy_skb;
}
 
-   csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 
0x);
-   skb->csum = csum_unfold(csum);
-   skb->ip_summed = CHECKSUM_COMPLETE;
skb->protocol = eth_type_trans(skb, gp->dev);
 
napi_gro_receive(>napi, skb);



Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-19 Thread Andreas Schwab
On Jun 18 2018, Eric Dumazet  wrote:

> DUMP_PREFIX_ADDRESS might give us more information (say alignment problem, or 
> crossing page boundaries)

DUMP_PREFIX_ADDRESS is useless for that purpose.

Here are some samples of broken csums:

[  853.849225] sungem: sungem wrong csum : 9886/07be, len 94 bytes, 
c001fa187e02
[  853.849232] raw data: : 00 0d 93 43 81 62 18 d6 c7 51 b8 1c 08 00 45 
10  ...C.b...QE.
[  853.849235] raw data: 0010: 00 4c cb a0 40 00 40 11 d9 97 c0 a8 0a 01 c0 
a8  .L..@.@.
[  853.849237] raw data: 0020: 0a 07 00 7b 00 7b 00 38 69 e1 1c 03 0c f7 00 
00  ...{.{.8i...
[  853.849240] raw data: 0030: 08 f0 00 00 15 f0 c0 35 67 67 de d3 ca c9 d9 
5b  ...5gg.[
[  853.849242] raw data: 0040: 1f ff de d3 d2 86 8f 67 fa f2 de d3 d2 86 8f 
38  ...g...8
[  853.849244] raw data: 0050: 2f ff de d3 d2 86 8f 3b ff ff d1 93 bc 50
/..;.P

[  857.883052] sungem: sungem wrong csum : dbb4/c48f, len 94 bytes, 
c001fa185882
[  857.883058] raw data: : 00 0d 93 43 81 62 18 d6 c7 51 b8 1c 08 00 45 
00  ...C.b...QE.
[  857.883070] raw data: 0010: 00 4c a1 97 40 00 3a 11 ce ed d9 5b 2c 11 c0 
a8  .L..@.:[,...
[  857.883080] raw data: 0020: 0a 07 00 7b 00 7b 00 38 14 4b 24 02 06 ea 00 
00  ...{.{.8.K$.
[  857.883085] raw data: 0030: 00 0b 00 00 02 99 c0 a8 64 09 de d3 d2 5a 36 
e4  dZ6.
[  857.883090] raw data: 0040: bc f5 de d3 d2 8a 8f 2c 17 44 de d3 d2 8a 93 
8b  ...,.D..
[  857.883094] raw data: 0050: d7 b7 de d3 d2 8a 93 97 69 6e 39 7b d2 5a
in9{.Z

[  858.124689] sungem: sungem wrong csum : 1f4f/02d0, len 118 bytes, 
c001fa185602
[  858.124700] raw data: : 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 
01  ...C.b.=~LH...a.
[  858.124705] raw data: 0010: 1e b1 00 3c 06 40 20 01 0a 62 17 11 88 01 00 
00  ...<.@ ..b..
[  858.124709] raw data: 0020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 
00  .8 ..b..
[  858.124714] raw data: 0030: 00 00 00 00 00 07 94 b4 00 16 86 f5 29 e8 36 
cb  ).6.
[  858.124718] raw data: 0040: 50 49 80 18 05 93 9a 53 00 00 01 01 08 0a 58 
b2  PI.S..X.
[  858.124723] raw data: 0050: de 54 61 5f 2f 3c 00 00 00 10 cc 08 55 f7 da 
21  .Ta_/<..U..!
[  858.124727] raw data: 0060: f4 60 0a 6b 3c aa b9 b3 7e 61 10 b8 c2 be 9a 
0b  .`.k<...~a..
[  858.124731] raw data: 0070: c7 e9 5b 97 1b ac
..[...

[  858.126522] sungem: sungem wrong csum : 0836/19e9, len 90 bytes, 
c001fa185382
[  858.126530] raw data: : 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 
01  ...C.b.=~LH...a.
[  858.126532] raw data: 0010: 1e b1 00 20 06 40 20 01 0a 62 17 11 88 01 00 
00  ... .@ ..b..
[  858.126535] raw data: 0020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 
00  .8 ..b..
[  858.126537] raw data: 0030: 00 00 00 00 00 07 94 b4 00 16 86 f5 2a 04 36 
cb  *.6.
[  858.126540] raw data: 0040: 50 65 80 10 05 93 3e 56 00 00 01 01 08 0a 58 
b2  Pe>V..X.
[  858.126542] raw data: 0050: de 56 61 5f 30 4d 1d 58 42 d2
.Va_0M.XB.

[  858.131559] sungem: sungem wrong csum : 5891/c98d, len 90 bytes, 
c001fa185102
[  858.131567] raw data: : 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 
01  ...C.b.=~LH...a.
[  858.131570] raw data: 0010: 1e b1 00 20 06 40 20 01 0a 62 17 11 88 01 00 
00  ... .@ ..b..
[  858.131572] raw data: 0020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 
00  .8 ..b..
[  858.131574] raw data: 0030: 00 00 00 00 00 07 94 b4 00 16 86 f5 2a 04 36 
cb  *.6.
[  858.131577] raw data: 0040: 50 a1 80 10 05 93 3e 10 00 00 01 01 08 0a 58 
b2  P.>...X.
[  858.131579] raw data: 0050: de 5b 61 5f 30 52 3f ea 70 9b
.[a_0R?.p.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-18 Thread Eric Dumazet



On 06/18/2018 04:29 PM, Eric Dumazet wrote:
> 
> 
> On 06/18/2018 11:45 AM, Mathieu Malaterre wrote:
> 
>>
>> Here is what I get on my side
>>
>> [   53.628847] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
>> [   53.667063] sungem: sungem wrong csum : eea8/6eec, len 149 bytes
>> [   58.648952] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
>> [   58.669414] sungem: sungem wrong csum : 5245/b50, len 149 bytes
>> [   63.674451] sungem: sungem wrong csum : 2d8/5abd, len 149 bytes
>> [   68.678233] sungem: sungem wrong csum : b8fc/a498, len 149 bytes
>> [   73.685771] sungem: sungem wrong csum : 374/5a21, len 149 bytes
>> [   78.689089] sungem: sungem wrong csum : d81/5014, len 149 bytes
>> [   83.683261] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
>> [   83.690193] sungem: sungem wrong csum : c2f7/9a9d, len 149 bytes
>> [   88.692511] sungem: sungem wrong csum : f4d8/68bc, len 149 bytes
>> [   93.699915] sungem: sungem wrong csum : 1370/4a25, len 149 bytes
>> [   98.703136] sungem: sungem wrong csum : e1b5/7bdf, len 149 bytes
>> [  103.704230] sungem: sungem wrong csum : 5321/a74, len 149 bytes
>> [  108.688912] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
>> [  108.706559] sungem: sungem wrong csum : ddbc/7fd8, len 149 bytes
>> [  113.713189] sungem: sungem wrong csum : 5a65/330, len 149 bytes
>> [  113.891697] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
>> [  118.717151] sungem: sungem wrong csum : f7c8/65cc, len 149 bytes
>> [  123.722680] sungem: sungem wrong csum : 3d7a/201b, len 149 bytes
>> [  128.726524] sungem: sungem wrong csum : c8fd/9497, len 149 bytes
>> [  133.732045] sungem: sungem wrong csum : de0d/7f87, len 149 bytes
>> [  135.529163] sungem: sungem wrong csum : 3089/b6dd, len 96 bytes
>> [  135.529208] eth0: hw csum failure
>> [  135.529220] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #7
>> [  135.529226] Call Trace:
>> [  135.529243] [dffedbe0] [c069ddac]
>> __skb_checksum_complete+0xf0/0x108 (unreliable)
> 
> Thanks, then I guess next step would be to dump the content of the frames
> having a wrong checksum, hoping we find an easy way to discard the 
> CHECKSUM_COMPLETE
> in a selective way.
> 
> Otherwise, we will need to remove CHECKSUM_COMPLETE setting in this driver.
> 
> diff --git a/drivers/net/ethernet/sun/sungem.c 
> b/drivers/net/ethernet/sun/sungem.c
> index 
> 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..77a761f95be788bb86c8d917f613c9084818f826
>  100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -857,6 +857,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
>  
> csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 
> 0x);
> skb->csum = csum_unfold(csum);
> +   {
> +   __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - 
> ETH_HLEN, 0);
> +   if (csum != csum_fold(rsum) && net_ratelimit())
> +   pr_err("sungem wrong csum : %04x/%04x, len %u 
> bytes\n",
> +   csum, csum_fold(rsum), len);
> +   print_hex_dump(KERN_ERR, "raw data: ", 
> DUMP_PREFIX_OFFSET,


DUMP_PREFIX_ADDRESS might give us more information (say alignment problem, or 
crossing page boundaries)

> +   16, 1, skb->data, len, true);
> +   }
> skb->ip_summed = CHECKSUM_COMPLETE;
> skb->protocol = eth_type_trans(skb, gp->dev);
>  
> 


Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-18 Thread Eric Dumazet



On 06/18/2018 11:45 AM, Mathieu Malaterre wrote:

> 
> Here is what I get on my side
> 
> [   53.628847] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
> [   53.667063] sungem: sungem wrong csum : eea8/6eec, len 149 bytes
> [   58.648952] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
> [   58.669414] sungem: sungem wrong csum : 5245/b50, len 149 bytes
> [   63.674451] sungem: sungem wrong csum : 2d8/5abd, len 149 bytes
> [   68.678233] sungem: sungem wrong csum : b8fc/a498, len 149 bytes
> [   73.685771] sungem: sungem wrong csum : 374/5a21, len 149 bytes
> [   78.689089] sungem: sungem wrong csum : d81/5014, len 149 bytes
> [   83.683261] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
> [   83.690193] sungem: sungem wrong csum : c2f7/9a9d, len 149 bytes
> [   88.692511] sungem: sungem wrong csum : f4d8/68bc, len 149 bytes
> [   93.699915] sungem: sungem wrong csum : 1370/4a25, len 149 bytes
> [   98.703136] sungem: sungem wrong csum : e1b5/7bdf, len 149 bytes
> [  103.704230] sungem: sungem wrong csum : 5321/a74, len 149 bytes
> [  108.688912] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
> [  108.706559] sungem: sungem wrong csum : ddbc/7fd8, len 149 bytes
> [  113.713189] sungem: sungem wrong csum : 5a65/330, len 149 bytes
> [  113.891697] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
> [  118.717151] sungem: sungem wrong csum : f7c8/65cc, len 149 bytes
> [  123.722680] sungem: sungem wrong csum : 3d7a/201b, len 149 bytes
> [  128.726524] sungem: sungem wrong csum : c8fd/9497, len 149 bytes
> [  133.732045] sungem: sungem wrong csum : de0d/7f87, len 149 bytes
> [  135.529163] sungem: sungem wrong csum : 3089/b6dd, len 96 bytes
> [  135.529208] eth0: hw csum failure
> [  135.529220] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #7
> [  135.529226] Call Trace:
> [  135.529243] [dffedbe0] [c069ddac]
> __skb_checksum_complete+0xf0/0x108 (unreliable)

Thanks, then I guess next step would be to dump the content of the frames
having a wrong checksum, hoping we find an easy way to discard the 
CHECKSUM_COMPLETE
in a selective way.

Otherwise, we will need to remove CHECKSUM_COMPLETE setting in this driver.

diff --git a/drivers/net/ethernet/sun/sungem.c 
b/drivers/net/ethernet/sun/sungem.c
index 
7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..77a761f95be788bb86c8d917f613c9084818f826
 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -857,6 +857,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
 
csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 
0x);
skb->csum = csum_unfold(csum);
+   {
+   __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - 
ETH_HLEN, 0);
+   if (csum != csum_fold(rsum) && net_ratelimit())
+   pr_err("sungem wrong csum : %04x/%04x, len %u bytes\n",
+   csum, csum_fold(rsum), len);
+   print_hex_dump(KERN_ERR, "raw data: ", 
DUMP_PREFIX_OFFSET,
+   16, 1, skb->data, len, true);
+   }
skb->ip_summed = CHECKSUM_COMPLETE;
skb->protocol = eth_type_trans(skb, gp->dev);
 



Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-18 Thread Mathieu Malaterre
On Mon, Jun 18, 2018 at 8:18 PM Eric Dumazet  wrote:
>
>
>
> On 06/18/2018 10:54 AM, Andreas Schwab wrote:
> > On Jun 17 2018, Eric Dumazet  wrote:
> >
> >> Oh this is silly, please try :
> >>
> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >> index 
> >> c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe
> >>  100644
> >> --- a/net/core/skbuff.c
> >> +++ b/net/core/skbuff.c
> >> @@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
> >>  int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
> >>  {
> >> if (skb->ip_summed == CHECKSUM_COMPLETE) {
> >> -   int delta = skb->len - len;
> >> +   __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 
> >> 0);
> >>
> >> -   skb->csum = csum_sub(skb->csum,
> >> -skb_checksum(skb, len, delta, 0));
> >> +   skb->csum = csum_block_sub(skb->csum, csumdiff, len);
> >> }
> >> return __pskb_trim(skb, len);
> >>  }
> >
> > That doesn't help either.
> >
> > Andreas.
> >
>
> Then maybe NIC provided csum is not correct.
>
> It does not compute a checksum on all the frame, but part of it.
>
> You could use this patch to double check.
>
> diff --git a/drivers/net/ethernet/sun/sungem.c 
> b/drivers/net/ethernet/sun/sungem.c
> index 
> 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..277859ea73e35271a10b02011120fca248ec8e71
>  100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -857,6 +857,11 @@ static int gem_rx(struct gem *gp, int work_to_do)
>
> csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 
> 0x);
> skb->csum = csum_unfold(csum);
> +   {
> +   __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - 
> ETH_HLEN, 0);
> +   if (csum != csum_fold(rsum))
> +   pr_err_ratelimited("sungem wrong csum : %x/%x, len %u 
> bytes\n", csum, csum_fold(rsum), len);
> +   }
> skb->ip_summed = CHECKSUM_COMPLETE;
> skb->protocol = eth_type_trans(skb, gp->dev);
>
>

Here is what I get on my side

[   53.628847] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
[   53.667063] sungem: sungem wrong csum : eea8/6eec, len 149 bytes
[   58.648952] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
[   58.669414] sungem: sungem wrong csum : 5245/b50, len 149 bytes
[   63.674451] sungem: sungem wrong csum : 2d8/5abd, len 149 bytes
[   68.678233] sungem: sungem wrong csum : b8fc/a498, len 149 bytes
[   73.685771] sungem: sungem wrong csum : 374/5a21, len 149 bytes
[   78.689089] sungem: sungem wrong csum : d81/5014, len 149 bytes
[   83.683261] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
[   83.690193] sungem: sungem wrong csum : c2f7/9a9d, len 149 bytes
[   88.692511] sungem: sungem wrong csum : f4d8/68bc, len 149 bytes
[   93.699915] sungem: sungem wrong csum : 1370/4a25, len 149 bytes
[   98.703136] sungem: sungem wrong csum : e1b5/7bdf, len 149 bytes
[  103.704230] sungem: sungem wrong csum : 5321/a74, len 149 bytes
[  108.688912] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
[  108.706559] sungem: sungem wrong csum : ddbc/7fd8, len 149 bytes
[  113.713189] sungem: sungem wrong csum : 5a65/330, len 149 bytes
[  113.891697] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
[  118.717151] sungem: sungem wrong csum : f7c8/65cc, len 149 bytes
[  123.722680] sungem: sungem wrong csum : 3d7a/201b, len 149 bytes
[  128.726524] sungem: sungem wrong csum : c8fd/9497, len 149 bytes
[  133.732045] sungem: sungem wrong csum : de0d/7f87, len 149 bytes
[  135.529163] sungem: sungem wrong csum : 3089/b6dd, len 96 bytes
[  135.529208] eth0: hw csum failure
[  135.529220] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #7
[  135.529226] Call Trace:
[  135.529243] [dffedbe0] [c069ddac]
__skb_checksum_complete+0xf0/0x108 (unreliable)


>
>
>


Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-18 Thread Mathieu Malaterre
On Mon, Jun 18, 2018 at 7:54 PM Andreas Schwab  wrote:
>
> On Jun 17 2018, Eric Dumazet  wrote:
>
> > Oh this is silly, please try :
> >
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 
> > c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe
> >  100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
> >  int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
> >  {
> > if (skb->ip_summed == CHECKSUM_COMPLETE) {
> > -   int delta = skb->len - len;
> > +   __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
> >
> > -   skb->csum = csum_sub(skb->csum,
> > -skb_checksum(skb, len, delta, 0));
> > +   skb->csum = csum_block_sub(skb->csum, csumdiff, len);
> > }
> > return __pskb_trim(skb, len);
> >  }
>
> That doesn't help either.

seconded (setup g4+sungem):

[  100.272676] eth0: hw csum failure
[  100.272710] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #6
[  100.272716] Call Trace:
[  100.272733] [dffedbd0] [c069ddb8]
__skb_checksum_complete+0xf0/0x108 (unreliable)
[  100.272752] [dffedbf0] [c078ea28] __udp4_lib_rcv+0x238/0xf98
[  100.272767] [dffedc70] [c0731630] ip_local_deliver_finish+0xa8/0x3c4
[  100.272777] [dffedcb0] [c073243c] ip_local_deliver+0xf0/0x154
[  100.272786] [dffedcf0] [c07328e8] ip_rcv+0x448/0x774
[  100.272800] [dffedd50] [c06aeaec] __netif_receive_skb_core+0x5e8/0x1184
[  100.272811] [dffedde0] [c06bba2c] napi_gro_receive+0x160/0x22c
[  100.272828] [dffede10] [e1571590] gem_poll+0x7fc/0x1ac0 [sungem]
[  100.272837] [dffedee0] [c06bacfc] net_rx_action+0x34c/0x618
[  100.272849] [dffedf60] [c07fd28c] __do_softirq+0x16c/0x5f0
[  100.272863] [dffedfd0] [c0064c7c] irq_exit+0x110/0x1a8
[  100.272877] [dffedff0] [c0016170] call_do_irq+0x24/0x3c
[  100.272890] [c0cf7e80] [c0009a84] do_IRQ+0x98/0x1a0
[  100.272900] [c0cf7eb0] [c001b474] ret_from_except+0x0/0x14
[  100.272911] --- interrupt: 501 at arch_cpu_idle+0x30/0x78
   LR = arch_cpu_idle+0x30/0x78
[  100.272920] [c0cf7f70] [c0cf6000] 0xc0cf6000 (unreliable)
[  100.272935] [c0cf7f80] [c00a3868] do_idle+0xc4/0x158
[  100.272944] [c0cf7fb0] [c00a3ab4] cpu_startup_entry+0x24/0x28
[  100.272955] [c0cf7fc0] [c0998820] start_kernel+0x47c/0x490
[  100.272963] [c0cf7ff0] [3444] 0x3444


> Andreas.
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."


Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-18 Thread Eric Dumazet



On 06/18/2018 10:54 AM, Andreas Schwab wrote:
> On Jun 17 2018, Eric Dumazet  wrote:
> 
>> Oh this is silly, please try :
>>
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index 
>> c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe
>>  100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
>>  int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
>>  {
>> if (skb->ip_summed == CHECKSUM_COMPLETE) {
>> -   int delta = skb->len - len;
>> +   __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
>>  
>> -   skb->csum = csum_sub(skb->csum,
>> -skb_checksum(skb, len, delta, 0));
>> +   skb->csum = csum_block_sub(skb->csum, csumdiff, len);
>> }
>> return __pskb_trim(skb, len);
>>  }
> 
> That doesn't help either.
> 
> Andreas.
> 

Then maybe NIC provided csum is not correct.

It does not compute a checksum on all the frame, but part of it.

You could use this patch to double check.

diff --git a/drivers/net/ethernet/sun/sungem.c 
b/drivers/net/ethernet/sun/sungem.c
index 
7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..277859ea73e35271a10b02011120fca248ec8e71
 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -857,6 +857,11 @@ static int gem_rx(struct gem *gp, int work_to_do)
 
csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 
0x);
skb->csum = csum_unfold(csum);
+   {
+   __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - 
ETH_HLEN, 0);
+   if (csum != csum_fold(rsum))
+   pr_err_ratelimited("sungem wrong csum : %x/%x, len %u 
bytes\n", csum, csum_fold(rsum), len);
+   }
skb->ip_summed = CHECKSUM_COMPLETE;
skb->protocol = eth_type_trans(skb, gp->dev);
 






Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-18 Thread Andreas Schwab
On Jun 17 2018, Eric Dumazet  wrote:

> Oh this is silly, please try :
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 
> c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe
>  100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
>  int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
>  {
> if (skb->ip_summed == CHECKSUM_COMPLETE) {
> -   int delta = skb->len - len;
> +   __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
>  
> -   skb->csum = csum_sub(skb->csum,
> -skb_checksum(skb, len, delta, 0));
> +   skb->csum = csum_block_sub(skb->csum, csumdiff, len);
> }
> return __pskb_trim(skb, len);
>  }

That doesn't help either.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-17 Thread Eric Dumazet



On 06/17/2018 03:27 AM, Andreas Schwab wrote:

> 
> That doesn't change anything.

OK, thanks !

Oh this is silly, please try :

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 
c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe
 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
 int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
 {
if (skb->ip_summed == CHECKSUM_COMPLETE) {
-   int delta = skb->len - len;
+   __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
 
-   skb->csum = csum_sub(skb->csum,
-skb_checksum(skb, len, delta, 0));
+   skb->csum = csum_block_sub(skb->csum, csumdiff, len);
}
return __pskb_trim(skb, len);
 }




Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-17 Thread Andreas Schwab
On Jun 16 2018, Eric Dumazet  wrote:

> I would try something like :
>
> Basically do not bother using CHECKSUM_COMPLETE for small frames that might 
> have been padded.
>
> Since we need to bring one cache line in eth_type_trans() and further header 
> processing,
> computing the checksum in software will be almost free anyway.
>
> diff --git a/drivers/net/ethernet/sun/sungem.c 
> b/drivers/net/ethernet/sun/sungem.c
> index 
> 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..071039f211a8a33153e888bd4014314ba5e686a4
>  100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -855,9 +855,11 @@ static int gem_rx(struct gem *gp, int work_to_do)
> skb = copy_skb;
> }
>  
> -   csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 
> 0x);
> -   skb->csum = csum_unfold(csum);
> -   skb->ip_summed = CHECKSUM_COMPLETE;
> +   if (len > ETH_ZLEN) {
> +   csum = (__force __sum16)htons((status & 
> RXDCTRL_TCPCSUM) ^ 0x);
> +   skb->csum = csum_unfold(csum);
> +   skb->ip_summed = CHECKSUM_COMPLETE;
> +   }
> skb->protocol = eth_type_trans(skb, gp->dev);
>  
> napi_gro_receive(>napi, skb);

That doesn't change anything.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-17 Thread Andreas Schwab
On Jun 16 2018, Mathieu Malaterre  wrote:

> That's odd since it seems to only affect g4+sungem user. None of the
> ppc64 seems to be having it.

I'm also seeing it on a PowerMac G5.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-16 Thread Eric Dumazet



On 06/16/2018 12:14 AM, Mathieu Malaterre wrote:
> Hi Eric,
> 
> On Fri, Jun 15, 2018 at 9:14 PM Eric Dumazet  wrote:
>>
>>
>>
>> On 06/15/2018 11:56 AM, Mathieu Malaterre wrote:
>>> This reverts commit 88078d98d1bb085d72af8437707279e203524fa5.
>>>
>>> It causes regressions for people using chips driven by the sungem
>>> driver. Suspicion is that the skb->csum value isn't being adjusted
>>> properly.
>>>
>>> Symptoms as seen on G4+sungem are:
>>>
>>> [   34.023281] eth0: hw csum failure
>>> [   34.023438] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #2
>>> [   34.023618] Call Trace:
>>> [   34.023707] [dffedbd0] [c069ddac] __skb_checksum_complete+0xf0/0x108 
>>> (unreliable)
>>> [   34.023948] [dffedbf0] [c0777a70] tcp_v4_rcv+0x604/0xe00
>>> [   34.024118] [dffedc70] [c0731624] ip_local_deliver_finish+0xa8/0x3c4
>>> [   34.024315] [dffedcb0] [c0732430] ip_local_deliver+0xf0/0x154
>>> [   34.024493] [dffedcf0] [c07328dc] ip_rcv+0x448/0x774
>>> [   34.024653] [dffedd50] [c06aeae0] __netif_receive_skb_core+0x5e8/0x1184
>>> [   34.024857] [dffedde0] [c06bba20] napi_gro_receive+0x160/0x22c
>>> [   34.025044] [dffede10] [e14b2590] gem_poll+0x7fc/0x1ac0 [sungem]
>>> [   34.025228] [dffedee0] [c06bacf0] net_rx_action+0x34c/0x618
>>> [   34.025402] [dffedf60] [c07fd27c] __do_softirq+0x16c/0x5f0
>>> [   34.025575] [dffedfd0] [c0064c7c] irq_exit+0x110/0x1a8
>>> [   34.025738] [dffedff0] [c0016170] call_do_irq+0x24/0x3c
>>> [   34.025903] [c0cf7e80] [c0009a84] do_IRQ+0x98/0x1a0
>>> [   34.026055] [c0cf7eb0] [c001b474] ret_from_except+0x0/0x14
>>> [   34.026225] --- interrupt: 501 at arch_cpu_idle+0x30/0x78
>>>LR = arch_cpu_idle+0x30/0x78
>>> [   34.026510] [c0cf7f70] [c0cf6000] 0xc0cf6000 (unreliable)
>>> [   34.026682] [c0cf7f80] [c00a3868] do_idle+0xc4/0x158
>>> [   34.026835] [c0cf7fb0] [c00a3ab0] cpu_startup_entry+0x20/0x28
>>> [   34.027013] [c0cf7fc0] [c0998820] start_kernel+0x47c/0x490
>>> [   34.027181] [c0cf7ff0] [3444] 0x3444
>>>
>>> See commit 7ce5a27f2ef8 ("Revert "net: Handle CHECKSUM_COMPLETE more
>>> adequately in pskb_trim_rcsum()."") for previous reference.
>>
>> This fix seems to hide a bug in csum functions on this architecture.
> 
> That's odd since it seems to only affect g4+sungem user. None of the
> ppc64 seems to be having it. And some ppc32 users are not even seeing
> it.
> 
>> Or a bug on this NIC when receiving a small packet (less than 60 bytes).
>> Maybe the padding bytes are not included in NIC provided csum, and not 0.
> 
> Ok in that case the bug is located in
> ./drivers/net/ethernet/sun/sungem.c that seems more likely. I'll try
> to understand that code, then.


I would try something like :

Basically do not bother using CHECKSUM_COMPLETE for small frames that might 
have been padded.

Since we need to bring one cache line in eth_type_trans() and further header 
processing,
computing the checksum in software will be almost free anyway.

diff --git a/drivers/net/ethernet/sun/sungem.c 
b/drivers/net/ethernet/sun/sungem.c
index 
7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..071039f211a8a33153e888bd4014314ba5e686a4
 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -855,9 +855,11 @@ static int gem_rx(struct gem *gp, int work_to_do)
skb = copy_skb;
}
 
-   csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 
0x);
-   skb->csum = csum_unfold(csum);
-   skb->ip_summed = CHECKSUM_COMPLETE;
+   if (len > ETH_ZLEN) {
+   csum = (__force __sum16)htons((status & 
RXDCTRL_TCPCSUM) ^ 0x);
+   skb->csum = csum_unfold(csum);
+   skb->ip_summed = CHECKSUM_COMPLETE;
+   }
skb->protocol = eth_type_trans(skb, gp->dev);
 
napi_gro_receive(>napi, skb);



Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"

2018-06-16 Thread Mathieu Malaterre
Hi Eric,

On Fri, Jun 15, 2018 at 9:14 PM Eric Dumazet  wrote:
>
>
>
> On 06/15/2018 11:56 AM, Mathieu Malaterre wrote:
> > This reverts commit 88078d98d1bb085d72af8437707279e203524fa5.
> >
> > It causes regressions for people using chips driven by the sungem
> > driver. Suspicion is that the skb->csum value isn't being adjusted
> > properly.
> >
> > Symptoms as seen on G4+sungem are:
> >
> > [   34.023281] eth0: hw csum failure
> > [   34.023438] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #2
> > [   34.023618] Call Trace:
> > [   34.023707] [dffedbd0] [c069ddac] __skb_checksum_complete+0xf0/0x108 
> > (unreliable)
> > [   34.023948] [dffedbf0] [c0777a70] tcp_v4_rcv+0x604/0xe00
> > [   34.024118] [dffedc70] [c0731624] ip_local_deliver_finish+0xa8/0x3c4
> > [   34.024315] [dffedcb0] [c0732430] ip_local_deliver+0xf0/0x154
> > [   34.024493] [dffedcf0] [c07328dc] ip_rcv+0x448/0x774
> > [   34.024653] [dffedd50] [c06aeae0] __netif_receive_skb_core+0x5e8/0x1184
> > [   34.024857] [dffedde0] [c06bba20] napi_gro_receive+0x160/0x22c
> > [   34.025044] [dffede10] [e14b2590] gem_poll+0x7fc/0x1ac0 [sungem]
> > [   34.025228] [dffedee0] [c06bacf0] net_rx_action+0x34c/0x618
> > [   34.025402] [dffedf60] [c07fd27c] __do_softirq+0x16c/0x5f0
> > [   34.025575] [dffedfd0] [c0064c7c] irq_exit+0x110/0x1a8
> > [   34.025738] [dffedff0] [c0016170] call_do_irq+0x24/0x3c
> > [   34.025903] [c0cf7e80] [c0009a84] do_IRQ+0x98/0x1a0
> > [   34.026055] [c0cf7eb0] [c001b474] ret_from_except+0x0/0x14
> > [   34.026225] --- interrupt: 501 at arch_cpu_idle+0x30/0x78
> >LR = arch_cpu_idle+0x30/0x78
> > [   34.026510] [c0cf7f70] [c0cf6000] 0xc0cf6000 (unreliable)
> > [   34.026682] [c0cf7f80] [c00a3868] do_idle+0xc4/0x158
> > [   34.026835] [c0cf7fb0] [c00a3ab0] cpu_startup_entry+0x20/0x28
> > [   34.027013] [c0cf7fc0] [c0998820] start_kernel+0x47c/0x490
> > [   34.027181] [c0cf7ff0] [3444] 0x3444
> >
> > See commit 7ce5a27f2ef8 ("Revert "net: Handle CHECKSUM_COMPLETE more
> > adequately in pskb_trim_rcsum()."") for previous reference.
>
> This fix seems to hide a bug in csum functions on this architecture.

That's odd since it seems to only affect g4+sungem user. None of the
ppc64 seems to be having it. And some ppc32 users are not even seeing
it.

> Or a bug on this NIC when receiving a small packet (less than 60 bytes).
> Maybe the padding bytes are not included in NIC provided csum, and not 0.

Ok in that case the bug is located in
./drivers/net/ethernet/sun/sungem.c that seems more likely. I'll try
to understand that code, then.

Thanks

>
>
>