Re: [PATCH net] net: sungem: fix rx checksum support

2018-08-27 Thread Eric Dumazet
On Sun, Aug 26, 2018 at 6:14 AM  wrote:
>
> > BTW, removing the FCS also means GRO is going to work, finally on this NIC 
> > ;)
> >
> > GRO does not like packets with padding.
>
> As a follow-up, I am seeing hw csum failures on Sun V440 that has
> onboard Sun Cassini with sungem driver. First tested version was 4.18
> (it happened there once) and now that I tried 4.18+git, it still
> happens:
>
> [   21.563282] libphy: Fixed MDIO Bus: probed
> [   21.617116] cassini: cassini.c:v1.6 (21 May 2008)
> [   21.678962] cassini :00:02.0: enabling device (0144 -> 0146)
> [   21.761931] cassini :00:02.0 eth0: Sun Cassini+ (64bit/66MHz PCI/Cu) 
> Ethernet[6] 00:03:ba:6f:14:39
> [   21.884952] cassini 0003:00:01.0: enabling device (0144 -> 0146)
> [   21.967868] cassini 0003:00:01.0 eth1: Sun Cassini+ (64bit/66MHz PCI/Cu) 
> Ethernet[29] 00:03:ba:6f:14:3a
> [...]
> [   54.341212] eth0: hw csum failure
> [   54.384725] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 
> 4.18.0-12952-g2923b27 #1397
> [   54.483167] Call Trace:
> [   54.515209]  [0077838c] __skb_checksum_complete+0xcc/0xe0
> [   54.595272]  [0080fc84] igmp_rcv+0x224/0x920
> [   54.660475]  [007ca3d0] ip_local_deliver+0xb0/0x240
> [   54.733675]  [007ca5c0] ip_rcv+0x60/0xa0
> [   54.794304]  [00781a30] __netif_receive_skb_one_core+0x30/0x60
> [   54.880094]  [00782914] process_backlog+0x94/0x140
> [   54.952161]  [00788f6c] net_rx_action+0x1ec/0x320
> [   55.023083]  [00870de8] __do_softirq+0xc8/0x200
> [   55.091719]  [0042c4cc] do_softirq_own_stack+0x2c/0x40
> [   55.168362]  [004662d8] irq_exit+0xb8/0xe0
> [   55.231266]  [00870ac0] handler_irq+0xc0/0x100
> [   55.298756]  [004208b4] tl0_irq5+0x14/0x20
> [   55.361670]  [0042cafc] arch_cpu_idle+0x9c/0xa0
> [   55.447055]  [0048a254] cpu_startup_entry+0x14/0x40
> [   55.536998]  [0095f4b4] 0x95f4b4
> [   55.588471]  [4000] 0x4000
> [  179.780371] eth0: hw csum failure
> [  179.823878] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 
> 4.18.0-12952-g2923b27 #1397
> [  179.922230] Call Trace:
> [  179.954267]  [0077838c] __skb_checksum_complete+0xcc/0xe0
> [  180.034335]  [0080fc84] igmp_rcv+0x224/0x920
> [  180.099536]  [007ca3d0] ip_local_deliver+0xb0/0x240
> [  180.172740]  [007ca5c0] ip_rcv+0x60/0xa0
> [  180.233368]  [00781a30] __netif_receive_skb_one_core+0x30/0x60
> [  180.319159]  [00782914] process_backlog+0x94/0x140
> [  180.391225]  [00788f6c] net_rx_action+0x1ec/0x320
> [  180.462148]  [00870de8] __do_softirq+0xc8/0x200
> [  180.530782]  [0042c4cc] do_softirq_own_stack+0x2c/0x40
> [  180.607422]  [004662d8] irq_exit+0xb8/0xe0
> [  180.670331]  [00870ac0] handler_irq+0xc0/0x100
> [  180.737822]  [004208b4] tl0_irq5+0x14/0x20
> [  180.800735]  [0042caf8] arch_cpu_idle+0x98/0xa0
> [  180.869373]  [00489f60] do_idle+0xe0/0x1c0
> [  180.932281]  [0048a25c] cpu_startup_entry+0x1c/0x40
> [  181.005491]  [0098e9b4] start_kernel+0x3b8/0x3c8


Note that these traces are for non TCP packets.

I suspect the driver should not provide CHECKSUM_COMPLETE for non TCP
packets, maybe the NIC is mishandling this case.


Re: [PATCH net] net: sungem: fix rx checksum support

2018-08-26 Thread mroos
> BTW, removing the FCS also means GRO is going to work, finally on this NIC ;)
> 
> GRO does not like packets with padding.

As a follow-up, I am seeing hw csum failures on Sun V440 that has 
onboard Sun Cassini with sungem driver. First tested version was 4.18 
(it happened there once) and now that I tried 4.18+git, it still 
happens:

[   21.563282] libphy: Fixed MDIO Bus: probed
[   21.617116] cassini: cassini.c:v1.6 (21 May 2008)
[   21.678962] cassini :00:02.0: enabling device (0144 -> 0146)
[   21.761931] cassini :00:02.0 eth0: Sun Cassini+ (64bit/66MHz PCI/Cu) 
Ethernet[6] 00:03:ba:6f:14:39
[   21.884952] cassini 0003:00:01.0: enabling device (0144 -> 0146)
[   21.967868] cassini 0003:00:01.0 eth1: Sun Cassini+ (64bit/66MHz PCI/Cu) 
Ethernet[29] 00:03:ba:6f:14:3a
[...]
[   54.341212] eth0: hw csum failure
[   54.384725] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.18.0-12952-g2923b27 
#1397
[   54.483167] Call Trace:
[   54.515209]  [0077838c] __skb_checksum_complete+0xcc/0xe0
[   54.595272]  [0080fc84] igmp_rcv+0x224/0x920
[   54.660475]  [007ca3d0] ip_local_deliver+0xb0/0x240
[   54.733675]  [007ca5c0] ip_rcv+0x60/0xa0
[   54.794304]  [00781a30] __netif_receive_skb_one_core+0x30/0x60
[   54.880094]  [00782914] process_backlog+0x94/0x140
[   54.952161]  [00788f6c] net_rx_action+0x1ec/0x320
[   55.023083]  [00870de8] __do_softirq+0xc8/0x200
[   55.091719]  [0042c4cc] do_softirq_own_stack+0x2c/0x40
[   55.168362]  [004662d8] irq_exit+0xb8/0xe0
[   55.231266]  [00870ac0] handler_irq+0xc0/0x100
[   55.298756]  [004208b4] tl0_irq5+0x14/0x20
[   55.361670]  [0042cafc] arch_cpu_idle+0x9c/0xa0
[   55.447055]  [0048a254] cpu_startup_entry+0x14/0x40
[   55.536998]  [0095f4b4] 0x95f4b4
[   55.588471]  [4000] 0x4000
[  179.780371] eth0: hw csum failure
[  179.823878] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.18.0-12952-g2923b27 
#1397
[  179.922230] Call Trace:
[  179.954267]  [0077838c] __skb_checksum_complete+0xcc/0xe0
[  180.034335]  [0080fc84] igmp_rcv+0x224/0x920
[  180.099536]  [007ca3d0] ip_local_deliver+0xb0/0x240
[  180.172740]  [007ca5c0] ip_rcv+0x60/0xa0
[  180.233368]  [00781a30] __netif_receive_skb_one_core+0x30/0x60
[  180.319159]  [00782914] process_backlog+0x94/0x140
[  180.391225]  [00788f6c] net_rx_action+0x1ec/0x320
[  180.462148]  [00870de8] __do_softirq+0xc8/0x200
[  180.530782]  [0042c4cc] do_softirq_own_stack+0x2c/0x40
[  180.607422]  [004662d8] irq_exit+0xb8/0xe0
[  180.670331]  [00870ac0] handler_irq+0xc0/0x100
[  180.737822]  [004208b4] tl0_irq5+0x14/0x20
[  180.800735]  [0042caf8] arch_cpu_idle+0x98/0xa0
[  180.869373]  [00489f60] do_idle+0xe0/0x1c0
[  180.932281]  [0048a25c] cpu_startup_entry+0x1c/0x40
[  181.005491]  [0098e9b4] start_kernel+0x3b8/0x3c8


-- 
Meelis Roos (mr...@linux.ee)


Re: [PATCH net] net: sungem: fix rx checksum support

2018-06-21 Thread Eric Dumazet



On 06/19/2018 10:30 PM, David Miller wrote:
> Tested-by: Andreas Schwab 
> 
> Applied and queued up for -stable, thanks Eric.
> 

BTW, removing the FCS also means GRO is going to work, finally on this NIC ;)

GRO does not like packets with padding.



Re: [PATCH net] net: sungem: fix rx checksum support

2018-06-20 Thread Eric Dumazet



On 06/20/2018 04:22 AM, Mathieu Malaterre wrote:
 
> Thanks for the stable tag.
> 
> IMHO the commit message should have also reference commit 7ce5a27f2ef8
> ("Revert "net: Handle CHECKSUM_COMPLETE more adequately in
> pskb_trim_rcsum().""). Which means that commit 88078d98d1bb ("net:
> pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends") should be ready
> for stable also, since it seems that the only driver responsible for
> the revert was sungem.
> 
> my 2cts
> 

I disagree.

commit 88078d98d1bb is a performance improvement, and not needed for stable.

We might also discover another buggy driver later.

Thanks.


Re: [PATCH net] net: sungem: fix rx checksum support

2018-06-20 Thread Mathieu Malaterre
On Wed, Jun 20, 2018 at 7:31 AM David Miller  wrote:
>
> From: Eric Dumazet 
> Date: Tue, 19 Jun 2018 19:18:50 -0700
>
> > After commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
> > are friends"), sungem owners reported the infamous "eth0: hw csum failure"
> > message.
> >
> > CHECKSUM_COMPLETE has in fact never worked for this driver, but this
> > was masked by the fact that upper stacks had to strip the FCS, and
> > therefore skb->ip_summed was set back to CHECKSUM_NONE before
> > my recent change.
> >
> > Driver configures a number of bytes to skip when the chip computes
> > the checksum, and for some reason only half of the Ethernet header
> > was skipped.
> >
> > Then a second problem is that we should strip the FCS by default,
> > unless the driver is updated to eventually support NETIF_F_RXFCS in
> > the future.
> >
> > Finally, a driver should check if NETIF_F_RXCSUM feature is enabled
> > or not, so that the admin can turn off rx checksum if wanted.
> >
> > Many thanks to Andreas Schwab and Mathieu Malaterre for their
> > help in debugging this issue.
> >
> > Signed-off-by: Eric Dumazet 
> > Reported-by: Meelis Roos 
> > Reported-by: Mathieu Malaterre 
> > Reported-by: Andreas Schwab 
> > Tested-by: Andreas Schwab 
>
> Applied and queued up for -stable, thanks Eric.

Thanks for the stable tag.

IMHO the commit message should have also reference commit 7ce5a27f2ef8
("Revert "net: Handle CHECKSUM_COMPLETE more adequately in
pskb_trim_rcsum().""). Which means that commit 88078d98d1bb ("net:
pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends") should be ready
for stable also, since it seems that the only driver responsible for
the revert was sungem.

my 2cts


Re: [PATCH net] net: sungem: fix rx checksum support

2018-06-19 Thread David Miller
From: Eric Dumazet 
Date: Tue, 19 Jun 2018 19:18:50 -0700

> After commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
> are friends"), sungem owners reported the infamous "eth0: hw csum failure"
> message.
> 
> CHECKSUM_COMPLETE has in fact never worked for this driver, but this
> was masked by the fact that upper stacks had to strip the FCS, and
> therefore skb->ip_summed was set back to CHECKSUM_NONE before
> my recent change.
> 
> Driver configures a number of bytes to skip when the chip computes
> the checksum, and for some reason only half of the Ethernet header
> was skipped.
> 
> Then a second problem is that we should strip the FCS by default,
> unless the driver is updated to eventually support NETIF_F_RXFCS in
> the future.
> 
> Finally, a driver should check if NETIF_F_RXCSUM feature is enabled
> or not, so that the admin can turn off rx checksum if wanted.
> 
> Many thanks to Andreas Schwab and Mathieu Malaterre for their
> help in debugging this issue.
> 
> Signed-off-by: Eric Dumazet 
> Reported-by: Meelis Roos 
> Reported-by: Mathieu Malaterre 
> Reported-by: Andreas Schwab 
> Tested-by: Andreas Schwab 

Applied and queued up for -stable, thanks Eric.


[PATCH net] net: sungem: fix rx checksum support

2018-06-19 Thread Eric Dumazet
After commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
are friends"), sungem owners reported the infamous "eth0: hw csum failure"
message.

CHECKSUM_COMPLETE has in fact never worked for this driver, but this
was masked by the fact that upper stacks had to strip the FCS, and
therefore skb->ip_summed was set back to CHECKSUM_NONE before
my recent change.

Driver configures a number of bytes to skip when the chip computes
the checksum, and for some reason only half of the Ethernet header
was skipped.

Then a second problem is that we should strip the FCS by default,
unless the driver is updated to eventually support NETIF_F_RXFCS in
the future.

Finally, a driver should check if NETIF_F_RXCSUM feature is enabled
or not, so that the admin can turn off rx checksum if wanted.

Many thanks to Andreas Schwab and Mathieu Malaterre for their
help in debugging this issue.

Signed-off-by: Eric Dumazet 
Reported-by: Meelis Roos 
Reported-by: Mathieu Malaterre 
Reported-by: Andreas Schwab 
Tested-by: Andreas Schwab 
---
 drivers/net/ethernet/sun/sungem.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/sun/sungem.c 
b/drivers/net/ethernet/sun/sungem.c
index 
7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..b9221fc1674dfa0ef17a43f8ff86d700a1ae514f
 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -60,8 +60,7 @@
 #include 
 #include "sungem.h"
 
-/* Stripping FCS is causing problems, disabled for now */
-#undef STRIP_FCS
+#define STRIP_FCS
 
 #define DEFAULT_MSG(NETIF_MSG_DRV  | \
 NETIF_MSG_PROBE| \
@@ -435,7 +434,7 @@ static int gem_rxmac_reset(struct gem *gp)
writel(desc_dma & 0x, gp->regs + RXDMA_DBLOW);
writel(RX_RING_SIZE - 4, gp->regs + RXDMA_KICK);
val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
-  ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
+  (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
writel(val, gp->regs + RXDMA_CFG);
if (readl(gp->regs + GREG_BIFCFG) & GREG_BIFCFG_M66EN)
writel(((5 & RXDMA_BLANK_IPKTS) |
@@ -760,7 +759,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
struct net_device *dev = gp->dev;
int entry, drops, work_done = 0;
u32 done;
-   __sum16 csum;
 
if (netif_msg_rx_status(gp))
printk(KERN_DEBUG "%s: rx interrupt, done: %d, rx_new: %d\n",
@@ -855,9 +853,13 @@ static int gem_rx(struct gem *gp, int work_to_do)
skb = copy_skb;
}
 
-   csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 
0x);
-   skb->csum = csum_unfold(csum);
-   skb->ip_summed = CHECKSUM_COMPLETE;
+   if (likely(dev->features & NETIF_F_RXCSUM)) {
+   __sum16 csum;
+
+   csum = (__force __sum16)htons((status & 
RXDCTRL_TCPCSUM) ^ 0x);
+   skb->csum = csum_unfold(csum);
+   skb->ip_summed = CHECKSUM_COMPLETE;
+   }
skb->protocol = eth_type_trans(skb, gp->dev);
 
napi_gro_receive(>napi, skb);
@@ -1761,7 +1763,7 @@ static void gem_init_dma(struct gem *gp)
writel(0, gp->regs + TXDMA_KICK);
 
val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
-  ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
+  (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
writel(val, gp->regs + RXDMA_CFG);
 
writel(desc_dma >> 32, gp->regs + RXDMA_DBHI);
@@ -2985,8 +2987,8 @@ static int gem_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
pci_set_drvdata(pdev, dev);
 
/* We can do scatter/gather and HW checksum */
-   dev->hw_features = NETIF_F_SG | NETIF_F_HW_CSUM;
-   dev->features |= dev->hw_features | NETIF_F_RXCSUM;
+   dev->hw_features = NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_RXCSUM;
+   dev->features = dev->hw_features;
if (pci_using_dac)
dev->features |= NETIF_F_HIGHDMA;
 
-- 
2.18.0.rc1.244.gcf134e6275-goog