Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-11-03 Thread Andre Tomt

On 31.10.2018 05:08, Andre Tomt wrote:

On 30.10.2018 12:04, Andre Tomt wrote:

On 30.10.2018 11:58, Andre Tomt wrote:

On 27.10.2018 23:41, Andre Tomt wrote:

On 26.10.2018 13:45, Andre Tomt wrote:

On 25.10.2018 19:38, Eric Dumazet wrote:



On 10/24/2018 12:41 PM, Andre Tomt wrote:


It eventually showed up again with mlx4, on 4.18.16 + fix and 
also on 4.19. I still do not have a useful packet capture.


It is running a torrent client serving up various linux 
distributions.




Have you also applied this fix ?

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913 





No. I've applied it now to 4.19 and will report back if anything 
shows up.


Just hit it on the simpler server; no VRF, no tunnels, no 
nat/conntrack. Only a basic stateless nftables ruleset and a vlan 
netdev (unlikely to be the one triggering this I guess; it has only 
v4 traffic).


I'm currently testing 4.19 with the recomended commit added, plus 
these to sort out some GRO issues (on a hunch, unsure if related):
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=a8305bff685252e80b7c60f4f5e7dd2e63e38218 

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=992cba7e276d438ac8b0a8c17b147b37c8c286f7 

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=ece23711dd956cd5053c9cb03e9fe0668f9c8894 



and I *think* it is behaving better now? it's not conclusive as it 
could take a while to trip in this environment but some of the test 
servers have not shown anything bad in almost 24h.


Sorry, s/some of the/none of the


I think it is fairly safe to say 4.19 + mlx4 + these 4 commits is OK. At 
least for my workload. Servers are now 51-61 hours in, no splats. I also 
added ntp pool traffic to one of them to make things a little more 
exciting.


Not sure what is needed for 4.18, I dont have the mental bandwidth to 
test that right now. Also no idea about the similar looking mlx5 splats 
reported elsewhere.


As expected conntrack/nat + vlan + forwarding still splats.
sch_cake, IFB and VRF was removed from this setup.

Here is a conntrack splat without IFB/VRF/Cake inteference:

[34458.506346] wanib: hw csum failure
[34458.506371] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.19.0-1 #1
[34458.506374] Hardware name: Supermicro Super Server/X10SDV-4C-TLN2F, BIOS 2.0 
06/13/2018
[34458.506377] Call Trace:
[34458.506381]  
[34458.506388]  dump_stack+0x5c/0x80
[34458.506392]  __skb_checksum_complete+0xac/0xc0
[34458.506402]  icmp_error+0x1c8/0x1f0 [nf_conntrack]
[34458.506406]  ? skb_copy_bits+0x13d/0x220
[34458.506411]  nf_conntrack_in+0xd8/0x390 [nf_conntrack]
[34458.506416]  ? ___pskb_trim+0x192/0x330
[34458.506421]  nf_hook_slow+0x43/0xc0
[34458.506426]  ip_rcv+0x90/0xb0
[34458.506430]  ? ip_rcv_finish_core.isra.0+0x310/0x310
[34458.506435]  __netif_receive_skb_one_core+0x42/0x50
[34458.506438]  netif_receive_skb_internal+0x24/0xb0
[34458.506441]  napi_gro_frags+0x177/0x210
[34458.506446]  mlx4_en_process_rx_cq+0x8df/0xb50 [mlx4_en]
[34458.506459]  ? mlx4_eq_int+0x38f/0xcb0 [mlx4_core]
[34458.506463]  mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en]
[34458.506466]  net_rx_action+0xe1/0x2c0
[34458.506469]  __do_softirq+0xe7/0x2d3
[34458.506475]  irq_exit+0x96/0xd0
[34458.506478]  do_IRQ+0x85/0xd0
[34458.506483]  common_interrupt+0xf/0xf
[34458.506486]  
[34458.506491] RIP: 0010:cpuidle_enter_state+0xb9/0x320
[34458.506495] Code: e8 3c 16 bc ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 
02 0f 85 3b 02 00 00 31 ff e8 5e fb c0 ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff 
ff f3 01 00 00 48 2b 1c 24 ba ff ff ff 7f 48 39 c3
[34458.506497] RSP: 0018:978d41943ea8 EFLAGS: 0246 ORIG_RAX: 
ffdb
[34458.506500] RAX: 8d8f6fa60fc0 RBX: 1f56ff07af28 RCX: 001f
[34458.506501] RDX: 1f56ff07af28 RSI: 3a2e90d6 RDI: 
[34458.506503] RBP: 8d8f6fa698c0 R08: 0002 R09: 00020840
[34458.506504] R10: 0004ea58f2899595 R11: 8d8f6fa601e8 R12: 0001
[34458.506505] R13: 8a0ac638 R14: 0001 R15: 
[34458.506509]  ? cpuidle_enter_state+0x94/0x320
[34458.506512]  do_idle+0x1e4/0x220
[34458.506515]  cpu_startup_entry+0x5f/0x70
[34458.506519]  start_secondary+0x185/0x1a0
[34458.506521]  secondary_startup_64+0xa4/0xb0


Stateless filtered non-forwarding host still looks like it has been 
fixed (the udp6_gro_* splats are still all gone). Also seems fine when 
moving the traffic over a vlan device. These fixes went into 4.19.1-rc1 
(checksum_complete + unlink gro packets on overflow fixes)


Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-30 Thread Andre Tomt

On 30.10.2018 12:04, Andre Tomt wrote:

On 30.10.2018 11:58, Andre Tomt wrote:

On 27.10.2018 23:41, Andre Tomt wrote:

On 26.10.2018 13:45, Andre Tomt wrote:

On 25.10.2018 19:38, Eric Dumazet wrote:



On 10/24/2018 12:41 PM, Andre Tomt wrote:


It eventually showed up again with mlx4, on 4.18.16 + fix and also 
on 4.19. I still do not have a useful packet capture.


It is running a torrent client serving up various linux 
distributions.




Have you also applied this fix ?

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913 





No. I've applied it now to 4.19 and will report back if anything 
shows up.


Just hit it on the simpler server; no VRF, no tunnels, no 
nat/conntrack. Only a basic stateless nftables ruleset and a vlan 
netdev (unlikely to be the one triggering this I guess; it has only 
v4 traffic).


I'm currently testing 4.19 with the recomended commit added, plus 
these to sort out some GRO issues (on a hunch, unsure if related):
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=a8305bff685252e80b7c60f4f5e7dd2e63e38218 

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=992cba7e276d438ac8b0a8c17b147b37c8c286f7 

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=ece23711dd956cd5053c9cb03e9fe0668f9c8894 



and I *think* it is behaving better now? it's not conclusive as it 
could take a while to trip in this environment but some of the test 
servers have not shown anything bad in almost 24h.


Sorry, s/some of the/none of the


I think it is fairly safe to say 4.19 + mlx4 + these 4 commits is OK. At 
least for my workload. Servers are now 51-61 hours in, no splats. I also 
added ntp pool traffic to one of them to make things a little more exciting.


Not sure what is needed for 4.18, I dont have the mental bandwidth to 
test that right now. Also no idea about the similar looking mlx5 splats 
reported elsewhere.


Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-30 Thread Fabio Rossi
> On 10/16/2018 06:00 AM, Eric Dumazet wrote:
> > On Mon, Oct 15, 2018 at 11:30 PM Andre Tomt  wrote:
> >>
> >> On 15.10.2018 17:41, Eric Dumazet wrote:
> >>> On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger
>  Something is changed between 4.17.12 and 4.18, after bisecting the 
>  problem I
>  got the following first bad commit:
> 
>  commit 88078d98d1bb085d72af8437707279e203524fa5
>  Author: Eric Dumazet 
>  Date:   Wed Apr 18 11:43:15 2018 -0700
> 
>   net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends
> 
>   After working on IP defragmentation lately, I found that some large
>   packets defeat CHECKSUM_COMPLETE optimization because of NIC adding
>   zero paddings on the last (small) fragment.
> 
>   While removing the padding with pskb_trim_rcsum(), we set 
>  skb->ip_summed
>   to CHECKSUM_NONE, forcing a full csum validation, even if all prior
>   fragments had CHECKSUM_COMPLETE set.
> 
>   We can instead compute the checksum of the part we are trimming,
>   usually smaller than the part we keep.
> 
>   Signed-off-by: Eric Dumazet 
>   Signed-off-by: David S. Miller 
> 
> >>>
> >>> Thanks for bisecting !
> >>>
> >>> This commit is known to expose some NIC/driver bugs.
> >>>
> >>> Look at commit 12b03558cef6d655d0d394f5e98a6fd07c1f6c0f
> >>> ("net: sungem: fix rx checksum support")  for one driver needing a fix.
> >>>
> >>> I assume SKY2_HW_NEW_LE is not set on your NIC ?
> >>>
> >>
> >> I've seen similar on several systems with mlx4 cards when using 4.18.x -
> >> that is hw csum failure followed by some backtrace.
> >>
> >> Only seems to happen on systems dealing with quite a bit of UDP.
> >>
> > 
> > Strange, because mlx4 on IPv6+UDP should not use CHECKSUM_COMPLETE,
> > but CHECKSUM_UNNECESSARY
> > 
> > I would be nice to track this a bit further, maybe by providing the
> > full packet content.
> > 
> >> Example from 4.18.10:
> >>> [635607.740574] p0xe0: hw csum failure
> >>> [635607.740598] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1
> >>> [635607.740599] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 
> >>> 2.0b 05/02/2017
> >>> [635607.740599] Call Trace:
> >>> [635607.740602]  
> >>> [635607.740611]  dump_stack+0x5c/0x7b
> >>> [635607.740617]  __skb_gro_checksum_complete+0x9a/0xa0
> >>> [635607.740621]  udp6_gro_receive+0x211/0x290
> >>> [635607.740624]  ipv6_gro_receive+0x1a8/0x390
> >>> [635607.740627]  dev_gro_receive+0x33e/0x550
> >>> [635607.740628]  napi_gro_frags+0xa2/0x210
> >>> [635607.740635]  mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en]
> >>> [635607.740648]  ? mlx4_cq_completion+0x23/0x70 [mlx4_core]
> >>> [635607.740654]  ? mlx4_eq_int+0x373/0xc80 [mlx4_core]
> >>> [635607.740657]  mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en]
> >>> [635607.740658]  net_rx_action+0xe0/0x2e0
> >>> [635607.740662]  __do_softirq+0xd8/0x2e5
> >>> [635607.740666]  irq_exit+0xb4/0xc0
> >>> [635607.740667]  do_IRQ+0x85/0xd0
> >>> [635607.740670]  common_interrupt+0xf/0xf
> >>> [635607.740671]  
> >>> [635607.740675] RIP: 0010:cpuidle_enter_state+0xb4/0x2a0
> >>> [635607.740675] Code: 31 ff e8 df a6 ba ff 45 84 f6 74 17 9c 58 0f 1f 44 
> >>> 00 00 f6 c4 02 0f 85 d8 01 00 00 31 ff e8 13 81 bf ff fb 66 0f 1f 44 00 
> >>> 00 <4c> 29 fb 48 ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7
> >>> [635607.740701] RSP: 0018:a5c206353ea8 EFLAGS: 0246 ORIG_RAX: 
> >>> ffd9
> >>> [635607.740703] RAX: 8d72ffd20f00 RBX: 00024214f597c5b0 RCX: 
> >>> 001f
> >>> [635607.740703] RDX: 00024214f597c5b0 RSI: 00020780 RDI: 
> >>> 
> >>> [635607.740704] RBP: 0004 R08: 002542bfbefa99fa R09: 
> >>> 
> >>> [635607.740705] R10: a5c206353e88 R11: 00c5 R12: 
> >>> af0aaf78
> >>> [635607.740706] R13: 8d72ffd297d8 R14:  R15: 
> >>> 00024214f58c2ed5
> >>> [635607.740709]  ? cpuidle_enter_state+0x91/0x2a0
> >>> [635607.740712]  do_idle+0x1d0/0x240
> >>> [635607.740715]  cpu_startup_entry+0x5f/0x70
> >>> [635607.740719]  start_secondary+0x185/0x1a0
> >>> [635607.740722]  secondary_startup_64+0xa5/0xb0
> >>> [635607.740731] p0xe0: hw csum failure
> >>> [635607.740745] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1
> >>> [635607.740746] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 
> >>> 2.0b 05/02/2017
> >>> [635607.740746] Call Trace:
> >>> [635607.740747]  
> >>> [635607.740750]  dump_stack+0x5c/0x7b
> >>> [635607.740755]  __skb_checksum_complete+0xb8/0xd0
> >>> [635607.740760]  __udp6_lib_rcv+0xa6b/0xa70
> >>> [635607.740767]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
> >>> [635607.740770]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
> >>> [635607.740774]  ip6_input_finish+0xc0/0x460
> >>> [635607.740776]  ip6_input+0x2b/0x90
> >>> [635607.740778]  ? ip6_rcv_finish+0x110/0x110
> >>> [635607.740780]  ipv6_rcv+0x2cd/0x4b0
> >>> 

Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-30 Thread Andre Tomt

On 30.10.2018 11:58, Andre Tomt wrote:

On 27.10.2018 23:41, Andre Tomt wrote:

On 26.10.2018 13:45, Andre Tomt wrote:

On 25.10.2018 19:38, Eric Dumazet wrote:



On 10/24/2018 12:41 PM, Andre Tomt wrote:


It eventually showed up again with mlx4, on 4.18.16 + fix and also 
on 4.19. I still do not have a useful packet capture.


It is running a torrent client serving up various linux distributions.



Have you also applied this fix ?

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913 





No. I've applied it now to 4.19 and will report back if anything 
shows up.


Just hit it on the simpler server; no VRF, no tunnels, no 
nat/conntrack. Only a basic stateless nftables ruleset and a vlan 
netdev (unlikely to be the one triggering this I guess; it has only v4 
traffic).


I'm currently testing 4.19 with the recomended commit added, plus these 
to sort out some GRO issues (on a hunch, unsure if related):
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=a8305bff685252e80b7c60f4f5e7dd2e63e38218 

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=992cba7e276d438ac8b0a8c17b147b37c8c286f7 

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=ece23711dd956cd5053c9cb03e9fe0668f9c8894 



and I *think* it is behaving better now? it's not conclusive as it could 
take a while to trip in this environment but some of the test servers 
have not shown anything bad in almost 24h.


Sorry, s/some of the/none of the


Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-30 Thread Andre Tomt

On 27.10.2018 23:41, Andre Tomt wrote:

On 26.10.2018 13:45, Andre Tomt wrote:

On 25.10.2018 19:38, Eric Dumazet wrote:



On 10/24/2018 12:41 PM, Andre Tomt wrote:


It eventually showed up again with mlx4, on 4.18.16 + fix and also 
on 4.19. I still do not have a useful packet capture.


It is running a torrent client serving up various linux distributions.



Have you also applied this fix ?

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913 





No. I've applied it now to 4.19 and will report back if anything shows 
up.


Just hit it on the simpler server; no VRF, no tunnels, no nat/conntrack. 
Only a basic stateless nftables ruleset and a vlan netdev (unlikely to 
be the one triggering this I guess; it has only v4 traffic).


I'm currently testing 4.19 with the recomended commit added, plus these 
to sort out some GRO issues (on a hunch, unsure if related):

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=a8305bff685252e80b7c60f4f5e7dd2e63e38218
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=992cba7e276d438ac8b0a8c17b147b37c8c286f7
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=ece23711dd956cd5053c9cb03e9fe0668f9c8894

and I *think* it is behaving better now? it's not conclusive as it could 
take a while to trip in this environment but some of the test servers 
have not shown anything bad in almost 24h.


Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-27 Thread Andre Tomt

On 26.10.2018 13:45, Andre Tomt wrote:

On 25.10.2018 19:38, Eric Dumazet wrote:



On 10/24/2018 12:41 PM, Andre Tomt wrote:


It eventually showed up again with mlx4, on 4.18.16 + fix and also on 
4.19. I still do not have a useful packet capture.


It is running a torrent client serving up various linux distributions.



Have you also applied this fix ?

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913 





No. I've applied it now to 4.19 and will report back if anything shows up.


Just hit it on the simpler server; no VRF, no tunnels, no nat/conntrack. 
Only a basic stateless nftables ruleset and a vlan netdev (unlikely to 
be the one triggering this I guess; it has only v4 traffic).


On 4.19 + above commit:

[158269.360271] p0xe0: hw csum failure
[158269.360286] CPU: 3 PID: 0 Comm: swapper/3 Tainted: P   O  
4.19.0-1 #1
[158269.360287] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 
05/02/2017
[158269.360288] Call Trace:
[158269.360290]  
[158269.360295]  dump_stack+0x5c/0x7b
[158269.360299]  __skb_gro_checksum_complete+0x9a/0xa0
[158269.360301]  udp6_gro_receive+0x211/0x290
[158269.360303]  ipv6_gro_receive+0x1b1/0x3a0
[158269.360306]  ? ip_sublist_rcv_finish+0x70/0x70
[158269.360307]  dev_gro_receive+0x3a0/0x620
[158269.360309]  ? __build_skb+0x25/0xe0
[158269.360310]  napi_gro_frags+0xa8/0x220
[158269.360314]  mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en]
[158269.360322]  ? mlx4_cq_completion+0x23/0x70 [mlx4_core]
[158269.360325]  ? mlx4_eq_int+0x373/0xc80 [mlx4_core]
[158269.360327]  mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en]
[158269.360329]  net_rx_action+0xe0/0x2e0
[158269.360330]  __do_softirq+0xd8/0x2ff
[158269.360333]  irq_exit+0xbd/0xd0
[158269.360334]  do_IRQ+0x85/0xd0
[158269.360336]  common_interrupt+0xf/0xf
[158269.360337]  
[158269.360339] RIP: 0010:cpuidle_enter_state+0xb3/0x310
[158269.360340] Code: 31 ff e8 e0 e0 bb ff 45 84 f6 74 17 9c 58 0f 1f 44 00 00 f6 c4 
02 0f 85 3f 02 00 00 31 ff e8 64 cc c0 ff fb 66 0f 1f 44 00 00 <4c> 29 fb 48 ba 
cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7
[158269.360341] RSP: 0018:af28c634bea8 EFLAGS: 0246 ORIG_RAX: 
ffd9
[158269.360342] RAX: 9a9f7fae0fc0 RBX: 8ff1f4ff622a RCX: 
001f
[158269.360343] RDX: 8ff1f4ff622a RSI: 22983893 RDI: 

[158269.360343] RBP: 0001 R08: 0002 R09: 
00020840
[158269.360344] R10: af28c634be88 R11: 0036 R12: 
9a9f7fae9aa8
[158269.360344] R13: aa0ac638 R14:  R15: 
8ff1f4f09d43
[158269.360347]  ? cpuidle_enter_state+0x90/0x310
[158269.360349]  do_idle+0x1d0/0x240
[158269.360351]  cpu_startup_entry+0x5f/0x70
[158269.360352]  start_secondary+0x185/0x1a0
[158269.360354]  secondary_startup_64+0xa4/0xb0


Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-26 Thread Andre Tomt

On 26.10.2018 14:59, Eric Dumazet wrote:

On Fri, Oct 26, 2018 at 5:38 AM Andre Tomt  wrote:

And it tripped again with that commit; however on another box with a
much more complicated setup (VRFs, sch_cake, ifb, conntrack/nat, 6in4
tunnel, VF device on mlx4)


[ 8197.348260] wanib: hw csum failure
[ 8197.348288] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.0-1 #1
[ 8197.348289] Hardware name: Supermicro SYS-5018D-FN8T/X10SDV-TP8F, BIOS 1.3 
03/19/2018
[ 8197.348290] Call Trace:
[ 8197.348296]  
[ 8197.348304]  dump_stack+0x5c/0x80
[ 8197.348308]  __skb_checksum_complete+0xac/0xc0
[ 8197.348318]  icmp_error+0x1c8/0x1f0 [nf_conntrack]
[ 8197.348325]  ? ip_output+0x61/0xc0
[ 8197.348328]  ? skb_copy_bits+0x13d/0x220
[ 8197.348334]  nf_conntrack_in+0xd8/0x390 [nf_conntrack]
[ 8197.348339]  ? ___pskb_trim+0x192/0x330
[ 8197.348343]  nf_hook_slow+0x43/0xc0
[ 8197.348346]  ip_rcv+0x90/0xb0
[ 8197.348349]  ? ip_rcv_finish_core.isra.0+0x310/0x310
[ 8197.348354]  __netif_receive_skb_one_core+0x42/0x50
[ 8197.348357]  netif_receive_skb_internal+0x24/0xb0
[ 8197.348361]  ifb_ri_tasklet+0x167/0x260 [ifb]
[ 8197.348365]  tasklet_action_common.isra.3+0x49/0xb0
[ 8197.348369]  __do_softirq+0xe7/0x2d3
[ 8197.348372]  irq_exit+0x96/0xd0
[ 8197.348375]  do_IRQ+0x85/0xd0
[ 8197.348378]  common_interrupt+0xf/0xf
[ 8197.348379]  
[ 8197.348382] RIP: 0010:cpuidle_enter_state+0xb9/0x320
[ 8197.348384] Code: e8 1c 16 bc ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 
02 0f 85 3b 02 00 00 31 ff e8 3e fb c0 ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff 
ff f3 01 00 00 48 2b 1c 24 ba ff ff ff 7f 48 39 c3
[ 8197.348386] RSP: 0018:9f0441953ea8 EFLAGS: 0246 ORIG_RAX: 
ffd5
[ 8197.348388] RAX: 9759efae0fc0 RBX: 07749807d911 RCX: 001f
[ 8197.348390] RDX: 07749807d911 RSI: 3a2e8670 RDI: 
[ 8197.348393] RBP: 9759efae98a8 R08: 0002 R09: 00020840
[ 8197.348396] R10: 00626b4810384abc R11: 9759efae01e8 R12: 0001
[ 8197.348398] R13: 8d0ac638 R14: 0001 R15: 
[ 8197.348402]  ? cpuidle_enter_state+0x94/0x320
[ 8197.348407]  do_idle+0x1e4/0x220
[ 8197.348411]  cpu_startup_entry+0x5f/0x70
[ 8197.348415]  start_secondary+0x185/0x1a0
[ 8197.348417]  secondary_startup_64+0xa4/0xb0



Very different trace , yet another bug to track .

If you can, try to remove some components from this setup.



Will do. Just remembered I took out the VF stuff a few days ago and that 
netdev is just a normal vlan device now. Going to eliminate VRF and 
cake/ifb as well.


Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-26 Thread Eric Dumazet
On Fri, Oct 26, 2018 at 5:38 AM Andre Tomt  wrote:
>
> On 26.10.2018 13:45, Andre Tomt wrote:
> > On 25.10.2018 19:38, Eric Dumazet wrote:
> >>
> >>
> >> On 10/24/2018 12:41 PM, Andre Tomt wrote:
> >>>
> >>> It eventually showed up again with mlx4, on 4.18.16 + fix and also on
> >>> 4.19. I still do not have a useful packet capture.
> >>>
> >>> It is running a torrent client serving up various linux distributions.
> >>>
> >>
> >> Have you also applied this fix ?
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913
> >>
> >>
> >
> > No. I've applied it now to 4.19 and will report back if anything shows up.
>
> And it tripped again with that commit; however on another box with a
> much more complicated setup (VRFs, sch_cake, ifb, conntrack/nat, 6in4
> tunnel, VF device on mlx4)
>
> > [ 8197.348260] wanib: hw csum failure
> > [ 8197.348288] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.0-1 #1
> > [ 8197.348289] Hardware name: Supermicro SYS-5018D-FN8T/X10SDV-TP8F, BIOS 
> > 1.3 03/19/2018
> > [ 8197.348290] Call Trace:
> > [ 8197.348296]  
> > [ 8197.348304]  dump_stack+0x5c/0x80
> > [ 8197.348308]  __skb_checksum_complete+0xac/0xc0
> > [ 8197.348318]  icmp_error+0x1c8/0x1f0 [nf_conntrack]
> > [ 8197.348325]  ? ip_output+0x61/0xc0
> > [ 8197.348328]  ? skb_copy_bits+0x13d/0x220
> > [ 8197.348334]  nf_conntrack_in+0xd8/0x390 [nf_conntrack]
> > [ 8197.348339]  ? ___pskb_trim+0x192/0x330
> > [ 8197.348343]  nf_hook_slow+0x43/0xc0
> > [ 8197.348346]  ip_rcv+0x90/0xb0
> > [ 8197.348349]  ? ip_rcv_finish_core.isra.0+0x310/0x310
> > [ 8197.348354]  __netif_receive_skb_one_core+0x42/0x50
> > [ 8197.348357]  netif_receive_skb_internal+0x24/0xb0
> > [ 8197.348361]  ifb_ri_tasklet+0x167/0x260 [ifb]
> > [ 8197.348365]  tasklet_action_common.isra.3+0x49/0xb0
> > [ 8197.348369]  __do_softirq+0xe7/0x2d3
> > [ 8197.348372]  irq_exit+0x96/0xd0
> > [ 8197.348375]  do_IRQ+0x85/0xd0
> > [ 8197.348378]  common_interrupt+0xf/0xf
> > [ 8197.348379]  
> > [ 8197.348382] RIP: 0010:cpuidle_enter_state+0xb9/0x320
> > [ 8197.348384] Code: e8 1c 16 bc ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 
> > 00 f6 c4 02 0f 85 3b 02 00 00 31 ff e8 3e fb c0 ff fb 66 0f 1f 44 00 00 
> > <48> b8 ff ff ff ff f3 01 00 00 48 2b 1c 24 ba ff ff ff 7f 48 39 c3
> > [ 8197.348386] RSP: 0018:9f0441953ea8 EFLAGS: 0246 ORIG_RAX: 
> > ffd5
> > [ 8197.348388] RAX: 9759efae0fc0 RBX: 07749807d911 RCX: 
> > 001f
> > [ 8197.348390] RDX: 07749807d911 RSI: 3a2e8670 RDI: 
> > 
> > [ 8197.348393] RBP: 9759efae98a8 R08: 0002 R09: 
> > 00020840
> > [ 8197.348396] R10: 00626b4810384abc R11: 9759efae01e8 R12: 
> > 0001
> > [ 8197.348398] R13: 8d0ac638 R14: 0001 R15: 
> > 
> > [ 8197.348402]  ? cpuidle_enter_state+0x94/0x320
> > [ 8197.348407]  do_idle+0x1e4/0x220
> > [ 8197.348411]  cpu_startup_entry+0x5f/0x70
> > [ 8197.348415]  start_secondary+0x185/0x1a0
> > [ 8197.348417]  secondary_startup_64+0xa4/0xb0



Very different trace , yet another bug to track .

If you can, try to remove some components from this setup.


Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-26 Thread Andre Tomt

On 26.10.2018 13:45, Andre Tomt wrote:

On 25.10.2018 19:38, Eric Dumazet wrote:



On 10/24/2018 12:41 PM, Andre Tomt wrote:


It eventually showed up again with mlx4, on 4.18.16 + fix and also on 
4.19. I still do not have a useful packet capture.


It is running a torrent client serving up various linux distributions.



Have you also applied this fix ?

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913 





No. I've applied it now to 4.19 and will report back if anything shows up.


And it tripped again with that commit; however on another box with a 
much more complicated setup (VRFs, sch_cake, ifb, conntrack/nat, 6in4 
tunnel, VF device on mlx4)



[ 8197.348260] wanib: hw csum failure
[ 8197.348288] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.0-1 #1
[ 8197.348289] Hardware name: Supermicro SYS-5018D-FN8T/X10SDV-TP8F, BIOS 1.3 
03/19/2018
[ 8197.348290] Call Trace:
[ 8197.348296]  
[ 8197.348304]  dump_stack+0x5c/0x80
[ 8197.348308]  __skb_checksum_complete+0xac/0xc0
[ 8197.348318]  icmp_error+0x1c8/0x1f0 [nf_conntrack]
[ 8197.348325]  ? ip_output+0x61/0xc0
[ 8197.348328]  ? skb_copy_bits+0x13d/0x220
[ 8197.348334]  nf_conntrack_in+0xd8/0x390 [nf_conntrack]
[ 8197.348339]  ? ___pskb_trim+0x192/0x330
[ 8197.348343]  nf_hook_slow+0x43/0xc0
[ 8197.348346]  ip_rcv+0x90/0xb0
[ 8197.348349]  ? ip_rcv_finish_core.isra.0+0x310/0x310
[ 8197.348354]  __netif_receive_skb_one_core+0x42/0x50
[ 8197.348357]  netif_receive_skb_internal+0x24/0xb0
[ 8197.348361]  ifb_ri_tasklet+0x167/0x260 [ifb]
[ 8197.348365]  tasklet_action_common.isra.3+0x49/0xb0
[ 8197.348369]  __do_softirq+0xe7/0x2d3
[ 8197.348372]  irq_exit+0x96/0xd0
[ 8197.348375]  do_IRQ+0x85/0xd0
[ 8197.348378]  common_interrupt+0xf/0xf
[ 8197.348379]  
[ 8197.348382] RIP: 0010:cpuidle_enter_state+0xb9/0x320
[ 8197.348384] Code: e8 1c 16 bc ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 
02 0f 85 3b 02 00 00 31 ff e8 3e fb c0 ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff 
ff f3 01 00 00 48 2b 1c 24 ba ff ff ff 7f 48 39 c3
[ 8197.348386] RSP: 0018:9f0441953ea8 EFLAGS: 0246 ORIG_RAX: 
ffd5
[ 8197.348388] RAX: 9759efae0fc0 RBX: 07749807d911 RCX: 001f
[ 8197.348390] RDX: 07749807d911 RSI: 3a2e8670 RDI: 
[ 8197.348393] RBP: 9759efae98a8 R08: 0002 R09: 00020840
[ 8197.348396] R10: 00626b4810384abc R11: 9759efae01e8 R12: 0001
[ 8197.348398] R13: 8d0ac638 R14: 0001 R15: 
[ 8197.348402]  ? cpuidle_enter_state+0x94/0x320
[ 8197.348407]  do_idle+0x1e4/0x220
[ 8197.348411]  cpu_startup_entry+0x5f/0x70
[ 8197.348415]  start_secondary+0x185/0x1a0
[ 8197.348417]  secondary_startup_64+0xa4/0xb0


Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-26 Thread Andre Tomt

On 25.10.2018 19:38, Eric Dumazet wrote:



On 10/24/2018 12:41 PM, Andre Tomt wrote:


It eventually showed up again with mlx4, on 4.18.16 + fix and also on 4.19. I 
still do not have a useful packet capture.

It is running a torrent client serving up various linux distributions.



Have you also applied this fix ?

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913



No. I've applied it now to 4.19 and will report back if anything shows up.


Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-25 Thread Eric Dumazet



On 10/24/2018 12:41 PM, Andre Tomt wrote:
> 
> It eventually showed up again with mlx4, on 4.18.16 + fix and also on 4.19. I 
> still do not have a useful packet capture.
> 
> It is running a torrent client serving up various linux distributions.
>

Have you also applied this fix ?

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=db4f1be3ca9b0ef7330763d07bf4ace83ad6f913



Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-24 Thread Andre Tomt

On 21.10.2018 15:34, Andre Tomt wrote:

On 20.10.2018 00:25, Eric Dumazet wrote:

On 10/19/2018 02:58 PM, Eric Dumazet wrote:

On 10/16/2018 06:00 AM, Eric Dumazet wrote:

On Mon, Oct 15, 2018 at 11:30 PM Andre Tomt  wrote:
I've seen similar on several systems with mlx4 cards when using 
4.18.x -

that is hw csum failure followed by some backtrace.

Only seems to happen on systems dealing with quite a bit of UDP.



Strange, because mlx4 on IPv6+UDP should not use CHECKSUM_COMPLETE,
but CHECKSUM_UNNECESSARY

I would be nice to track this a bit further, maybe by providing the
full packet content.





As a matter of fact Dimitris found the issue in the patch and is 
working on a fix involving csum_block_sub()


Problems comes from trimming an odd number of bytes.


More exactly, trimming bytes starting at an odd offset.


No hw csum failures here since I deployed Dimitris fix on top of 4.18.16 
32 hours ago.


Thanks


It eventually showed up again with mlx4, on 4.18.16 + fix and also on 
4.19. I still do not have a useful packet capture.


It is running a torrent client serving up various linux distributions.


[116116.994519] p0xe0: hw csum failure
[116116.994550] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.19.0-1 #1
[116116.994551] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 
05/02/2017
[116116.994555] Call Trace:
[116116.994558]  
[116116.994567]  dump_stack+0x5c/0x7b
[116116.994574]  __skb_gro_checksum_complete+0x9a/0xa0
[116116.994580]  udp6_gro_receive+0x211/0x290
[116116.994585]  ipv6_gro_receive+0x1b1/0x3a0
[116116.994588]  dev_gro_receive+0x3a0/0x620
[116116.994590]  ? __build_skb+0x25/0xe0
[116116.994592]  napi_gro_frags+0xa8/0x220
[116116.994598]  mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en]
[116116.994611]  ? mlx4_cq_completion+0x23/0x70 [mlx4_core]
[116116.994621]  ? mlx4_eq_int+0x373/0xc80 [mlx4_core]
[116116.994629]  mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en]
[116116.994635]  net_rx_action+0xe0/0x2e0
[116116.994641]  __do_softirq+0xd8/0x2ff
[116116.994646]  irq_exit+0xbd/0xd0
[116116.994650]  do_IRQ+0x85/0xd0
[116116.994656]  common_interrupt+0xf/0xf
[116116.994659]  
[116116.994665] RIP: 0010:cpuidle_enter_state+0xb3/0x310
[116116.994668] Code: 31 ff e8 e0 e0 bb ff 45 84 f6 74 17 9c 58 0f 1f 44 00 00 f6 c4 
02 0f 85 3f 02 00 00 31 ff e8 64 cc c0 ff fb 66 0f 1f 44 00 00 <4c> 29 fb 48 ba 
cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7
[116116.994669] RSP: 0018:924a0635bea8 EFLAGS: 0246 ORIG_RAX: 
ffda
[116116.994671] RAX: 9016ffb60fc0 RBX: 699b9835d616 RCX: 
001f
[116116.994673] RDX: 699b9835d616 RSI: 229837f7 RDI: 

[116116.994674] RBP: 0001 R08: 0002 R09: 
00020840
[116116.994675] R10: 924a0635be88 R11: 0367 R12: 
9016ffb69aa8
[116116.994676] R13: a50ac638 R14:  R15: 
699b981c63b9
[116116.994680]  ? cpuidle_enter_state+0x90/0x310
[116116.994685]  do_idle+0x1d0/0x240
[116116.994687]  cpu_startup_entry+0x5f/0x70
[116116.994690]  start_secondary+0x185/0x1a0
[116116.994693]  secondary_startup_64+0xa4/0xb0
[116116.994709] p0xe0: hw csum failure
[116116.994739] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.19.0-1 #1
[116116.994740] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 
05/02/2017
[116116.994741] Call Trace:
[116116.994743]  
[116116.994746]  dump_stack+0x5c/0x7b
[116116.994751]  __skb_checksum_complete+0xb8/0xd0
[116116.994755]  __udp6_lib_rcv+0xa0e/0xa20
[116116.994764]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
[116116.994768]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
[116116.994771]  ip6_input_finish+0xc0/0x460
[116116.994774]  ip6_input+0x2b/0x90
[116116.994776]  ? ip6_make_skb+0x1b0/0x1b0
[116116.994778]  ipv6_rcv+0x54/0xb0
[116116.994781]  __netif_receive_skb_one_core+0x42/0x50
[116116.994784]  netif_receive_skb_internal+0x24/0xb0
[116116.994786]  napi_gro_frags+0x171/0x220
[116116.994790]  mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en]
[116116.994798]  ? mlx4_cq_completion+0x23/0x70 [mlx4_core]
[116116.994803]  ? mlx4_eq_int+0x373/0xc80 [mlx4_core]
[116116.994806]  mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en]
[116116.994808]  net_rx_action+0xe0/0x2e0
[116116.994810]  __do_softirq+0xd8/0x2ff
[116116.994812]  irq_exit+0xbd/0xd0
[116116.994814]  do_IRQ+0x85/0xd0
[116116.994816]  common_interrupt+0xf/0xf
[116116.994818]  
[116116.994821] RIP: 0010:cpuidle_enter_state+0xb3/0x310
[116116.994823] Code: 31 ff e8 e0 e0 bb ff 45 84 f6 74 17 9c 58 0f 1f 44 00 00 f6 c4 
02 0f 85 3f 02 00 00 31 ff e8 64 cc c0 ff fb 66 0f 1f 44 00 00 <4c> 29 fb 48 ba 
cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7
[116116.994824] RSP: 0018:924a0635bea8 EFLAGS: 0246 ORIG_RAX: 
ffda
[116116.994825] RAX: 9016ffb60fc0 RBX: 699b9835d616 RCX: 
001f
[116116.994826] RDX: 699b9835d616 RSI: 229837f7 RDI: 

[116116.994827] RBP: 0001 R08: 0002 R09: 

Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-21 Thread Andre Tomt

On 20.10.2018 00:25, Eric Dumazet wrote:

On 10/19/2018 02:58 PM, Eric Dumazet wrote:

On 10/16/2018 06:00 AM, Eric Dumazet wrote:

On Mon, Oct 15, 2018 at 11:30 PM Andre Tomt  wrote:

I've seen similar on several systems with mlx4 cards when using 4.18.x -
that is hw csum failure followed by some backtrace.

Only seems to happen on systems dealing with quite a bit of UDP.



Strange, because mlx4 on IPv6+UDP should not use CHECKSUM_COMPLETE,
but CHECKSUM_UNNECESSARY

I would be nice to track this a bit further, maybe by providing the
full packet content.





As a matter of fact Dimitris found the issue in the patch and is working on a 
fix involving csum_block_sub()

Problems comes from trimming an odd number of bytes.


More exactly, trimming bytes starting at an odd offset.


No hw csum failures here since I deployed Dimitris fix on top of 4.18.16 
32 hours ago.


Thanks


Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-19 Thread Eric Dumazet



On 10/19/2018 02:58 PM, Eric Dumazet wrote:
> 
> 
> On 10/16/2018 06:00 AM, Eric Dumazet wrote:
>> On Mon, Oct 15, 2018 at 11:30 PM Andre Tomt  wrote:
>>>
>>> On 15.10.2018 17:41, Eric Dumazet wrote:
 On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger
> Something is changed between 4.17.12 and 4.18, after bisecting the 
> problem I
> got the following first bad commit:
>
> commit 88078d98d1bb085d72af8437707279e203524fa5
> Author: Eric Dumazet 
> Date:   Wed Apr 18 11:43:15 2018 -0700
>
>  net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends
>
>  After working on IP defragmentation lately, I found that some large
>  packets defeat CHECKSUM_COMPLETE optimization because of NIC adding
>  zero paddings on the last (small) fragment.
>
>  While removing the padding with pskb_trim_rcsum(), we set 
> skb->ip_summed
>  to CHECKSUM_NONE, forcing a full csum validation, even if all prior
>  fragments had CHECKSUM_COMPLETE set.
>
>  We can instead compute the checksum of the part we are trimming,
>  usually smaller than the part we keep.
>
>  Signed-off-by: Eric Dumazet 
>  Signed-off-by: David S. Miller 
>

 Thanks for bisecting !

 This commit is known to expose some NIC/driver bugs.

 Look at commit 12b03558cef6d655d0d394f5e98a6fd07c1f6c0f
 ("net: sungem: fix rx checksum support")  for one driver needing a fix.

 I assume SKY2_HW_NEW_LE is not set on your NIC ?

>>>
>>> I've seen similar on several systems with mlx4 cards when using 4.18.x -
>>> that is hw csum failure followed by some backtrace.
>>>
>>> Only seems to happen on systems dealing with quite a bit of UDP.
>>>
>>
>> Strange, because mlx4 on IPv6+UDP should not use CHECKSUM_COMPLETE,
>> but CHECKSUM_UNNECESSARY
>>
>> I would be nice to track this a bit further, maybe by providing the
>> full packet content.
>>
>>> Example from 4.18.10:
 [635607.740574] p0xe0: hw csum failure
 [635607.740598] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1
 [635607.740599] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 
 05/02/2017
 [635607.740599] Call Trace:
 [635607.740602]  
 [635607.740611]  dump_stack+0x5c/0x7b
 [635607.740617]  __skb_gro_checksum_complete+0x9a/0xa0
 [635607.740621]  udp6_gro_receive+0x211/0x290
 [635607.740624]  ipv6_gro_receive+0x1a8/0x390
 [635607.740627]  dev_gro_receive+0x33e/0x550
 [635607.740628]  napi_gro_frags+0xa2/0x210
 [635607.740635]  mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en]
 [635607.740648]  ? mlx4_cq_completion+0x23/0x70 [mlx4_core]
 [635607.740654]  ? mlx4_eq_int+0x373/0xc80 [mlx4_core]
 [635607.740657]  mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en]
 [635607.740658]  net_rx_action+0xe0/0x2e0
 [635607.740662]  __do_softirq+0xd8/0x2e5
 [635607.740666]  irq_exit+0xb4/0xc0
 [635607.740667]  do_IRQ+0x85/0xd0
 [635607.740670]  common_interrupt+0xf/0xf
 [635607.740671]  
 [635607.740675] RIP: 0010:cpuidle_enter_state+0xb4/0x2a0
 [635607.740675] Code: 31 ff e8 df a6 ba ff 45 84 f6 74 17 9c 58 0f 1f 44 
 00 00 f6 c4 02 0f 85 d8 01 00 00 31 ff e8 13 81 bf ff fb 66 0f 1f 44 00 00 
 <4c> 29 fb 48 ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7
 [635607.740701] RSP: 0018:a5c206353ea8 EFLAGS: 0246 ORIG_RAX: 
 ffd9
 [635607.740703] RAX: 8d72ffd20f00 RBX: 00024214f597c5b0 RCX: 
 001f
 [635607.740703] RDX: 00024214f597c5b0 RSI: 00020780 RDI: 
 
 [635607.740704] RBP: 0004 R08: 002542bfbefa99fa R09: 
 
 [635607.740705] R10: a5c206353e88 R11: 00c5 R12: 
 af0aaf78
 [635607.740706] R13: 8d72ffd297d8 R14:  R15: 
 00024214f58c2ed5
 [635607.740709]  ? cpuidle_enter_state+0x91/0x2a0
 [635607.740712]  do_idle+0x1d0/0x240
 [635607.740715]  cpu_startup_entry+0x5f/0x70
 [635607.740719]  start_secondary+0x185/0x1a0
 [635607.740722]  secondary_startup_64+0xa5/0xb0
 [635607.740731] p0xe0: hw csum failure
 [635607.740745] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1
 [635607.740746] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 
 05/02/2017
 [635607.740746] Call Trace:
 [635607.740747]  
 [635607.740750]  dump_stack+0x5c/0x7b
 [635607.740755]  __skb_checksum_complete+0xb8/0xd0
 [635607.740760]  __udp6_lib_rcv+0xa6b/0xa70
 [635607.740767]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
 [635607.740770]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
 [635607.740774]  ip6_input_finish+0xc0/0x460
 [635607.740776]  ip6_input+0x2b/0x90
 [635607.740778]  ? ip6_rcv_finish+0x110/0x110
 [635607.740780]  ipv6_rcv+0x2cd/0x4b0
 [635607.740783]  ? udp6_lib_lookup_skb+0x59/0x80
 

Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-19 Thread Eric Dumazet



On 10/16/2018 06:00 AM, Eric Dumazet wrote:
> On Mon, Oct 15, 2018 at 11:30 PM Andre Tomt  wrote:
>>
>> On 15.10.2018 17:41, Eric Dumazet wrote:
>>> On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger
 Something is changed between 4.17.12 and 4.18, after bisecting the problem 
 I
 got the following first bad commit:

 commit 88078d98d1bb085d72af8437707279e203524fa5
 Author: Eric Dumazet 
 Date:   Wed Apr 18 11:43:15 2018 -0700

  net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends

  After working on IP defragmentation lately, I found that some large
  packets defeat CHECKSUM_COMPLETE optimization because of NIC adding
  zero paddings on the last (small) fragment.

  While removing the padding with pskb_trim_rcsum(), we set 
 skb->ip_summed
  to CHECKSUM_NONE, forcing a full csum validation, even if all prior
  fragments had CHECKSUM_COMPLETE set.

  We can instead compute the checksum of the part we are trimming,
  usually smaller than the part we keep.

  Signed-off-by: Eric Dumazet 
  Signed-off-by: David S. Miller 

>>>
>>> Thanks for bisecting !
>>>
>>> This commit is known to expose some NIC/driver bugs.
>>>
>>> Look at commit 12b03558cef6d655d0d394f5e98a6fd07c1f6c0f
>>> ("net: sungem: fix rx checksum support")  for one driver needing a fix.
>>>
>>> I assume SKY2_HW_NEW_LE is not set on your NIC ?
>>>
>>
>> I've seen similar on several systems with mlx4 cards when using 4.18.x -
>> that is hw csum failure followed by some backtrace.
>>
>> Only seems to happen on systems dealing with quite a bit of UDP.
>>
> 
> Strange, because mlx4 on IPv6+UDP should not use CHECKSUM_COMPLETE,
> but CHECKSUM_UNNECESSARY
> 
> I would be nice to track this a bit further, maybe by providing the
> full packet content.
> 
>> Example from 4.18.10:
>>> [635607.740574] p0xe0: hw csum failure
>>> [635607.740598] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1
>>> [635607.740599] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 
>>> 05/02/2017
>>> [635607.740599] Call Trace:
>>> [635607.740602]  
>>> [635607.740611]  dump_stack+0x5c/0x7b
>>> [635607.740617]  __skb_gro_checksum_complete+0x9a/0xa0
>>> [635607.740621]  udp6_gro_receive+0x211/0x290
>>> [635607.740624]  ipv6_gro_receive+0x1a8/0x390
>>> [635607.740627]  dev_gro_receive+0x33e/0x550
>>> [635607.740628]  napi_gro_frags+0xa2/0x210
>>> [635607.740635]  mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en]
>>> [635607.740648]  ? mlx4_cq_completion+0x23/0x70 [mlx4_core]
>>> [635607.740654]  ? mlx4_eq_int+0x373/0xc80 [mlx4_core]
>>> [635607.740657]  mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en]
>>> [635607.740658]  net_rx_action+0xe0/0x2e0
>>> [635607.740662]  __do_softirq+0xd8/0x2e5
>>> [635607.740666]  irq_exit+0xb4/0xc0
>>> [635607.740667]  do_IRQ+0x85/0xd0
>>> [635607.740670]  common_interrupt+0xf/0xf
>>> [635607.740671]  
>>> [635607.740675] RIP: 0010:cpuidle_enter_state+0xb4/0x2a0
>>> [635607.740675] Code: 31 ff e8 df a6 ba ff 45 84 f6 74 17 9c 58 0f 1f 44 00 
>>> 00 f6 c4 02 0f 85 d8 01 00 00 31 ff e8 13 81 bf ff fb 66 0f 1f 44 00 00 
>>> <4c> 29 fb 48 ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7
>>> [635607.740701] RSP: 0018:a5c206353ea8 EFLAGS: 0246 ORIG_RAX: 
>>> ffd9
>>> [635607.740703] RAX: 8d72ffd20f00 RBX: 00024214f597c5b0 RCX: 
>>> 001f
>>> [635607.740703] RDX: 00024214f597c5b0 RSI: 00020780 RDI: 
>>> 
>>> [635607.740704] RBP: 0004 R08: 002542bfbefa99fa R09: 
>>> 
>>> [635607.740705] R10: a5c206353e88 R11: 00c5 R12: 
>>> af0aaf78
>>> [635607.740706] R13: 8d72ffd297d8 R14:  R15: 
>>> 00024214f58c2ed5
>>> [635607.740709]  ? cpuidle_enter_state+0x91/0x2a0
>>> [635607.740712]  do_idle+0x1d0/0x240
>>> [635607.740715]  cpu_startup_entry+0x5f/0x70
>>> [635607.740719]  start_secondary+0x185/0x1a0
>>> [635607.740722]  secondary_startup_64+0xa5/0xb0
>>> [635607.740731] p0xe0: hw csum failure
>>> [635607.740745] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1
>>> [635607.740746] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 
>>> 05/02/2017
>>> [635607.740746] Call Trace:
>>> [635607.740747]  
>>> [635607.740750]  dump_stack+0x5c/0x7b
>>> [635607.740755]  __skb_checksum_complete+0xb8/0xd0
>>> [635607.740760]  __udp6_lib_rcv+0xa6b/0xa70
>>> [635607.740767]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
>>> [635607.740770]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
>>> [635607.740774]  ip6_input_finish+0xc0/0x460
>>> [635607.740776]  ip6_input+0x2b/0x90
>>> [635607.740778]  ? ip6_rcv_finish+0x110/0x110
>>> [635607.740780]  ipv6_rcv+0x2cd/0x4b0
>>> [635607.740783]  ? udp6_lib_lookup_skb+0x59/0x80
>>> [635607.740785]  __netif_receive_skb_core+0x455/0xb30
>>> [635607.740788]  ? ipv6_gro_receive+0x1a8/0x390
>>> [635607.740790]  ? netif_receive_skb_internal+0x24/0xb0
>>> 

Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-16 Thread Eric Dumazet
On Mon, Oct 15, 2018 at 11:30 PM Andre Tomt  wrote:
>
> On 15.10.2018 17:41, Eric Dumazet wrote:
> > On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger
> >> Something is changed between 4.17.12 and 4.18, after bisecting the problem 
> >> I
> >> got the following first bad commit:
> >>
> >> commit 88078d98d1bb085d72af8437707279e203524fa5
> >> Author: Eric Dumazet 
> >> Date:   Wed Apr 18 11:43:15 2018 -0700
> >>
> >>  net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends
> >>
> >>  After working on IP defragmentation lately, I found that some large
> >>  packets defeat CHECKSUM_COMPLETE optimization because of NIC adding
> >>  zero paddings on the last (small) fragment.
> >>
> >>  While removing the padding with pskb_trim_rcsum(), we set 
> >> skb->ip_summed
> >>  to CHECKSUM_NONE, forcing a full csum validation, even if all prior
> >>  fragments had CHECKSUM_COMPLETE set.
> >>
> >>  We can instead compute the checksum of the part we are trimming,
> >>  usually smaller than the part we keep.
> >>
> >>  Signed-off-by: Eric Dumazet 
> >>  Signed-off-by: David S. Miller 
> >>
> >
> > Thanks for bisecting !
> >
> > This commit is known to expose some NIC/driver bugs.
> >
> > Look at commit 12b03558cef6d655d0d394f5e98a6fd07c1f6c0f
> > ("net: sungem: fix rx checksum support")  for one driver needing a fix.
> >
> > I assume SKY2_HW_NEW_LE is not set on your NIC ?
> >
>
> I've seen similar on several systems with mlx4 cards when using 4.18.x -
> that is hw csum failure followed by some backtrace.
>
> Only seems to happen on systems dealing with quite a bit of UDP.
>

Strange, because mlx4 on IPv6+UDP should not use CHECKSUM_COMPLETE,
but CHECKSUM_UNNECESSARY

I would be nice to track this a bit further, maybe by providing the
full packet content.

> Example from 4.18.10:
> > [635607.740574] p0xe0: hw csum failure
> > [635607.740598] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1
> > [635607.740599] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 
> > 05/02/2017
> > [635607.740599] Call Trace:
> > [635607.740602]  
> > [635607.740611]  dump_stack+0x5c/0x7b
> > [635607.740617]  __skb_gro_checksum_complete+0x9a/0xa0
> > [635607.740621]  udp6_gro_receive+0x211/0x290
> > [635607.740624]  ipv6_gro_receive+0x1a8/0x390
> > [635607.740627]  dev_gro_receive+0x33e/0x550
> > [635607.740628]  napi_gro_frags+0xa2/0x210
> > [635607.740635]  mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en]
> > [635607.740648]  ? mlx4_cq_completion+0x23/0x70 [mlx4_core]
> > [635607.740654]  ? mlx4_eq_int+0x373/0xc80 [mlx4_core]
> > [635607.740657]  mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en]
> > [635607.740658]  net_rx_action+0xe0/0x2e0
> > [635607.740662]  __do_softirq+0xd8/0x2e5
> > [635607.740666]  irq_exit+0xb4/0xc0
> > [635607.740667]  do_IRQ+0x85/0xd0
> > [635607.740670]  common_interrupt+0xf/0xf
> > [635607.740671]  
> > [635607.740675] RIP: 0010:cpuidle_enter_state+0xb4/0x2a0
> > [635607.740675] Code: 31 ff e8 df a6 ba ff 45 84 f6 74 17 9c 58 0f 1f 44 00 
> > 00 f6 c4 02 0f 85 d8 01 00 00 31 ff e8 13 81 bf ff fb 66 0f 1f 44 00 00 
> > <4c> 29 fb 48 ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7
> > [635607.740701] RSP: 0018:a5c206353ea8 EFLAGS: 0246 ORIG_RAX: 
> > ffd9
> > [635607.740703] RAX: 8d72ffd20f00 RBX: 00024214f597c5b0 RCX: 
> > 001f
> > [635607.740703] RDX: 00024214f597c5b0 RSI: 00020780 RDI: 
> > 
> > [635607.740704] RBP: 0004 R08: 002542bfbefa99fa R09: 
> > 
> > [635607.740705] R10: a5c206353e88 R11: 00c5 R12: 
> > af0aaf78
> > [635607.740706] R13: 8d72ffd297d8 R14:  R15: 
> > 00024214f58c2ed5
> > [635607.740709]  ? cpuidle_enter_state+0x91/0x2a0
> > [635607.740712]  do_idle+0x1d0/0x240
> > [635607.740715]  cpu_startup_entry+0x5f/0x70
> > [635607.740719]  start_secondary+0x185/0x1a0
> > [635607.740722]  secondary_startup_64+0xa5/0xb0
> > [635607.740731] p0xe0: hw csum failure
> > [635607.740745] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1
> > [635607.740746] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 
> > 05/02/2017
> > [635607.740746] Call Trace:
> > [635607.740747]  
> > [635607.740750]  dump_stack+0x5c/0x7b
> > [635607.740755]  __skb_checksum_complete+0xb8/0xd0
> > [635607.740760]  __udp6_lib_rcv+0xa6b/0xa70
> > [635607.740767]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
> > [635607.740770]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
> > [635607.740774]  ip6_input_finish+0xc0/0x460
> > [635607.740776]  ip6_input+0x2b/0x90
> > [635607.740778]  ? ip6_rcv_finish+0x110/0x110
> > [635607.740780]  ipv6_rcv+0x2cd/0x4b0
> > [635607.740783]  ? udp6_lib_lookup_skb+0x59/0x80
> > [635607.740785]  __netif_receive_skb_core+0x455/0xb30
> > [635607.740788]  ? ipv6_gro_receive+0x1a8/0x390
> > [635607.740790]  ? netif_receive_skb_internal+0x24/0xb0
> > [635607.740792]  netif_receive_skb_internal+0x24/0xb0
> > [635607.740793]  

Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-16 Thread Andre Tomt

On 15.10.2018 17:41, Eric Dumazet wrote:

On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger

Something is changed between 4.17.12 and 4.18, after bisecting the problem I
got the following first bad commit:

commit 88078d98d1bb085d72af8437707279e203524fa5
Author: Eric Dumazet 
Date:   Wed Apr 18 11:43:15 2018 -0700

 net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends

 After working on IP defragmentation lately, I found that some large
 packets defeat CHECKSUM_COMPLETE optimization because of NIC adding
 zero paddings on the last (small) fragment.

 While removing the padding with pskb_trim_rcsum(), we set skb->ip_summed
 to CHECKSUM_NONE, forcing a full csum validation, even if all prior
 fragments had CHECKSUM_COMPLETE set.

 We can instead compute the checksum of the part we are trimming,
 usually smaller than the part we keep.

 Signed-off-by: Eric Dumazet 
 Signed-off-by: David S. Miller 



Thanks for bisecting !

This commit is known to expose some NIC/driver bugs.

Look at commit 12b03558cef6d655d0d394f5e98a6fd07c1f6c0f
("net: sungem: fix rx checksum support")  for one driver needing a fix.

I assume SKY2_HW_NEW_LE is not set on your NIC ?



I've seen similar on several systems with mlx4 cards when using 4.18.x - 
that is hw csum failure followed by some backtrace.


Only seems to happen on systems dealing with quite a bit of UDP.

Example from 4.18.10:

[635607.740574] p0xe0: hw csum failure
[635607.740598] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1
[635607.740599] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 
05/02/2017
[635607.740599] Call Trace:
[635607.740602]  
[635607.740611]  dump_stack+0x5c/0x7b
[635607.740617]  __skb_gro_checksum_complete+0x9a/0xa0
[635607.740621]  udp6_gro_receive+0x211/0x290
[635607.740624]  ipv6_gro_receive+0x1a8/0x390
[635607.740627]  dev_gro_receive+0x33e/0x550
[635607.740628]  napi_gro_frags+0xa2/0x210
[635607.740635]  mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en]
[635607.740648]  ? mlx4_cq_completion+0x23/0x70 [mlx4_core]
[635607.740654]  ? mlx4_eq_int+0x373/0xc80 [mlx4_core]
[635607.740657]  mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en]
[635607.740658]  net_rx_action+0xe0/0x2e0
[635607.740662]  __do_softirq+0xd8/0x2e5
[635607.740666]  irq_exit+0xb4/0xc0
[635607.740667]  do_IRQ+0x85/0xd0
[635607.740670]  common_interrupt+0xf/0xf
[635607.740671]  
[635607.740675] RIP: 0010:cpuidle_enter_state+0xb4/0x2a0
[635607.740675] Code: 31 ff e8 df a6 ba ff 45 84 f6 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d8 01 00 00 31 ff e8 13 81 bf ff fb 66 0f 1f 44 00 00 <4c> 29 fb 48 ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7 
[635607.740701] RSP: 0018:a5c206353ea8 EFLAGS: 0246 ORIG_RAX: ffd9

[635607.740703] RAX: 8d72ffd20f00 RBX: 00024214f597c5b0 RCX: 
001f
[635607.740703] RDX: 00024214f597c5b0 RSI: 00020780 RDI: 

[635607.740704] RBP: 0004 R08: 002542bfbefa99fa R09: 

[635607.740705] R10: a5c206353e88 R11: 00c5 R12: 
af0aaf78
[635607.740706] R13: 8d72ffd297d8 R14:  R15: 
00024214f58c2ed5
[635607.740709]  ? cpuidle_enter_state+0x91/0x2a0
[635607.740712]  do_idle+0x1d0/0x240
[635607.740715]  cpu_startup_entry+0x5f/0x70
[635607.740719]  start_secondary+0x185/0x1a0
[635607.740722]  secondary_startup_64+0xa5/0xb0
[635607.740731] p0xe0: hw csum failure
[635607.740745] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1
[635607.740746] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 
05/02/2017
[635607.740746] Call Trace:
[635607.740747]  
[635607.740750]  dump_stack+0x5c/0x7b
[635607.740755]  __skb_checksum_complete+0xb8/0xd0
[635607.740760]  __udp6_lib_rcv+0xa6b/0xa70
[635607.740767]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
[635607.740770]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
[635607.740774]  ip6_input_finish+0xc0/0x460
[635607.740776]  ip6_input+0x2b/0x90
[635607.740778]  ? ip6_rcv_finish+0x110/0x110
[635607.740780]  ipv6_rcv+0x2cd/0x4b0
[635607.740783]  ? udp6_lib_lookup_skb+0x59/0x80
[635607.740785]  __netif_receive_skb_core+0x455/0xb30
[635607.740788]  ? ipv6_gro_receive+0x1a8/0x390
[635607.740790]  ? netif_receive_skb_internal+0x24/0xb0
[635607.740792]  netif_receive_skb_internal+0x24/0xb0
[635607.740793]  napi_gro_frags+0x165/0x210
[635607.740796]  mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en]
[635607.740802]  ? mlx4_cq_completion+0x23/0x70 [mlx4_core]
[635607.740807]  ? mlx4_eq_int+0x373/0xc80 [mlx4_core]
[635607.740810]  mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en]
[635607.740811]  net_rx_action+0xe0/0x2e0
[635607.740813]  __do_softirq+0xd8/0x2e5
[635607.740816]  irq_exit+0xb4/0xc0
[635607.740817]  do_IRQ+0x85/0xd0
[635607.740820]  common_interrupt+0xf/0xf
[635607.740821]  
[635607.740823] RIP: 0010:cpuidle_enter_state+0xb4/0x2a0
[635607.740823] Code: 31 ff e8 df a6 ba ff 45 84 f6 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d8 01 00 00 31 ff e8 13 81 bf ff fb 66 

Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-15 Thread Fabio Rossi



On 15 October 2018 17:41:47 CEST, Eric Dumazet  wrote:
>On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger
> wrote:
>>
>>
>>
>> Begin forwarded message:
>>
>> Date: Sun, 14 Oct 2018 10:42:48 +
>> From: bugzilla-dae...@bugzilla.kernel.org
>> To: step...@networkplumber.org
>> Subject: [Bug 201423] New: eth0: hw csum failure
>>
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=201423
>>
>> Bug ID: 201423
>>Summary: eth0: hw csum failure
>>Product: Networking
>>Version: 2.5
>> Kernel Version: 4.19.0-rc7
>>   Hardware: Intel
>> OS: Linux
>>   Tree: Mainline
>> Status: NEW
>>   Severity: normal
>>   Priority: P1
>>  Component: Other
>>   Assignee: step...@networkplumber.org
>>   Reporter: ross...@inwind.it
>> Regression: No
>>
>> I have a P6T DELUXE V2 motherboard and using the sky2 driver for the
>ethernet
>> ports. I get the following error message:
>>
>> [  433.727397] eth0: hw csum failure
>> [  433.727406] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.19.0-rc7
>#19
>> [  433.727406] Hardware name: System manufacturer System Product
>Name/P6T
>> DELUXE V2, BIOS 120212/22/2010
>> [  433.727407] Call Trace:
>> [  433.727409]  
>> [  433.727415]  dump_stack+0x46/0x5b
>> [  433.727419]  __skb_checksum_complete+0xb0/0xc0
>> [  433.727423]  tcp_v4_rcv+0x528/0xb60
>> [  433.727426]  ? ipt_do_table+0x2d0/0x400
>> [  433.727429]  ip_local_deliver_finish+0x5a/0x110
>> [  433.727430]  ip_local_deliver+0xe1/0xf0
>> [  433.727431]  ? ip_sublist_rcv_finish+0x60/0x60
>> [  433.727432]  ip_rcv+0xca/0xe0
>> [  433.727434]  ? ip_rcv_finish_core.isra.0+0x300/0x300
>> [  433.727436]  __netif_receive_skb_one_core+0x4b/0x70
>> [  433.727438]  netif_receive_skb_internal+0x4e/0x130
>> [  433.727439]  napi_gro_receive+0x6a/0x80
>> [  433.727442]  sky2_poll+0x707/0xd20
>> [  433.727446]  ? rcu_check_callbacks+0x1b4/0x900
>> [  433.727447]  net_rx_action+0x237/0x380
>> [  433.727449]  __do_softirq+0xdc/0x1e0
>> [  433.727452]  irq_exit+0xa9/0xb0
>> [  433.727453]  do_IRQ+0x45/0xc0
>> [  433.727455]  common_interrupt+0xf/0xf
>> [  433.727456]  
>> [  433.727459] RIP: 0010:cpuidle_enter_state+0x124/0x200
>> [  433.727461] Code: 53 60 89 c3 e8 dd 90 ad ff 65 8b 3d 96 58 a7 7e
>e8 d1 8f
>> ad ff 31 ff 49 89 c4 e8 27 99 ad ff fb 48 ba cf f7 53 e3 a5 9b c4 20
><4c> 89 e1
>> 4c 29 e9 48 89 c8 48 c1 f9 3f 48 f7 ea b8 ff ff ff 7f 48
>> [  433.727462] RSP: :c90a3e98 EFLAGS: 0282 ORIG_RAX:
>> ffde
>> [  433.727463] RAX: 880237b1f280 RBX: 0004 RCX:
>> 001f
>> [  433.727464] RDX: 20c49ba5e353f7cf RSI: 2fe419c1 RDI:
>> 
>> [  433.727465] RBP: 880237b263a0 R08: 0714 R09:
>> 00650512105d
>> [  433.727465] R10:  R11: 0342 R12:
>> 0064fc2a8b1c
>> [  433.727466] R13: 0064fc25b35f R14: 0004 R15:
>> 8204af20
>> [  433.727468]  ? cpuidle_enter_state+0x119/0x200
>> [  433.727471]  do_idle+0x1bf/0x200
>> [  433.727473]  cpu_startup_entry+0x6a/0x70
>> [  433.727475]  start_secondary+0x17f/0x1c0
>> [  433.727476]  secondary_startup_64+0xa4/0xb0
>> [  441.662954] eth0: hw csum failure
>> [  441.662959] CPU: 4 PID: 4347 Comm: radeon_cs:0 Not tainted
>4.19.0-rc7 #19
>> [  441.662960] Hardware name: System manufacturer System Product
>Name/P6T
>> DELUXE V2, BIOS 120212/22/2010
>> [  441.662960] Call Trace:
>> [  441.662963]  
>> [  441.662968]  dump_stack+0x46/0x5b
>> [  441.662972]  __skb_checksum_complete+0xb0/0xc0
>> [  441.662975]  tcp_v4_rcv+0x528/0xb60
>> [  441.662979]  ? ipt_do_table+0x2d0/0x400
>> [  441.662981]  ip_local_deliver_finish+0x5a/0x110
>> [  441.662983]  ip_local_deliver+0xe1/0xf0
>> [  441.662985]  ? ip_sublist_rcv_finish+0x60/0x60
>> [  441.662986]  ip_rcv+0xca/0xe0
>> [  441.662988]  ? ip_rcv_finish_core.isra.0+0x300/0x300
>> [  441.662990]  __netif_receive_skb_one_core+0x4b/0x70
>> [  441.662993]  netif_receive_skb_internal+0x4e/0x130
>> [  441.662994]  napi_gro_receive+0x6a/0x80
>> [  441.662998]  sky2_poll+0x707/0xd20
>> [  441.663000]  net_rx_action+0x237/0x380
>> [  441.663002]  __do_softirq+0xdc/0x1e0
>> [  441.663005]  irq_exit+0xa9/0xb0
>> [  441.663007]  do_IRQ+0x45/0xc0
>> [  441.

Re: [Bug 201423] New: eth0: hw csum failure

2018-10-15 Thread Stephen Hemminger
On Mon, 15 Oct 2018 08:41:47 -0700
Eric Dumazet  wrote:

> On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger
>  wrote:
> >
> >
> >
> > Begin forwarded message:
> >
> > Date: Sun, 14 Oct 2018 10:42:48 +
> > From: bugzilla-dae...@bugzilla.kernel.org
> > To: step...@networkplumber.org
> > Subject: [Bug 201423] New: eth0: hw csum failure
> >
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=201423
> >
> > Bug ID: 201423
> >Summary: eth0: hw csum failure
> >Product: Networking
> >Version: 2.5
> > Kernel Version: 4.19.0-rc7
> >   Hardware: Intel
> > OS: Linux
> >   Tree: Mainline
> > Status: NEW
> >   Severity: normal
> >   Priority: P1
> >  Component: Other
> >   Assignee: step...@networkplumber.org
> >   Reporter: ross...@inwind.it
> > Regression: No
> >
> > I have a P6T DELUXE V2 motherboard and using the sky2 driver for the 
> > ethernet
> > ports. I get the following error message:
> >
> > [  433.727397] eth0: hw csum failure
> > [  433.727406] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.19.0-rc7 #19
> > [  433.727406] Hardware name: System manufacturer System Product Name/P6T
> > DELUXE V2, BIOS 120212/22/2010
> > [  433.727407] Call Trace:
> > [  433.727409]  
> > [  433.727415]  dump_stack+0x46/0x5b
> > [  433.727419]  __skb_checksum_complete+0xb0/0xc0
> > [  433.727423]  tcp_v4_rcv+0x528/0xb60
> > [  433.727426]  ? ipt_do_table+0x2d0/0x400
> > [  433.727429]  ip_local_deliver_finish+0x5a/0x110
> > [  433.727430]  ip_local_deliver+0xe1/0xf0
> > [  433.727431]  ? ip_sublist_rcv_finish+0x60/0x60
> > [  433.727432]  ip_rcv+0xca/0xe0
> > [  433.727434]  ? ip_rcv_finish_core.isra.0+0x300/0x300
> > [  433.727436]  __netif_receive_skb_one_core+0x4b/0x70
> > [  433.727438]  netif_receive_skb_internal+0x4e/0x130
> > [  433.727439]  napi_gro_receive+0x6a/0x80
> > [  433.727442]  sky2_poll+0x707/0xd20
> > [  433.727446]  ? rcu_check_callbacks+0x1b4/0x900
> > [  433.727447]  net_rx_action+0x237/0x380
> > [  433.727449]  __do_softirq+0xdc/0x1e0
> > [  433.727452]  irq_exit+0xa9/0xb0
> > [  433.727453]  do_IRQ+0x45/0xc0
> > [  433.727455]  common_interrupt+0xf/0xf
> > [  433.727456]  
> > [  433.727459] RIP: 0010:cpuidle_enter_state+0x124/0x200
> > [  433.727461] Code: 53 60 89 c3 e8 dd 90 ad ff 65 8b 3d 96 58 a7 7e e8 d1 
> > 8f
> > ad ff 31 ff 49 89 c4 e8 27 99 ad ff fb 48 ba cf f7 53 e3 a5 9b c4 20 <4c> 
> > 89 e1
> > 4c 29 e9 48 89 c8 48 c1 f9 3f 48 f7 ea b8 ff ff ff 7f 48
> > [  433.727462] RSP: :c90a3e98 EFLAGS: 0282 ORIG_RAX:
> > ffde
> > [  433.727463] RAX: 880237b1f280 RBX: 0004 RCX:
> > 001f
> > [  433.727464] RDX: 20c49ba5e353f7cf RSI: 2fe419c1 RDI:
> > 
> > [  433.727465] RBP: 880237b263a0 R08: 0714 R09:
> > 00650512105d
> > [  433.727465] R10:  R11: 0342 R12:
> > 0064fc2a8b1c
> > [  433.727466] R13: 0064fc25b35f R14: 0004 R15:
> > 8204af20
> > [  433.727468]  ? cpuidle_enter_state+0x119/0x200
> > [  433.727471]  do_idle+0x1bf/0x200
> > [  433.727473]  cpu_startup_entry+0x6a/0x70
> > [  433.727475]  start_secondary+0x17f/0x1c0
> > [  433.727476]  secondary_startup_64+0xa4/0xb0
> > [  441.662954] eth0: hw csum failure
> > [  441.662959] CPU: 4 PID: 4347 Comm: radeon_cs:0 Not tainted 4.19.0-rc7 #19
> > [  441.662960] Hardware name: System manufacturer System Product Name/P6T
> > DELUXE V2, BIOS 120212/22/2010
> > [  441.662960] Call Trace:
> > [  441.662963]  
> > [  441.662968]  dump_stack+0x46/0x5b
> > [  441.662972]  __skb_checksum_complete+0xb0/0xc0
> > [  441.662975]  tcp_v4_rcv+0x528/0xb60
> > [  441.662979]  ? ipt_do_table+0x2d0/0x400
> > [  441.662981]  ip_local_deliver_finish+0x5a/0x110
> > [  441.662983]  ip_local_deliver+0xe1/0xf0
> > [  441.662985]  ? ip_sublist_rcv_finish+0x60/0x60
> > [  441.662986]  ip_rcv+0xca/0xe0
> > [  441.662988]  ? ip_rcv_finish_core.isra.0+0x300/0x300
> > [  441.662990]  __netif_receive_skb_one_core+0x4b/0x70
> > [  441.662993]  netif_receive_skb_internal+0x4e/0x130
> > [  441.662994]  napi_gro_receive+0x6a/0x80
> > [  441.662998]  sky2_poll+0x707/0xd20
> > [  441.663000]  net_rx_action+0x237/0x380
> > [  441.663002]  __do_softirq+0xdc/

Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-15 Thread Dave Stevenson
Hi Eric.

On Mon, 15 Oct 2018 at 16:42, Eric Dumazet  wrote:
>
> On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger
>  wrote:
> >
> >
> >
> > Begin forwarded message:
> >
> > Date: Sun, 14 Oct 2018 10:42:48 +
> > From: bugzilla-dae...@bugzilla.kernel.org
> > To: step...@networkplumber.org
> > Subject: [Bug 201423] New: eth0: hw csum failure
> >
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=201423
> >
> > Bug ID: 201423
> >Summary: eth0: hw csum failure
> >Product: Networking
> >Version: 2.5
> > Kernel Version: 4.19.0-rc7
> >   Hardware: Intel
> > OS: Linux
> >   Tree: Mainline
> > Status: NEW
> >   Severity: normal
> >   Priority: P1
> >  Component: Other
> >   Assignee: step...@networkplumber.org
> >   Reporter: ross...@inwind.it
> > Regression: No
> >
> > I have a P6T DELUXE V2 motherboard and using the sky2 driver for the 
> > ethernet
> > ports. I get the following error message:
> >
> > [  433.727397] eth0: hw csum failure
> > [  433.727406] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.19.0-rc7 #19
> > [  433.727406] Hardware name: System manufacturer System Product Name/P6T
> > DELUXE V2, BIOS 120212/22/2010
> > [  433.727407] Call Trace:
> > [  433.727409]  
> > [  433.727415]  dump_stack+0x46/0x5b
> > [  433.727419]  __skb_checksum_complete+0xb0/0xc0
> > [  433.727423]  tcp_v4_rcv+0x528/0xb60
> > [  433.727426]  ? ipt_do_table+0x2d0/0x400
> > [  433.727429]  ip_local_deliver_finish+0x5a/0x110
> > [  433.727430]  ip_local_deliver+0xe1/0xf0
> > [  433.727431]  ? ip_sublist_rcv_finish+0x60/0x60
> > [  433.727432]  ip_rcv+0xca/0xe0
> > [  433.727434]  ? ip_rcv_finish_core.isra.0+0x300/0x300
> > [  433.727436]  __netif_receive_skb_one_core+0x4b/0x70
> > [  433.727438]  netif_receive_skb_internal+0x4e/0x130
> > [  433.727439]  napi_gro_receive+0x6a/0x80
> > [  433.727442]  sky2_poll+0x707/0xd20
> > [  433.727446]  ? rcu_check_callbacks+0x1b4/0x900
> > [  433.727447]  net_rx_action+0x237/0x380
> > [  433.727449]  __do_softirq+0xdc/0x1e0
> > [  433.727452]  irq_exit+0xa9/0xb0
> > [  433.727453]  do_IRQ+0x45/0xc0
> > [  433.727455]  common_interrupt+0xf/0xf
> > [  433.727456]  
> > [  433.727459] RIP: 0010:cpuidle_enter_state+0x124/0x200
> > [  433.727461] Code: 53 60 89 c3 e8 dd 90 ad ff 65 8b 3d 96 58 a7 7e e8 d1 
> > 8f
> > ad ff 31 ff 49 89 c4 e8 27 99 ad ff fb 48 ba cf f7 53 e3 a5 9b c4 20 <4c> 
> > 89 e1
> > 4c 29 e9 48 89 c8 48 c1 f9 3f 48 f7 ea b8 ff ff ff 7f 48
> > [  433.727462] RSP: :c90a3e98 EFLAGS: 0282 ORIG_RAX:
> > ffde
> > [  433.727463] RAX: 880237b1f280 RBX: 0004 RCX:
> > 001f
> > [  433.727464] RDX: 20c49ba5e353f7cf RSI: 2fe419c1 RDI:
> > 
> > [  433.727465] RBP: 880237b263a0 R08: 0714 R09:
> > 00650512105d
> > [  433.727465] R10:  R11: 0342 R12:
> > 0064fc2a8b1c
> > [  433.727466] R13: 0064fc25b35f R14: 0004 R15:
> > 8204af20
> > [  433.727468]  ? cpuidle_enter_state+0x119/0x200
> > [  433.727471]  do_idle+0x1bf/0x200
> > [  433.727473]  cpu_startup_entry+0x6a/0x70
> > [  433.727475]  start_secondary+0x17f/0x1c0
> > [  433.727476]  secondary_startup_64+0xa4/0xb0
> > [  441.662954] eth0: hw csum failure
> > [  441.662959] CPU: 4 PID: 4347 Comm: radeon_cs:0 Not tainted 4.19.0-rc7 #19
> > [  441.662960] Hardware name: System manufacturer System Product Name/P6T
> > DELUXE V2, BIOS 120212/22/2010
> > [  441.662960] Call Trace:
> > [  441.662963]  
> > [  441.662968]  dump_stack+0x46/0x5b
> > [  441.662972]  __skb_checksum_complete+0xb0/0xc0
> > [  441.662975]  tcp_v4_rcv+0x528/0xb60
> > [  441.662979]  ? ipt_do_table+0x2d0/0x400
> > [  441.662981]  ip_local_deliver_finish+0x5a/0x110
> > [  441.662983]  ip_local_deliver+0xe1/0xf0
> > [  441.662985]  ? ip_sublist_rcv_finish+0x60/0x60
> > [  441.662986]  ip_rcv+0xca/0xe0
> > [  441.662988]  ? ip_rcv_finish_core.isra.0+0x300/0x300
> > [  441.662990]  __netif_receive_skb_one_core+0x4b/0x70
> > [  441.662993]  netif_receive_skb_internal+0x4e/0x130
> > [  441.662994]  napi_gro_receive+0x6a/0x80
> > [  441.662998]  sky2_poll+0x707/0xd20
> > [  441.663000]  net_rx_action+0x237/0x380
> > [  441.663002

Re: Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-15 Thread Eric Dumazet
On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger
 wrote:
>
>
>
> Begin forwarded message:
>
> Date: Sun, 14 Oct 2018 10:42:48 +
> From: bugzilla-dae...@bugzilla.kernel.org
> To: step...@networkplumber.org
> Subject: [Bug 201423] New: eth0: hw csum failure
>
>
> https://bugzilla.kernel.org/show_bug.cgi?id=201423
>
> Bug ID: 201423
>Summary: eth0: hw csum failure
>Product: Networking
>Version: 2.5
> Kernel Version: 4.19.0-rc7
>   Hardware: Intel
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: Other
>   Assignee: step...@networkplumber.org
>   Reporter: ross...@inwind.it
> Regression: No
>
> I have a P6T DELUXE V2 motherboard and using the sky2 driver for the ethernet
> ports. I get the following error message:
>
> [  433.727397] eth0: hw csum failure
> [  433.727406] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.19.0-rc7 #19
> [  433.727406] Hardware name: System manufacturer System Product Name/P6T
> DELUXE V2, BIOS 120212/22/2010
> [  433.727407] Call Trace:
> [  433.727409]  
> [  433.727415]  dump_stack+0x46/0x5b
> [  433.727419]  __skb_checksum_complete+0xb0/0xc0
> [  433.727423]  tcp_v4_rcv+0x528/0xb60
> [  433.727426]  ? ipt_do_table+0x2d0/0x400
> [  433.727429]  ip_local_deliver_finish+0x5a/0x110
> [  433.727430]  ip_local_deliver+0xe1/0xf0
> [  433.727431]  ? ip_sublist_rcv_finish+0x60/0x60
> [  433.727432]  ip_rcv+0xca/0xe0
> [  433.727434]  ? ip_rcv_finish_core.isra.0+0x300/0x300
> [  433.727436]  __netif_receive_skb_one_core+0x4b/0x70
> [  433.727438]  netif_receive_skb_internal+0x4e/0x130
> [  433.727439]  napi_gro_receive+0x6a/0x80
> [  433.727442]  sky2_poll+0x707/0xd20
> [  433.727446]  ? rcu_check_callbacks+0x1b4/0x900
> [  433.727447]  net_rx_action+0x237/0x380
> [  433.727449]  __do_softirq+0xdc/0x1e0
> [  433.727452]  irq_exit+0xa9/0xb0
> [  433.727453]  do_IRQ+0x45/0xc0
> [  433.727455]  common_interrupt+0xf/0xf
> [  433.727456]  
> [  433.727459] RIP: 0010:cpuidle_enter_state+0x124/0x200
> [  433.727461] Code: 53 60 89 c3 e8 dd 90 ad ff 65 8b 3d 96 58 a7 7e e8 d1 8f
> ad ff 31 ff 49 89 c4 e8 27 99 ad ff fb 48 ba cf f7 53 e3 a5 9b c4 20 <4c> 89 
> e1
> 4c 29 e9 48 89 c8 48 c1 f9 3f 48 f7 ea b8 ff ff ff 7f 48
> [  433.727462] RSP: :c90a3e98 EFLAGS: 0282 ORIG_RAX:
> ffde
> [  433.727463] RAX: 880237b1f280 RBX: 0004 RCX:
> 001f
> [  433.727464] RDX: 20c49ba5e353f7cf RSI: 2fe419c1 RDI:
> 
> [  433.727465] RBP: 880237b263a0 R08: 0714 R09:
> 00650512105d
> [  433.727465] R10:  R11: 0342 R12:
> 0064fc2a8b1c
> [  433.727466] R13: 0064fc25b35f R14: 0004 R15:
> 8204af20
> [  433.727468]  ? cpuidle_enter_state+0x119/0x200
> [  433.727471]  do_idle+0x1bf/0x200
> [  433.727473]  cpu_startup_entry+0x6a/0x70
> [  433.727475]  start_secondary+0x17f/0x1c0
> [  433.727476]  secondary_startup_64+0xa4/0xb0
> [  441.662954] eth0: hw csum failure
> [  441.662959] CPU: 4 PID: 4347 Comm: radeon_cs:0 Not tainted 4.19.0-rc7 #19
> [  441.662960] Hardware name: System manufacturer System Product Name/P6T
> DELUXE V2, BIOS 120212/22/2010
> [  441.662960] Call Trace:
> [  441.662963]  
> [  441.662968]  dump_stack+0x46/0x5b
> [  441.662972]  __skb_checksum_complete+0xb0/0xc0
> [  441.662975]  tcp_v4_rcv+0x528/0xb60
> [  441.662979]  ? ipt_do_table+0x2d0/0x400
> [  441.662981]  ip_local_deliver_finish+0x5a/0x110
> [  441.662983]  ip_local_deliver+0xe1/0xf0
> [  441.662985]  ? ip_sublist_rcv_finish+0x60/0x60
> [  441.662986]  ip_rcv+0xca/0xe0
> [  441.662988]  ? ip_rcv_finish_core.isra.0+0x300/0x300
> [  441.662990]  __netif_receive_skb_one_core+0x4b/0x70
> [  441.662993]  netif_receive_skb_internal+0x4e/0x130
> [  441.662994]  napi_gro_receive+0x6a/0x80
> [  441.662998]  sky2_poll+0x707/0xd20
> [  441.663000]  net_rx_action+0x237/0x380
> [  441.663002]  __do_softirq+0xdc/0x1e0
> [  441.663005]  irq_exit+0xa9/0xb0
> [  441.663007]  do_IRQ+0x45/0xc0
> [  441.663009]  common_interrupt+0xf/0xf
> [  441.663010]  
> [  441.663012] RIP: 0010:merge+0x22/0xb0
> [  441.663014] Code: c3 31 c0 c3 90 90 90 90 41 56 41 55 41 54 55 48 89 d5 53
> 48 89 cb 48 83 ec 18 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 <48> 85 
> c9
> 74 70 48 85 d2 74 6b 49 89 fd 49 89 f6 49 89 e4 eb 14 48
> [  441.663015] RSP: 0018:c990b988 EFLAGS: 0246 ORIG_RAX:
> ffde
> [  441.663017] RAX:  RBX: 88021

Fw: [Bug 201423] New: eth0: hw csum failure

2018-10-15 Thread Stephen Hemminger



Begin forwarded message:

Date: Sun, 14 Oct 2018 10:42:48 +
From: bugzilla-dae...@bugzilla.kernel.org
To: step...@networkplumber.org
Subject: [Bug 201423] New: eth0: hw csum failure


https://bugzilla.kernel.org/show_bug.cgi?id=201423

Bug ID: 201423
   Summary: eth0: hw csum failure
   Product: Networking
   Version: 2.5
Kernel Version: 4.19.0-rc7
  Hardware: Intel
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: step...@networkplumber.org
  Reporter: ross...@inwind.it
Regression: No

I have a P6T DELUXE V2 motherboard and using the sky2 driver for the ethernet
ports. I get the following error message:

[  433.727397] eth0: hw csum failure
[  433.727406] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.19.0-rc7 #19
[  433.727406] Hardware name: System manufacturer System Product Name/P6T
DELUXE V2, BIOS 120212/22/2010
[  433.727407] Call Trace:
[  433.727409]  
[  433.727415]  dump_stack+0x46/0x5b
[  433.727419]  __skb_checksum_complete+0xb0/0xc0
[  433.727423]  tcp_v4_rcv+0x528/0xb60
[  433.727426]  ? ipt_do_table+0x2d0/0x400
[  433.727429]  ip_local_deliver_finish+0x5a/0x110
[  433.727430]  ip_local_deliver+0xe1/0xf0
[  433.727431]  ? ip_sublist_rcv_finish+0x60/0x60
[  433.727432]  ip_rcv+0xca/0xe0
[  433.727434]  ? ip_rcv_finish_core.isra.0+0x300/0x300
[  433.727436]  __netif_receive_skb_one_core+0x4b/0x70
[  433.727438]  netif_receive_skb_internal+0x4e/0x130
[  433.727439]  napi_gro_receive+0x6a/0x80
[  433.727442]  sky2_poll+0x707/0xd20
[  433.727446]  ? rcu_check_callbacks+0x1b4/0x900
[  433.727447]  net_rx_action+0x237/0x380
[  433.727449]  __do_softirq+0xdc/0x1e0
[  433.727452]  irq_exit+0xa9/0xb0
[  433.727453]  do_IRQ+0x45/0xc0
[  433.727455]  common_interrupt+0xf/0xf
[  433.727456]  
[  433.727459] RIP: 0010:cpuidle_enter_state+0x124/0x200
[  433.727461] Code: 53 60 89 c3 e8 dd 90 ad ff 65 8b 3d 96 58 a7 7e e8 d1 8f
ad ff 31 ff 49 89 c4 e8 27 99 ad ff fb 48 ba cf f7 53 e3 a5 9b c4 20 <4c> 89 e1
4c 29 e9 48 89 c8 48 c1 f9 3f 48 f7 ea b8 ff ff ff 7f 48
[  433.727462] RSP: :c90a3e98 EFLAGS: 0282 ORIG_RAX:
ffde
[  433.727463] RAX: 880237b1f280 RBX: 0004 RCX:
001f
[  433.727464] RDX: 20c49ba5e353f7cf RSI: 2fe419c1 RDI:

[  433.727465] RBP: 880237b263a0 R08: 0714 R09:
00650512105d
[  433.727465] R10:  R11: 0342 R12:
0064fc2a8b1c
[  433.727466] R13: 0064fc25b35f R14: 0004 R15:
8204af20
[  433.727468]  ? cpuidle_enter_state+0x119/0x200
[  433.727471]  do_idle+0x1bf/0x200
[  433.727473]  cpu_startup_entry+0x6a/0x70
[  433.727475]  start_secondary+0x17f/0x1c0
[  433.727476]  secondary_startup_64+0xa4/0xb0
[  441.662954] eth0: hw csum failure
[  441.662959] CPU: 4 PID: 4347 Comm: radeon_cs:0 Not tainted 4.19.0-rc7 #19
[  441.662960] Hardware name: System manufacturer System Product Name/P6T
DELUXE V2, BIOS 120212/22/2010
[  441.662960] Call Trace:
[  441.662963]  
[  441.662968]  dump_stack+0x46/0x5b
[  441.662972]  __skb_checksum_complete+0xb0/0xc0
[  441.662975]  tcp_v4_rcv+0x528/0xb60
[  441.662979]  ? ipt_do_table+0x2d0/0x400
[  441.662981]  ip_local_deliver_finish+0x5a/0x110
[  441.662983]  ip_local_deliver+0xe1/0xf0
[  441.662985]  ? ip_sublist_rcv_finish+0x60/0x60
[  441.662986]  ip_rcv+0xca/0xe0
[  441.662988]  ? ip_rcv_finish_core.isra.0+0x300/0x300
[  441.662990]  __netif_receive_skb_one_core+0x4b/0x70
[  441.662993]  netif_receive_skb_internal+0x4e/0x130
[  441.662994]  napi_gro_receive+0x6a/0x80
[  441.662998]  sky2_poll+0x707/0xd20
[  441.663000]  net_rx_action+0x237/0x380
[  441.663002]  __do_softirq+0xdc/0x1e0
[  441.663005]  irq_exit+0xa9/0xb0
[  441.663007]  do_IRQ+0x45/0xc0
[  441.663009]  common_interrupt+0xf/0xf
[  441.663010]  
[  441.663012] RIP: 0010:merge+0x22/0xb0
[  441.663014] Code: c3 31 c0 c3 90 90 90 90 41 56 41 55 41 54 55 48 89 d5 53
48 89 cb 48 83 ec 18 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 <48> 85 c9
74 70 48 85 d2 74 6b 49 89 fd 49 89 f6 49 89 e4 eb 14 48
[  441.663015] RSP: 0018:c990b988 EFLAGS: 0246 ORIG_RAX:
ffde
[  441.663017] RAX:  RBX: 88021ab2d408 RCX:
88021ab2d408
[  441.663018] RDX: 88021ab2d388 RSI: a021c440 RDI:

[  441.663019] RBP: 88021ab2d388 R08: 5ecf R09:
8500
[  441.663020] R10: ea000877ec00 R11: 880236803500 R12:
a021c440
[  441.663021] R13: 88021ab2d448 R14: 0004 R15:
c990b9e0
[  441.663048]  ? radeon_irq_kms_set_irq_n_enabled+0x120/0x120 [radeon]
[  441.663063]  ? radeon_irq_kms_set_irq_n_enabled+0x120/0x120 [radeon]
[  441.663065]  ? merge+0x57/0xb0
[  441.663080]  ? radeon_irq_kms_set_irq_n_enabled+0x120/0x120 [radeon]
[  441.663082