Hi Matthew,
Sorry for the delay in responding,

i tried your kernel but it seems that something missing on the kernel
that you provide above, i still see the Call Trace after running traffic
with padding, can you please check if the kernel contains the needed
patches, i tried to check the changelog but seems that the changelog not
updated because i couldn't find the patches.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1854842

Title:
  mlx5_core reports hardware checksum error for padded packets on
  Mellanox NICs

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Committed

Bug description:
  BugLink: https://bugs.launchpad.net/bugs/1854842

  [Impact]

  On machines equipped with Mellanox NIC's, in this particular case,
  Mellanox 5 series NICs using the mlx5_core driver, there is a kernel
  splat when sending large IP packets which have padding at the end.

  enp6s0f0: hw csum failure
  CPU: 19 PID: 0 Comm: swapper/19 Not tainted 4.15.0-72-generic
  Call Trace:
  <IRQ>
  dump_stack+0x63/0x8e
  netdev_rx_csum_fault+0x38/0x40
  __skb_checksum_complete+0xbc/0xd0
  nf_ip_checksum+0xc3/0xf0
  icmp_error+0x27d/0x310 [nf_conntrack_ipv4]
  nf_conntrack_in+0x15a/0x510 [nf_conntrack]
  ? __skb_checksum+0x68/0x330
  ipv4_conntrack_in+0x1c/0x20 [nf_conntrack_ipv4]
  nf_hook_slow+0x48/0xc0
  ? skb_send_sock+0x50/0x50
  ip_rcv+0x301/0x360
  ? inet_del_offload+0x40/0x40
  __netif_receive_skb_core+0x432/0xb80
  __netif_receive_skb+0x18/0x60
  ? __netif_receive_skb+0x18/0x60
  netif_receive_skb_internal+0x45/0xe0
  napi_gro_receive+0xc5/0xf0
  mlx5e_handle_rx_cqe+0x48d/0x5e0 [mlx5_core]
  ? enqueue_task_rt+0x1b4/0x2e0
  mlx5e_poll_rx_cq+0xd1/0x8c0 [mlx5_core]
  mlx5e_napi_poll+0x9d/0x290 [mlx5_core]
  net_rx_action+0x140/0x3a0
  __do_softirq+0xe4/0x2d4
  irq_exit+0xc5/0xd0
  do_IRQ+0x86/0xe0
  common_interrupt+0x8c/0x8c
  </IRQ>

  This bug is a further attempt to fix these splats, as there has been
  previous fixes in LP #1840854 and a series of commits which landed in
  4.15.0-67 (LP #1847155) as a part of upstream -stable patches.

  This bug will also fix the same problems on the new Mellanox CX6 and
  Bluefield hardware, which has been enabled already via previous
  upstream -stable patches which landed in LP #1847155.

  [Fix]

  This particular issue was fixed for Mellanox series 5 drivers in the
  following commits:

  commit 0aa1d18615c163f92935b806dcaff9157645233a
  Author: Saeed Mahameed <sae...@mellanox.com>
  Date:   Tue Mar 12 00:24:52 2019 -0700
  Subject: net/mlx5e: Rx, Fixup skb checksum for packets with tail padding

  This commit required a minor backport.

  This commit was selected for upstream -stable in 4.19.76 and 5.0.10.
  This commit appears to be omitted from "Bionic update: upstream stable 
patchset 2019-10-07", which is LP #1847155, probably due to requiring a 
backport.

  commit db849faa9bef993a1379dc510623f750a72fa7ce
  Author: Saeed Mahameed <sae...@mellanox.com>
  Date:   Fri May 3 13:14:59 2019 -0700
  Subject: net/mlx5e: Rx, Fix checksum calculation for new hardware

  This commit required a minor backport.

  This commit was selected for upstream -stable in 5.1.21 and 5.2.4.
  This commit has already been applied to the disco kernel, as part of stable 
updates.

  [Testcase]

  The following scapy script will reproduce this issue. Run from the
  machine with the Mellanox series 5 NIC:

  1)
  
a=Ether(dst='ff:ff:ff:ff:ff:ff')/IP(dst='127.0.0.1')/ICMP()/Padding(load='\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe')

  2) sendp(a, iface='enp6s0f0')

  3) Check dmesg on the reciever side. The example uses localhost, so
  check dmesg.

  I have built some test kernels, which are available here:

  https://launchpad.net/~mruffell/+archive/ubuntu/lp1854842-test
  This kernel contains 0aa1d18615c163f92935b806dcaff9157645233a.

  and

  https://launchpad.net/~mruffell/+archive/ubuntu/lp1854842-test-2
  This kernel contains db849faa9bef993a1379dc510623f750a72fa7ce.

  If you install the test kernels the issue is resolved.

  [Regression Potential]

  The changes are limited to the mlx5_core driver, and only modify how
  packet checksums are calculated when padding is involved.

  Both patches have been accepted and published by upstream -stable, and
  are widely accepted by the community.

  Because of this, I believe the risk of regression is low.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1854842/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to