[Bug 1909062] Re: qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not supporting IPIP tx csum offload

2022-01-26 Thread Brian Murray
The Hirsute Hippo has reached End of Life, so this bug will not be fixed
for that release.

** Changed in: linux (Ubuntu Hirsute)
   Status: Fix Committed => Won't Fix

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1909062

Title:
  qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not
  supporting IPIP tx csum offload

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1909062/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1909062] Re: qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not supporting IPIP tx csum offload

2021-02-23 Thread Launchpad Bug Tracker
This bug was fixed in the package linux - 5.4.0-66.74

---
linux (5.4.0-66.74) focal; urgency=medium

  * focal/linux: 5.4.0-66.74 -proposed tracker (LP: #1913152)

  * Add support for selective build of special drivers (LP: #1912789)
- [Packaging] Add support for ODM drivers
- [Packaging] Turn on ODM support for amd64

  * Packaging resync (LP: #1786013)
- update dkms package versions
- update dkms package versions

  * Introduce the new NVIDIA 460-server series and update the 460 series
(LP: #1913200)
- [Config] dkms-versions -- drop NVIDIA 435 455 and 440-server
- [Config] dkms-versions -- add the 460-server nvidia driver

  * Enable mute and micmute LED on HP EliteBook 850 G7 (LP: #1910102)
- ALSA: hda/realtek: Enable mute and micmute LED on HP EliteBook 850 G7

  * SYNA30B4:00 06CB:CE09 Mouse  on HP EliteBook 850 G7 not working at all
(LP: #1908992)
- HID: multitouch: Enable multi-input for Synaptics pointstick/touchpad 
device

  * HD Audio Device PCI ID for the Intel Cometlake-R platform (LP: #1912427)
- SAUCE: ALSA: hda: Add Cometlake-R PCI ID

  * switch to an autogenerated nvidia series based core via dkms-versions
(LP: #1912803)
- [Packaging] nvidia -- use dkms-versions to define versions built
- [Packaging] update-version-dkms -- maintain flags fields
- [Config] dkms-versions -- add transitional/skip information for nvidia
  packages

  * udpgro.sh in net from ubuntu_kernel_selftests seems not reflecting sub-test
result (LP: #1908499)
- selftests: fix the return value for UDP GRO test

  * qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not supporting IPIP
tx csum offload (LP: #1909062)
- qede: fix offload for IPIP tunnel packets

  * Use DCPD to control HP DreamColor panel (LP: #1911001)
- SAUCE: drm/dp: Another HP DreamColor panel brigntness fix

  * kvm: Windows 2k19 with Hyper-v role gets stuck on pending hypervisor
requests on cascadelake based kvm hosts (LP: #1911848)
- KVM: x86: Set KVM_REQ_EVENT if run is canceled with req_immediate_exit set

  * Ubuntu 20.10 four needed fixes to 'Add driver for Mellanox Connect-IB
adapters' (LP: #1905574)
- net/mlx5: Fix a race when moving command interface to polling mode

  * Fix right sounds and mute/micmute LEDs for HP ZBook Fury 15/17 G7 Mobile
Workstation (LP: #1910561)
- ALSA: hda/realtek: fix right sounds and mute/micmute LEDs for HP machines

  * Ubuntu 20.04 - multicast counter is not increased in ip -s (LP: #1901842)
- net/mlx5e: Fix multicast counter not up-to-date in "ip -s"

  * eeh-basic.sh in powerpc from ubuntu_kernel_selftests timeout with 5.4 P8 /
P9 (LP: #1882503)
- selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic

  * DMI entry syntax fix for Pegatron / ByteSpeed C15B (LP: #1910639)
- Input: i8042 - unbreak Pegatron C15B

  * CVE-2020-29372
- mm: check that mm is still valid in madvise()

  * update ENA driver, incl. new ethtool stats (LP: #1910291)
- net: ena: Change WARN_ON expression in ena_del_napi_in_range()
- net: ena: ethtool: convert stat_offset to 64 bit resolution
- net: ena: ethtool: Add new device statistics
- net: ena: ethtool: add stats printing to XDP queues
- net: ena: xdp: add queue counters for xdp actions
- net: ena: Change license into format to SPDX in all files
- net: ena: Change log message to netif/dev function
- net: ena: Capitalize all log strings and improve code readability
- net: ena: Remove redundant print of placement policy
- net: ena: Change RSS related macros and variables names
- net: ena: Fix all static chekers' warnings
- drivers/net/ethernet: remove incorrectly formatted doc
- net: ena: handle bad request id in ena_netdev
- net: ena: fix packet's addresses for rx_offset feature

  * s390x broken with unknown syscall number on kernels < 5.8 (LP: #1895132)
- s390/ptrace: return -ENOSYS when invalid syscall is supplied

  * Focal update: v5.4.86 upstream stable release (LP: #1910822)
- ARM: dts: sun7i: bananapi: Enable RGMII RX/TX delay on Ethernet PHY
- ARM: dts: sun8i: r40: bananapi-m2-berry: Fix dcdc1 regulator
- ARM: dts: sun8i: v40: bananapi-m2-berry: Fix ethernet node
- pinctrl: merrifield: Set default bias in case no particular value given
- pinctrl: baytrail: Avoid clearing debounce value when turning it off
- ARM: dts: sun8i: v3s: fix GIC node memory range
- ARM: dts: sun7i: pcduino3-nano: enable RGMII RX/TX delay on PHY
- ARM: dts: imx6qdl-wandboard-revd1: Remove PAD_GPIO_6 from enetgrp
- ARM: dts: imx6qdl-kontron-samx6i: fix I2C_PM scl pin
- PM: runtime: Add pm_runtime_resume_and_get to deal with usage counter
- gpio: zynq: fix reference leak in zynq_gpio functions
- gpio: mvebu: fix potential user-after-free on probe
- scsi: bnx2i: Requires MMU
- xsk: Fix xsk_poll()'s return type
- xsk: Replace 

[Bug 1909062] Re: qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not supporting IPIP tx csum offload

2021-02-10 Thread Matthew Ruffell
Performing verification for Focal.

The affected user enabled -proposed and installed 5.4.0-66-generic to
the system with a QLogic FastLinQ QL41000 Series 10/25/40/50GbE
Controller.

They then set all interfaces down, and brought up the QLogic NIC only.

#‌ uname -rv
5.4.0-66-generic #‌74~18.04.2-Ubuntu SMP Fri Feb 5 11:17:31 UTC 2021

#‌ nslookup internal.kubernetes.domain.example 10.1.0.10
Server: 10.1.0.10
Address: 10.1.0.10#‌53
Name: internal.kubernetes.domain.example
Address: 10.48.24.11

#‌ ethtool -k eno1 | grep tx-checksumming
tx-checksumming: on
#‌ ethtool -k enp94s0f0 | grep tx-checksumming
tx-checksumming: on

DNS lookup to an internal kubernetes domain with IPIP type DNS lookups
work as intended, with tx checksumming enabled.

The kernel in -proposed fixes the issue, marking as verified.

** Tags removed: verification-needed-focal
** Tags added: verification-done-focal

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1909062

Title:
  qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not
  supporting IPIP tx csum offload

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1909062/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1909062] Re: qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not supporting IPIP tx csum offload

2021-02-05 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
focal' to 'verification-done-focal'. If the problem still exists, change
the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-focal

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1909062

Title:
  qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not
  supporting IPIP tx csum offload

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1909062/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1909062] Re: qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not supporting IPIP tx csum offload

2021-02-02 Thread Matthew Ruffell
Hi Manish,

Thanks for testing the Groovy 5.8 kernel in -proposed!

The Focal 5.4 / Bionic 5.4 HWE kernel will be in -proposed next week
sometime, if you are looking to verify that as well.

I will also ask our customer in common to verify the Bionic HWE kernel
when it becomes available.

Thanks for making the patch!
Matthew

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1909062

Title:
  qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not
  supporting IPIP tx csum offload

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1909062/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1909062] Re: qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not supporting IPIP tx csum offload

2021-02-02 Thread Manish Chopra
Hi Matthew,

We have verified the fix with proposed kernel.
I hope that I have corrected the "tags" appropriately.

Thanks,
Manish

** Tags removed: verification-needed-groovy
** Tags added: verification-done-groovy

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1909062

Title:
  qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not
  supporting IPIP tx csum offload

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1909062/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1909062] Re: qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not supporting IPIP tx csum offload

2021-01-28 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
groovy' to 'verification-done-groovy'. If the problem still exists,
change the tag 'verification-needed-groovy' to 'verification-failed-
groovy'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-groovy

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1909062

Title:
  qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not
  supporting IPIP tx csum offload

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1909062/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1909062] Re: qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not supporting IPIP tx csum offload

2021-01-22 Thread Kelsey Skunberg
** Changed in: linux (Ubuntu Groovy)
   Status: In Progress => Fix Committed

** Changed in: linux (Ubuntu Focal)
   Status: In Progress => Fix Committed

** Changed in: linux (Ubuntu Hirsute)
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1909062

Title:
  qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not
  supporting IPIP tx csum offload

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1909062/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1909062] Re: qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not supporting IPIP tx csum offload

2021-01-14 Thread Matthew Ruffell
** Description changed:

  BugLink: https://bugs.launchpad.net/bugs/1909062
  
  [Impact]
  
  For users with QLogic QL41xxx series NICs, such as the FastLinQ QL41000
  Series 10/25/40/50GbE Controller, when they upgrade from the 4.15 kernel
  to the 5.4 kernel, Kubernetes Internal DNS requests will fail, due to
  these packets getting corrupted.
  
  Kubernetes uses IPIP tunnelled packets for internal DNS resolution, and
  this particular packet type is not supported for hardware tx checksum
  offload, and the packets end up corrupted when the qede driver attempts
  to checksum them.
  
  This only affects internal Kubernetes DNS, as regular DNS lookups to
  regular external domains will succeed, due to them not using IPIP packet
  types.
  
  [Fix]
  
  Marvell has developed a fix for the qede driver, which checks the packet
  type, and if it is IPPROTO_IPIP, then csum offloads are disabled for
  socket buffers of type IPIP.
  
  commit 5d5647dad259bb416fd5d3d87012760386d97530
  Author: Manish Chopra 
  Date: Mon Dec 21 06:55:30 2020 -0800
  Subject: qede: fix offload for IPIP tunnel packets
- Link: 
https://github.com/torvalds/linux/commit/5d5647dad259bb416fd5d3d87012760386d97530
 
+ Link: 
https://github.com/torvalds/linux/commit/5d5647dad259bb416fd5d3d87012760386d97530
  
- This commit landed in mainline in 5.11-rc3. The commit is queued for
- upstream stable.
+ This commit landed in mainline in 5.11-rc3. The commit was accepted into
+ update stable 4.14.215, 4.19.167, 5.4.89 and 5.10.7.
+ 
+ Note, this SRU isn't targeted for Bionic due to tx csum offload support
+ only landing in 5.0 and onward, meaning the 4.15 kernel still works even
+ without this patch. Because of this, Bionic can pick the patch up
+ naturally from upstream stable.
  
  [Testcase]
  
  The system must have a QLogic QL41xxx series NIC fitted, and needs to be
  a part of a Kubernetes cluster.
  
  Firstly, get a list of all devices in the system:
  
  $ sudo ifconfig
  
  Next, set all devices down with:
  
  $ sudo ifconfig  down
  
  Next, bring up the QLogic QL41xxx device:
  
  $ sudo ifconfig  up
  
  Then, attempt to lookup an internal Kubernetes domain:
  
  $ nslookup 
  
  Without the patch, the connection will time out:
  
  ;; connection timed out; no servers could be reached
  
  If we look at packet traces with tcpdump, we see it leaves the source,
  but never arrives at the destination.
  
  There is a test kernel available in the following ppa:
  
  https://launchpad.net/~mruffell/+archive/ubuntu/sf297772-test
  
  If you install it, then Kubernetes internal DNS lookups will succeed.
  
  [Where problems could occur]
  
  If a regression were to occur, then users of the qede driver would be
  affected. This is limited to those with QLogic QL41xxx series NICs. The
  patch explicitly checks for IPIP type packets, so only those particular
  packets would be affected.
  
  Since IPIP type packets are uncommon, it would not cause a total outage
  on regression, since most packets are not IPIP tunnelled. It could
  potentially cause problems for users who frequently handle VPN or
  Kubernetes internal DNS traffic.
  
  A workaround would be to use ethtool to disable tx csum offload for all
  packet types, or to revert to an older kernel.

** Also affects: linux (Ubuntu Hirsute)
   Importance: Undecided
   Status: Confirmed

** Changed in: linux (Ubuntu Hirsute)
   Status: Confirmed => In Progress

** Changed in: linux (Ubuntu Hirsute)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu Hirsute)
 Assignee: (unassigned) => Matthew Ruffell (mruffell)

** Description changed:

  BugLink: https://bugs.launchpad.net/bugs/1909062
  
  [Impact]
  
  For users with QLogic QL41xxx series NICs, such as the FastLinQ QL41000
  Series 10/25/40/50GbE Controller, when they upgrade from the 4.15 kernel
  to the 5.4 kernel, Kubernetes Internal DNS requests will fail, due to
  these packets getting corrupted.
  
  Kubernetes uses IPIP tunnelled packets for internal DNS resolution, and
  this particular packet type is not supported for hardware tx checksum
  offload, and the packets end up corrupted when the qede driver attempts
  to checksum them.
  
  This only affects internal Kubernetes DNS, as regular DNS lookups to
  regular external domains will succeed, due to them not using IPIP packet
  types.
  
  [Fix]
  
  Marvell has developed a fix for the qede driver, which checks the packet
  type, and if it is IPPROTO_IPIP, then csum offloads are disabled for
  socket buffers of type IPIP.
  
  commit 5d5647dad259bb416fd5d3d87012760386d97530
  Author: Manish Chopra 
  Date: Mon Dec 21 06:55:30 2020 -0800
  Subject: qede: fix offload for IPIP tunnel packets
  Link: 
https://github.com/torvalds/linux/commit/5d5647dad259bb416fd5d3d87012760386d97530
  
  This commit landed in mainline in 5.11-rc3. The commit was accepted into
- update stable 4.14.215, 4.19.167, 5.4.89 and 5.10.7.
+ upstream stable 4.14.215, 4.19.167, 

[Bug 1909062] Re: qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not supporting IPIP tx csum offload

2021-01-10 Thread Matthew Ruffell
** Description changed:

  BugLink: https://bugs.launchpad.net/bugs/1909062
  
  [Impact]
  
  For users with QLogic QL41xxx series NICs, such as the FastLinQ QL41000
  Series 10/25/40/50GbE Controller, when they upgrade from the 4.15 kernel
  to the 5.4 kernel, Kubernetes Internal DNS requests will fail, due to
  these packets getting corrupted.
  
  Kubernetes uses IPIP tunnelled packets for internal DNS resolution, and
  this particular packet type is not supported for hardware tx checksum
  offload, and the packets end up corrupted when the qede driver attempts
  to checksum them.
  
  This only affects internal Kubernetes DNS, as regular DNS lookups to
  regular external domains will succeed, due to them not using IPIP packet
  types.
  
  [Fix]
  
  Marvell has developed a fix for the qede driver, which checks the packet
  type, and if it is IPPROTO_IPIP, then csum offloads are disabled for
  socket buffers of type IPIP.
  
  commit 5d5647dad259bb416fd5d3d87012760386d97530
  Author: Manish Chopra 
  Date: Mon Dec 21 06:55:30 2020 -0800
  Subject: qede: fix offload for IPIP tunnel packets
- Link: 
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=5d5647dad259bb416fd5d3d87012760386d97530
+ Link: 
https://github.com/torvalds/linux/commit/5d5647dad259bb416fd5d3d87012760386d97530
 
  
- This commit is currently in the netdev tree, awaiting merge to mainline.
- The commit is queued for upstream stable.
+ This commit landed in mainline in 5.11-rc3. The commit is queued for
+ upstream stable.
  
  [Testcase]
  
  The system must have a QLogic QL41xxx series NIC fitted, and needs to be
  a part of a Kubernetes cluster.
  
  Firstly, get a list of all devices in the system:
  
  $ sudo ifconfig
  
  Next, set all devices down with:
  
  $ sudo ifconfig  down
  
  Next, bring up the QLogic QL41xxx device:
  
  $ sudo ifconfig  up
  
  Then, attempt to lookup an internal Kubernetes domain:
  
  $ nslookup 
  
  Without the patch, the connection will time out:
  
  ;; connection timed out; no servers could be reached
  
  If we look at packet traces with tcpdump, we see it leaves the source,
  but never arrives at the destination.
  
  There is a test kernel available in the following ppa:
  
  https://launchpad.net/~mruffell/+archive/ubuntu/sf297772-test
  
  If you install it, then Kubernetes internal DNS lookups will succeed.
  
  [Where problems could occur]
  
  If a regression were to occur, then users of the qede driver would be
  affected. This is limited to those with QLogic QL41xxx series NICs. The
  patch explicitly checks for IPIP type packets, so only those particular
  packets would be affected.
  
  Since IPIP type packets are uncommon, it would not cause a total outage
  on regression, since most packets are not IPIP tunnelled. It could
  potentially cause problems for users who frequently handle VPN or
  Kubernetes internal DNS traffic.
  
  A workaround would be to use ethtool to disable tx csum offload for all
  packet types, or to revert to an older kernel.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1909062

Title:
  qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not
  supporting IPIP tx csum offload

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1909062/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1909062] Re: qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not supporting IPIP tx csum offload

2021-01-02 Thread Matthew Ruffell
Thanks Manish! I am building a test kernel now, and I will let you know
once it is ready to test.

If we get good test results, I will submit the patch for SRU to the
Ubuntu kernels once the patch has hit mainline.

** Description changed:

- With QL41xxx and Ubuntu DNS server DNS failures are seen when updated to
- the latest Ubuntu kernel 20.04.1 LTS version 5.4.0-52-generic. Issue was
- not observed with 4.5 ubuntu-linux.
+ BugLink: https://bugs.launchpad.net/bugs/1909062
  
- Problem Definition:
- OS Version: /etc/os-release shows Ubuntu 18.04.4 LTS, but Booted kernel is 
the latest Ubuntu 20.04.1 LTS version 5.4.0-52-generic
- NIC: 2 dual-port (4) QLogic Corp. FastLinQ QL41000 Series 10/25/40/50GbE 
Controller [1077:8070] (rev 02)
- Inbox driver qede v8.37.0.20
+ [Impact]
  
- Complete Detailed Problem Description:
- Anything that uses the internal Kubernetes DNS server fails. If an external 
DNS server is used resolution works for non-Kubernetes IPs.
+ For users with QLogic QL41xxx series NICs, such as the FastLinQ QL41000
+ Series 10/25/40/50GbE Controller, when they upgrade from the 4.15 kernel
+ to the 5.4 kernel, Kubernetes Internal DNS requests will fail, due to
+ these packets getting corrupted.
  
- Similar issue is described in this article.
- https://github.com/kubernetes/kubernetes/issues/95365
+ Kubernetes uses IPIP tunnelled packets for internal DNS resolution, and
+ this particular packet type is not supported for hardware tx checksum
+ offload, and the packets end up corrupted when the qede driver attempts
+ to checksum them.
  
- Below patch recently on upstream fixes this -
- [Note that issue was introduced by driver's tunnel offload support which was 
added in after 4.5 kernel]
+ This only affects internal Kubernetes DNS, as regular DNS lookups to
+ regular external domains will succeed, due to them not using IPIP packet
+ types.
+ 
+ [Fix]
+ 
+ Marvell has developed a fix for the qede driver, which checks the packet
+ type, and if it is IPPROTO_IPIP, then csum offloads are disabled for
+ socket buffers of type IPIP.
  
  commit 5d5647dad259bb416fd5d3d87012760386d97530
  Author: Manish Chopra 
- Date:   Mon Dec 21 06:55:30 2020 -0800
+ Date: Mon Dec 21 06:55:30 2020 -0800
+ Subject: qede: fix offload for IPIP tunnel packets
+ Link: 
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=5d5647dad259bb416fd5d3d87012760386d97530
  
- qede: fix offload for IPIP tunnel packets
+ This commit is currently in the netdev tree, awaiting merge to mainline.
+ The commit is queued for upstream stable.
  
- IPIP tunnels packets are unknown to device,
- hence these packets are incorrectly parsed and
- caused the packet corruption, so disable offlods
- for such packets at run time.
+ [Testcase]
  
- Signed-off-by: Manish Chopra 
- Signed-off-by: Sudarsana Kalluru 
- Signed-off-by: Igor Russkikh 
- Link: https://lore.kernel.org/r/20201221145530.7771-1-mani...@marvell.com
- Signed-off-by: Jakub Kicinski 
+ The system must have a QLogic QL41xxx series NIC fitted, and needs to be
+ a part of a Kubernetes cluster.
  
- Thanks,
- Manish
+ Firstly, get a list of all devices in the system:
+ 
+ $ sudo ifconfig
+ 
+ Next, set all devices down with:
+ 
+ $ sudo ifconfig  down
+ 
+ Next, bring up the QLogic QL41xxx device:
+ 
+ $ sudo ifconfig  up
+ 
+ Then, attempt to lookup an internal Kubernetes domain:
+ 
+ $ nslookup 
+ 
+ Without the patch, the connection will time out:
+ 
+ ;; connection timed out; no servers could be reached
+ 
+ If we look at packet traces with tcpdump, we see it leaves the source,
+ but never arrives at the destination.
+ 
+ There is a test kernel available in the following ppa:
+ 
+ https://launchpad.net/~mruffell/+archive/ubuntu/sf297772-test
+ 
+ If you install it, then Kubernetes internal DNS lookups will succeed.
+ 
+ [Where problems could occur]
+ 
+ If a regression were to occur, then users of the qede driver would be
+ affected. This is limited to those with QLogic QL41xxx series NICs. The
+ patch explicitly checks for IPIP type packets, so only those particular
+ packets would be affected.
+ 
+ Since IPIP type packets are uncommon, it would not cause a total outage
+ on regression, since most packets are not IPIP tunnelled. It could
+ potentially cause problems for users who frequently handle VPN or
+ Kubernetes internal DNS traffic.
+ 
+ A workaround would be to use ethtool to disable tx csum offload for all
+ packet types, or to revert to an older kernel.

** Tags added: sts

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1909062

Title:
  qede: Kubernetes Internal DNS Failure due to QL41xxx NIC not
  supporting IPIP tx csum offload

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1909062/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com