[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
** Changed in: linux (Ubuntu) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
This bug was fixed in the package linux - 4.13.0-38.43 --- linux (4.13.0-38.43) artful; urgency=medium * linux: 4.13.0-38.43 -proposed tracker (LP: #1755762) * Servers going OOM after updating kernel from 4.10 to 4.13 (LP: #1748408) - i40e: Fix memory leak related filter programming status - i40e: Add programming descriptors to cleaned_count * [SRU] Lenovo E41 Mic mute hotkey is not responding (LP: #1753347) - platform/x86: ideapad-laptop: Increase timeout to wait for EC answer * fails to dump with latest kpti fixes (LP: #1750021) - kdump: write correct address of mem_section into vmcoreinfo * headset mic can't be detected on two Dell machines (LP: #1748807) - ALSA: hda/realtek - Support headset mode for ALC215/ALC285/ALC289 - ALSA: hda - Fix headset mic detection problem for two Dell machines - ALSA: hda - Fix a wrong FIXUP for alc289 on Dell machines * CIFS SMB2/SMB3 does not work for domain based DFS (LP: #1747572) - CIFS: make IPC a regular tcon - CIFS: use tcon_ipc instead of use_ipc parameter of SMB2_ioctl - CIFS: dump IPC tcon in debug proc file * i2c-thunderx: erroneous error message "unhandled state: 0" (LP: #1754076) - i2c: octeon: Prevent error message on bus error * hisi_sas: Add disk LED support (LP: #1752695) - scsi: hisi_sas: directly attached disk LED feature for v2 hw * EDAC, sb_edac: Backport 1 patch to Ubuntu 17.10 (Fix missing DIMM sysfs entries with KNL SNC2/SNC4 mode) (LP: #1743856) - EDAC, sb_edac: Fix missing DIMM sysfs entries with KNL SNC2/SNC4 mode * [regression] Colour banding and artefacts appear system-wide on an Asus Zenbook UX303LA with Intel HD 4400 graphics (LP: #1749420) - drm/edid: Add 6 bpc quirk for CPT panel in Asus UX303LA * DVB Card with SAA7146 chipset not working (LP: #1742316) - vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems * [Asus UX360UA] battery status in unity-panel is not changing when battery is being charged (LP: #1661876) // AC adapter status not detected on Asus ZenBook UX410UAK (LP: #1745032) - ACPI / battery: Add quirk for Asus UX360UA and UX410UAK * ASUS UX305LA - Battery state not detected correctly (LP: #1482390) - ACPI / battery: Add quirk for Asus GL502VSK and UX305LA * support thunderx2 vendor pmu events (LP: #1747523) - perf pmu: Extract function to get JSON alias map - perf pmu: Pass pmu as a parameter to get_cpuid_str() - perf tools arm64: Add support for get_cpuid_str function. - perf pmu: Add helper function is_pmu_core to detect PMU CORE devices - perf vendor events arm64: Add ThunderX2 implementation defined pmu core events - perf pmu: Add check for valid cpuid in perf_pmu__find_map() * lpfc.ko module doesn't work (LP: #1746970) - scsi: lpfc: Fix loop mode target discovery * Ubuntu 17.10 crashes on vmalloc.c (LP: #1739498) - powerpc/mm/book3s64: Make KERN_IO_START a variable - powerpc/mm/slb: Move comment next to the code it's referring to - powerpc/mm/hash64: Make vmalloc 56T on hash * ethtool -p fails to light NIC LED on HiSilicon D05 systems (LP: #1748567) - net: hns: add ACPI mode support for ethtool -p * CVE-2017-17807 - KEYS: add missing permission check for request_key() destination * [Artful SRU] Fix capsule update regression (LP: #1746019) - efi/capsule-loader: Reinstate virtual capsule mapping * [Artful/Bionic] [Config] enable EDAC_GHES for ARM64 (LP: #1747746) - Ubuntu: [Config] enable EDAC_GHES for ARM64 * linux-tools: perf incorrectly linking libbfd (LP: #1748922) - SAUCE: tools -- add ability to disable libbfd - [Packaging] correct disablement of libbfd * Cherry pick c96f5471ce7d for delayacct fix (LP: #1747769) - delayacct: Account blkio completion on the correct task * Error in CPU frequency reporting when nominal and min pstates are same (cpufreq) (LP: #1746174) - cpufreq: powernv: Dont assume distinct pstate values for nominal and pmin * retpoline abi files are empty on i386 (LP: #1751021) - [Packaging] retpoline-extract -- instantiate retpoline files for i386 - [Packaging] final-checks -- sanity checking ABI contents - [Packaging] final-checks -- check for empty retpoline files * [P9,Power NV][WSP][Ubuntu 1804] : "Kernel access of bad area " when grouping different pmu events using perf fuzzer . (perf:) (LP: #1746225) - powerpc/perf: Fix oops when grouping different pmu events * bnx2x_attn_int_deasserted3:4323 MC assert! (LP: #1715519) // CVE-2018-126 - net: create skb_gso_validate_mac_len() - bnx2x: disable GSO where gso_size is too big for hardware * Ubuntu16.04.03: ISAv3 initialize MMU registers before setting partition table (LP: #1736145) - powerpc/64s: Initialize ISAv3 MMU registers before setting partition table * powerpc/powernv: Flush console before platform error reboot (LP: #1735159) - po
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
For how long do the Xenial HWE kernels stay in the "proposed" ? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
Proposed kernels show the same improved behaviour as the earlier test kernels. ** Tags removed: verification-needed-artful ** Tags added: verification-done-artful -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
Running the -proposed kernel on two machines now, will provide the results in a couple of days. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
** Changed in: linux (Ubuntu) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
Together with the new Artful kernel there was also a new HWE kernel that is based on the new Artful kernel (4.13.0-38.43~16.04.1). Verification can be done with that kernel as well. Just the automatically generated messages are for the base kernels where the patch was applied to. The HWE kernel is a backport of the Artful kernel right now. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
@Stefan: I haven't reproduced the issue on Artful and I don't have an environment to do so. The original issue is for the HWE kernel on Xenial and only for that I can perform verification. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- artful' to 'verification-done-artful'. If the problem still exists, change the tag 'verification-needed-artful' to 'verification-failed- artful'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-artful -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
** Changed in: linux (Ubuntu Artful) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
SRU request submitted: https://lists.ubuntu.com/archives/kernel-team/2018-March/090701.html ** Description changed: - We are seeing this on multiple servers after upgrading from previous - 4.10 series HWE kernels to the new 4.13 HWE series. With the new kernel, - free memory is continously decreasing at a high rate and the servers - start swapping and finally OOMing services within days. With the 4.10 - kernel, decrease of free memory is slower and stabilizes after a while. + == SRU Justification == + We are seeing this on multiple servers after upgrading from previous 4.10 series HWE kernels to the new 4.13 HWE series. With the new kernel, free memory is continously decreasing at a high rate and the servers start swapping and finally OOMing services within days. With the 4.10 kernel, decrease of free memory is slower and stabilizes after a while. Latest kernel tested is linux-image-4.13.0-32-generic but the issue also affects older kernels from that series, tested back to linux- image-4.13.0-19-generic. No issue with linux-image-4.10.0-42-generic. The servers are running as OpenStack controller nodes using either Ocata or Pike UCA plus ceph. See attached graph for the memory behaviour. + + == Fix == + 2b9478ffc550("i40e: Fix memory leak related filter programming status") + 62b4c6694dfd("i40e: Add programming descriptors to cleaned_count") + + == Regression Potential == + Low. Limited to i40e and fix existing regression. + + == Test Case == + A test kernel was built with these patches and tested by the original bug reporter. + The bug reporter states the test kernel resolved the bug. + ProblemType: Bug DistroRelease: Ubuntu 16.04 Package: linux-image-4.13.0-32-generic 4.13.0-32.35~16.04.1 ProcVersionSignature: Ubuntu 4.13.0-32.35~16.04.1-generic 4.13.13 Uname: Linux 4.13.0-32-generic x86_64 ApportVersion: 2.20.1-0ubuntu2.15 Architecture: amd64 Date: Fri Feb 9 09:45:50 2018 ProcEnviron: - LANGUAGE=en_US: - TERM=screen - PATH=(custom, no user) - LANG=en_US.utf8 - SHELL=/bin/bash + LANGUAGE=en_US: + TERM=screen + PATH=(custom, no user) + LANG=en_US.utf8 + SHELL=/bin/bash SourcePackage: linux-hwe UpgradeStatus: No upgrade log present (probably fresh install) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
** Changed in: linux (Ubuntu) Status: Triaged => In Progress ** Changed in: linux (Ubuntu Artful) Status: Triaged => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
The slow leak will probably be tolerable for the time being, having those two patches added to the kernel would surely be a pretty valuable step that I think should be done now. My target still is Xenial with the hwe kernel, though. If you need to go via Artful to fix that, well, go ahead. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
Do you think an Artful SRU request should be sent for commits 2b9478 and 62b4c66? Or would you like to investigate the slow memory leak further? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
The test kernel solves the issue in the same way as my own kernel earlier, i.e. we still seem to have a very slow running memory leak with this kernel. I'm also seeing this slow leak when I replace the in-tree i40e driver by an upstream version (2.3.4), so either it is unrelated or contained in both. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
I built a test kernel with commits 2b9478 and 62b4c66. The test kernel can be downloaded from: http://kernel.ubuntu.com/~jsalisbury/lp1748408 Can you test this kernel and see if it resolves this bug? Note, to test this kernel, you need to install both the linux-image and linux-image-extra .deb packages. Thanks in advance! Hide -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
After running for a couple of days, it seems that we are still seeing the slow memory leak similar to what was noticed in >= 4.14 earlier with the patched kernel. But it won't be possible for me to bisect at that rate. @Joseph: Getting a patched current 4.13 still would be nice, getting instructions for how to build such a kernel would be even nicer. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
I have a significant memory leak after upgrading from previous 4.10 series HWE kernels to the new 4.13 HWE series for Ubuntu 16.04 server with Ethernet controller Intel X710 for 10GbE SFP+ # dmesg | grep i40e [1.625565] i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.14-k [1.625565] i40e: Copyright (c) 2013 - 2014 Intel Corporation. [1.688509] i40e :02:00.0: fw 5.40.47690 api 1.5 nvm 5.40 0x80002d35 18.0.17 [1.959126] i40e :02:00.0: MAC address: 3c:fd:fe:1a:1d:e0 [2.060021] i40e :02:00.0: PCI-Express: Speed 8.0GT/s Width x4 [2.060091] i40e :02:00.0: PCI-Express bandwidth available for this device may be insufficient for optimal performance. [2.060096] i40e :02:00.0: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate. [2.085931] i40e :02:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 8 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA [2.140793] i40e :02:00.1: fw 5.40.47690 api 1.5 nvm 5.40 0x80002d35 18.0.17 [2.422817] i40e :02:00.1: MAC address: 3c:fd:fe:1a:1d:e2 [2.442684] i40e :02:00.1: PCI-Express: Speed 8.0GT/s Width x4 [2.442696] i40e :02:00.1: PCI-Express bandwidth available for this device may be insufficient for optimal performance. [2.442715] i40e :02:00.1: Please move the device to a different PCI-e link with more lanes and/or higher transfer rate. [2.443043] i40e :02:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 8 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA [2.480205] i40e :02:00.0 enp2s0f0: renamed from eth1 [2.512183] i40e :02:00.1 enp2s0f1: renamed from eth0 [5.800514] i40e :02:00.0 enp2s0f0: NIC Link is Up, 10 Gbps Full Duplex, Flow Control: None ** Attachment added: "201802_nperf_memory-week.png" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+attachment/5063316/+files/201802_nperf_memory-week.png -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
O.k., confirming that this series of patches fixes the issue: ~/linux$ git log --oneline|head -3 bc6d6fd2f916 i40e: Add programming descriptors to cleaned_count 69949b3bd674 i40e: Fix memory leak related filter programming status b32038eb34ee UBUNTU: Ubuntu-4.13.0-32.35 Can you build the same thing on top of the latest 4.13 set? Seems some special gcc foo is needed to make the retpoline stuff working there -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
Reading the thread further, we seem to need two patches, see https://www.spinics.net/lists/netdev/msg462051.html, so I'm going to add bc6d6fd2f916a0794ae4c44b28e14e2d172e05e0 into the build, too. Will try that on top of b32038eb34ee42fd8056f99f88652270f6667996 (tag: Ubuntu-4.13.0-32.35). I also tested the "ethtool --set-priv-flags flow-director-atr off" option and it seems to slow down the leak similar to >= 4.14 kernels. So either that fixes only part of the issue or we have a different one that only got masked up to now. Third option would be using the upstream i40e driver instead, testing with 2.3.4 currently and that also seems to resolve the issue. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
I built a test kernel with commit 2b9478ffc550f1. The test kernel can be downloaded from: http://kernel.ubuntu.com/~jsalisbury/lp1748408 Can you test this kernel and see if it resolves this bug? Note, to test this kernel, you need to install both the linux-image and linux-image-extra .deb packages. Thanks in advance! -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
A colleague found that this seems to be a known issue: https://www.spinics.net/lists/netdev/msg458258.html and the fix should be https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/drivers/net/ethernet/intel/i40e/i40e_txrx.c?id=2b9478ffc550f17c6cd8c69057234e91150f5972 I will try cherry-picking this onto 4.13, not sure why it never seems to have been pulled into the stable branch. Also not sure why we are still seeing issues with >= 4.14, very likely a completely different issue there, but I think we'll be fine if we get 4.13 fixed for now. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
Sorry for the delay, bisecting took longer than planned, but I now have the result: 6964e53f55837b0c49ed60d36656d2e0ee4fc27b is the first bad commit commit 6964e53f55837b0c49ed60d36656d2e0ee4fc27b Author: Jacob Keller Date: Mon Jun 12 15:38:36 2017 -0700 i40e: fix handling of HW ATR eviction The bad news is that this patch pretty certainly isn't directly the culprit, as it only fixes (and re-enables) features that seem to have been messed up earlier. So not sure how to proceed now, probably need to discuss this with upstream developers? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
@Dr. Jens Harbott, Just let me know if you need assistance with the bisect between 4.11 and 4.12. I can build the kernels for you. I would say the next step would be to test the 4.12 release candidates to narrow down the issue further. 4.12-rc1 is available here: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12-rc1/ You can just change the 'rc' part of that link to test the other release candidates, such as rc2, rc3, etc. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
** Tags added: kernel-bug-exists-upstream -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
Hi guys. I'm experiencing problems with 4.13.0-36 on a cloud server (x64) with dynamic ram management. The server is provisioned to use up to 12GB of RAM, but it got so bad that only 1GB was visible, causing everything to a halt while swap usage went through the roof. Could it be related? And here is another instance of kernel 4.13 showing memory problems similar to what is being described here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1722778 I have reverted to 4.11.0-041100-generic #201705041534, for now -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
Oh and I have also experienced random, unexplained OOM crashing of squid process recently on two separate machines (one i386 and one amd64) which seem to coincide in time with the upgrade from kernel 4.10 to 4.13 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
So here are the first results: 4.11.0-041100-generic #201705041534 - not affected 4.12.0-041200-generic #201707022031 - affected 4.13.0-041300-generic #201709031731 - affected 4.13.16-041316-generic #201711240901 - affected Results for newer kernels are not so clear, they do not fail as fast as previous ones, but they do still fill up memory and - later - swap slowly. The rate is so slow however, that it will probably take weeks to come to some definitive results here. Thus my next step, unless there is a better proposal, will be starting to bisect from 4.11 to 4.12, git expects 13 steps for that. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
Thanks for the update. I requested testing of the mainline kernel to see if there is a commit in mainline that fixes the bug, which we could backport back to 4.13. If the bug is not fixed in mailine, we can perform a kernel bisect to identify the commit that introduced this regression. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
Ok, nevermind the aufs issue, I got that resolved. Should have some results with mainline kernels in a couple of days. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
@Joseph: I did test 4.15.2, but some things are failing, in particular docker because of lacking AUFS support, so I need to build a kernel myself I guess, which will take a bit. Also note that I'm seeing this on Xenial machines, didn't test with Artful. We used to run them with the 4.10 HWE kernels because they offer improved performance in some areas compared to the stock 4.4. kernel. Started seeing these issue after HWE switched to 4.13 a couple of weeks ago. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.15 kernel[0]. Thanks in advance. [0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.15 ** Package changed: linux-hwe (Ubuntu) => linux (Ubuntu) ** Changed in: linux (Ubuntu) Importance: Undecided => High ** Also affects: linux (Ubuntu Artful) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Artful) Importance: Undecided => Critical ** Changed in: linux (Ubuntu Artful) Importance: Critical => High ** Changed in: linux (Ubuntu Artful) Assignee: (unassigned) => Joseph Salisbury (jsalisbury) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Joseph Salisbury (jsalisbury) ** Changed in: linux (Ubuntu Artful) Status: New => In Progress ** Changed in: linux (Ubuntu) Status: New => Triaged ** Changed in: linux (Ubuntu Artful) Status: In Progress => Triaged -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1748408] Re: Servers going OOM after updating kernel from 4.10 to 4.13
** Summary changed: - Servers going OOM + Servers going OOM after updating kernel from 4.10 to 4.13 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1748408 Title: Servers going OOM after updating kernel from 4.10 to 4.13 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1748408/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs