[Kernel-packages] [Bug 1662378] Re: many OOMs on busy xenial IMAP server with lots of available memory
Zero OOMs after 4 days of uptime (they started after 2 days previously) so I'd say this is fixed. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1662378 Title: many OOMs on busy xenial IMAP server with lots of available memory Status in linux package in Ubuntu: Triaged Status in linux source package in Xenial: Triaged Bug description: We recently noticed that a busy xenial IMAP server with about 22 days uptime has been logging a lot of OOM messages. The machine has 24G of memory. Below please find some typical memory info. I noted that there was about 11G of memory allocated to slab, and since all of the oom-killer invoked messages report order=2 or order=3 (see below for gfp_flags) I thought that maybe fragmentation was a factor. After doing echo 2 > /proc/sys/vm/drop_caches to release reclaimable slab memory, no OOMs were logged for around 30 minutes, but then they started up again, although perhaps not as frequently as before, and the amount of memory in slab was back up around its former size. To the best of my knowledge we do not have any custom VM- related sysctl tweaks on this machine. Attached please find version.log and lspci-vnvn.log. And here's a link to a kern.log from a little before time of boot onwards, containing all of the oom-killer messages: https://people.canonical.com/~pjdc/grenadilla-sanitized-kern.log.xz == Breakdown of Failed Allocations === pjdc@grenadilla:~$ grep -o 'gfp_mask=.*, order=.' kern.log-version-with-oom-killer-invoked | sort | uniq -c | sort -n 1990 gfp_mask=0x26000c0, order=2 4043 gfp_mask=0x240c0c0, order=3 pjdc@grenadilla:~$ _ == Representative (Probably) Memory Info == pjdc@grenadilla:~$ free -m totalusedfree shared buff/cache available Mem: 240971762 213 266 22121 21087 Swap: 17492 101 17391 pjdc@grenadilla:~$ cat /proc/meminfo MemTotal: 24676320 kB MemFree: 219440 kB MemAvailable: 21593416 kB Buffers: 6186648 kB Cached: 4255608 kB SwapCached: 3732 kB Active: 7593140 kB Inactive:4404824 kB Active(anon):1319736 kB Inactive(anon): 508544 kB Active(file):6273404 kB Inactive(file): 3896280 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 17912436 kB SwapFree: 17808972 kB Dirty: 524 kB Writeback: 0 kB AnonPages: 1553244 kB Mapped: 219868 kB Shmem:272576 kB Slab: 12209796 kB SReclaimable: 11572836 kB SUnreclaim: 636960 kB KernelStack: 14464 kB PageTables:54864 kB NFS_Unstable: 0 kB Bounce:0 kB WritebackTmp: 0 kB CommitLimit:30250596 kB Committed_AS:2640808 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted: 0 kB AnonHugePages: 18432 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free:0 HugePages_Rsvd:0 HugePages_Surp:0 Hugepagesize: 2048 kB DirectMap4k: 2371708 kB DirectMap2M:22784000 kB pjdc@grenadilla:~$ To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1662378/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1662378] Re: many OOMs on busy xenial IMAP server with lots of available memory
I've scheduled install/reboot for Sunday evening UTC, and should be able to report back a few days after that. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1662378 Title: many OOMs on busy xenial IMAP server with lots of available memory Status in linux package in Ubuntu: Triaged Status in linux source package in Xenial: Triaged Bug description: We recently noticed that a busy xenial IMAP server with about 22 days uptime has been logging a lot of OOM messages. The machine has 24G of memory. Below please find some typical memory info. I noted that there was about 11G of memory allocated to slab, and since all of the oom-killer invoked messages report order=2 or order=3 (see below for gfp_flags) I thought that maybe fragmentation was a factor. After doing echo 2 > /proc/sys/vm/drop_caches to release reclaimable slab memory, no OOMs were logged for around 30 minutes, but then they started up again, although perhaps not as frequently as before, and the amount of memory in slab was back up around its former size. To the best of my knowledge we do not have any custom VM- related sysctl tweaks on this machine. Attached please find version.log and lspci-vnvn.log. And here's a link to a kern.log from a little before time of boot onwards, containing all of the oom-killer messages: https://people.canonical.com/~pjdc/grenadilla-sanitized-kern.log.xz == Breakdown of Failed Allocations === pjdc@grenadilla:~$ grep -o 'gfp_mask=.*, order=.' kern.log-version-with-oom-killer-invoked | sort | uniq -c | sort -n 1990 gfp_mask=0x26000c0, order=2 4043 gfp_mask=0x240c0c0, order=3 pjdc@grenadilla:~$ _ == Representative (Probably) Memory Info == pjdc@grenadilla:~$ free -m totalusedfree shared buff/cache available Mem: 240971762 213 266 22121 21087 Swap: 17492 101 17391 pjdc@grenadilla:~$ cat /proc/meminfo MemTotal: 24676320 kB MemFree: 219440 kB MemAvailable: 21593416 kB Buffers: 6186648 kB Cached: 4255608 kB SwapCached: 3732 kB Active: 7593140 kB Inactive:4404824 kB Active(anon):1319736 kB Inactive(anon): 508544 kB Active(file):6273404 kB Inactive(file): 3896280 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 17912436 kB SwapFree: 17808972 kB Dirty: 524 kB Writeback: 0 kB AnonPages: 1553244 kB Mapped: 219868 kB Shmem:272576 kB Slab: 12209796 kB SReclaimable: 11572836 kB SUnreclaim: 636960 kB KernelStack: 14464 kB PageTables:54864 kB NFS_Unstable: 0 kB Bounce:0 kB WritebackTmp: 0 kB CommitLimit:30250596 kB Committed_AS:2640808 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted: 0 kB AnonHugePages: 18432 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free:0 HugePages_Rsvd:0 HugePages_Surp:0 Hugepagesize: 2048 kB DirectMap4k: 2371708 kB DirectMap2M:22784000 kB pjdc@grenadilla:~$ To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1662378/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1662378] Re: many OOMs on busy xenial IMAP server with lots of available memory
** Attachment added: "lspci-vnvn.log" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1662378/+attachment/4814448/+files/lspci-vnvn.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1662378 Title: many OOMs on busy xenial IMAP server with lots of available memory Status in linux package in Ubuntu: Confirmed Bug description: We recently noticed that a busy xenial IMAP server with about 22 days uptime has been logging a lot of OOM messages. The machine has 24G of memory. Below please find some typical memory info. I noted that there was about 11G of memory allocated to slab, and since all of the oom-killer invoked messages report order=2 or order=3 (see below for gfp_flags) I thought that maybe fragmentation was a factor. After doing echo 2 > /proc/sys/vm/drop_caches to release reclaimable slab memory, no OOMs were logged for around 30 minutes, but then they started up again, although perhaps not as frequently as before, and the amount of memory in slab was back up around its former size. To the best of my knowledge we do not have any custom VM- related sysctl tweaks on this machine. Attached please find version.log and lspci-vnvn.log. And here's a link to a kern.log from a little before time of boot onwards, containing all of the oom-killer messages: https://people.canonical.com/~pjdc/grenadilla-sanitized-kern.log.xz == Breakdown of Failed Allocations === pjdc@grenadilla:~$ grep -o 'gfp_mask=.*, order=.' kern.log-version-with-oom-killer-invoked | sort | uniq -c | sort -n 1990 gfp_mask=0x26000c0, order=2 4043 gfp_mask=0x240c0c0, order=3 pjdc@grenadilla:~$ _ == Representative (Probably) Memory Info == pjdc@grenadilla:~$ free -m totalusedfree shared buff/cache available Mem: 240971762 213 266 22121 21087 Swap: 17492 101 17391 pjdc@grenadilla:~$ cat /proc/meminfo MemTotal: 24676320 kB MemFree: 219440 kB MemAvailable: 21593416 kB Buffers: 6186648 kB Cached: 4255608 kB SwapCached: 3732 kB Active: 7593140 kB Inactive:4404824 kB Active(anon):1319736 kB Inactive(anon): 508544 kB Active(file):6273404 kB Inactive(file): 3896280 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 17912436 kB SwapFree: 17808972 kB Dirty: 524 kB Writeback: 0 kB AnonPages: 1553244 kB Mapped: 219868 kB Shmem:272576 kB Slab: 12209796 kB SReclaimable: 11572836 kB SUnreclaim: 636960 kB KernelStack: 14464 kB PageTables:54864 kB NFS_Unstable: 0 kB Bounce:0 kB WritebackTmp: 0 kB CommitLimit:30250596 kB Committed_AS:2640808 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted: 0 kB AnonHugePages: 18432 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free:0 HugePages_Rsvd:0 HugePages_Surp:0 Hugepagesize: 2048 kB DirectMap4k: 2371708 kB DirectMap2M:22784000 kB pjdc@grenadilla:~$ To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1662378/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1662378] [NEW] many OOMs on busy xenial IMAP server with lots of available memory
Public bug reported: We recently noticed that a busy xenial IMAP server with about 22 days uptime has been logging a lot of OOM messages. The machine has 24G of memory. Below please find some typical memory info. I noted that there was about 11G of memory allocated to slab, and since all of the oom-killer invoked messages report order=2 or order=3 (see below for gfp_flags) I thought that maybe fragmentation was a factor. After doing echo 2 > /proc/sys/vm/drop_caches to release reclaimable slab memory, no OOMs were logged for around 30 minutes, but then they started up again, although perhaps not as frequently as before, and the amount of memory in slab was back up around its former size. To the best of my knowledge we do not have any custom VM-related sysctl tweaks on this machine. Attached please find version.log and lspci-vnvn.log. And here's a link to a kern.log from a little before time of boot onwards, containing all of the oom-killer messages: https://people.canonical.com/~pjdc/grenadilla-sanitized-kern.log.xz == Breakdown of Failed Allocations === pjdc@grenadilla:~$ grep -o 'gfp_mask=.*, order=.' kern.log-version-with-oom-killer-invoked | sort | uniq -c | sort -n 1990 gfp_mask=0x26000c0, order=2 4043 gfp_mask=0x240c0c0, order=3 pjdc@grenadilla:~$ _ == Representative (Probably) Memory Info == pjdc@grenadilla:~$ free -m totalusedfree shared buff/cache available Mem: 240971762 213 266 22121 21087 Swap: 17492 101 17391 pjdc@grenadilla:~$ cat /proc/meminfo MemTotal: 24676320 kB MemFree: 219440 kB MemAvailable: 21593416 kB Buffers: 6186648 kB Cached: 4255608 kB SwapCached: 3732 kB Active: 7593140 kB Inactive:4404824 kB Active(anon):1319736 kB Inactive(anon): 508544 kB Active(file):6273404 kB Inactive(file): 3896280 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 17912436 kB SwapFree: 17808972 kB Dirty: 524 kB Writeback: 0 kB AnonPages: 1553244 kB Mapped: 219868 kB Shmem:272576 kB Slab: 12209796 kB SReclaimable: 11572836 kB SUnreclaim: 636960 kB KernelStack: 14464 kB PageTables:54864 kB NFS_Unstable: 0 kB Bounce:0 kB WritebackTmp: 0 kB CommitLimit:30250596 kB Committed_AS:2640808 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HardwareCorrupted: 0 kB AnonHugePages: 18432 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free:0 HugePages_Rsvd:0 HugePages_Surp:0 Hugepagesize: 2048 kB DirectMap4k: 2371708 kB DirectMap2M:22784000 kB pjdc@grenadilla:~$ ** Affects: linux (Ubuntu) Importance: Undecided Status: Confirmed ** Attachment added: "version.log" https://bugs.launchpad.net/bugs/1662378/+attachment/4814447/+files/version.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1662378 Title: many OOMs on busy xenial IMAP server with lots of available memory Status in linux package in Ubuntu: Confirmed Bug description: We recently noticed that a busy xenial IMAP server with about 22 days uptime has been logging a lot of OOM messages. The machine has 24G of memory. Below please find some typical memory info. I noted that there was about 11G of memory allocated to slab, and since all of the oom-killer invoked messages report order=2 or order=3 (see below for gfp_flags) I thought that maybe fragmentation was a factor. After doing echo 2 > /proc/sys/vm/drop_caches to release reclaimable slab memory, no OOMs were logged for around 30 minutes, but then they started up again, although perhaps not as frequently as before, and the amount of memory in slab was back up around its former size. To the best of my knowledge we do not have any custom VM- related sysctl tweaks on this machine. Attached please find version.log and lspci-vnvn.log. And here's a link to a kern.log from a little before time of boot onwards, containing all of the oom-killer messages: https://people.canonical.com/~pjdc/grenadilla-sanitized-kern.log.xz == Breakdown of Failed Allocations === pjdc@grenadilla:~$ grep -o 'gfp_mask=.*, order=.' kern.log-version-with-oom-killer-invoked | sort | uniq -c | sort -n 1990 gfp_mask=0x26000c0, order=2 4043 gfp_mask=0x240c0c0, order=3 pjdc@grenadilla:~$ _ == Representative (Probably) Memory Info == pjdc@grenadilla:~$ free -m totalusedfree shared buff/cache available Mem: 240971762 213 266 22121 21087 Swap: 17492 101 17391
[Kernel-packages] [Bug 1602577] Re: [arm64] compute nodes unstable after upgrading from 4.2 to 4.4 kernel
Kernel from #24 fails to boot with the following, and keeps looping. Unclear if it's due to a problem with our system or with the kernel. I've sought advice from someone more familiar with the hardware we're using and will update with any further info. 306 bytes read in 18 ms (16.6 KiB/s) ## Executing script at 400400 19403840 bytes read in 499 ms (37.1 MiB/s) 28340037 bytes read in 723 ms (37.4 MiB/s) ## Booting kernel from Legacy Image at 400200 ... Image Name: kernel 4.8.0-040800rc1-generic Created: 2016-08-11 4:06:14 UTC Image Type: ARM Linux Kernel Image (uncompressed) Data Size:19403776 Bytes = 18.5 MiB Load Address: 0008 Entry Point: 0008 Verifying Checksum ... OK ## Loading init Ramdisk from Legacy Image at 400500 ... Image Name: ramdisk 4.8.0-040800rc1-generic Created: 2016-08-11 4:06:15 UTC Image Type: ARM Linux RAMDisk Image (gzip compressed) Data Size:28339973 Bytes = 27 MiB Load Address: Entry Point: Verifying Checksum ... OK ERROR: Did not find a cmdline Flattened Device Tree Could not find a valid device tree -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1602577 Title: [arm64] compute nodes unstable after upgrading from 4.2 to 4.4 kernel Status in linux package in Ubuntu: Triaged Status in linux source package in Xenial: Triaged Bug description: Hi, In order to investigate bug LP#1531768, we upgraded some arm64 compute nodes (swirlices) to a 4.4 kernel. I think it made the VMs work better, but the hosts became extremely unstable. After some time, getting a shell on them would be impossible. Connecting on the VSP, you'd get a prompt, and once you typed your username and password, you'd see the motd but the shell would never spawn. Because of these instability issues, all the arm64 compute nodes are now back on 4.2. However, we managed to capture "perf record" data when a host was failing. I'll attach it to the bug. Perhaps it will give you hints as to what we can do to help you troubleshoot this bug further. Once we have your instructions, we'll happily reboot one (or a few) nodes to 4.4 to continue troubleshooting. Thanks ! --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 Jul 12 12:54 seq crw-rw 1 root audio 116, 33 Jul 12 12:54 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.21 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory DistroRelease: Ubuntu 14.04 Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99 Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=screen-256color PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: console=ttyS0,9600n8r ro compat_uts_machine=armv7l ProcVersionSignature: Ubuntu 4.2.0-41.48~14.04.1-generic 4.2.8-ckt11 RelatedPackageVersions: linux-restricted-modules-4.2.0-41-generic N/A linux-backports-modules-4.2.0-41-generic N/A linux-firmware1.127.22 RfKill: Error: [Errno 2] No such file or directory Tags: trusty uec-images Uname: Linux 4.2.0-41-generic aarch64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: True To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1602577/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp