[Kernel-packages] [Bug 1662378] Re: many OOMs on busy xenial IMAP server with lots of available memory

2017-02-23 Thread Paul Collins
Zero OOMs after 4 days of uptime (they started after 2 days previously)
so I'd say this is fixed.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1662378

Title:
  many OOMs on busy xenial IMAP server with lots of available memory

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Xenial:
  Triaged

Bug description:
  We recently noticed that a busy xenial IMAP server with about 22 days
  uptime has been logging a lot of OOM messages.  The machine has 24G of
  memory.  Below please find some typical memory info.

  I noted that there was about 11G of memory allocated to slab, and
  since all of the oom-killer invoked messages report order=2 or order=3
  (see below for gfp_flags) I thought that maybe fragmentation was a
  factor.  After doing echo 2 > /proc/sys/vm/drop_caches to release
  reclaimable slab memory, no OOMs were logged for around 30 minutes,
  but then they started up again, although perhaps not as frequently as
  before, and the amount of memory in slab was back up around its former
  size.  To the best of my knowledge we do not have any custom VM-
  related sysctl tweaks on this machine.

  Attached please find version.log and lspci-vnvn.log.  And here's a link to a 
kern.log from a little before time of boot onwards, containing all of the 
oom-killer messages: 
  https://people.canonical.com/~pjdc/grenadilla-sanitized-kern.log.xz

  == Breakdown of Failed Allocations ===

  pjdc@grenadilla:~$ grep -o 'gfp_mask=.*, order=.' 
kern.log-version-with-oom-killer-invoked | sort | uniq -c | sort -n
 1990 gfp_mask=0x26000c0, order=2
 4043 gfp_mask=0x240c0c0, order=3
  pjdc@grenadilla:~$ _

  == Representative (Probably) Memory Info ==

  pjdc@grenadilla:~$ free -m
totalusedfree  shared  buff/cache   
available
  Mem:  240971762 213 266   22121   
21087
  Swap: 17492 101   17391
  pjdc@grenadilla:~$ cat /proc/meminfo 
  MemTotal:   24676320 kB
  MemFree:  219440 kB
  MemAvailable:   21593416 kB
  Buffers: 6186648 kB
  Cached:  4255608 kB
  SwapCached: 3732 kB
  Active:  7593140 kB
  Inactive:4404824 kB
  Active(anon):1319736 kB
  Inactive(anon):   508544 kB
  Active(file):6273404 kB
  Inactive(file):  3896280 kB
  Unevictable:   0 kB
  Mlocked:   0 kB
  SwapTotal:  17912436 kB
  SwapFree:   17808972 kB
  Dirty:   524 kB
  Writeback: 0 kB
  AnonPages:   1553244 kB
  Mapped:   219868 kB
  Shmem:272576 kB
  Slab:   12209796 kB
  SReclaimable:   11572836 kB
  SUnreclaim:   636960 kB
  KernelStack:   14464 kB
  PageTables:54864 kB
  NFS_Unstable:  0 kB
  Bounce:0 kB
  WritebackTmp:  0 kB
  CommitLimit:30250596 kB
  Committed_AS:2640808 kB
  VmallocTotal:   34359738367 kB
  VmallocUsed:   0 kB
  VmallocChunk:  0 kB
  HardwareCorrupted: 0 kB
  AnonHugePages: 18432 kB
  CmaTotal:  0 kB
  CmaFree:   0 kB
  HugePages_Total:   0
  HugePages_Free:0
  HugePages_Rsvd:0
  HugePages_Surp:0
  Hugepagesize:   2048 kB
  DirectMap4k: 2371708 kB
  DirectMap2M:22784000 kB
  pjdc@grenadilla:~$

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1662378/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1662378] Re: many OOMs on busy xenial IMAP server with lots of available memory

2017-02-14 Thread Paul Collins
I've scheduled install/reboot for Sunday evening UTC, and should be able
to report back a few days after that.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1662378

Title:
  many OOMs on busy xenial IMAP server with lots of available memory

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Xenial:
  Triaged

Bug description:
  We recently noticed that a busy xenial IMAP server with about 22 days
  uptime has been logging a lot of OOM messages.  The machine has 24G of
  memory.  Below please find some typical memory info.

  I noted that there was about 11G of memory allocated to slab, and
  since all of the oom-killer invoked messages report order=2 or order=3
  (see below for gfp_flags) I thought that maybe fragmentation was a
  factor.  After doing echo 2 > /proc/sys/vm/drop_caches to release
  reclaimable slab memory, no OOMs were logged for around 30 minutes,
  but then they started up again, although perhaps not as frequently as
  before, and the amount of memory in slab was back up around its former
  size.  To the best of my knowledge we do not have any custom VM-
  related sysctl tweaks on this machine.

  Attached please find version.log and lspci-vnvn.log.  And here's a link to a 
kern.log from a little before time of boot onwards, containing all of the 
oom-killer messages: 
  https://people.canonical.com/~pjdc/grenadilla-sanitized-kern.log.xz

  == Breakdown of Failed Allocations ===

  pjdc@grenadilla:~$ grep -o 'gfp_mask=.*, order=.' 
kern.log-version-with-oom-killer-invoked | sort | uniq -c | sort -n
 1990 gfp_mask=0x26000c0, order=2
 4043 gfp_mask=0x240c0c0, order=3
  pjdc@grenadilla:~$ _

  == Representative (Probably) Memory Info ==

  pjdc@grenadilla:~$ free -m
totalusedfree  shared  buff/cache   
available
  Mem:  240971762 213 266   22121   
21087
  Swap: 17492 101   17391
  pjdc@grenadilla:~$ cat /proc/meminfo 
  MemTotal:   24676320 kB
  MemFree:  219440 kB
  MemAvailable:   21593416 kB
  Buffers: 6186648 kB
  Cached:  4255608 kB
  SwapCached: 3732 kB
  Active:  7593140 kB
  Inactive:4404824 kB
  Active(anon):1319736 kB
  Inactive(anon):   508544 kB
  Active(file):6273404 kB
  Inactive(file):  3896280 kB
  Unevictable:   0 kB
  Mlocked:   0 kB
  SwapTotal:  17912436 kB
  SwapFree:   17808972 kB
  Dirty:   524 kB
  Writeback: 0 kB
  AnonPages:   1553244 kB
  Mapped:   219868 kB
  Shmem:272576 kB
  Slab:   12209796 kB
  SReclaimable:   11572836 kB
  SUnreclaim:   636960 kB
  KernelStack:   14464 kB
  PageTables:54864 kB
  NFS_Unstable:  0 kB
  Bounce:0 kB
  WritebackTmp:  0 kB
  CommitLimit:30250596 kB
  Committed_AS:2640808 kB
  VmallocTotal:   34359738367 kB
  VmallocUsed:   0 kB
  VmallocChunk:  0 kB
  HardwareCorrupted: 0 kB
  AnonHugePages: 18432 kB
  CmaTotal:  0 kB
  CmaFree:   0 kB
  HugePages_Total:   0
  HugePages_Free:0
  HugePages_Rsvd:0
  HugePages_Surp:0
  Hugepagesize:   2048 kB
  DirectMap4k: 2371708 kB
  DirectMap2M:22784000 kB
  pjdc@grenadilla:~$

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1662378/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1662378] Re: many OOMs on busy xenial IMAP server with lots of available memory

2017-02-06 Thread Paul Collins
** Attachment added: "lspci-vnvn.log"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1662378/+attachment/4814448/+files/lspci-vnvn.log

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1662378

Title:
  many OOMs on busy xenial IMAP server with lots of available memory

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  We recently noticed that a busy xenial IMAP server with about 22 days
  uptime has been logging a lot of OOM messages.  The machine has 24G of
  memory.  Below please find some typical memory info.

  I noted that there was about 11G of memory allocated to slab, and
  since all of the oom-killer invoked messages report order=2 or order=3
  (see below for gfp_flags) I thought that maybe fragmentation was a
  factor.  After doing echo 2 > /proc/sys/vm/drop_caches to release
  reclaimable slab memory, no OOMs were logged for around 30 minutes,
  but then they started up again, although perhaps not as frequently as
  before, and the amount of memory in slab was back up around its former
  size.  To the best of my knowledge we do not have any custom VM-
  related sysctl tweaks on this machine.

  Attached please find version.log and lspci-vnvn.log.  And here's a link to a 
kern.log from a little before time of boot onwards, containing all of the 
oom-killer messages: 
  https://people.canonical.com/~pjdc/grenadilla-sanitized-kern.log.xz

  == Breakdown of Failed Allocations ===

  pjdc@grenadilla:~$ grep -o 'gfp_mask=.*, order=.' 
kern.log-version-with-oom-killer-invoked | sort | uniq -c | sort -n
 1990 gfp_mask=0x26000c0, order=2
 4043 gfp_mask=0x240c0c0, order=3
  pjdc@grenadilla:~$ _

  == Representative (Probably) Memory Info ==

  pjdc@grenadilla:~$ free -m
totalusedfree  shared  buff/cache   
available
  Mem:  240971762 213 266   22121   
21087
  Swap: 17492 101   17391
  pjdc@grenadilla:~$ cat /proc/meminfo 
  MemTotal:   24676320 kB
  MemFree:  219440 kB
  MemAvailable:   21593416 kB
  Buffers: 6186648 kB
  Cached:  4255608 kB
  SwapCached: 3732 kB
  Active:  7593140 kB
  Inactive:4404824 kB
  Active(anon):1319736 kB
  Inactive(anon):   508544 kB
  Active(file):6273404 kB
  Inactive(file):  3896280 kB
  Unevictable:   0 kB
  Mlocked:   0 kB
  SwapTotal:  17912436 kB
  SwapFree:   17808972 kB
  Dirty:   524 kB
  Writeback: 0 kB
  AnonPages:   1553244 kB
  Mapped:   219868 kB
  Shmem:272576 kB
  Slab:   12209796 kB
  SReclaimable:   11572836 kB
  SUnreclaim:   636960 kB
  KernelStack:   14464 kB
  PageTables:54864 kB
  NFS_Unstable:  0 kB
  Bounce:0 kB
  WritebackTmp:  0 kB
  CommitLimit:30250596 kB
  Committed_AS:2640808 kB
  VmallocTotal:   34359738367 kB
  VmallocUsed:   0 kB
  VmallocChunk:  0 kB
  HardwareCorrupted: 0 kB
  AnonHugePages: 18432 kB
  CmaTotal:  0 kB
  CmaFree:   0 kB
  HugePages_Total:   0
  HugePages_Free:0
  HugePages_Rsvd:0
  HugePages_Surp:0
  Hugepagesize:   2048 kB
  DirectMap4k: 2371708 kB
  DirectMap2M:22784000 kB
  pjdc@grenadilla:~$

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1662378/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1662378] [NEW] many OOMs on busy xenial IMAP server with lots of available memory

2017-02-06 Thread Paul Collins
Public bug reported:

We recently noticed that a busy xenial IMAP server with about 22 days
uptime has been logging a lot of OOM messages.  The machine has 24G of
memory.  Below please find some typical memory info.

I noted that there was about 11G of memory allocated to slab, and since
all of the oom-killer invoked messages report order=2 or order=3 (see
below for gfp_flags) I thought that maybe fragmentation was a factor.
After doing echo 2 > /proc/sys/vm/drop_caches to release reclaimable
slab memory, no OOMs were logged for around 30 minutes, but then they
started up again, although perhaps not as frequently as before, and the
amount of memory in slab was back up around its former size.  To the
best of my knowledge we do not have any custom VM-related sysctl tweaks
on this machine.

Attached please find version.log and lspci-vnvn.log.  And here's a link to a 
kern.log from a little before time of boot onwards, containing all of the 
oom-killer messages: 
https://people.canonical.com/~pjdc/grenadilla-sanitized-kern.log.xz

== Breakdown of Failed Allocations ===

pjdc@grenadilla:~$ grep -o 'gfp_mask=.*, order=.' 
kern.log-version-with-oom-killer-invoked | sort | uniq -c | sort -n
   1990 gfp_mask=0x26000c0, order=2
   4043 gfp_mask=0x240c0c0, order=3
pjdc@grenadilla:~$ _

== Representative (Probably) Memory Info ==

pjdc@grenadilla:~$ free -m
  totalusedfree  shared  buff/cache   available
Mem:  240971762 213 266   22121   21087
Swap: 17492 101   17391
pjdc@grenadilla:~$ cat /proc/meminfo 
MemTotal:   24676320 kB
MemFree:  219440 kB
MemAvailable:   21593416 kB
Buffers: 6186648 kB
Cached:  4255608 kB
SwapCached: 3732 kB
Active:  7593140 kB
Inactive:4404824 kB
Active(anon):1319736 kB
Inactive(anon):   508544 kB
Active(file):6273404 kB
Inactive(file):  3896280 kB
Unevictable:   0 kB
Mlocked:   0 kB
SwapTotal:  17912436 kB
SwapFree:   17808972 kB
Dirty:   524 kB
Writeback: 0 kB
AnonPages:   1553244 kB
Mapped:   219868 kB
Shmem:272576 kB
Slab:   12209796 kB
SReclaimable:   11572836 kB
SUnreclaim:   636960 kB
KernelStack:   14464 kB
PageTables:54864 kB
NFS_Unstable:  0 kB
Bounce:0 kB
WritebackTmp:  0 kB
CommitLimit:30250596 kB
Committed_AS:2640808 kB
VmallocTotal:   34359738367 kB
VmallocUsed:   0 kB
VmallocChunk:  0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 18432 kB
CmaTotal:  0 kB
CmaFree:   0 kB
HugePages_Total:   0
HugePages_Free:0
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
DirectMap4k: 2371708 kB
DirectMap2M:22784000 kB
pjdc@grenadilla:~$

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: Confirmed

** Attachment added: "version.log"
   
https://bugs.launchpad.net/bugs/1662378/+attachment/4814447/+files/version.log

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1662378

Title:
  many OOMs on busy xenial IMAP server with lots of available memory

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  We recently noticed that a busy xenial IMAP server with about 22 days
  uptime has been logging a lot of OOM messages.  The machine has 24G of
  memory.  Below please find some typical memory info.

  I noted that there was about 11G of memory allocated to slab, and
  since all of the oom-killer invoked messages report order=2 or order=3
  (see below for gfp_flags) I thought that maybe fragmentation was a
  factor.  After doing echo 2 > /proc/sys/vm/drop_caches to release
  reclaimable slab memory, no OOMs were logged for around 30 minutes,
  but then they started up again, although perhaps not as frequently as
  before, and the amount of memory in slab was back up around its former
  size.  To the best of my knowledge we do not have any custom VM-
  related sysctl tweaks on this machine.

  Attached please find version.log and lspci-vnvn.log.  And here's a link to a 
kern.log from a little before time of boot onwards, containing all of the 
oom-killer messages: 
  https://people.canonical.com/~pjdc/grenadilla-sanitized-kern.log.xz

  == Breakdown of Failed Allocations ===

  pjdc@grenadilla:~$ grep -o 'gfp_mask=.*, order=.' 
kern.log-version-with-oom-killer-invoked | sort | uniq -c | sort -n
 1990 gfp_mask=0x26000c0, order=2
 4043 gfp_mask=0x240c0c0, order=3
  pjdc@grenadilla:~$ _

  == Representative (Probably) Memory Info ==

  pjdc@grenadilla:~$ free -m
totalusedfree  shared  buff/cache   
available
  Mem:  240971762 213 266   22121   
21087
  Swap: 17492 101   17391
  

[Kernel-packages] [Bug 1602577] Re: [arm64] compute nodes unstable after upgrading from 4.2 to 4.4 kernel

2016-08-10 Thread Paul Collins
Kernel from #24 fails to boot with the following, and keeps looping.
Unclear if it's due to a problem with our system or with the kernel.
I've sought advice from someone more familiar with the hardware we're
using and will update with any further info.

306 bytes read in 18 ms (16.6 KiB/s)
## Executing script at 400400
19403840 bytes read in 499 ms (37.1 MiB/s)
28340037 bytes read in 723 ms (37.4 MiB/s)
## Booting kernel from Legacy Image at 400200 ...
   Image Name:   kernel 4.8.0-040800rc1-generic
   Created:  2016-08-11   4:06:14 UTC
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:19403776 Bytes = 18.5 MiB
   Load Address: 0008
   Entry Point:  0008
   Verifying Checksum ... OK
## Loading init Ramdisk from Legacy Image at 400500 ...
   Image Name:   ramdisk 4.8.0-040800rc1-generic
   Created:  2016-08-11   4:06:15 UTC
   Image Type:   ARM Linux RAMDisk Image (gzip compressed)
   Data Size:28339973 Bytes = 27 MiB
   Load Address: 
   Entry Point:  
   Verifying Checksum ... OK
ERROR: Did not find a cmdline Flattened Device Tree
Could not find a valid device tree

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1602577

Title:
  [arm64] compute nodes unstable after upgrading from 4.2 to 4.4 kernel

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Xenial:
  Triaged

Bug description:
  Hi,

  In order to investigate bug LP#1531768, we upgraded some arm64 compute
  nodes (swirlices) to a 4.4 kernel. I think it made the VMs work
  better, but the hosts became extremely unstable.

  After some time, getting a shell on them would be impossible.
  Connecting on the VSP, you'd get a prompt, and once you typed your
  username and password, you'd see the motd but the shell would never
  spawn.

  Because of these instability issues, all the arm64 compute nodes are
  now back on 4.2. However, we managed to capture "perf record" data
  when a host was failing. I'll attach it to the bug. Perhaps it will
  give you hints as to what we can do to help you troubleshoot this bug
  further.

  Once we have your instructions, we'll happily reboot one (or a few)
  nodes to 4.4 to continue troubleshooting.

  Thanks !
  --- 
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Jul 12 12:54 seq
   crw-rw 1 root audio 116, 33 Jul 12 12:54 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 2.14.1-0ubuntu3.21
  Architecture: arm64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory
  DistroRelease: Ubuntu 14.04
  Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize 
libusb: -99
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB:
   
  ProcKernelCmdLine: console=ttyS0,9600n8r ro compat_uts_machine=armv7l
  ProcVersionSignature: Ubuntu 4.2.0-41.48~14.04.1-generic 4.2.8-ckt11
  RelatedPackageVersions:
   linux-restricted-modules-4.2.0-41-generic N/A
   linux-backports-modules-4.2.0-41-generic  N/A
   linux-firmware1.127.22
  RfKill: Error: [Errno 2] No such file or directory
  Tags:  trusty uec-images
  Uname: Linux 4.2.0-41-generic aarch64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: True

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1602577/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp