** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu Bionic)
       Status: New => Triaged

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu)
   Importance: Medium => High

** Changed in: linux (Ubuntu Bionic)
   Importance: Medium => High

** Tags removed: kernel-da-key
** Tags added: kernel-key

** Also affects: linux (Ubuntu Cosmic)
   Importance: High
       Status: Triaged

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1792349

Title:
  Memory leaking when running kubernetes cronjobs

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Bionic:
  Triaged
Status in linux source package in Cosmic:
  Triaged

Bug description:
  We are using Kubernetes V1.8.15 with docker 18.03.1-ce.
  We schedule 50 Kubernetes cronjobs to run every 5 minutes. Each cronjob will 
create a simple busybox container, echo hello world, then terminate.

  In the data attached to the bug I let this run for 1 hour, and in this
  time the Available memory had reduced from 31256704 kB to 30461224 kB
  - so a loss of 776 MB. From previous longer runs we observe the
  available memory continues to drop.

  There doesn't appear to be any processes left behind, or any growth in
  any other processes to explain where the memory has gone.

  echo 3 > /proc/sys/vm/drop_caches causes some of the memory to be
  returned, but the majority remains leaked, and the only way to free it
  appears to be to reboot the system.

  We are currently running Ubuntu 4.15.0-32.35-generic 4.15.18 and have
  previously observed similar issues on Ubuntu 16.04 with Kernel
  4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:38:41 UTC 2017 and
  Debian 9.4 running 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3
  (2018-03-02)

  The leak was more severe on the Debian system, and investigations
  there showed leaks in pcpu_get_vm_areas and were related to memory
  cgroups. Running with Kernel 4.17 on debian showed a leak at a similar
  rate to what we now observe on Ubuntu 18. This leak causes us issues
  as we need to run the cronjobs regularly and want the systems to
  remain up for months.

  Kubernetes will create a new cgroup each time the cronjob runs, but
  these are removed when the job completes (which takes a few seconds).
  If I use systemd-cgtop I don't see any increase in cgroups over time -
  but if I monitor /proc/cgroups over time I can see num_cgroups for
  memory increases.

  For the duration of the test I collected slabinfo, meminfo,
  vmallocinfo & cgroups - which I will attach to the bug. Each file is
  suffixed with the number of seconds since the start.

  *.0 & *.600 were taken before the test was started. The test was
  stopped shortly after the *.4200 files were generated. I then left the
  system idle for 10 minutes. I then ran echo 3 >
  /proc/sys/vm/drop_caches after *.4800 was generated. This seemed to
  free ~240MB - but this still leaves ~500MB lost. I then left the
  system idle for a further 20 minutes, and MemoryAvailable didn't seem
  to be increasing significantly.

  Note, the data attached is from running on kernel 4.18.7-041807-generic 
#201809090930 SMP Sun Sep 9 09:33:16 UTC 2018 (which I ran to verify the issue 
still exists in latest kernel) - however I was unable to run ubuntu-bug linux 
on this kernel as it complained about:
  *** Problem in linux-image-4.18.7-041807-generic

  The problem cannot be reported:

  This report is about a package that is not installed.

  So I switched back to 4.15.0-32.35-generic to raise the bug.

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-32-generic 4.15.0-32.35
  ProcVersionSignature: Ubuntu 4.15.0-32.35-generic 4.15.18
  Uname: Linux 4.15.0-32-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Sep 13 08:55 seq
   crw-rw---- 1 root audio 116, 33 Sep 13 08:55 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.2
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  Date: Thu Sep 13 08:55:46 2018
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd 
   Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
  MachineType: Xen HVM domU
  PciMultimedia:
   
  ProcEnviron:
   LANG=C.UTF-8
   SHELL=/bin/bash
   TERM=xterm
   PATH=(custom, no user)
  ProcFB:
   
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-32-generic 
root=UUID=6a84f0e4-8522-41cd-8ecb-d4a6fbecef8a ro earlyprintk
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-32-generic N/A
   linux-backports-modules-4.15.0-32-generic  N/A
   linux-firmware                             N/A
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  WifiSyslog:
   
  dmi.bios.date: 08/13/2018
  dmi.bios.vendor: Xen
  dmi.bios.version: 4.7.5-1.21
  dmi.chassis.type: 1
  dmi.chassis.vendor: Xen
  dmi.modalias: 
dmi:bvnXen:bvr4.7.5-1.21:bd08/13/2018:svnXen:pnHVMdomU:pvr4.7.5-1.21:cvnXen:ct1:cvr:
  dmi.product.name: HVM domU
  dmi.product.version: 4.7.5-1.21
  dmi.sys.vendor: Xen

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1792349/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to