** Description changed:

  ==Case Study==
  I was trying to figure out why Ubuntu was killing my program, even though 
system real memory pressure is not that high, though virtual memory seemed 
abnormally high compared to on other platforms (which as it turn out just do 
opportunistic page reclaim).  It turned out to be a quirk of the lazy 
allocater, and for some reason if there is not enough real memory it is trigger 
happy with the OOMkiller instead of trying to solve the problem by garbage 
collecting first.  I understand that the memory pressure watermark is 
introduced so heavy scans aren't taking place all the time, but if the system 
encounters a potential OOM, there should be a policy setting to try a heavy 
scan before going straight to OOMKilling.
  
  ==Summary==
  The kernel will leave memory mapped into a program's address space as part of 
lazy allocation, and will only free pages when a request both reduces total 
free memory (real+swap)  below `vm.min_freekbytes`, and does not exceed free 
memory available.
  
  Any long lived process will always accumulate a huge pool of virtual
  memory that is over-committed because of the kernel's desire to do this.
  
  If the amount of memory available is greater than the watermark, and any
  program in the system then makes an allocation request which exceeds
  total free and available memory the OOMKiller is launched to kill
  programs.
  
  ==The ideal case==
  Introduce a `vm.overcommit_memory` policy that attempts to relocate memory 
before treating the system as OOM.
  
  If reclaim/relocate takes longer than a timeout or if after compacting
  there is not(or could not feasibly be with a quick preflight sum), then
  treat the system as OOM, instead of immediately killing the most memory
  hungry program.
  
  ==The workaround==
  Setting
  ```
  sysctl vm.overcommit_memory=1 #Always grant memory
  sysctl vm.min_free_kbytes= $A_LARGE_NUMBER_SAY_A_GIG
  ```
  Allows the system to recover by sidestepping the issue, but this setting is 
too low by default in Ubuntu for a memory hungry program and something that 
wants periodic large allocations running at once.
  
  In particular I was running a background scientific job, and tried to
  watch a youtube video in Firefox, and either the database server wanted
  a large allocation for a transaction or Firefox wanted a large
  allocation for a new window or video buffer, but one of those
  allocations was to large and prompted the OOMkiller to fire, even though
  by all accounts the amount of real memory in use was small, and neither
  task was actually using anywhere close to all of the memory it had been
  mapped, because it had freed that memory and had just been running for a
  while.
  
  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-image-5.8.0-53-generic 5.8.0-53.60~20.04.1
  ProcVersionSignature: Ubuntu 5.8.0-53.60~20.04.1-generic 5.8.18
  Uname: Linux 5.8.0-53-generic x86_64
  NonfreeKernelModules: nvidia_modeset nvidia
  ApportVersion: 2.20.11-0ubuntu27.17
  Architecture: amd64
  CasperMD5CheckResult: skip
  CurrentDesktop: ubuntu:GNOME
  Date: Tue May 25 13:12:15 2021
  InstallationDate: Installed on 2021-03-12 (74 days ago)
  InstallationMedia: Ubuntu 20.04.2.0 LTS "Focal Fossa" - Release amd64 
(20210209.1)
  SourcePackage: linux-signed-hwe-5.8
  UpgradeStatus: No upgrade log present (probably fresh install)
+ 
+ 
+ -----------------------------
+ 
+ 
+ Update 2022-12-14
+ 
+ Bug seems to have been caused by a second oomkillerd supplied by
+ systemd, which kills long running programs too aggressively.
+ 
+ sudo systemctl disable systemd-oomd.service
+ sudo systemctl mask systemd-oomd
+ 
+ https://askubuntu.com/questions/1404888/how-do-i-disable-the-systemd-
+ oom-process-killer-in-ubuntu-22-04

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-hwe-5.8 in Ubuntu.
https://bugs.launchpad.net/bugs/1929612

Title:
  OMMKiller dispatched when memory free but not reclaimed

Status in linux-signed-hwe-5.8 package in Ubuntu:
  New

Bug description:
  ==Case Study==
  I was trying to figure out why Ubuntu was killing my program, even though 
system real memory pressure is not that high, though virtual memory seemed 
abnormally high compared to on other platforms (which as it turn out just do 
opportunistic page reclaim).  It turned out to be a quirk of the lazy 
allocater, and for some reason if there is not enough real memory it is trigger 
happy with the OOMkiller instead of trying to solve the problem by garbage 
collecting first.  I understand that the memory pressure watermark is 
introduced so heavy scans aren't taking place all the time, but if the system 
encounters a potential OOM, there should be a policy setting to try a heavy 
scan before going straight to OOMKilling.

  ==Summary==
  The kernel will leave memory mapped into a program's address space as part of 
lazy allocation, and will only free pages when a request both reduces total 
free memory (real+swap)  below `vm.min_freekbytes`, and does not exceed free 
memory available.

  Any long lived process will always accumulate a huge pool of virtual
  memory that is over-committed because of the kernel's desire to do
  this.

  If the amount of memory available is greater than the watermark, and
  any program in the system then makes an allocation request which
  exceeds total free and available memory the OOMKiller is launched to
  kill programs.

  ==The ideal case==
  Introduce a `vm.overcommit_memory` policy that attempts to relocate memory 
before treating the system as OOM.

  If reclaim/relocate takes longer than a timeout or if after compacting
  there is not(or could not feasibly be with a quick preflight sum),
  then treat the system as OOM, instead of immediately killing the most
  memory hungry program.

  ==The workaround==
  Setting
  ```
  sysctl vm.overcommit_memory=1 #Always grant memory
  sysctl vm.min_free_kbytes= $A_LARGE_NUMBER_SAY_A_GIG
  ```
  Allows the system to recover by sidestepping the issue, but this setting is 
too low by default in Ubuntu for a memory hungry program and something that 
wants periodic large allocations running at once.

  In particular I was running a background scientific job, and tried to
  watch a youtube video in Firefox, and either the database server
  wanted a large allocation for a transaction or Firefox wanted a large
  allocation for a new window or video buffer, but one of those
  allocations was to large and prompted the OOMkiller to fire, even
  though by all accounts the amount of real memory in use was small, and
  neither task was actually using anywhere close to all of the memory it
  had been mapped, because it had freed that memory and had just been
  running for a while.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-image-5.8.0-53-generic 5.8.0-53.60~20.04.1
  ProcVersionSignature: Ubuntu 5.8.0-53.60~20.04.1-generic 5.8.18
  Uname: Linux 5.8.0-53-generic x86_64
  NonfreeKernelModules: nvidia_modeset nvidia
  ApportVersion: 2.20.11-0ubuntu27.17
  Architecture: amd64
  CasperMD5CheckResult: skip
  CurrentDesktop: ubuntu:GNOME
  Date: Tue May 25 13:12:15 2021
  InstallationDate: Installed on 2021-03-12 (74 days ago)
  InstallationMedia: Ubuntu 20.04.2.0 LTS "Focal Fossa" - Release amd64 
(20210209.1)
  SourcePackage: linux-signed-hwe-5.8
  UpgradeStatus: No upgrade log present (probably fresh install)

  
  -----------------------------


  Update 2022-12-14

  Bug seems to have been caused by a second oomkillerd supplied by
  systemd, which kills long running programs too aggressively.

  sudo systemctl disable systemd-oomd.service
  sudo systemctl mask systemd-oomd

  https://askubuntu.com/questions/1404888/how-do-i-disable-the-systemd-
  oom-process-killer-in-ubuntu-22-04

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-5.8/+bug/1929612/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to