https://fedoraproject.org/wiki/Changes/KDEEarlyOOM

== Summary ==
As [[Changes/EnableEarlyoom|Fedora Workstation did in F32]], install
earlyoom package, and enable it by default. If both RAM and swap go below
10% free, earlyoom issues SIGTERM to the process with the largest
oom_score. If both RAM and swap go below 5% free, earlyoom issues SIGKILL
to the process with the largest oom_score. The idea is to recover from out
of memory situations sooner, rather than the typical complete system hang
in which the user has no other choice but to force power off.

== Owner ==
* Name: [[User:bcotton|Ben Cotton]]
* Email: bcot...@redhat.com

== Detailed Description ==
Shamelessly copied from Workstation, which did it in the last release:

Certain workloads have heavy memory demands, quickly consume all of RAM,
and start to heavily page out to swap. (Heavy paging, is often called "swap
thrashing" for added descriptive effect, probably because it's noticeable
and annoying). Incidental swap usage is a good thing, it frees up memory
for active pages used by a process. Heavy swap usage quickly leads to a
very negative UX, because it's slow, even on modern SSDs. Due to installer
defaults, the swap partition is made the same size as available memory (at
install time), which can be huge. This just extends swap thrashing time.

On the one hand, we want this resource hungry job to complete. On the other
hand, we want our system to be responsive while that other work is going
on. But once the GUI stutters or even comes to an apparent stand still
(hang), we're really wishing the kernel oom-killer would kick in and free
up memory, so we can start over (maybe using memory or thread limiting
options - which arguably should be more intelligently figured out, and that
too is a work in progress but beyond the scope of this feature).

However, once in a heavy swap scenario, it's relatively common the system
gets stuck in it, where GUI interactivity is terrible to non-existent, and
also the kernel oom-killer doesn't trigger. From a certain point of view,
this is working as intended. The kernel oom-killer is concerned about
keeping the kernel running. It's not at all concerned about user space
responsiveness.

Instead of the system becoming completely unresponsive for tens of minutes,
hours or days, this feature expects that an offending process (determined
by oom_score, same as the kernel oom-killer) will be killed off within
seconds or a few minutes.

== Benefit to Fedora ==

KDE users will be able to take advantage of the benefits Workstation users
got from enabling earlyOOM in Fedora 32:

* improved user experience by more quickly regaining control over one's
system, rather than having to force power off in low-memory situations
where there's aggressive swapping. Once a system becomes unresponsive, it's
completely reasonable for the user to assume the system is lost, but that
includes high potential for data loss.
* reducing forced poweroff as the main work around will increase data
collection, improving understanding of low memory situations and how to
handle them better
* earlyoom first sends SIGTERM to the chosen process, so it has a chance of
a proper shutdown, unlike the kernel's oom-killer

== Scope ==
* Proposal owners:
** Modify {{code|
https://pagure.io/fedora-comps/blob/master/f/comps-f33.xml.in}} to include
earlyoom package for in {{code|kde-desktop}} section.
** Add {{code|
https://src.fedoraproject.org/rpms/fedora-release/blob/master/f/80-kde.preset}}
to include:
<pre>
# enable earlyoom by default on KDE
enable earlyoom.service
</pre>

* Other developers: None, unless KDE-based Spins/Labs want to opt out

* Release engineering: N/A
* Policies and guidelines: N/A
* Trademark approval: N/A

== Upgrade/compatibility impact ==
earlyoom.service will be enabled on upgrade. An upgraded system should
exhibit the same behaviors as a newly-installed system.

== How To Test ==
* Fedora 31/32 KDE users can test today:
** {{code|sudo dnf install earlyoom}}
** {{code|sudo systemctl enable --now earlyoom}}

And then attempt to cause an out of memory situation. Examples:
** {{code|tail /dev/zero}}
** https://lkml.org/lkml/2019/8/4/15

== User Experience ==
earlyoom sends SIGTERM to processes based on oom_score when both memory and
swap have less than 10% free and SIGKILL when below 5%.

== Dependencies ==
None

== Contingency Plan ==

* Contingency mechanism: (What to do?  Who will do it?) Owner reverts
changes
* Contingency deadline: Final freeze
* Blocks release? No

== Documentation ==
* {{code|man earlyoom}}
* https://github.com/rfjakob/earlyoom
* https://www.kernel.org/doc/gorman/html/understand/understand016.html

== Release Notes ==
The earlyoom service is now enabled by default in Fedora KDE.

The earlyoom service monitors system memory usage. If free memory falls
below a set limit, earlyoom terminates an appropriate process to free up
memory. As a result, the system does not become unresponsive for long
periods of time in low-memory situations.

The following is the default earlyoom configuration:

* If both RAM and swap go below 10% free, earlyoom sends the SIGTERM signal
to the process with the largest oom_score.
* If both RAM and swap go below 5% free, earlyoom sends the SIGKILL signal
to the process with the largest oom_score.

For more information, see the earlyoom man page.

-- 
Ben Cotton
He / Him / His
Senior Program Manager, Fedora & CentOS Stream
Red Hat
TZ=America/Indiana/Indianapolis
_______________________________________________
devel-announce mailing list -- devel-announce@lists.fedoraproject.org
To unsubscribe send an email to devel-announce-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel-announce@lists.fedoraproject.org

Reply via email to