Public bug reported:
[Impact]
* On MGLRU-enabled systems, high memory pressure on NUMA nodes will cause page
allocation failures
* This happens due to page reclaim not waking up flusher threads
* OOM can be triggered even if the system has enough available memory
[Test Plan]
* For the bug to properly trigger, we should uninstall apport and use the
attached alloc_and_crash.c reproducer
* alloc_and_crash will mmap a huge range of memory, memset it and forcibly
SEGFAULT
* The attached bash script will membind alloc_and_crash to NUMA node 0, so we
can see the allocation failures in dmesg
$ sudo apt remove --purge apport
$ sudo dmesg -c; ./repro.bash; sleep 2; sudo dmesg
[Fix]
* The upstream patch wakes up flusher threads if there are too many dirty
entries in the coldest LRU generation
* This happens when trying to shrink lruvecs, so reclaim only gets woken up
during high memory pressure
* Fix was introduced by commit:
1bc542c6a0d1 mm/vmscan: wake up flushers conditionally to avoid cgroup OOM
[Regression Potential]
* This commit fixes the memory reclaim path, so regressions would likely show
up during increased system memory pressure
* According to the upstream patch, increased SSD/disk wearing is possible due
to waking up flusher threads, although these have not been noted in testing
** Affects: linux (Ubuntu)
Importance: High
Assignee: Heitor Alves de Siqueira (halves)
Status: Confirmed
** Affects: linux (Ubuntu Noble)
Importance: High
Assignee: Heitor Alves de Siqueira (halves)
Status: Confirmed
** Affects: linux (Ubuntu Oracular)
Importance: Medium
Assignee: Heitor Alves de Siqueira (halves)
Status: Confirmed
** Affects: linux (Ubuntu Plucky)
Importance: High
Assignee: Heitor Alves de Siqueira (halves)
Status: Confirmed
** Also affects: linux (Ubuntu Noble)
Importance: Undecided
Status: New
** Also affects: linux (Ubuntu Oracular)
Importance: Undecided
Status: New
** Also affects: linux (Ubuntu Plucky)
Importance: High
Assignee: Heitor Alves de Siqueira (halves)
Status: Confirmed
** Changed in: linux (Ubuntu Oracular)
Assignee: (unassigned) => Heitor Alves de Siqueira (halves)
** Changed in: linux (Ubuntu Noble)
Assignee: (unassigned) => Heitor Alves de Siqueira (halves)
** Changed in: linux (Ubuntu Oracular)
Importance: Undecided => High
** Changed in: linux (Ubuntu Noble)
Importance: Undecided => High
** Changed in: linux (Ubuntu Oracular)
Importance: High => Medium
** Changed in: linux (Ubuntu Oracular)
Status: New => Confirmed
** Changed in: linux (Ubuntu Noble)
Status: New => Confirmed
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2097214
Title:
MGLRU: page allocation failure on NUMA-enabled systems
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2097214/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs