Introduce hierarchical per-cpu counters and use them for rss tracking to
fix the per-mm RSS tracking which has become too inaccurate for OOM
killer purposes on large many-core systems.

The approach proposed here is to replace this by the hierarchical
per-cpu counters, which bounds the inaccuracy based on the system
topology with O(N*logN).

Relevant delta since v7: Initialize the subsystem earlier in
start_kernel() so it is ready before any mm is created. Introduce and
use a precise sum positive API to cover the scenario where an unlucky
precise sum iteration happens concurrently with a sequence of counter
updates that makes it observe a negative sum.

Testing and feedback are welcome!

Thanks,

Mathieu

Cc: Andrew Morton <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Martin Liu <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: [email protected]
Cc: Shakeel Butt <[email protected]>
Cc: SeongJae Park <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Sweet Tea Dorminy <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: "Liam R . Howlett" <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Al Viro <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Yu Zhao <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Mateusz Guzik <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: Aboorva Devarajan <[email protected]>

Mathieu Desnoyers (2):
  lib: Introduce hierarchical per-cpu counters
  mm: Fix OOM killer inaccuracy on large many-core systems

 include/linux/mm.h                  |  10 +-
 include/linux/mm_types.h            |   4 +-
 include/linux/percpu_counter_tree.h | 217 +++++++++++++++
 include/trace/events/kmem.h         |   2 +-
 init/main.c                         |   2 +
 kernel/fork.c                       |  32 ++-
 lib/Makefile                        |   1 +
 lib/percpu_counter_tree.c           | 392 ++++++++++++++++++++++++++++
 8 files changed, 641 insertions(+), 19 deletions(-)
 create mode 100644 include/linux/percpu_counter_tree.h
 create mode 100644 lib/percpu_counter_tree.c

-- 
2.39.5

Reply via email to