Valera, Vasya, do we need this patch in vz8?

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 06/04/2021 06:18 PM, Valeriy Vdovin wrote:
From: Vasily Averin <[email protected]>

vz7 kernel can remove in-use memcg from parent-chield hierarchy tree,
and skip resize of its shrinker_map during registration of new shrinkers.
It allows memcg user to access memory beyond of end non-resized
shrinker_map, and corrupt next slab element.

CPU A                    CPU B                       CPU C
got memcg            cgroup_rmdir()                 sget_userns()
shrink_slab_memcg()  cgroup_destroy_locked()        register_shrinker()
  .                  cgroup_css_killed()            register_memcg_shrinker()
  .                   cgroup_offline_fn()           down_write(&shrinker_rwsem)
  .                   list_del_rcu(&cgrp->sibling)  memcg_expand_shrinker_maps()
wait for shrinker_rwsem                             re-allocs shrinker_maps of 
all memcg
    .                                                but ignores memcg used on 
CPU A
    .                                                because it was removed by 
CPU B
    .                                               releases shrinker_rwsem
accesses and modifies memory
 beyond end of own shrinker_map
  while using of new memcg_shrinkers
  registered on CPU C

v4:
- corrected WARN_ON condition

v3:
- corrected map->nr_max value, "- 1" was removed
- nr_max update inside shrink_slab_memcg()

v2 changes:
- map_size was in bytes, shrinker_nr_max was in bytes (thanks to ktkhai@)
- map_size was replaced by nr_max -- maximal allowed bit number
- added WARN_ONs to detect another possible corruption places

https://jira.sw.ru/browse/PSBM-100509
https://jira.sw.ru/browse/PSBM-100694

Signed-off-by: Vasily Averin <[email protected]>
Acked-by: Kirill Tkhai <[email protected]>
(cherry-picked from 2ad49c8b916379544fa9dd35f5059dd9ffb0ce1b)
https://jira.sw.ru/browse/PSBM-127849
Signed-off-by: Valeriy Vdovin <[email protected]>
---
 mm/vmscan.c | 8 ++++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 95e23c6b4590..23113db62eef 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -540,7 +540,7 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int 
nid,
 {
        struct memcg_shrinker_map *map;
        unsigned long freed = 0;
-       int ret, i;
+       int ret, i, nr_max;

        if (!memcg_kmem_enabled() || !mem_cgroup_online(memcg))
                return 0;
@@ -553,7 +553,9 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int 
nid,
        if (unlikely(!map))
                goto unlock;

-       for_each_set_bit(i, map->map, shrinker_nr_max) {
+       nr_max = min(shrinker_nr_max, map->nr_max);
+
+       for_each_set_bit(i, map->map, nr_max) {
                struct shrink_control sc = {
                        .gfp_mask = gfp_mask,
                        .nid = nid,
@@ -594,6 +596,7 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int 
nid,
                                memcg_set_shrinker_bit(memcg, nid, i);
                }
                freed += ret;
+               nr_max = min(shrinker_nr_max, map->nr_max);
        }
 unlock:
        up_read(&shrinker_rwsem);
@@ -663,6 +666,7 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
                if (ret == SHRINK_EMPTY)
                        ret = 0;
                freed += ret;
                /*
                 * Bail out if someone want to register a new shrinker to
                 * prevent the regsitration from being stalled for long periods

_______________________________________________
Devel mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to