Radix tree nodes preallocated using radix_tree_prealloc will not
necessary be used by the process who preallocated them. As a result, a
radix tree node can be accounted to one memory cgroup while actually
taken care of by another. This discrepancy can lead to nasty
consequences: think what will happen if a lot of radix tree nodes are
accounted to a cgroup which is constantly tight on memory, but used for
storing elements that belong to a cgroup which never experiences memory
pressure. This can easily happen in case of page cache.

I believe the true fix would be assigning a radix tree preload to a
memcg, but this should be discussed upstream first. This patch is a
quick-fix, which simply disables preload accounting. Since most radix
tree allocations proceed via preloads, this effectively means that we
will hardly ever account radix tree nodes. Not a big deal though,
because we never accounted them before Vz7.

https://jira.sw.ru/browse/PSBM-35205..

Signed-off-by: Vladimir Davydov <[email protected]>
---
 lib/radix-tree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index d5c2fa1a4102..a8d10c4516d7 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -273,7 +273,8 @@ int radix_tree_preload(gfp_t gfp_mask)
        rtp = &__get_cpu_var(radix_tree_preloads);
        while (rtp->nr < ARRAY_SIZE(rtp->nodes)) {
                preempt_enable();
-               node = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask);
+               node = kmem_cache_alloc(radix_tree_node_cachep,
+                                       gfp_mask | __GFP_NOACCOUNT);
                if (node == NULL)
                        goto out;
                preempt_disable();
-- 
2.1.4

_______________________________________________
Devel mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to