Radix tree nodes preallocated using radix_tree_prealloc will not necessary be used by the process who preallocated them. As a result, a radix tree node can be accounted to one memory cgroup while actually taken care of by another. This discrepancy can lead to nasty consequences: think what will happen if a lot of radix tree nodes are accounted to a cgroup which is constantly tight on memory, but used for storing elements that belong to a cgroup which never experiences memory pressure. This can easily happen in case of page cache.
I believe the true fix would be assigning a radix tree preload to a memcg, but this should be discussed upstream first. This patch is a quick-fix, which simply disables preload accounting. Since most radix tree allocations proceed via preloads, this effectively means that we will hardly ever account radix tree nodes. Not a big deal though, because we never accounted them before Vz7. https://jira.sw.ru/browse/PSBM-35205.. Signed-off-by: Vladimir Davydov <[email protected]> --- lib/radix-tree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/radix-tree.c b/lib/radix-tree.c index d5c2fa1a4102..a8d10c4516d7 100644 --- a/lib/radix-tree.c +++ b/lib/radix-tree.c @@ -273,7 +273,8 @@ int radix_tree_preload(gfp_t gfp_mask) rtp = &__get_cpu_var(radix_tree_preloads); while (rtp->nr < ARRAY_SIZE(rtp->nodes)) { preempt_enable(); - node = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask); + node = kmem_cache_alloc(radix_tree_node_cachep, + gfp_mask | __GFP_NOACCOUNT); if (node == NULL) goto out; preempt_disable(); -- 2.1.4 _______________________________________________ Devel mailing list [email protected] https://lists.openvz.org/mailman/listinfo/devel
