So, this one is tricky. Right now most (all but one place) of the memory allocations in netlink code happen in process context and are done via kmalloc/slub. Thus they are auto-accounted into kmem.
The single exceptional place is in netlink_alloc_large_skb where big sending packets are allocated with vmalloc. The good news about it is that the only use case for it right now seem to be in newest netfilter user space code that tries to load HUGE netfilter tables into kernel via netlink API. Since this is very likely not to case for our containers, we can just disable this newest (appeared in 3.10 with c05cdb1b86) feature for everyone but host. One more pain here is in mapped sockets. It's also relatively new and pages that are kernel, but mapped into process VM are out of track. This is like memory that is vmsplice-d into pipe and then unmapped. It's also gets unaccounted, but occupies place. Both issues worth revisiting. Signed-off-by: Pavel Emelyanov <xe...@parallels.com> --- net/netlink/af_netlink.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 94d635f..734a68a 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -1561,7 +1561,13 @@ static struct sk_buff *netlink_alloc_large_skb(unsigned int size, struct sk_buff *skb; void *data; - if (size <= NLMSG_GOODSIZE || broadcast) + if (size <= NLMSG_GOODSIZE || broadcast || + /* + * Once we have vmalloc_kmem() that would account + * allocated pages into memcg, this check can be + * removed. + */ + !ve_is_super(get_exec_env())) return alloc_skb(size, GFP_KERNEL); size = SKB_DATA_ALIGN(size) + -- 1.8.3.1 _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel