[
https://issues.apache.org/jira/browse/MESOS-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373837#comment-15373837
]
John Garcia commented on MESOS-5836:
------------------------------------
[~vinodkone] [Docker issue #24559
filed|https://github.com/docker/docker/issues/24559]
> Memory cgroup leakage in 4.2, 4.4, 4.5 kernels
> ----------------------------------------------
>
> Key: MESOS-5836
> URL: https://issues.apache.org/jira/browse/MESOS-5836
> Project: Mesos
> Issue Type: Bug
> Components: containerization
> Affects Versions: 0.28.1, 0.28.2, 1.0.0, 1.1.0
> Reporter: John Garcia
> Labels: mesosphere
>
> We've noticed an issue with kernel versions 4.2, 4.4, and 4.5 where memory
> cgroups are not cleaned up by the system. When the register fills up with
> 65336 cgroups, additional cgroups cannot be formed because there's no IDs for
> the new cgroup, and ENOSPC is returned. This is a concern for the Mesos
> project because no further containers can be created by Mesos in this state.
> We tested Docker 1.8.3, and Docker 1.8.3 will silently fail to build the
> memory cgroup, resulting in rogue containers that are memory-unbound.
> h3. Steps to reproduce:
> *NOTE: Mesos is not required to reproduce this issue*
> - Start a new instance using kernel 4.2, 4.4, or 4.5 (CoreOS 766-1010, Ubuntu
> 16.04)
> - ssh to the machine
> - {{cat /proc/cgroups}} to determine the number of memory cgroups
> - Run several docker containers using the {{--memory}} or {{-m}} option to
> set a memory isolator, either in parallel or in series
> - Stop all containers
> - {{cat /proc/cgroups}} to review the number of memory cgroups and compare to
> previous run
> - Optional: Run 65,336 docker containers using memory isolation and then try
> to launch a Mesos container
> h3. Differential diagnosis:
> When the cgroup limit is exceeded, subsequent container terminations will
> draw the following error in {{dmesg}}:
> {code}idr_remove called for id=65536 which is not allocated.{code}
> Subsequent efforts to create a cgroup folder will fail:
> {code}/sys/fs/cgroup/memory/mesos $ df .
> Filesystem 1K-blocks Used Available Use% Mounted on
> cgroup 0 0 0 - /sys/fs/cgroup/memory
> /sys/fs/cgroup/memory/mesos $ sudo mkdir foo
> mkdir: cannot create directory 'foo': No space left on device{code}
> Subsequently launched Docker containers will fail to utilize memory
> isolation: {code}/sys/fs/cgroup/memory/mesos $ docker run -m 32m -d
> example/busybox sleep 10000
> ...
> /sys/fs/cgroup/memory/mesos $ docker ps | grep busybox
> 849c66081229 example/busybox
> "sleep 10000" 6 seconds ago Up 4 seconds
>
> suspicious_mahavira
> /sys/fs/cgroup/memory/mesos $ find /sys/fs/cgroup -name "*849c66081229*"
> /sys/fs/cgroup/blkio/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/freezer/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/devices/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/cpu,cpuacct/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/cpuset/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/net_cls,net_prio/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/systemd/system.slice/docker-849c6608122989f1bc9ae39a5c70281228a304092baa0d73d9430ed94223f554.scope
> /sys/fs/cgroup/memory/mesos $ {code}
> Mesos containerizer will fail with {{No space left on device}}:
> {code}E0707 20:17:29.091142 105665 slave.cpp:3802] Container
> 'ef5419cf-9d00-425a-a9ee-a848d330bfb2' for executor
> 'node-0_executor__42a4fafe-f64d-4b41-91d2-efc20a86a6a3' of framework
> d6ab251a-064a-46a0-a1c8-9ee559f3b44a-0023 failed to start: Failed to prepare
> isolator: Failed to create directory
> '/sys/fs/cgroup/memory/mesos/ef5419cf-9d00-425a-a9ee-a848d330bfb2': No space
> left on device
> {code}
> h3. Remediation
> Once a system is found to be affected, the following command can be used to
> drop all page caches, which allows the system to reap all of the old cgroups
> and return to normal operation.
> {code}echo 1 > /proc/sys/vm/drop_caches{code}
> We suspect that [patch 9184539|https://patchwork.kernel.org/patch/9184539/]
> could fix it, but we have not yet tested.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)