[
https://issues.apache.org/jira/browse/MESOS-7927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16158045#comment-16158045
]
Qian Zhang commented on MESOS-7927:
-----------------------------------
After some investigations, I think this issue is even worse than what we
thought, it can happen in a couple of cases:
# Run {{mesos-execute}} to launch a command task (e.g., {{sleep 5}}), when the
task finishes, the executor container is not removed from the {{containers_}}
map of the composing containerizer.
# Run {{mesos-execute}} to launch a long running command task (e.g., {{sleep
1000}}), and then kill the `mesos-execute` by `ctrl+c`, the executor container
is not removed from the {{containers_}} map of the composing containerizer.
# Run {{mesos-execute}} to launch a task group which has one task (e.g.,
{{sleep 5}}), when the task finishes, both nested container and executor
container are not removed from the {{containers_}} map of the composing
containerizer.
# Run {{mesos-execute}} to launch a task group which has one long running task
(e.g., {{sleep 1000}}), then kill the `mesos-execute` by `ctrl+c`, the nested
container is removed from the {{containers_}} map of the composing
containerizer, but the executor container isn't.
So the executor container will not be removed in any cases, the reason that the
nested container can be removed in the case 4 is, we chain the {{destroy()}} of
the composing containerizer when killing the nested container:
https://github.com/apache/mesos/blob/1.4.0-rc4/src/slave/containerizer/composing.cpp#L654:L657.
> The composing containerizer leaks memory in some scenarios.
> -----------------------------------------------------------
>
> Key: MESOS-7927
> URL: https://issues.apache.org/jira/browse/MESOS-7927
> Project: Mesos
> Issue Type: Bug
> Reporter: Anand Mazumdar
> Assignee: Qian Zhang
> Priority: Critical
>
> The composing containerizer does not remove an active containers from its
> internal {{containers}} hashmap containing the known active containers in
> some cases. This can happen when the container terminates on its own. This
> means that {{destroy()}} is not invoked for such containers.
> Ideally, we should chain the {{destroy}} callback when launching the
> container itself.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)