[
https://issues.apache.org/jira/browse/MESOS-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15485880#comment-15485880
]
Qian Zhang commented on MESOS-6149:
-----------------------------------
I think in MESOS-6063, we have handled the case that agent is restarted with
more cgroups subsystems enabled, and in this ticket, we are going to handle the
case that agent is restarted with less cgroups subsystems enabled, e.g., before
agent is restarted, the enabled subsystems are
{{cgroups/cpu,cgroups/mem,cgroups/net_cls}}, after agent is restarted, the
enabled subsystems are {{cgroups/cpu,cgroups/mem}}, i.e., {{cgroups/net_cls}}
is disabled after agent is restarted.
However, I am not sure if checkpointing used subsystems for container can help
to handle this case, because I think once a subsystem is disabled after agent
is restarted, even we have checkpointed used subsystems for container, when the
container terminates, we still have no chance to do any cleanup for the
subsystem which is disabled (because agent will not call that subsystem at
all), so the cgroups created for the container will remain there as a garbage
data.
One possible solution in my mind is, for the container which is created before
agent restarts, we will still use its checkpointed subsystems (even some of
them are disabled after agent restart), but for new containers created after
agent restarts, we will just use the subsystems enabled in agent.
> Checkpoint used subsystems for containers
> -----------------------------------------
>
> Key: MESOS-6149
> URL: https://issues.apache.org/jira/browse/MESOS-6149
> Project: Mesos
> Issue Type: Improvement
> Reporter: haosdent
> Assignee: haosdent
>
> In MESOS-6063, we have tracked recovered and prepared subsystems for
> containers. To make it works better, we could checkpoint this information and
> recover it after Agent restart.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)