[ https://issues.apache.org/jira/browse/MESOS-10126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Qian Zhang reassigned MESOS-10126: ---------------------------------- Sprint: Studio 1: RI-23 68 Story Points: 3 Assignee: Qian Zhang RR: [https://reviews.apache.org/r/72516/] > Docker volume isolator needs to clean up the `info` struct regardless the > result of unmount operation > ----------------------------------------------------------------------------------------------------- > > Key: MESOS-10126 > URL: https://issues.apache.org/jira/browse/MESOS-10126 > Project: Mesos > Issue Type: Bug > Components: containerization > Reporter: Qian Zhang > Assignee: Qian Zhang > Priority: Critical > > Currently when > [DockerVolumeIsolatorProcess::cleanup()|https://github.com/apache/mesos/blob/1.9.0/src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp#L610] > is called, we will unmount the volume first, but if the unmount operation > fails we will not remove the container's checkpoint directory and NOT erase > the container's `info` struct from `infos`. This is problematic, because the > remaining `info` in the `infos` will cause the reference count of the volume > is larger than 0, but actually the volume is not being used by any > containers. And next time when another container using this volume is > destroyed, we will NOT unmount the volume since its reference count will be > larger than 1 (see > [here|https://github.com/apache/mesos/blob/1.9.0/src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp#L631:L651] > for details) which should be 2, so we will never have chance to unmount this > volume. > We have this issue since Mesos 1.0.0 release when Docker volume isolator was > introduced. -- This message was sent by Atlassian Jira (v8.3.4#803005)