Qian Zhang created MESOS-10126:
----------------------------------
Summary: Docker volume isolator needs to clean up the `info`
struct regardless the result of unmount operation
Key: MESOS-10126
URL: https://issues.apache.org/jira/browse/MESOS-10126
Project: Mesos
Issue Type: Task
Components: containerization
Reporter: Qian Zhang
Currently when
[DockerVolumeIsolatorProcess::cleanup()|https://github.com/apache/mesos/blob/1.9.0/src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp#L610]
is called, we will unmount the volume first, but if the unmount operation
fails we will not remove the container's checkpoint directory and NOT erase the
container's `info` struct from `infos`. This is problematic, because the
remaining `info` in the `infos` will cause the reference count of the volume is
larger than 0, but actually the volume is not being used by any containers. And
next time when another container using this volume is destroyed, we will NOT
unmount the volume since its reference count will be larger than 1 (see
[here|https://github.com/apache/mesos/blob/1.9.0/src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp#L631:L651]
for details) which should be 2, so we will never have chance to unmount this
volume.
We have this issue since Mesos 1.0.0 release when Docker volume isolator was
introduced.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)