Qian Zhang created MESOS-10126: ---------------------------------- Summary: Docker volume isolator needs to clean up the `info` struct regardless the result of unmount operation Key: MESOS-10126 URL: https://issues.apache.org/jira/browse/MESOS-10126 Project: Mesos Issue Type: Task Components: containerization Reporter: Qian Zhang
Currently when [DockerVolumeIsolatorProcess::cleanup()|https://github.com/apache/mesos/blob/1.9.0/src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp#L610] is called, we will unmount the volume first, but if the unmount operation fails we will not remove the container's checkpoint directory and NOT erase the container's `info` struct from `infos`. This is problematic, because the remaining `info` in the `infos` will cause the reference count of the volume is larger than 0, but actually the volume is not being used by any containers. And next time when another container using this volume is destroyed, we will NOT unmount the volume since its reference count will be larger than 1 (see [here|https://github.com/apache/mesos/blob/1.9.0/src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp#L631:L651] for details) which should be 2, so we will never have chance to unmount this volume. We have this issue since Mesos 1.0.0 release when Docker volume isolator was introduced. -- This message was sent by Atlassian Jira (v8.3.4#803005)