[ 
https://issues.apache.org/jira/browse/MESOS-10126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qian Zhang reassigned MESOS-10126:
----------------------------------

          Sprint: Studio 1: RI-23 68
    Story Points: 3
        Assignee: Qian Zhang

RR:

[https://reviews.apache.org/r/72516/]

> Docker volume isolator needs to clean up the `info` struct regardless the 
> result of unmount operation
> -----------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-10126
>                 URL: https://issues.apache.org/jira/browse/MESOS-10126
>             Project: Mesos
>          Issue Type: Bug
>          Components: containerization
>            Reporter: Qian Zhang
>            Assignee: Qian Zhang
>            Priority: Critical
>
> Currently when 
> [DockerVolumeIsolatorProcess::cleanup()|https://github.com/apache/mesos/blob/1.9.0/src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp#L610]
>  is called, we will unmount the volume first, but if the unmount operation 
> fails we will not remove the container's checkpoint directory and NOT erase 
> the container's `info` struct from `infos`. This is problematic, because the 
> remaining `info` in the `infos` will cause the reference count of the volume 
> is larger than 0, but actually the volume is not being used by any 
> containers. And next time when another container using this volume is 
> destroyed, we will NOT unmount the volume since its reference count will be 
> larger than 1 (see 
> [here|https://github.com/apache/mesos/blob/1.9.0/src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp#L631:L651]
>  for details) which should be 2, so we will never have chance to unmount this 
> volume.
> We have this issue since Mesos 1.0.0 release when Docker volume isolator was 
> introduced.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to