> On May 26, 2020, 1:57 a.m., Andrei Budnik wrote:
> > src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp
> > Lines 666 (patched)
> > <https://reviews.apache.org/r/72516/diff/1/?file=2231872#file2231872line666>
> >
> >     Since the isolator doesn’t remove the checkpoint directory on unmount 
> > failure (`_cleanup`), wouldn’t the info entry get restored on the agent 
> > restart (`_recover`)?

Yes, the info entry will get restored on agent recovery, and since the 
container process has already been killed, on agent recovery 
`MesosContainerizerProcess::reaped` will be called for the container and it 
will call `MesosContainerizerProcess::destroy` to destroy the container, so we 
get another chance to unmount the volume in `docker/volume` isolator.


- Qian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72516/#review220856
-----------------------------------------------------------


On May 26, 2020, 9:41 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72516/
> -----------------------------------------------------------
> 
> (Updated May 26, 2020, 9:41 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10126
>     https://issues.apache.org/jira/browse/MESOS-10126
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Currently when `DockerVolumeIsolatorProcess::cleanup()` is called, we will
> unmount the volume first, and if the unmount operation fails we will NOT
> erase the container's `Info` struct from `infos`. This is problematic
> because the remaining `Info` in `infos` will cause the reference count of
> the volume is greater than 0, but actually the volume is not being used by
> any containers. That means we may never get a chance to unmount this volume
> on this agent, furthermore if it is an EBS volume, it cannot be used by any
> tasks launched on any other agents since a EBS volume can only be attached
> to one node at a time. The only workaround would manually unmount the volume.
> 
> So in this patch `DockerVolumeIsolatorProcess::cleanup()` is updated to erase
> container's `Info` struct before unmounting volumes.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp 
> c547696f50a4df9cce4ee9078b5fe90b93fd91d2 
> 
> 
> Diff: https://reviews.apache.org/r/72516/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>

Reply via email to