----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/65518/#review197203 -----------------------------------------------------------
src/docker/executor.cpp Lines 273-302 (patched) <https://reviews.apache.org/r/65518/#comment277355> Could we validate this with a test? I think it would be possible to use a `MockDocker` which intercepts the call to `run()`, so that we can prevent it from returning when the container exits. I think something like the following should work: ``` MockDocker* mockDocker = new MockDocker(tests::flags.docker, tests::flags.docker_socket); Promise<Option<int>> runPromise; Future<Option<int>> runFuture; auto interceptedRun = [mockDocker, &runPromise, &runFuture]( const Docker::RunOptions& runOptions, const process::Subprocess::IO& _stdout, const process::Subprocess::IO& _stderr) { runFuture = mockDocker->_run( runOptions, _stdout, _stderr); return runPromise.future(); }; EXPECT_CALL(*mockDocker, run(_, _, _)) .WillRepeatedly(Invoke(interceptedRun)); ``` Then we have control over when the `Future` associated with the 'docker run' call is satisfied. See the Docker containerizer tests for examples of how we use the MockDocker and MockDockerContainerizer. What do you think? src/docker/executor.cpp Lines 275 (patched) <https://reviews.apache.org/r/65518/#comment277330> s/will/can/ src/docker/executor.cpp Lines 276 (patched) <https://reviews.apache.org/r/65518/#comment277333> Nit: s/"docker run"/'docker run'/ src/docker/executor.cpp Lines 277 (patched) <https://reviews.apache.org/r/65518/#comment277332> s/returns/returning/ src/docker/executor.cpp Lines 280 (patched) <https://reviews.apache.org/r/65518/#comment277328> Should we log an error in the case that `container.pid.isNone()`? AFAIK this means that the Docker daemon was not able to locate a process associated with the container. This makes me wonder if we should actually do CHECK_SOME(container.pid)? It looks like the only place where we use `containerPid` is in the health checker, where the PID is used to enter the appropriate namespaces. For now, I would suggest that we log an error to help in debugging. src/docker/executor.cpp Lines 283 (patched) <https://reviews.apache.org/r/65518/#comment277337> s/can not/cannot/ - Greg Mann On Feb. 9, 2018, 1:03 a.m., Qian Zhang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/65518/ > ----------------------------------------------------------- > > (Updated Feb. 9, 2018, 1:03 a.m.) > > > Review request for mesos, Gaston Kleiman, Gilbert Song, Greg Mann, and Vinod > Kone. > > > Bugs: MESOS-8488 > https://issues.apache.org/jira/browse/MESOS-8488 > > > Repository: mesos > > > Description > ------- > > Due to a Docker issue (https://github.com/moby/moby/issues/33820), > Docker daemon will fail to catch a container exit, i.e., the container > process has already exited but the command `docker ps` shows the > container still running, this will lead to the "docker run" command > that we execute in Docker executor never returns, and it will also > cause the `docker stop` command takes no effect, i.e., it will return > without error but `docker ps` shows the container still running, so > the task will stuck in `TASK_KILLING` state. > > To workaround this Docker issue, in this patch we made Docker executor > reaps the container process directly so Docker executor will be notified > once the container process exits. > > > Diffs > ----- > > src/docker/executor.cpp e4c53d558e414e50b1c429fba8e31e504c63744a > > > Diff: https://reviews.apache.org/r/65518/diff/2/ > > > Testing > ------- > > sudo make check > > > Thanks, > > Qian Zhang > >