----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33249/#review83310 -----------------------------------------------------------
Thanks Jay the changes looks reasonable to me, will wait for Jie to comment. - Timothy Chen On May 11, 2015, 10:06 p.m., Jay Buffington wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/33249/ > ----------------------------------------------------------- > > (Updated May 11, 2015, 10:06 p.m.) > > > Review request for mesos, Benjamin Hindman, Jie Yu, and Timothy Chen. > > > Bugs: MESOS-2020 and MESOS-2656 > https://issues.apache.org/jira/browse/MESOS-2020 > https://issues.apache.org/jira/browse/MESOS-2656 > > > Repository: mesos > > > Description > ------- > > Send statusUpdate to scheduler on containerizer launch failure > > This review is actually for three commits. The three commits are separated > out on github here: > https://github.com/apache/mesos/compare/master...jaybuff:report-launch-failure?expand=1 > > Commit #1: > docker-no-executor-framework example should send ContainerInfo > > docker-no-executor-framework stopped working when the protobuf message > changed during development. Apparently noone noticed because we don't > run it as part of our test suite. I used it to develop a fix for > proper error reporting, so I fixed it. > > Commit #2: > serialize wait and destroy in containerizer > > Containerizers keep track of all the containers they have created. When > a container is destroy()'ed the containerizer removes the container from > its internal hashmap. This means that wait() must be called before the > container is destroyed. > > When the docker containerizer fails because of a "docker pull" or > "docker run" failure, we end up with a race condition where wait() does > not properly return the Termination, so we ended up with a default > failure message of "Abnormal Executor Termination" which isn't very > useful. > > This commit solves this issue by not destroying the container until > after wait is called in the failure case. > > Fixes MESOS-2656 > > Commit #3: > send scheduler containerizer launch failure errors > > If a method in the launch sequence returns a failure the error message > wasn't making it back to the scheduler. > > Fixes MESOS-2020 > > > Diffs > ----- > > include/mesos/containerizer/containerizer.proto > 95c84dfbecef28ca9a220a98293c5fc6215a692f > src/examples/docker_no_executor_framework.cpp > d5385d9295d30a6039bab9938f3270006582d3da > src/slave/containerizer/containerizer.hpp > 56c088abb036aa26c323c3142fd27ea29ed51d4f > src/slave/containerizer/docker.hpp b25ec55bf3cd30d6e8a804d09d90c632a7d12e3f > src/slave/containerizer/docker.cpp f9fc89ad7e3c853c3f9f6dcf9aa68e54dc1888c6 > src/slave/containerizer/mesos/containerizer.hpp > 5e5f13ed8a71ff9510b40b6032d00fd16d312622 > src/slave/containerizer/mesos/containerizer.cpp > f2587280dc0e1d566d2b856a80358c7b3896c603 > src/slave/slave.cpp c78ee3c9e7fc38ad364e83f4abe267e86bfbbc13 > src/tests/docker_containerizer_tests.cpp > c9d66b3fbc7d081f36c26781573dca50de823c44 > > Diff: https://reviews.apache.org/r/33249/diff/ > > > Testing > ------- > > I used examples/docker_no_executor_framework.cpp, and wrote a few tests and > ran make check > > > Thanks, > > Jay Buffington > >