> On Sept. 6, 2017, 9 p.m., Qian Zhang wrote: > > src/tests/containerizer/ports_isolator_tests.cpp > > Lines 1051-1052 (patched) > > <https://reviews.apache.org/r/62003/diff/1/?file=1808287#file1808287line1051> > > > > Can you please elaborate a bit about this? What do you mean for `the > > original nested container status gets swallowed`? > > James Peach wrote: > When we are not using nested containers, we get a > `REASON_CONTAINER_LIMITATION` status update which comes directly from the > isolator. However, when we use nested containers, we actually get a > `REASON_COMMAND_EXECUTOR_FAILED` reason; the task status that comes from the > isolator never goes anywhere. > > Qian Zhang wrote: > > However, when we use nested containers, we actually get a > REASON_COMMAND_EXECUTOR_FAILED reason; the task status that comes from the > isolator never goes anywhere. > > Is it because there is only one nested container and when that nested > container is destroyed by Mesos containerizer, the default executor will know > it and terminate itself, so it has no chance to send > REASON_CONTAINER_LIMITATION status for that nested container before it's self > termination? If so, what if we launch a task group which has two tasks? when > one of the task is destroyed due to the limitation, will we get a > `REASON_CONTAINER_LIMITATION` status update for it? > > James Peach wrote: > I can reproduce this with `mesos-execute` so I filed > [MESOS-7963](https://issues.apache.org/jira/browse/MESOS-7963) to handle this. > > James Peach wrote: > The nested container status gets lost > [here](https://github.com/apache/mesos/blob/master/src/slave/http.cpp#L2496), > so I think the right approach is to always trigger the resource limitation on > the root container.
I agree the nested container status gets lost [here](https://github.com/apache/mesos/blob/1.3.1/src/slave/http.cpp#L2420), the protobuf message `WaitNestedContainer` only has one field `exit_status`, but the protobuf message `ContainerTermination` has some other fields (`reasons` and `message`) which are lost when convert `ContainerTermination` to `WaitNestedContainer` in those code. So I think the right approach is to add more fields (`reasons` and `message`) in `WaitNestedContainer` and set them based on the corresponding fields in `ContainerTermination`, and then in `DefaultExecutor::waited()`, we should get `reasons` and `message` from `WaitNestedContainer` and set them in the task status update so that the scheduler can receive them. - Qian ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/62003/#review184660 ----------------------------------------------------------- On Sept. 9, 2017, 6:44 a.m., James Peach wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/62003/ > ----------------------------------------------------------- > > (Updated Sept. 9, 2017, 6:44 a.m.) > > > Review request for mesos, Qian Zhang and Jiang Yan Xu. > > > Bugs: MESOS-7675 > https://issues.apache.org/jira/browse/MESOS-7675 > > > Repository: mesos > > > Description > ------- > > Added `network/ports` isolator nested container tests using the v1 > TaskGroups API. This tests that rogue port usage by a nested task is > detected both with and without agent recovery, and that a well-behaved > task is preserved across agent recovery. > > > Diffs > ----- > > src/tests/containerizer/ports_isolator_tests.cpp PRE-CREATION > src/tests/mesos.hpp 444c75763aea995708db0b17eafee2d22c912554 > > > Diff: https://reviews.apache.org/r/62003/diff/4/ > > > Testing > ------- > > make check (Fedora 26) > > > Thanks, > > James Peach > >
