> On Sept. 6, 2017, 9 p.m., Qian Zhang wrote:
> > src/tests/containerizer/ports_isolator_tests.cpp
> > Lines 1051-1052 (patched)
> > <https://reviews.apache.org/r/62003/diff/1/?file=1808287#file1808287line1051>
> >
> >     Can you please elaborate a bit about this? What do you mean for `the 
> > original nested container status gets swallowed`?
> 
> James Peach wrote:
>     When we are not using nested containers, we get a 
> `REASON_CONTAINER_LIMITATION` status update which comes directly from the 
> isolator. However, when we use nested containers, we actually get a 
> `REASON_COMMAND_EXECUTOR_FAILED` reason; the task status that comes from the 
> isolator never goes anywhere.
> 
> Qian Zhang wrote:
>     > However, when we use nested containers, we actually get a 
> REASON_COMMAND_EXECUTOR_FAILED reason; the task status that comes from the 
> isolator never goes anywhere.
>     
>     Is it because there is only one nested container and when that nested 
> container is destroyed by Mesos containerizer, the default executor will know 
> it and terminate itself, so it has no chance to send 
> REASON_CONTAINER_LIMITATION status for that nested container before it's self 
> termination? If so, what if we launch a task group which has two tasks? when 
> one of the task is destroyed due to the limitation, will we get a 
> `REASON_CONTAINER_LIMITATION` status update for it?
> 
> James Peach wrote:
>     I can reproduce this with `mesos-execute` so I filed 
> [MESOS-7963](https://issues.apache.org/jira/browse/MESOS-7963) to handle this.
> 
> James Peach wrote:
>     The nested container status gets lost 
> [here](https://github.com/apache/mesos/blob/master/src/slave/http.cpp#L2496), 
> so I think the right approach is to always trigger the resource limitation on 
> the root container.

I agree the nested container status gets lost 
[here](https://github.com/apache/mesos/blob/1.3.1/src/slave/http.cpp#L2420), 
the protobuf message `WaitNestedContainer` only has one field `exit_status`, 
but the protobuf message `ContainerTermination` has some other fields 
(`reasons` and `message`) which are lost when convert `ContainerTermination` to 
`WaitNestedContainer` in those code.

So I think the right approach is to add more fields (`reasons` and `message`) 
in `WaitNestedContainer` and set them based on the corresponding fields in 
`ContainerTermination`, and then in `DefaultExecutor::waited()`, we should get 
`reasons` and `message` from `WaitNestedContainer` and set them in the task 
status update so that the scheduler can receive them.


- Qian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62003/#review184660
-----------------------------------------------------------


On Sept. 9, 2017, 6:44 a.m., James Peach wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62003/
> -----------------------------------------------------------
> 
> (Updated Sept. 9, 2017, 6:44 a.m.)
> 
> 
> Review request for mesos, Qian Zhang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7675
>     https://issues.apache.org/jira/browse/MESOS-7675
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added `network/ports` isolator nested container tests using the v1
> TaskGroups API. This tests that rogue port usage by a nested task is
> detected both with and without agent recovery, and that a well-behaved
> task is preserved across agent recovery.
> 
> 
> Diffs
> -----
> 
>   src/tests/containerizer/ports_isolator_tests.cpp PRE-CREATION 
>   src/tests/mesos.hpp 444c75763aea995708db0b17eafee2d22c912554 
> 
> 
> Diff: https://reviews.apache.org/r/62003/diff/4/
> 
> 
> Testing
> -------
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>

Reply via email to