Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-24 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203778
---


Ship it!




Ship It!

- Greg Mann


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> Moreover, we added a clean up for terminated containers that have
> been recovered after agent's restart.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/9/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-24 Thread Qian Zhang


> On May 24, 2018, 4:24 p.m., Qian Zhang wrote:
> > Ship It!

A minor comment, since we fixed a tech debt (memory leak) in 
`ComposingContainerizerProcess::__recover()` in this patch, can you please 
mention it in the commit message?


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203734
---


On April 17, 2018, 11:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 11:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/8/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-24 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203734
---


Ship it!




Ship It!

- Qian Zhang


On April 17, 2018, 11:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 11:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/8/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-23 Thread Andrei Budnik


> On May 21, 2018, 1:21 p.m., Qian Zhang wrote:
> > Can you please explain how the container will be cleaned up from the 
> > `containers_` map after agent recovery?
> 
> Andrei Budnik wrote:
> In the current implementation, a recovered container can be cleaned up 
> only when someone calls `destroy()`. I didn't change that behaviour.
> 
> Qian Zhang wrote:
> So if no one calls `destroy()` on a recovered container (e.g., it just 
> terminates itself), the container will never be cleaned up from the 
> `containers_` map which seems a memory leak. I know you did not change that 
> behaviour, but I think it is a tech debt which we need to fix.

Fixed.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203490
---


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/8/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-23 Thread Andrei Budnik


> On May 21, 2018, 1:21 p.m., Qian Zhang wrote:
> > src/slave/containerizer/composing.cpp
> > Lines 396-400 (original)
> > 
> >
> > Previously, in the case that `destroy-in-progress stopped an 
> > launch-in-progress`, the `destroy()` will return `ContainerTermination()` 
> > which is set here. But now with this implementation, it will return 
> > `None()` which is set by the underlying containerizer. This seems not 
> > correct to me, in this case, the destroy should considered as succeeded 
> > (return `ContainerTermination()`) rather than failed (return `None()`).
> 
> Andrei Budnik wrote:
> Before https://reviews.apache.org/r/66510/ patch was committed, 
> `destroy()` returned `Future`. There was a reason to set a `true` to 
> the `destroyed` promise, because we successfully destroyed a container (in 
> the case that destroy-in-progress stopped an launch-in-progress).
> 
> >the destroy should considered as succeeded (return 
> ContainerTermination()) rather than failed (return None()).
> 
> At the moment I don't think that a client should interpret the result of 
> `destroy()` as failed/succeeded relying on whether or not 
> `Option` is empty. If there was no actual attempt to 
> launch a container using an underlying containerizer, why do we need to 
> return some virtual `ContainerTermination`?
> 
> Greg Mann wrote:
> Hm, so we don't have any callers who need to recognize the 
> destroy-during-launch case? If not, then I think this sounds OK to me, WDYT 
> Qian? Andrei, if we decide to keep this approach, could you update the 
> comment on `destroy()` in the following patch to better reflect what happens 
> when it's called during container launch? https://reviews.apache.org/r/67130/
> 
> Qian Zhang wrote:
> OK to me as well.

Updated comments:
https://reviews.apache.org/r/8/diff/6-7/
https://reviews.apache.org/r/67130/diff/3-4/


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203490
---


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/7/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-22 Thread Qian Zhang


> On May 21, 2018, 9:21 p.m., Qian Zhang wrote:
> > Can you please explain how the container will be cleaned up from the 
> > `containers_` map after agent recovery?
> 
> Andrei Budnik wrote:
> In the current implementation, a recovered container can be cleaned up 
> only when someone calls `destroy()`. I didn't change that behaviour.

So if no one calls `destroy()` on a recovered container (e.g., it just 
terminates itself), the container will never be cleaned up from the 
`containers_` map which seems a memory leak. I know you did not change that 
behaviour, but I think it is a tech debt which we need to fix.


> On May 21, 2018, 9:21 p.m., Qian Zhang wrote:
> > src/slave/containerizer/composing.cpp
> > Lines 396-400 (original)
> > 
> >
> > Previously, in the case that `destroy-in-progress stopped an 
> > launch-in-progress`, the `destroy()` will return `ContainerTermination()` 
> > which is set here. But now with this implementation, it will return 
> > `None()` which is set by the underlying containerizer. This seems not 
> > correct to me, in this case, the destroy should considered as succeeded 
> > (return `ContainerTermination()`) rather than failed (return `None()`).
> 
> Andrei Budnik wrote:
> Before https://reviews.apache.org/r/66510/ patch was committed, 
> `destroy()` returned `Future`. There was a reason to set a `true` to 
> the `destroyed` promise, because we successfully destroyed a container (in 
> the case that destroy-in-progress stopped an launch-in-progress).
> 
> >the destroy should considered as succeeded (return 
> ContainerTermination()) rather than failed (return None()).
> 
> At the moment I don't think that a client should interpret the result of 
> `destroy()` as failed/succeeded relying on whether or not 
> `Option` is empty. If there was no actual attempt to 
> launch a container using an underlying containerizer, why do we need to 
> return some virtual `ContainerTermination`?
> 
> Greg Mann wrote:
> Hm, so we don't have any callers who need to recognize the 
> destroy-during-launch case? If not, then I think this sounds OK to me, WDYT 
> Qian? Andrei, if we decide to keep this approach, could you update the 
> comment on `destroy()` in the following patch to better reflect what happens 
> when it's called during container launch? https://reviews.apache.org/r/67130/

OK to me as well.


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203490
---


On April 17, 2018, 11:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 11:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/6/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-22 Thread Andrei Budnik


> On May 21, 2018, 1:21 p.m., Qian Zhang wrote:
> > Can you please explain how the container will be cleaned up from the 
> > `containers_` map after agent recovery?

In the current implementation, a recovered container can be cleaned up only 
when someone calls `destroy()`. I didn't change that behaviour.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203490
---


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/6/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-22 Thread Greg Mann


> On May 21, 2018, 1:21 p.m., Qian Zhang wrote:
> > src/slave/containerizer/composing.cpp
> > Lines 396-400 (original)
> > 
> >
> > Previously, in the case that `destroy-in-progress stopped an 
> > launch-in-progress`, the `destroy()` will return `ContainerTermination()` 
> > which is set here. But now with this implementation, it will return 
> > `None()` which is set by the underlying containerizer. This seems not 
> > correct to me, in this case, the destroy should considered as succeeded 
> > (return `ContainerTermination()`) rather than failed (return `None()`).
> 
> Andrei Budnik wrote:
> Before https://reviews.apache.org/r/66510/ patch was committed, 
> `destroy()` returned `Future`. There was a reason to set a `true` to 
> the `destroyed` promise, because we successfully destroyed a container (in 
> the case that destroy-in-progress stopped an launch-in-progress).
> 
> >the destroy should considered as succeeded (return 
> ContainerTermination()) rather than failed (return None()).
> 
> At the moment I don't think that a client should interpret the result of 
> `destroy()` as failed/succeeded relying on whether or not 
> `Option` is empty. If there was no actual attempt to 
> launch a container using an underlying containerizer, why do we need to 
> return some virtual `ContainerTermination`?

Hm, so we don't have any callers who need to recognize the 
destroy-during-launch case? If not, then I think this sounds OK to me, WDYT 
Qian? Andrei, if we decide to keep this approach, could you update the comment 
on `destroy()` in the following patch to better reflect what happens when it's 
called during container launch? https://reviews.apache.org/r/67130/


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203490
---


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/5/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-22 Thread Andrei Budnik


> On May 21, 2018, 1:21 p.m., Qian Zhang wrote:
> > src/slave/containerizer/composing.cpp
> > Line 361 (original), 360-365 (patched)
> > 
> >
> > So besides removing the container from the `containers_` map, we also 
> > eliminate a call to `ComposingContainerizerProcess::destroy()`, My 
> > understanding is we assume the underlying containerizers will always do its 
> > own destroy to clean up its internal data, so here in composing 
> > containerizer we just need to remove the container from the `containers_` 
> > map, right?

>we assume the underlying containerizers will always do its own destroy to 
>clean up its internal data, so here in composing containerizer we just need to 
>remove the container from the containers_ map, right?


Yes, correct
https://github.com/apache/mesos/blob/d7d7cfbc3e5609fc9a4e8de8203a6ecb11afeac7/src/slave/containerizer/containerizer.hpp#L129-L130


> On May 21, 2018, 1:21 p.m., Qian Zhang wrote:
> > src/slave/containerizer/composing.cpp
> > Lines 396-400 (original)
> > 
> >
> > Previously, in the case that `destroy-in-progress stopped an 
> > launch-in-progress`, the `destroy()` will return `ContainerTermination()` 
> > which is set here. But now with this implementation, it will return 
> > `None()` which is set by the underlying containerizer. This seems not 
> > correct to me, in this case, the destroy should considered as succeeded 
> > (return `ContainerTermination()`) rather than failed (return `None()`).

Before https://reviews.apache.org/r/66510/ patch was committed, `destroy()` 
returned `Future`. There was a reason to set a `true` to the `destroyed` 
promise, because we successfully destroyed a container (in the case that 
destroy-in-progress stopped an launch-in-progress).

>the destroy should considered as succeeded (return ContainerTermination()) 
>rather than failed (return None()).

At the moment I don't think that a client should interpret the result of 
`destroy()` as failed/succeeded relying on whether or not 
`Option` is empty. If there was no actual attempt to 
launch a container using an underlying containerizer, why do we need to return 
some virtual `ContainerTermination`?


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203490
---


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/5/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-22 Thread Qian Zhang


> On May 19, 2018, 1:26 a.m., Greg Mann wrote:
> > src/slave/containerizer/composing.cpp
> > Line 669 (original), 618 (patched)
> > 
> >
> > Why return `wait()` when the state is DESTROYING, rather than just 
> > forwarding the call to the underlying containerizer's `destroy()`?
> > 
> > With this implementation, the composing containerizer's `wait()` will 
> > behave differently than the underlying containerizer's `wait()`, is that 
> > what we want?
> 
> Andrei Budnik wrote:
> Good point!
> It looks like `container->containerizer->destroy(containerId).onAny(...)` 
> can be moved from `if` condition to the bottom of the method. That will 
> simplify `ComposingContainerizerProcess::destroy()` implementation.
> WDYT?
> 
> Qian Zhang wrote:
> > Why return wait() when the state is DESTROYING, rather than just 
> forwarding the call to the underlying containerizer's destroy()?
> 
> `DESTROYING` state means a call to the underlying containerizer's 
> `destroy()` is already going on, so why do we want to call it again in this 
> case?
> 
> > With this implementation, the composing containerizer's wait() will 
> behave differently than the underlying containerizer's wait(), is that what 
> we want?
> 
> How will the composing containerizer's `wait()` behave differently than 
> the underlying containerizer's `wait()`? I see it is actually just a wrapper 
> of the underlying containerizer's `wait()`.
> 
> Greg Mann wrote:
> > DESTROYING state means a call to the underlying containerizer's 
> destroy() is already going on, so why do we want to call it again in this 
> case?
> 
> My logic was that the behavior of the composing containerizer should be 
> the same as the individual containerizers. So, if the caller calls 
> `destroy()` a second time, we can just forward the call to the individual 
> containerizer.
> 
> > How will the composing containerizer's wait() behave differently than 
> the underlying containerizer's wait()? I see it is actually just a wrapper of 
> the underlying containerizer's wait().
> 
> Sorry, what I meant to type was: With this implementation, the composing 
> containerizer's `destroy()` will behave differently than the underlying 
> containerizer's `destroy()`, is that what we want?
> 
> Greg Mann wrote:
> > It looks like container->containerizer->destroy(containerId).onAny(...) 
> can be moved from if condition to the bottom of the method. That will 
> simplify ComposingContainerizerProcess::destroy() implementation.
> 
> Andrei that sounds good to me - Qian what do you think?

I am OK with it, but I have a general comment for removing `destroyed`:
https://reviews.apache.org/r/8/#comment285784


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203428
---


On April 17, 2018, 11:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 11:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/4/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-21 Thread Greg Mann


> On May 18, 2018, 5:26 p.m., Greg Mann wrote:
> > src/slave/containerizer/composing.cpp
> > Line 669 (original), 618 (patched)
> > 
> >
> > Why return `wait()` when the state is DESTROYING, rather than just 
> > forwarding the call to the underlying containerizer's `destroy()`?
> > 
> > With this implementation, the composing containerizer's `wait()` will 
> > behave differently than the underlying containerizer's `wait()`, is that 
> > what we want?
> 
> Andrei Budnik wrote:
> Good point!
> It looks like `container->containerizer->destroy(containerId).onAny(...)` 
> can be moved from `if` condition to the bottom of the method. That will 
> simplify `ComposingContainerizerProcess::destroy()` implementation.
> WDYT?
> 
> Qian Zhang wrote:
> > Why return wait() when the state is DESTROYING, rather than just 
> forwarding the call to the underlying containerizer's destroy()?
> 
> `DESTROYING` state means a call to the underlying containerizer's 
> `destroy()` is already going on, so why do we want to call it again in this 
> case?
> 
> > With this implementation, the composing containerizer's wait() will 
> behave differently than the underlying containerizer's wait(), is that what 
> we want?
> 
> How will the composing containerizer's `wait()` behave differently than 
> the underlying containerizer's `wait()`? I see it is actually just a wrapper 
> of the underlying containerizer's `wait()`.
> 
> Greg Mann wrote:
> > DESTROYING state means a call to the underlying containerizer's 
> destroy() is already going on, so why do we want to call it again in this 
> case?
> 
> My logic was that the behavior of the composing containerizer should be 
> the same as the individual containerizers. So, if the caller calls 
> `destroy()` a second time, we can just forward the call to the individual 
> containerizer.
> 
> > How will the composing containerizer's wait() behave differently than 
> the underlying containerizer's wait()? I see it is actually just a wrapper of 
> the underlying containerizer's wait().
> 
> Sorry, what I meant to type was: With this implementation, the composing 
> containerizer's `destroy()` will behave differently than the underlying 
> containerizer's `destroy()`, is that what we want?

> It looks like container->containerizer->destroy(containerId).onAny(...) can 
> be moved from if condition to the bottom of the method. That will simplify 
> ComposingContainerizerProcess::destroy() implementation.

Andrei that sounds good to me - Qian what do you think?


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203428
---


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/4/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-21 Thread Greg Mann


> On May 18, 2018, 5:26 p.m., Greg Mann wrote:
> > src/slave/containerizer/composing.cpp
> > Line 669 (original), 618 (patched)
> > 
> >
> > Why return `wait()` when the state is DESTROYING, rather than just 
> > forwarding the call to the underlying containerizer's `destroy()`?
> > 
> > With this implementation, the composing containerizer's `wait()` will 
> > behave differently than the underlying containerizer's `wait()`, is that 
> > what we want?
> 
> Andrei Budnik wrote:
> Good point!
> It looks like `container->containerizer->destroy(containerId).onAny(...)` 
> can be moved from `if` condition to the bottom of the method. That will 
> simplify `ComposingContainerizerProcess::destroy()` implementation.
> WDYT?
> 
> Qian Zhang wrote:
> > Why return wait() when the state is DESTROYING, rather than just 
> forwarding the call to the underlying containerizer's destroy()?
> 
> `DESTROYING` state means a call to the underlying containerizer's 
> `destroy()` is already going on, so why do we want to call it again in this 
> case?
> 
> > With this implementation, the composing containerizer's wait() will 
> behave differently than the underlying containerizer's wait(), is that what 
> we want?
> 
> How will the composing containerizer's `wait()` behave differently than 
> the underlying containerizer's `wait()`? I see it is actually just a wrapper 
> of the underlying containerizer's `wait()`.

> DESTROYING state means a call to the underlying containerizer's destroy() is 
> already going on, so why do we want to call it again in this case?

My logic was that the behavior of the composing containerizer should be the 
same as the individual containerizers. So, if the caller calls `destroy()` a 
second time, we can just forward the call to the individual containerizer.

> How will the composing containerizer's wait() behave differently than the 
> underlying containerizer's wait()? I see it is actually just a wrapper of the 
> underlying containerizer's wait().

Sorry, what I meant to type was: With this implementation, the composing 
containerizer's `destroy()` will behave differently than the underlying 
containerizer's `destroy()`, is that what we want?


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203428
---


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/4/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-21 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203490
---



Can you please explain how the container will be cleaned up from the 
`containers_` map after agent recovery?


src/slave/containerizer/composing.cpp
Line 361 (original), 360-365 (patched)


So besides removing the container from the `containers_` map, we also 
eliminate a call to `ComposingContainerizerProcess::destroy()`, My 
understanding is we assume the underlying containerizers will always do its own 
destroy to clean up its internal data, so here in composing containerizer we 
just need to remove the container from the `containers_` map, right?



src/slave/containerizer/composing.cpp
Lines 396-400 (original)


Previously, in the case that `destroy-in-progress stopped an 
launch-in-progress`, the `destroy()` will return `ContainerTermination()` which 
is set here. But now with this implementation, it will return `None()` which is 
set by the underlying containerizer. This seems not correct to me, in this 
case, the destroy should considered as succeeded (return 
`ContainerTermination()`) rather than failed (return `None()`).



src/slave/containerizer/composing.cpp
Lines 401-402 (original), 395-396 (patched)


Do we need these two lines? Since there is a destroy going on, the 
container will be removed from `containers_` when the destroy is done.


- Qian Zhang


On April 17, 2018, 11:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 11:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/4/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-21 Thread Qian Zhang


> On May 19, 2018, 1:26 a.m., Greg Mann wrote:
> > src/slave/containerizer/composing.cpp
> > Line 669 (original), 618 (patched)
> > 
> >
> > Why return `wait()` when the state is DESTROYING, rather than just 
> > forwarding the call to the underlying containerizer's `destroy()`?
> > 
> > With this implementation, the composing containerizer's `wait()` will 
> > behave differently than the underlying containerizer's `wait()`, is that 
> > what we want?
> 
> Andrei Budnik wrote:
> Good point!
> It looks like `container->containerizer->destroy(containerId).onAny(...)` 
> can be moved from `if` condition to the bottom of the method. That will 
> simplify `ComposingContainerizerProcess::destroy()` implementation.
> WDYT?

> Why return wait() when the state is DESTROYING, rather than just forwarding 
> the call to the underlying containerizer's destroy()?

`DESTROYING` state means a call to the underlying containerizer's `destroy()` 
is already going on, so why do we want to call it again in this case?

> With this implementation, the composing containerizer's wait() will behave 
> differently than the underlying containerizer's wait(), is that what we want?

How will the composing containerizer's `wait()` behave differently than the 
underlying containerizer's `wait()`? I see it is actually just a wrapper of the 
underlying containerizer's `wait()`.


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203428
---


On April 17, 2018, 11:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 11:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/4/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-18 Thread Andrei Budnik


> On May 18, 2018, 5:26 p.m., Greg Mann wrote:
> > src/slave/containerizer/composing.cpp
> > Line 669 (original), 618 (patched)
> > 
> >
> > Why return `wait()` when the state is DESTROYING, rather than just 
> > forwarding the call to the underlying containerizer's `destroy()`?
> > 
> > With this implementation, the composing containerizer's `wait()` will 
> > behave differently than the underlying containerizer's `wait()`, is that 
> > what we want?

Good point!
It looks like `container->containerizer->destroy(containerId).onAny(...)` can 
be moved from `if` condition to the bottom of the method. That will simplify 
`ComposingContainerizerProcess::destroy()` implementation.
WDYT?


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203428
---


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/4/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-18 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203428
---




src/slave/containerizer/composing.cpp
Line 669 (original), 618 (patched)


Why return `wait()` when the state is DESTROYING, rather than just 
forwarding the call to the underlying containerizer's `destroy()`?

With this implementation, the composing containerizer's `wait()` will 
behave differently than the underlying containerizer's `wait()`, is that what 
we want?


- Greg Mann


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/4/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-18 Thread Greg Mann


> On May 15, 2018, 11:05 p.m., Greg Mann wrote:
> > src/slave/containerizer/composing.cpp
> > Line 361 (original), 360-365 (patched)
> > 
> >
> > Moving our previous discussion from https://reviews.apache.org/r/9/ 
> > to this RR, since the two have been merged.
> > 
> > You said:
> > ```
> > Composing c'zer adds a container to the containers_ map and then calls 
> > launch() on underlying c'zer. However, composing c'zer doesn't handle the 
> > case when an underlying containerizer returns a failure during the launch, 
> > so the container will not be removed from the containers_ map. If we want 
> > to call destroy() without the callback, then there is no chance to get the 
> > container removed from the containers_ map in this case.
> > 
> > To check that, you can remove the callback from destroy() (and 
> > optionally copy-paste containerizer->wait(containerId).onAny(...) to 
> > _recover()), then run tests and you'll get a few tests failing, including 
> > ParentChildContainerTypeAndContentType/AgentContainerAPITest.NestedContainerFailLaunch/0.
> > 
> > We could address the issue by handling the launch failures, but there 
> > are 3 places where we should handle it. That change would make composing 
> > c'zer more complex IMO.
> > ```
> > 
> > I'm a bit confused. Are you saying that in the case of launch failure, 
> > we rely on the caller to call `destroy()` in order to remove the container 
> > from the `containers_` map? It looks to me like we handle the launch 
> > failure case with the following code:
> > ```
> >   // If we are here, the launch is not supported by `containerizer`.
> > 
> >   // Try the next containerizer.
> >   ++containerizer;
> > 
> >   if (containerizer == containerizers_.end()) {
> > // If we are here none of the containerizers support the launch.
> > 
> > // We set this to `None` because the container has no chance of
> > // getting launched by any containerizer. This is similar to what
> > // would happen if the destroy "started" after launch returned 
> > false.
> > container->destroyed.set(Option::none());
> > 
> > // We destroy the container irrespective whether
> > // a destroy is already in progress, for simplicity.
> > containers_.erase(containerId);
> > delete container;
> > 
> > // None of the containerizers support the launch.
> > return Containerizer::LaunchResult::NOT_SUPPORTED;
> >   }
> > ```
> > 
> > My hope would be that if the caller attempts to launch a task with the 
> > composing containerizer and the launch fails, then the `containers_` map 
> > will be updated so that the container does not exist anymore. Is that not 
> > the case?
> > 
> > I don't quite understand why it's not simple to remove the 
> > `containers_.erase(containerId)` code from `destroy()` and simply rely on 
> > the code here to erase it whenever `wait()` returns.
> 
> Andrei Budnik wrote:
> >we handle the launch failure case with the following code
> 
> We have 3 places in composing where we call `containerizer->launch()` and 
> then subscribe `_launch()` callback using `.then()` which doesn't call the 
> callback if a future returned by `containerizer->launch()` is failed. See 
> test `NestedContainerFailLaunch` from my previous comment that reproduces 
> this case.
> The following code handles only the case when an underlying c'zer doesn't 
> support the launch.
> 
> I tried to compare two (?) possible solutions:
> 1) Add `.onFailed()` callback in all 3 places where we call 
> `containerizer->launch().next(...)` to clean `containers_` map.
> 2) Add `.onAny()` callback for `containerizer->destroy()` in 
> `ComposingContainer::destroy()` to clean `containers_` map.
> 
> Also, first solution requires adding `containerizer->wait()` code to the 
> `ComposingContainerizerProcess::__recover()` in order to be able to clean 
> `containers_` map after recovery.
> Finally, I decided to implement second solution, because first one "would 
> make composing c'zer more complex IMO".
> 
> Greg Mann wrote:
> OK that makes sense, thanks!!
> 
> So, the current implementation requires that `destroy()` must be called 
> after a failed launch in order to clean up the `containers_` map. So, are we 
> relying on the fact that the caller (Mesos agent) will always call 
> `destroy()` after a failed launch? Do the Mesos and Docker containerizers 
> also require that the agent must do this in order to maintain consistent 
> state in the containerizer?
> 
> Andrei Budnik wrote:
> >Do the Mesos and Docker containerizers also require that the agent must 
> do this in order to maintain consistent state in the containerizer?
> 
> If an attempt to launch a 
> 

Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-18 Thread Andrei Budnik


> On May 15, 2018, 11:05 p.m., Greg Mann wrote:
> > src/slave/containerizer/composing.cpp
> > Line 361 (original), 360-365 (patched)
> > 
> >
> > Moving our previous discussion from https://reviews.apache.org/r/9/ 
> > to this RR, since the two have been merged.
> > 
> > You said:
> > ```
> > Composing c'zer adds a container to the containers_ map and then calls 
> > launch() on underlying c'zer. However, composing c'zer doesn't handle the 
> > case when an underlying containerizer returns a failure during the launch, 
> > so the container will not be removed from the containers_ map. If we want 
> > to call destroy() without the callback, then there is no chance to get the 
> > container removed from the containers_ map in this case.
> > 
> > To check that, you can remove the callback from destroy() (and 
> > optionally copy-paste containerizer->wait(containerId).onAny(...) to 
> > _recover()), then run tests and you'll get a few tests failing, including 
> > ParentChildContainerTypeAndContentType/AgentContainerAPITest.NestedContainerFailLaunch/0.
> > 
> > We could address the issue by handling the launch failures, but there 
> > are 3 places where we should handle it. That change would make composing 
> > c'zer more complex IMO.
> > ```
> > 
> > I'm a bit confused. Are you saying that in the case of launch failure, 
> > we rely on the caller to call `destroy()` in order to remove the container 
> > from the `containers_` map? It looks to me like we handle the launch 
> > failure case with the following code:
> > ```
> >   // If we are here, the launch is not supported by `containerizer`.
> > 
> >   // Try the next containerizer.
> >   ++containerizer;
> > 
> >   if (containerizer == containerizers_.end()) {
> > // If we are here none of the containerizers support the launch.
> > 
> > // We set this to `None` because the container has no chance of
> > // getting launched by any containerizer. This is similar to what
> > // would happen if the destroy "started" after launch returned 
> > false.
> > container->destroyed.set(Option::none());
> > 
> > // We destroy the container irrespective whether
> > // a destroy is already in progress, for simplicity.
> > containers_.erase(containerId);
> > delete container;
> > 
> > // None of the containerizers support the launch.
> > return Containerizer::LaunchResult::NOT_SUPPORTED;
> >   }
> > ```
> > 
> > My hope would be that if the caller attempts to launch a task with the 
> > composing containerizer and the launch fails, then the `containers_` map 
> > will be updated so that the container does not exist anymore. Is that not 
> > the case?
> > 
> > I don't quite understand why it's not simple to remove the 
> > `containers_.erase(containerId)` code from `destroy()` and simply rely on 
> > the code here to erase it whenever `wait()` returns.
> 
> Andrei Budnik wrote:
> >we handle the launch failure case with the following code
> 
> We have 3 places in composing where we call `containerizer->launch()` and 
> then subscribe `_launch()` callback using `.then()` which doesn't call the 
> callback if a future returned by `containerizer->launch()` is failed. See 
> test `NestedContainerFailLaunch` from my previous comment that reproduces 
> this case.
> The following code handles only the case when an underlying c'zer doesn't 
> support the launch.
> 
> I tried to compare two (?) possible solutions:
> 1) Add `.onFailed()` callback in all 3 places where we call 
> `containerizer->launch().next(...)` to clean `containers_` map.
> 2) Add `.onAny()` callback for `containerizer->destroy()` in 
> `ComposingContainer::destroy()` to clean `containers_` map.
> 
> Also, first solution requires adding `containerizer->wait()` code to the 
> `ComposingContainerizerProcess::__recover()` in order to be able to clean 
> `containers_` map after recovery.
> Finally, I decided to implement second solution, because first one "would 
> make composing c'zer more complex IMO".
> 
> Greg Mann wrote:
> OK that makes sense, thanks!!
> 
> So, the current implementation requires that `destroy()` must be called 
> after a failed launch in order to clean up the `containers_` map. So, are we 
> relying on the fact that the caller (Mesos agent) will always call 
> `destroy()` after a failed launch? Do the Mesos and Docker containerizers 
> also require that the agent must do this in order to maintain consistent 
> state in the containerizer?

>Do the Mesos and Docker containerizers also require that the agent must do 
>this in order to maintain consistent state in the containerizer?

If an attempt to launch a 

Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-17 Thread Greg Mann


> On May 15, 2018, 11:05 p.m., Greg Mann wrote:
> > src/slave/containerizer/composing.cpp
> > Line 361 (original), 360-365 (patched)
> > 
> >
> > Moving our previous discussion from https://reviews.apache.org/r/9/ 
> > to this RR, since the two have been merged.
> > 
> > You said:
> > ```
> > Composing c'zer adds a container to the containers_ map and then calls 
> > launch() on underlying c'zer. However, composing c'zer doesn't handle the 
> > case when an underlying containerizer returns a failure during the launch, 
> > so the container will not be removed from the containers_ map. If we want 
> > to call destroy() without the callback, then there is no chance to get the 
> > container removed from the containers_ map in this case.
> > 
> > To check that, you can remove the callback from destroy() (and 
> > optionally copy-paste containerizer->wait(containerId).onAny(...) to 
> > _recover()), then run tests and you'll get a few tests failing, including 
> > ParentChildContainerTypeAndContentType/AgentContainerAPITest.NestedContainerFailLaunch/0.
> > 
> > We could address the issue by handling the launch failures, but there 
> > are 3 places where we should handle it. That change would make composing 
> > c'zer more complex IMO.
> > ```
> > 
> > I'm a bit confused. Are you saying that in the case of launch failure, 
> > we rely on the caller to call `destroy()` in order to remove the container 
> > from the `containers_` map? It looks to me like we handle the launch 
> > failure case with the following code:
> > ```
> >   // If we are here, the launch is not supported by `containerizer`.
> > 
> >   // Try the next containerizer.
> >   ++containerizer;
> > 
> >   if (containerizer == containerizers_.end()) {
> > // If we are here none of the containerizers support the launch.
> > 
> > // We set this to `None` because the container has no chance of
> > // getting launched by any containerizer. This is similar to what
> > // would happen if the destroy "started" after launch returned 
> > false.
> > container->destroyed.set(Option::none());
> > 
> > // We destroy the container irrespective whether
> > // a destroy is already in progress, for simplicity.
> > containers_.erase(containerId);
> > delete container;
> > 
> > // None of the containerizers support the launch.
> > return Containerizer::LaunchResult::NOT_SUPPORTED;
> >   }
> > ```
> > 
> > My hope would be that if the caller attempts to launch a task with the 
> > composing containerizer and the launch fails, then the `containers_` map 
> > will be updated so that the container does not exist anymore. Is that not 
> > the case?
> > 
> > I don't quite understand why it's not simple to remove the 
> > `containers_.erase(containerId)` code from `destroy()` and simply rely on 
> > the code here to erase it whenever `wait()` returns.
> 
> Andrei Budnik wrote:
> >we handle the launch failure case with the following code
> 
> We have 3 places in composing where we call `containerizer->launch()` and 
> then subscribe `_launch()` callback using `.then()` which doesn't call the 
> callback if a future returned by `containerizer->launch()` is failed. See 
> test `NestedContainerFailLaunch` from my previous comment that reproduces 
> this case.
> The following code handles only the case when an underlying c'zer doesn't 
> support the launch.
> 
> I tried to compare two (?) possible solutions:
> 1) Add `.onFailed()` callback in all 3 places where we call 
> `containerizer->launch().next(...)` to clean `containers_` map.
> 2) Add `.onAny()` callback for `containerizer->destroy()` in 
> `ComposingContainer::destroy()` to clean `containers_` map.
> 
> Also, first solution requires adding `containerizer->wait()` code to the 
> `ComposingContainerizerProcess::__recover()` in order to be able to clean 
> `containers_` map after recovery.
> Finally, I decided to implement second solution, because first one "would 
> make composing c'zer more complex IMO".

OK that makes sense, thanks!!

So, the current implementation requires that `destroy()` must be called after a 
failed launch in order to clean up the `containers_` map. So, are we relying on 
the fact that the caller (Mesos agent) will always call `destroy()` after a 
failed launch? Do the Mesos and Docker containerizers also require that the 
agent must do this in order to maintain consistent state in the containerizer?


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203185
---


On April 17, 2018, 3:23 p.m., Andrei 

Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-16 Thread Andrei Budnik


> On May 15, 2018, 11:05 p.m., Greg Mann wrote:
> > src/slave/containerizer/composing.cpp
> > Line 361 (original), 360-365 (patched)
> > 
> >
> > Moving our previous discussion from https://reviews.apache.org/r/9/ 
> > to this RR, since the two have been merged.
> > 
> > You said:
> > ```
> > Composing c'zer adds a container to the containers_ map and then calls 
> > launch() on underlying c'zer. However, composing c'zer doesn't handle the 
> > case when an underlying containerizer returns a failure during the launch, 
> > so the container will not be removed from the containers_ map. If we want 
> > to call destroy() without the callback, then there is no chance to get the 
> > container removed from the containers_ map in this case.
> > 
> > To check that, you can remove the callback from destroy() (and 
> > optionally copy-paste containerizer->wait(containerId).onAny(...) to 
> > _recover()), then run tests and you'll get a few tests failing, including 
> > ParentChildContainerTypeAndContentType/AgentContainerAPITest.NestedContainerFailLaunch/0.
> > 
> > We could address the issue by handling the launch failures, but there 
> > are 3 places where we should handle it. That change would make composing 
> > c'zer more complex IMO.
> > ```
> > 
> > I'm a bit confused. Are you saying that in the case of launch failure, 
> > we rely on the caller to call `destroy()` in order to remove the container 
> > from the `containers_` map? It looks to me like we handle the launch 
> > failure case with the following code:
> > ```
> >   // If we are here, the launch is not supported by `containerizer`.
> > 
> >   // Try the next containerizer.
> >   ++containerizer;
> > 
> >   if (containerizer == containerizers_.end()) {
> > // If we are here none of the containerizers support the launch.
> > 
> > // We set this to `None` because the container has no chance of
> > // getting launched by any containerizer. This is similar to what
> > // would happen if the destroy "started" after launch returned 
> > false.
> > container->destroyed.set(Option::none());
> > 
> > // We destroy the container irrespective whether
> > // a destroy is already in progress, for simplicity.
> > containers_.erase(containerId);
> > delete container;
> > 
> > // None of the containerizers support the launch.
> > return Containerizer::LaunchResult::NOT_SUPPORTED;
> >   }
> > ```
> > 
> > My hope would be that if the caller attempts to launch a task with the 
> > composing containerizer and the launch fails, then the `containers_` map 
> > will be updated so that the container does not exist anymore. Is that not 
> > the case?
> > 
> > I don't quite understand why it's not simple to remove the 
> > `containers_.erase(containerId)` code from `destroy()` and simply rely on 
> > the code here to erase it whenever `wait()` returns.

>we handle the launch failure case with the following code

We have 3 places in composing where we call `containerizer->launch()` and then 
subscribe `_launch()` callback using `.then()` which doesn't call the callback 
if a future returned by `containerizer->launch()` is failed. See test 
`NestedContainerFailLaunch` from my previous comment that reproduces this case.
The following code handles only the case when an underlying c'zer doesn't 
support the launch.

I tried to compare two (?) possible solutions:
1) Add `.onFailed()` callback in all 3 places where we call 
`containerizer->launch().next(...)` to clean `containers_` map.
2) Add `.onAny()` callback for `containerizer->destroy()` in 
`ComposingContainer::destroy()` to clean `containers_` map.

Also, first solution requires adding `containerizer->wait()` code to the 
`ComposingContainerizerProcess::__recover()` in order to be able to clean 
`containers_` map after recovery.
Finally, I decided to implement second solution, because first one "would make 
composing c'zer more complex IMO".


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203185
---


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored 

Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-15 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review203185
---




src/slave/containerizer/composing.cpp
Line 361 (original), 360-365 (patched)


Moving our previous discussion from https://reviews.apache.org/r/9/ to 
this RR, since the two have been merged.

You said:
```
Composing c'zer adds a container to the containers_ map and then calls 
launch() on underlying c'zer. However, composing c'zer doesn't handle the case 
when an underlying containerizer returns a failure during the launch, so the 
container will not be removed from the containers_ map. If we want to call 
destroy() without the callback, then there is no chance to get the container 
removed from the containers_ map in this case.

To check that, you can remove the callback from destroy() (and optionally 
copy-paste containerizer->wait(containerId).onAny(...) to _recover()), then run 
tests and you'll get a few tests failing, including 
ParentChildContainerTypeAndContentType/AgentContainerAPITest.NestedContainerFailLaunch/0.

We could address the issue by handling the launch failures, but there are 3 
places where we should handle it. That change would make composing c'zer more 
complex IMO.
```

I'm a bit confused. Are you saying that in the case of launch failure, we 
rely on the caller to call `destroy()` in order to remove the container from 
the `containers_` map? It looks to me like we handle the launch failure case 
with the following code:
```
  // If we are here, the launch is not supported by `containerizer`.

  // Try the next containerizer.
  ++containerizer;

  if (containerizer == containerizers_.end()) {
// If we are here none of the containerizers support the launch.

// We set this to `None` because the container has no chance of
// getting launched by any containerizer. This is similar to what
// would happen if the destroy "started" after launch returned false.
container->destroyed.set(Option::none());

// We destroy the container irrespective whether
// a destroy is already in progress, for simplicity.
containers_.erase(containerId);
delete container;

// None of the containerizers support the launch.
return Containerizer::LaunchResult::NOT_SUPPORTED;
  }
```

My hope would be that if the caller attempts to launch a task with the 
composing containerizer and the launch fails, then the `containers_` map will 
be updated so that the container does not exist anymore. Is that not the case?

I don't quite understand why it's not simple to remove the 
`containers_.erase(containerId)` code from `destroy()` and simply rely on the 
code here to erase it whenever `wait()` returns.


- Greg Mann


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we stored `destroyed` promise for each container and used
> it to guarantee that `destroy()` returns a non-empty value when the
> destroy-in-progress stops an launch-in-progress using the next
> containerizer. Since `wait()` and `destroy()` return the same
> `ContainerTermination` value when called with the same ContainerID
> argument, we can remove `destroyed` promise and add callbacks to
> clean up `containers_` map instead.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 1fb79f53b48154ecbd3b0165b6a8002c871e6dce 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/4/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-11 Thread Andrei Budnik


> On May 9, 2018, 3:15 p.m., Qian Zhang wrote:
> > src/slave/containerizer/composing.cpp
> > Lines 658-661 (original)
> > 
> >
> > If we remove these codes, will the container be missed to delete and 
> > erased from `containers_`?
> 
> Qian Zhang wrote:
> Ah, Just saw the code to delete it in the next patch. Anyway I think it 
> may be better to put that change into this patch :-)
> 
> Andrei Budnik wrote:
> `container_` cleanup is implemented in the following patch `/r/9/`.

Merged the next patch into this one.


- Andrei


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review202746
---


On April 17, 2018, 3:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 3:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we were using `destroyed` promise for each container to
> guarantee that calling `destroy()` on a container returns a non-empty
> value when the destroy-in-progress stops an launch-in-progress using
> the next containerizer. Since we are unifying the semantics of the
> return type for both `wait()` and `destroy()` operations, we can
> remove the `destroyed` promise.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 186102c66d373dcd799cadd9fed7d1c8cb894971 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/2/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-11 Thread Andrei Budnik


> On May 9, 2018, 3:15 p.m., Qian Zhang wrote:
> > I checked the callers of `Containerizer::destroy()`, it seems no one 
> > actually cares about its return value (`Option`), so 
> > why do we need to return that? Can we just make it return `Future`?
> 
> Qian Zhang wrote:
> And then for the last line of `ComposingContainerizerProcess::destroy`:
> ```
> return container->containerizer->wait(containerId);
> ```
> We could change it to something like:
> ```
> return container->containerizer->wait(containerId)
> .onAny([](const Future

Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-09 Thread Qian Zhang


> On May 9, 2018, 11:15 p.m., Qian Zhang wrote:
> > src/slave/containerizer/composing.cpp
> > Lines 658-661 (original)
> > 
> >
> > If we remove these codes, will the container be missed to delete and 
> > erased from `containers_`?

Ah, Just saw the code to delete it in the next patch. Anyway I think it may be 
better to put that change into this patch :-)


- Qian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review202746
---


On April 17, 2018, 11:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 11:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we were using `destroyed` promise for each container to
> guarantee that calling `destroy()` on a container returns a non-empty
> value when the destroy-in-progress stops an launch-in-progress using
> the next containerizer. Since we are unifying the semantics of the
> return type for both `wait()` and `destroy()` operations, we can
> remove the `destroyed` promise.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 186102c66d373dcd799cadd9fed7d1c8cb894971 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/2/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-09 Thread Qian Zhang


> On May 9, 2018, 11:15 p.m., Qian Zhang wrote:
> > I checked the callers of `Containerizer::destroy()`, it seems no one 
> > actually cares about its return value (`Option`), so 
> > why do we need to return that? Can we just make it return `Future`?

And then for the last line of `ComposingContainerizerProcess::destroy`:
```
return container->containerizer->wait(containerId);
```
We could change it to something like:
```
return container->containerizer->wait(containerId)
.onAny([](const Future

Re: Review Request 66668: Removed `destroyed` from `Container` struct in composing containerizer.

2018-05-09 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8/#review202746
---



I checked the callers of `Containerizer::destroy()`, it seems no one actually 
cares about its return value (`Option`), so why do we 
need to return that? Can we just make it return `Future`?


src/slave/containerizer/composing.cpp
Line 610 (original), 594 (patched)


Do we still need the `switch case`? It seems an `if else` is more readable.



src/slave/containerizer/composing.cpp
Lines 658-661 (original)


If we remove these codes, will the container be missed to delete and erased 
from `containers_`?


- Qian Zhang


On April 17, 2018, 11:23 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8/
> ---
> 
> (Updated April 17, 2018, 11:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Greg Mann, Jie Yu, and Qian 
> Zhang.
> 
> 
> Bugs: MESOS-8712
> https://issues.apache.org/jira/browse/MESOS-8712
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we were using `destroyed` promise for each container to
> guarantee that calling `destroy()` on a container returns a non-empty
> value when the destroy-in-progress stops an launch-in-progress using
> the next containerizer. Since we are unifying the semantics of the
> return type for both `wait()` and `destroy()` operations, we can
> remove the `destroyed` promise.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.cpp 
> 186102c66d373dcd799cadd9fed7d1c8cb894971 
> 
> 
> Diff: https://reviews.apache.org/r/8/diff/2/
> 
> 
> Testing
> ---
> 
> internal CI
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>