Re: Review Request 52149: Added support for waiting on child containers to the default executor.

2016-09-26 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52149/
---

(Updated Sept. 26, 2016, 10:42 p.m.)


Review request for mesos and Vinod Kone.


Changes
---

Review comments, NNFR.


Bugs: MESOS-6227
https://issues.apache.org/jira/browse/MESOS-6227


Repository: mesos


Description
---

This change adds support for waiting on child containers via the
`WAIT_NESTED_CONTAINER` call on the Agent API. If the connection
fails due to a temporary network blip, it reconnects with the
agent.


Diffs (updated)
-

  src/launcher/default_executor.cpp 2102fe8d70f0960fed669e1c4f0d6b6cd4af261c 

Diff: https://reviews.apache.org/r/52149/diff/


Testing
---

make check


Thanks,

Anand Mazumdar



Re: Review Request 52149: Added support for waiting on child containers to the default executor.

2016-09-26 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52149/#review150433
---


Fix it, then Ship it!





src/launcher/default_executor.cpp (line 103)


log a statement here?



src/launcher/default_executor.cpp (line 462)


LOG(INFO) maybe because it is making an API call?



src/launcher/default_executor.cpp (line 827)


Add a comment here for why the delay is needed?


- Vinod Kone


On Sept. 26, 2016, 4:31 a.m., Anand Mazumdar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52149/
> ---
> 
> (Updated Sept. 26, 2016, 4:31 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6227
> https://issues.apache.org/jira/browse/MESOS-6227
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This change adds support for waiting on child containers via the
> `WAIT_NESTED_CONTAINER` call on the Agent API. If the connection
> fails due to a temporary network blip, it reconnects with the
> agent.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 2102fe8d70f0960fed669e1c4f0d6b6cd4af261c 
> 
> Diff: https://reviews.apache.org/r/52149/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>



Re: Review Request 52149: Added support for waiting on child containers to the default executor.

2016-09-25 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52149/
---

(Updated Sept. 26, 2016, 4:31 a.m.)


Review request for mesos and Vinod Kone.


Changes
---

Added handling for the case when the agent might still be recovering when we 
make the `WAIT_NESTED_CONTAINER` call upon an agent failure.


Bugs: MESOS-6227
https://issues.apache.org/jira/browse/MESOS-6227


Repository: mesos


Description
---

This change adds support for waiting on child containers via the
`WAIT_NESTED_CONTAINER` call on the Agent API. If the connection
fails due to a temporary network blip, it reconnects with the
agent.


Diffs (updated)
-

  src/launcher/default_executor.cpp 2102fe8d70f0960fed669e1c4f0d6b6cd4af261c 

Diff: https://reviews.apache.org/r/52149/diff/


Testing
---

make check


Thanks,

Anand Mazumdar



Re: Review Request 52149: Added support for waiting on child containers to the default executor.

2016-09-25 Thread Anand Mazumdar


> On Sept. 24, 2016, 9:57 p.m., Vinod Kone wrote:
> > src/launcher/default_executor.cpp, lines 797-803
> > 
> >
> > Would be great if these can be merged into one struct `Container`
> > 
> > struct Container
> > {
> >   ContainerID containerId;
> >   TaskID taskId;
> >   Option waiting;
> > }

I left a `TODO` to address this.


- Anand


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52149/#review150318
---


On Sept. 23, 2016, 3:28 a.m., Anand Mazumdar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52149/
> ---
> 
> (Updated Sept. 23, 2016, 3:28 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6227
> https://issues.apache.org/jira/browse/MESOS-6227
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This change adds support for waiting on child containers via the
> `WAIT_NESTED_CONTAINER` call on the Agent API. If the connection
> fails due to a temporary network blip, it reconnects with the
> agent.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 2102fe8d70f0960fed669e1c4f0d6b6cd4af261c 
> 
> Diff: https://reviews.apache.org/r/52149/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>



Re: Review Request 52149: Added support for waiting on child containers to the default executor.

2016-09-25 Thread Anand Mazumdar


> On Sept. 24, 2016, 9:53 p.m., Vinod Kone wrote:
> > src/launcher/default_executor.cpp, line 427
> > 
> >
> > just inline this function.

This is invoked by the `retry()` handler too asynchronously. Hence, it can't be 
inlined.


> On Sept. 24, 2016, 9:53 p.m., Vinod Kone wrote:
> > src/launcher/default_executor.cpp, line 132
> > 
> >
> > don't you want to check `if (!killed)`?

hmm, take this scenario:

- We had sent `KILL_NESTED_CONTAINER` call(s) for killing child containers. 
While we were waiting on those containers to terminate, we got disconnected 
from the agent.
- We ignored all `waited()` callbacks thereafter owing to being disconnected.
- When the agent process was up again and after the executor subscribed again 
with the agent, we still want to `wait()` on the child containers to correctly 
forward the TASK_KILLED status updates to the scheduler. If we have the boolean 
`killed` check, we won't be able to `wait()` on the child containers any more.


- Anand


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52149/#review150316
---


On Sept. 23, 2016, 3:28 a.m., Anand Mazumdar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52149/
> ---
> 
> (Updated Sept. 23, 2016, 3:28 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6227
> https://issues.apache.org/jira/browse/MESOS-6227
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This change adds support for waiting on child containers via the
> `WAIT_NESTED_CONTAINER` call on the Agent API. If the connection
> fails due to a temporary network blip, it reconnects with the
> agent.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 2102fe8d70f0960fed669e1c4f0d6b6cd4af261c 
> 
> Diff: https://reviews.apache.org/r/52149/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>



Re: Review Request 52149: Added support for waiting on child containers to the default executor.

2016-09-25 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52149/
---

(Updated Sept. 25, 2016, 10:05 p.m.)


Review request for mesos and Vinod Kone.


Changes
---

Review comments.


Bugs: MESOS-6227
https://issues.apache.org/jira/browse/MESOS-6227


Repository: mesos


Description
---

This change adds support for waiting on child containers via the
`WAIT_NESTED_CONTAINER` call on the Agent API. If the connection
fails due to a temporary network blip, it reconnects with the
agent.


Diffs (updated)
-

  src/launcher/default_executor.cpp 2102fe8d70f0960fed669e1c4f0d6b6cd4af261c 

Diff: https://reviews.apache.org/r/52149/diff/


Testing
---

make check


Thanks,

Anand Mazumdar



Re: Review Request 52149: Added support for waiting on child containers to the default executor.

2016-09-24 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52149/#review150318
---




src/launcher/default_executor.cpp (lines 796 - 802)


Would be great if these can be merged into one struct `Container`

struct Container
{
  ContainerID containerId;
  TaskID taskId;
  Option waiting;
}


- Vinod Kone


On Sept. 23, 2016, 3:28 a.m., Anand Mazumdar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52149/
> ---
> 
> (Updated Sept. 23, 2016, 3:28 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6227
> https://issues.apache.org/jira/browse/MESOS-6227
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This change adds support for waiting on child containers via the
> `WAIT_NESTED_CONTAINER` call on the Agent API. If the connection
> fails due to a temporary network blip, it reconnects with the
> agent.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 2102fe8d70f0960fed669e1c4f0d6b6cd4af261c 
> 
> Diff: https://reviews.apache.org/r/52149/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>



Re: Review Request 52149: Added support for waiting on child containers to the default executor.

2016-09-24 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52149/#review150316
---




src/launcher/default_executor.cpp (line 132)


don't you want to check `if (!killed)`?



src/launcher/default_executor.cpp (line 427)


just inline this function.



src/launcher/default_executor.cpp (line 482)


log a statement here.



src/launcher/default_executor.cpp (line 501)


Log here?



src/launcher/default_executor.cpp (line 753)


log a statement here.



src/launcher/default_executor.cpp (lines 762 - 775)


I think it is worth logging insided onFailed and onDiscarded handlers 
before firing a delay.


- Vinod Kone


On Sept. 23, 2016, 3:28 a.m., Anand Mazumdar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52149/
> ---
> 
> (Updated Sept. 23, 2016, 3:28 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6227
> https://issues.apache.org/jira/browse/MESOS-6227
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This change adds support for waiting on child containers via the
> `WAIT_NESTED_CONTAINER` call on the Agent API. If the connection
> fails due to a temporary network blip, it reconnects with the
> agent.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 2102fe8d70f0960fed669e1c4f0d6b6cd4af261c 
> 
> Diff: https://reviews.apache.org/r/52149/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>



Re: Review Request 52149: Added support for waiting on child containers to the default executor.

2016-09-22 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52149/
---

(Updated Sept. 23, 2016, 3:28 a.m.)


Review request for mesos and Vinod Kone.


Changes
---

Review comments


Bugs: MESOS-6227
https://issues.apache.org/jira/browse/MESOS-6227


Repository: mesos


Description
---

This change adds support for waiting on child containers via the
`WAIT_NESTED_CONTAINER` call on the Agent API. If the connection
fails due to a temporary network blip, it reconnects with the
agent.


Diffs (updated)
-

  src/launcher/default_executor.cpp 2102fe8d70f0960fed669e1c4f0d6b6cd4af261c 

Diff: https://reviews.apache.org/r/52149/diff/


Testing
---

make check


Thanks,

Anand Mazumdar



Re: Review Request 52149: Added support for waiting on child containers to the default executor.

2016-09-22 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52149/
---

(Updated Sept. 22, 2016, 7:33 p.m.)


Review request for mesos and Vinod Kone.


Bugs: MESOS-6227
https://issues.apache.org/jira/browse/MESOS-6227


Repository: mesos


Description
---

This change adds support for waiting on child containers via the
`WAIT_NESTED_CONTAINER` call on the Agent API. If the connection
fails due to a temporary network blip, it reconnects with the
agent.


Diffs (updated)
-

  src/launcher/default_executor.cpp 2102fe8d70f0960fed669e1c4f0d6b6cd4af261c 

Diff: https://reviews.apache.org/r/52149/diff/


Testing
---

make check


Thanks,

Anand Mazumdar