Re: Review Request 66259: Added `max_completion_time` support to command executor.

2018-05-01 Thread Zhitao Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66259/
---

(Updated May 1, 2018, 9:41 a.m.)


Review request for mesos, Jason Lai and James Peach.


Changes
---

Review comments.


Bugs: MESOS-8725
https://issues.apache.org/jira/browse/MESOS-8725


Repository: mesos


Description
---

If `TaskInfo.max_completion_time` is set, command executor will kill
the task with `SIGKILL` immediately. Note that no KillPolicy will be
observed. Framework should only received a `TASK_FAILED` state with
`REASON_MAX_COMPLETION_TIME_REACHED` reason.


Diffs (updated)
-

  src/launcher/executor.cpp 8d0869cfcdfc693301d0e185a036e97204bad17f 


Diff: https://reviews.apache.org/r/66259/diff/6/

Changes: https://reviews.apache.org/r/66259/diff/5-6/


Testing
---


Thanks,

Zhitao Li



Re: Review Request 66259: Added `max_completion_time` support to command executor.

2018-04-30 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66259/#review202150
---


Fix it, then Ship it!





src/launcher/executor.cpp
Lines 1052 (patched)


```
"its maximum completion time of " << duration;
```



src/launcher/executor.cpp
Lines 1187 (patched)


This comment looks stale?


- James Peach


On April 23, 2018, 10:18 p.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66259/
> ---
> 
> (Updated April 23, 2018, 10:18 p.m.)
> 
> 
> Review request for mesos, Jason Lai and James Peach.
> 
> 
> Bugs: MESOS-8725
> https://issues.apache.org/jira/browse/MESOS-8725
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> If `TaskInfo.max_completion_time` is set, command executor will kill
> the task with `SIGKILL` immediately. Note that no KillPolicy will be
> observed. Framework should only received a `TASK_FAILED` state with
> `REASON_MAX_COMPLETION_TIME_REACHED` reason.
> 
> 
> Diffs
> -
> 
>   src/launcher/executor.cpp 8d0869cfcdfc693301d0e185a036e97204bad17f 
> 
> 
> Diff: https://reviews.apache.org/r/66259/diff/5/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>



Re: Review Request 66259: Added `max_completion_time` support to command executor.

2018-04-23 Thread Zhitao Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66259/
---

(Updated April 23, 2018, 3:18 p.m.)


Review request for mesos, Jason Lai and James Peach.


Changes
---

Review comments.


Bugs: MESOS-8725
https://issues.apache.org/jira/browse/MESOS-8725


Repository: mesos


Description
---

If `TaskInfo.max_completion_time` is set, command executor will kill
the task with `SIGKILL` immediately. Note that no KillPolicy will be
observed. Framework should only received a `TASK_FAILED` state with
`REASON_MAX_COMPLETION_TIME_REACHED` reason.


Diffs (updated)
-

  src/launcher/executor.cpp 8d0869cfcdfc693301d0e185a036e97204bad17f 


Diff: https://reviews.apache.org/r/66259/diff/5/

Changes: https://reviews.apache.org/r/66259/diff/4-5/


Testing
---


Thanks,

Zhitao Li



Re: Review Request 66259: Added `max_completion_time` support to command executor.

2018-04-13 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66259/#review201112
---




src/launcher/executor.cpp
Lines 590 (patched)


I think that it makes more sense to move this block to just before we call 
`launchTaskSubprocess`.

I wonder whether we should also make the log message show `Starting task 
$TASKID with maximum completion time of $DURATION`?



src/launcher/executor.cpp
Lines 750 (patched)


Move this into the `if` above



src/launcher/executor.cpp
Lines 936 (patched)


Should set maxCompletionTimer to `None()` as well, for consistency.



src/launcher/executor.cpp
Lines 1008 (patched)


Can `pid` ever be validly `None()` due to your changes? If it can, I'm not 
seeing the reason ...



src/launcher/executor.cpp
Lines 1036 (patched)


How about calling this `taskCompletionTimeout`?



src/launcher/executor.cpp
Lines 1037 (patched)


Consider adding a `CHECK(!killed)` here, since we should never do 
completion time handling after receiving a scheduler kill.



src/launcher/executor.cpp
Lines 1038 (patched)


I think this can be a `CHECK(!terminated)` since you cancel the completion 
timeout in `reaped()`.



src/launcher/executor.cpp
Lines 1043 (patched)


Suggest that we phrase this more naturally:
```
Killing task $TASKID which exceeded its maximum completion time
```

I'm assuming that you wanted to depend on the log message in `escalated` to 
capture the completion time?



src/launcher/executor.cpp
Lines 1049 (patched)


What happens if the scheduler sends a `kill` after the completion timeout 
fires? In this code, it seems like the `kill` will still be processed; there is 
also some inconsistency because we are killing the task but don't set the 
`killed` flag.

Maybe a better approach would be to call `kill()` from here with a 0 grace 
period? Then in `kill()` I'd argue that a 0 grace period should skip the 
`SIGTERM` phase.


- James Peach


On April 8, 2018, 12:51 a.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66259/
> ---
> 
> (Updated April 8, 2018, 12:51 a.m.)
> 
> 
> Review request for mesos, Jason Lai and James Peach.
> 
> 
> Bugs: MESOS-8725
> https://issues.apache.org/jira/browse/MESOS-8725
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> If `TaskInfo.max_completion_time` is set, command executor will kill
> the task with `SIGKILL` immediately. Note that no KillPolicy will be
> observed. Framework should only received a `TASK_FAILED` state with
> `REASON_MAX_COMPLETION_TIME_REACHED` reason.
> 
> 
> Diffs
> -
> 
>   src/launcher/executor.cpp 8d0869cfcdfc693301d0e185a036e97204bad17f 
> 
> 
> Diff: https://reviews.apache.org/r/66259/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>



Re: Review Request 66259: Added `max_completion_time` support to command executor.

2018-04-07 Thread Zhitao Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66259/
---

(Updated April 7, 2018, 5:51 p.m.)


Review request for mesos, Jason Lai and James Peach.


Changes
---

Rename and proper state.


Summary (updated)
-

Added `max_completion_time` support to command executor.


Bugs: MESOS-8725
https://issues.apache.org/jira/browse/MESOS-8725


Repository: mesos


Description (updated)
---

If `TaskInfo.max_completion_time` is set, command executor will kill
the task with `SIGKILL` immediately. Note that no KillPolicy will be
observed. Framework should only received a `TASK_FAILED` state with
`REASON_MAX_COMPLETION_TIME_REACHED` reason.


Diffs (updated)
-

  src/launcher/executor.cpp 383655230fcdad30b652ea984cb9f5fc9c49dd38 


Diff: https://reviews.apache.org/r/66259/diff/2/

Changes: https://reviews.apache.org/r/66259/diff/1-2/


Testing
---


Thanks,

Zhitao Li