-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48453/#review136777
-----------------------------------------------------------




src/sched/sched.cpp (line 1001)
<https://reviews.apache.org/r/48453/#comment201868>

    What happens in the following scenario:
    
    * framework launches task with executor (=> add UPID to `taskPids`)
    * agent where the task is running fails health checks (=> framework 
receives `TASK_LOST`, which is considered a terminal state per 
`isTerminalState()`, so we remove the UPID from `taskPids`)
    * master fails over and we reregister with a new master
    * agent reregisters with the master; this is allowed, per non-strict 
registry
    * we get `TASK_RUNNING` for the task
    
    ISTM we won't track the executor in `executorPids`, although we should.
    
    In general, the logic here seems pretty complicated and a little 
arbitrary...



src/sched/sched.cpp (line 1134)
<https://reviews.apache.org/r/48453/#comment201866>

    Can this actually occur?



src/sched/sched.cpp (line 1669)
<https://reviews.apache.org/r/48453/#comment201867>

    Is `taskPids` the best name here? Seems like we use this only to store task 
PIDs in the transient period between launching a task and getting a 
`TASK_RUNNING` update for it.


- Neil Conway


On June 9, 2016, 1:08 a.m., Anindya Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48453/
> -----------------------------------------------------------
> 
> (Updated June 9, 2016, 1:08 a.m.)
> 
> 
> Review request for mesos and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-5143
>     https://issues.apache.org/jira/browse/MESOS-5143
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Since UPIDs are tracked in the scheduler driver to be able to directly
> send FrameworkMessage to executor, we now track UPIDs for an executor
> running on an agent (instead for a slave). We track this mapping only
> for the life of the executor (instead of the life of the agent). This
> enables us to avoid sending lost slave message to all frameworks
> (instead of relevant frameworks only).
> 
> 
> Diffs
> -----
> 
>   src/sched/sched.cpp 9f561d73a2e591afdc3ba4adb35a11763dced402 
> 
> Diff: https://reviews.apache.org/r/48453/diff/
> 
> 
> Testing
> -------
> 
> All tests passed.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>

Reply via email to