-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61473/#review191976
-----------------------------------------------------------



Almost there!


include/mesos/mesos.proto
Lines 344-346 (original), 342 (patched)
<https://reviews.apache.org/r/61473/#comment269950>

    Given the potential confusion about the chagned behavior, at the API level 
I think there's value in explaining the history a little bit:
    
    ```
    Prior to Mesos 1.5, such tasks will be killed by Mesos when the agent 
reregisters (unless the master has failed over). However due to the lack of 
benefit in maintaining different behaviors depending on whether the master has 
failed over (see MESOS-7215), as of 1.5, Mesos will not kill these tasks in 
either case.
    ```
    
    How does it soud?



include/mesos/v1/mesos.proto
Lines 342-344 (original), 340 (patched)
<https://reviews.apache.org/r/61473/#comment269957>

    Ditto.



src/master/http.cpp
Line 322 (original), 323 (patched)
<https://reviews.apache.org/r/61473/#comment269983>

    Do we need a 
    
    ```
    if (task.get()->state() != TASK_UNREACHABLE) {
      continue;
    }
    ```
    
    here?



src/master/master.hpp
Line 873 (original), 873 (patched)
<https://reviews.apache.org/r/61473/#comment269994>

    We need the comment about `unreachable` here as well?



src/master/master.hpp
Line 2644 (original), 2645 (patched)
<https://reviews.apache.org/r/61473/#comment269995>

    Are there two cases here or `unreachable` is the only variable that matters?
    
    We can assert the following invariant inside an `if (unreachable) {}`:
    
    ```
    CHECK(task->state() == TASK_UNREACHABLE || task->state() == TASK_LOST) << 
task->state();
    ```



src/master/master.hpp
Line 3160 (original), 3157 (patched)
<https://reviews.apache.org/r/61473/#comment269996>

    s/all tasks/tasks runing/



src/master/master.cpp
Lines 6804-6807 (original), 6779-6782 (patched)
<https://reviews.apache.org/r/61473/#comment269997>

    Fix indentation.



src/master/master.cpp
Line 6849 (original), 6820 (patched)
<https://reviews.apache.org/r/61473/#comment269998>

    s/Determine which frameworks on the slave to shutdown, if any.//
    
    Previously when there are two cases, this sentence is a summary. Now that 
there is only one case, this sentence feels redudant.



src/master/master.cpp
Line 9370 (original), 9322 (patched)
<https://reviews.apache.org/r/61473/#comment270000>

    I have the similar feeling as Ilya, without context it's hard to understand 
this line. Although the context is not too far above, why not just define it 
within the context?
    
    ```
        TaskState newTaskState = TASK_UNREACHABLE;
        TaskStatus::Reason newTaskReason = TaskStatus::REASON_SLAVE_REMOVED;
        
        // Needed to convey task unreachablility because we lose this 
information
        // from the task state if `TASK_LOST` is used.
        bool unreachable = true;
    
        if (!framework->capabilities.partitionAware) {
          newTaskState = TASK_LOST;
        } else if (unreachableTime.isNone()) {
          unreachable = false;
          newTaskState = TASK_GONE_BY_OPERATOR;
          newTaskReason = TaskStatus::REASON_SLAVE_REMOVED_BY_OPERATOR;
        }
    ```
    
    Note that I simplied the nested if else structure. Does it look right?



src/master/master.cpp
Line 9582 (original), 9535 (patched)
<https://reviews.apache.org/r/61473/#comment270001>

    Assert this invariant inside this `if`?
    
    ```
    CHECK(!unreachable) << task->task_id();
    ```



src/master/master.cpp
Line 10318 (original), 10271 (patched)
<https://reviews.apache.org/r/61473/#comment270002>

    Use `foreachvalue`. `values()` call will copy out the values.


- Jiang Yan Xu


On Nov. 27, 2017, 4:56 p.m., Megha Sharma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61473/
> -----------------------------------------------------------
> 
> (Updated Nov. 27, 2017, 4:56 p.m.)
> 
> 
> Review request for mesos, James Peach, Vinod Kone, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7215
>     https://issues.apache.org/jira/browse/MESOS-7215
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Master will not kill the tasks for non-Partition aware frameworks
> when an unreachable agent re-registers with the master.
> Master used to send a ShutdownFrameworkMessages to the agent
> to kill the tasks from non partition aware frameworks including
> the ones that are still registered which was problematic because
> the offer from this agent could still go to the same framework which
> could then launch new tasks. The agent would then receive tasks
> of the same framework and ignore them because it thinks the
> framework is shutting down. The framework is not shutting down of
> course, so from the master and the scheduler's perspective the task
> is pending in STAGING forever until the next agent reregistration,
> which could happen much later. This commit fixes the problem by
> not shutting down the non-partition aware frameworks on such an
> agent.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto e194093e490741acc552fd3ad328fd710b4b4435 
>   include/mesos/v1/mesos.proto 6fb1139683952877667abbcf8bf84b5b31bcd29e 
>   src/master/http.cpp 10084125deb839a9846a4f64d2e433ff02754c02 
>   src/master/master.hpp a309fc78ee2613762f3d5d22ac7559afc7aac4a3 
>   src/master/master.cpp 2ddd67ada3731803b00883b6a1f32b20c1bb238f 
>   src/tests/partition_tests.cpp e49c474167076b4136a161ed29b11db9a13455a7 
> 
> 
> Diff: https://reviews.apache.org/r/61473/diff/22/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Megha Sharma
> 
>

Reply via email to