-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69451/#review210904
-----------------------------------------------------------



Thanks for jumping on this and producing a fix! Some small edit suggestions to 
the description:

```
The Framework::send function assumes that either http or pid is
set, which is not true for a unregistered framework recovered from agent  // 
s/unregistered//
reregistration. As a result, the master would crash when a recovered      // 
s/reregistration/that hasn't yet re-registered/
executor tries to send a message to such a framework. This patch fixes    // 
s/framework/framework (see MESOS-XXXX).
this crash bug.
```


src/master/master.hpp
Line 2595 (original), 2597 (patched)
<https://reviews.apache.org/r/69451/#comment295720>

    attempting?



src/master/master.hpp
Line 2599 (original), 2600-2604 (patched)
<https://reviews.apache.org/r/69451/#comment295722>

    Hm.. based on this comment, should we only be sending in the 
disconnected+pid case?
    
    Also, which messages is this referring to? It seems to suggest we let all 
messages through, but in the call sites we seem to guard this call to avoid 
calling it for the disconnected case?



src/master/master.hpp
Line 2609 (original), 2614 (patched)
<https://reviews.apache.org/r/69451/#comment295721>

    Do we want a similar log warning here for the else case?
    
    ```
    } else {
      LOG(WARNING) << "Unable to send event to framework " << *this << ":"
                   << " framework is recovered but hasn't re-registered";
    }
    ```
    
    Is this accurate or are there other cases? My read of the http/pid state 
comment is that it will be set based on the last connection, therefore the only 
time one of them won't be set is the recovered case (the change that broke the 
original one-being-set invariant).


- Benjamin Mahler


On Nov. 27, 2018, 4:58 a.m., Chun-Hung Hsiao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69451/
> -----------------------------------------------------------
> 
> (Updated Nov. 27, 2018, 4:58 a.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Benjamin Mahler, Greg Mann, 
> and Till Toenshoff.
> 
> 
> Bugs: MESOS-9419
>     https://issues.apache.org/jira/browse/MESOS-9419
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The `Framework::send` function assumes that either `http` or `pid` is
> set, which is not true for a unregistered framework recovered from agent
> reregistration. As a result, the master would crash when a recovered
> executor tries to send a message to such a framework. This patch fixes
> this crash bug.
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 3b3c1a4e61de9503c8d038dd3bee623ded5914c9 
>   src/master/master.cpp b4b02d8b4d7d6d1aabda1f97b9bf824419f76a9e 
> 
> 
> Diff: https://reviews.apache.org/r/69451/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>

Reply via email to