> On Feb. 8, 2018, 8:22 p.m., Vinod Kone wrote:
> > src/master/master.cpp
> > Lines 2186 (patched)
> > <https://reviews.apache.org/r/65571/diff/1/?file=1954571#file1954571line2186>
> >
> >     s/Leader detector indicated no master elected/No master was elected/
> >     
> >     More importantly, this changes the semantics a bit. Previously if this 
> > master was the current leader it committed suicide even in this case. But 
> > we don't anymore. Is that what we want?
> >     
> >     
> >     Also, where in the interface does it say that None() is retryable. It 
> > says retryable errors are handled internally by the detector?
> 
> Benno Evers wrote:
>     Ok, I guess I misinterpreted the documentation on 
> `MasterDetector::detect()`:
>     
>     ```
>        * Returns MasterInfo after an election has occurred and the elected
>        * master is different than that specified (if any), or NONE if an
>        * election occurs and no master is elected (e.g., all masters are
>        * lost). A failed future is returned if the detector is unable to
>        * detect the leading master due to a non-retryable error.
>     ```
>     
>     Since electing no master sounds like an error, and the future is not 
> failed, I assumed that this case was implicitly classifid as retryable error.
>     
>     Anyways, for the semantics I assume that not aborting is at least what 
> the original author intended, otherwise there would be no point to 
> differentiate between `Error` and `None` in the API.
>     
>     Additionally, the code that causes the master to crash in this situation 
> was introduced relatively recently (11 July 2017, a8c7ae44c8), before that 
> the `detected()`-handler would have just set `leader` to `None` and quietly 
> continued. So I would argue that this fix is actually restoring the 
> previously existing behaviour, not changing it.
> 
> Vinod Kone wrote:
>     "...before that the detected()-handler would have just set leader to None 
> and quietly continued". Is this true? AFAICT the commit you mentioned only 
> adds leader domain related changes, doesn't change the behavior we are 
> talking about? See: https://reviews.apache.org/r/59763/

Oh, sorry, I thought you were talking about the case where a non-leading master 
is passed `None`, which also would have crashed before this fix but doesn't 
crash anymore.

I've updated the code path to restore the suicide in the case where `None` is 
passed to a leading master.


- Benno


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65571/#review197103
-----------------------------------------------------------


On Feb. 13, 2018, 8:05 p.m., Benno Evers wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65571/
> -----------------------------------------------------------
> 
> (Updated Feb. 13, 2018, 8:05 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Vinod Kone.
> 
> 
> Bugs: MESOS-8550
>     https://issues.apache.org/jira/browse/MESOS-8550
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The function `MasterDetector::detect()` returns a value of type
> `Future<Option<MasterInfo>>`, which, according to its documentation,
> can be `None` if an election occured and no master is elected.
> 
> However, the code in `Master::detected()` was only handling the
> cases of a failed future or a valid `MasterInfo` object.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp d7d22866f7a4eb87bd8949efafc97e828e7d4b94 
>   src/tests/cluster.hpp ad2b80e658d2f8afcefe9969d62cd33f0c475ce9 
>   src/tests/master_tests.cpp 28663c7a77096943949350abb3d13f9c04505f5b 
> 
> 
> Diff: https://reviews.apache.org/r/65571/diff/3/
> 
> 
> Testing
> -------
> 
> `./mesos-tests`
> 
> 
> Thanks,
> 
> Benno Evers
> 
>

Reply via email to