> On March 4, 2014, 12:49 a.m., Vinod Kone wrote:
> > src/master/registrar.cpp, lines 222-227
> > <https://reviews.apache.org/r/18675/diff/1/?file=507859#file507859line222>
> >
> >     The semantics here are a bit hard to follow.
> >     
> >     How about instead of "recovering" variable you make 'recovered' 
> > Option<Promise<Registry> >? One side benefit is that when some one calls 
> > admit() etc with calling recover() first you could immediately return a 
> > failure if 'recovered.isNone()'. I think this explicitness is better than 
> > having those calls waiting forever?
> >     
> >     Also, why does it have to be MasterInfo? IIUC, all you want is someone 
> > to update the version of the registry. Can we just use a "bool" or "UUID" 
> > instead of MasterInfo?
> 
> Ben Mahler wrote:
>     1. Promises are not copyable or assignable, so you cannot use a Promise 
> inside an Option, unless you make it a pointer.
>     
>     2. The calls wait forever only if a call to recover() does not occur. We 
> could, as you said, fail any operations that are initiated by introducing 
> another check in the admit/readmit/remove methods:
>     
>     Future<bool> RegistrarProcess::admit(const SlaveInfo& info)
>     {
>       if (!info.has_id()) {
>         return Failure("SlaveInfo is missing the 'id' field");
>       }
>     
>       if (recovered.isPending() && !recovering) {
>         return Failure("Attempted to admit prior to recovering");
>       }
>     
>       return recovered.future()
>         .then(defer(self(), &Self::_admit, info));
>     }
>     
>     However, the old semantics were to provide "auto-recovery". We could 
> still provide auto-recovery if we had a factory function that allows the 
> Master to construct a Registrar by passing MasterInfo.
>     
>     3. You are correct, we technically do not need MasterInfo, but flipping a 
> boolean seems hacky, and a UUID is a bit out of place. The UUID would change 
> with each leader election, but what also changes with each leader election? 
> The MasterInfo. This was why I went with MasterInfo.
>     
>     Let me know your thoughts on these points!
> 
> Ben Mahler wrote:
>     We could use an Option<Owned<Promise>> to eliminate the need for a 
> 'recovering' boolean:
>     
>     if (!recovering && recovered.future.isPending()) // Before.
>     if (recovered.isNone()) // After.
>     
>     I'll look at cleaning this up.
> 
> Vinod Kone wrote:
>     The argument for 3) seems a bit backwards. You actually had to complicate 
> the code (needing factoring functions and 2)) because you are using 
> MasterInfo. Why use it in the first place when you don't have to and go 
> through hoops? I don't see why a UUID is out of place?
>     
>     Looking at it in a different way, a registry is a place to store 
> persistent information that other masters can use to recover state. A new 
> master doesn't need to recovery an old master's MasterInfo. The MasterInfo is 
> just present to resolve conflicts and seems like an overkill.

Thanks for bringing this up, I just realized that it is sufficient to simply 
'store' what we 'fetch' in order to cause a UUID change on the underlying state 
entry, note the comment here:

inline process::Future<Option<Variable> > State::store(const Variable& variable)
{
  // Note that we try and swap an entry even if the value didn't change!
  UUID uuid = UUID::fromBytes(variable.entry.uuid());

  // Create a new entry to replace the existing entry provided the
  // UUID matches.
  Entry entry;
  entry.set_name(variable.entry.name());
  entry.set_uuid(UUID::random().toBytes());
  entry.set_value(variable.entry.value());

  return storage->set(entry, uuid)
    .then(lambda::bind(&State::_store, entry, lambda::_1));
}

Surprised we didn't notice this earlier! Perhaps we should make this more clear 
in the state API.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18675/#review36069
-----------------------------------------------------------


On March 6, 2014, 8:12 p.m., Ben Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18675/
> -----------------------------------------------------------
> 
> (Updated March 6, 2014, 8:12 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Vinod Kone.
> 
> 
> Bugs: MESOS-764
>     https://issues.apache.org/jira/browse/MESOS-764
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As a safety measure, we would like to force a version change on the Registry 
> as a result of the recovery process. This helps prevent a "rogue" Master 
> (that still believes it is leading) from performing writes to the Registry.
> 
> This change adds a 'Recover' operation which adds the latest MasterInfo to 
> the Registry. Unfortunately, since the Registrar is injected to the Master, I 
> had to add 'MasterInfo' as an argument to the 'recover' function, thus 
> altering the design somewhat.
> 
> The current semantics are that there one must call 'recover' before calling 
> other operations, otherwise these other operations will fail.
> 
> 
> Diffs
> -----
> 
>   src/master/registrar.hpp 20734afc69055197e9ab90d42253c56e4af4b97c 
>   src/master/registrar.cpp 37337c07b24a96e71910b7c83085d159361a1188 
>   src/tests/registrar_tests.cpp 3bf42bd77a10470a2afc6fd8e1da30d6134e792c 
> 
> Diff: https://reviews.apache.org/r/18675/diff/
> 
> 
> Testing
> -------
> 
> Added a small test.
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>

Reply via email to