> On March 4, 2014, 12:49 a.m., Vinod Kone wrote:
> > src/master/registrar.cpp, lines 222-227
> > <https://reviews.apache.org/r/18675/diff/1/?file=507859#file507859line222>
> >
> >     The semantics here are a bit hard to follow.
> >     
> >     How about instead of "recovering" variable you make 'recovered' 
> > Option<Promise<Registry> >? One side benefit is that when some one calls 
> > admit() etc with calling recover() first you could immediately return a 
> > failure if 'recovered.isNone()'. I think this explicitness is better than 
> > having those calls waiting forever?
> >     
> >     Also, why does it have to be MasterInfo? IIUC, all you want is someone 
> > to update the version of the registry. Can we just use a "bool" or "UUID" 
> > instead of MasterInfo?
> 
> Ben Mahler wrote:
>     1. Promises are not copyable or assignable, so you cannot use a Promise 
> inside an Option, unless you make it a pointer.
>     
>     2. The calls wait forever only if a call to recover() does not occur. We 
> could, as you said, fail any operations that are initiated by introducing 
> another check in the admit/readmit/remove methods:
>     
>     Future<bool> RegistrarProcess::admit(const SlaveInfo& info)
>     {
>       if (!info.has_id()) {
>         return Failure("SlaveInfo is missing the 'id' field");
>       }
>     
>       if (recovered.isPending() && !recovering) {
>         return Failure("Attempted to admit prior to recovering");
>       }
>     
>       return recovered.future()
>         .then(defer(self(), &Self::_admit, info));
>     }
>     
>     However, the old semantics were to provide "auto-recovery". We could 
> still provide auto-recovery if we had a factory function that allows the 
> Master to construct a Registrar by passing MasterInfo.
>     
>     3. You are correct, we technically do not need MasterInfo, but flipping a 
> boolean seems hacky, and a UUID is a bit out of place. The UUID would change 
> with each leader election, but what also changes with each leader election? 
> The MasterInfo. This was why I went with MasterInfo.
>     
>     Let me know your thoughts on these points!
> 
> Ben Mahler wrote:
>     We could use an Option<Owned<Promise>> to eliminate the need for a 
> 'recovering' boolean:
>     
>     if (!recovering && recovered.future.isPending()) // Before.
>     if (recovered.isNone()) // After.
>     
>     I'll look at cleaning this up.
> 
> Vinod Kone wrote:
>     The argument for 3) seems a bit backwards. You actually had to complicate 
> the code (needing factoring functions and 2)) because you are using 
> MasterInfo. Why use it in the first place when you don't have to and go 
> through hoops? I don't see why a UUID is out of place?
>     
>     Looking at it in a different way, a registry is a place to store 
> persistent information that other masters can use to recover state. A new 
> master doesn't need to recovery an old master's MasterInfo. The MasterInfo is 
> just present to resolve conflicts and seems like an overkill.
> 
> Ben Mahler wrote:
>     Thanks for bringing this up, I just realized that it is sufficient to 
> simply 'store' what we 'fetch' in order to cause a UUID change on the 
> underlying state entry, note the comment here:
>     
>     inline process::Future<Option<Variable> > State::store(const Variable& 
> variable)
>     {
>       // Note that we try and swap an entry even if the value didn't change!
>       UUID uuid = UUID::fromBytes(variable.entry.uuid());
>     
>       // Create a new entry to replace the existing entry provided the
>       // UUID matches.
>       Entry entry;
>       entry.set_name(variable.entry.name());
>       entry.set_uuid(UUID::random().toBytes());
>       entry.set_value(variable.entry.value());
>     
>       return storage->set(entry, uuid)
>         .then(lambda::bind(&State::_store, entry, lambda::_1));
>     }
>     
>     Surprised we didn't notice this earlier! Perhaps we should make this more 
> clear in the state API.
> 
> Benjamin Hindman wrote:
>     I'd like us to caution assuming these _implementation_ semantics. What's 
> the problem with storing the MasterInfo? Are we worried about performance?

1. Can these implementation semantics be made into API semantics? 
   Ideally we can leave State::fetch free to make optimizations but I
   imagine that there will be others who want to cause a version change
   without mutating the data.

2. Vinod: did you mean "factory" functions?
   I didn't introduce a factory function yet. It's important to note that
   the complexity introduced in this change is primarily due to the
   introduction of 'recover' as well as the need to change the version of
   the Registry.

   The use of 'MasterInfo' to induce a Registry mutation results in only the
   following altered semantics:
     (a) The admit/readmit/remove operations no longer imply recovery, because
         they do not have the MasterInfo available. To deal with this we
         introduce a small if condition to return a Failure if an operation is
         attempted prior to recovery.

So, having a way to change the version without mutation would be ideal. 
Alternatively, I don't think introducing MasterInfo complicates the code.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18675/#review36069
-----------------------------------------------------------


On March 6, 2014, 8:12 p.m., Ben Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18675/
> -----------------------------------------------------------
> 
> (Updated March 6, 2014, 8:12 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Vinod Kone.
> 
> 
> Bugs: MESOS-764
>     https://issues.apache.org/jira/browse/MESOS-764
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As a safety measure, we would like to force a version change on the Registry 
> as a result of the recovery process. This helps prevent a "rogue" Master 
> (that still believes it is leading) from performing writes to the Registry.
> 
> This change adds a 'Recover' operation which adds the latest MasterInfo to 
> the Registry. Unfortunately, since the Registrar is injected to the Master, I 
> had to add 'MasterInfo' as an argument to the 'recover' function, thus 
> altering the design somewhat.
> 
> The current semantics are that there one must call 'recover' before calling 
> other operations, otherwise these other operations will fail.
> 
> 
> Diffs
> -----
> 
>   src/master/registrar.hpp 20734afc69055197e9ab90d42253c56e4af4b97c 
>   src/master/registrar.cpp 37337c07b24a96e71910b7c83085d159361a1188 
>   src/tests/registrar_tests.cpp 3bf42bd77a10470a2afc6fd8e1da30d6134e792c 
> 
> Diff: https://reviews.apache.org/r/18675/diff/
> 
> 
> Testing
> -------
> 
> Added a small test.
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>

Reply via email to