> On March 4, 2014, 12:49 a.m., Vinod Kone wrote:
> > src/master/registrar.cpp, lines 222-227
> > <https://reviews.apache.org/r/18675/diff/1/?file=507859#file507859line222>
> >
> > The semantics here are a bit hard to follow.
> >
> > How about instead of "recovering" variable you make 'recovered'
> > Option<Promise<Registry> >? One side benefit is that when some one calls
> > admit() etc with calling recover() first you could immediately return a
> > failure if 'recovered.isNone()'. I think this explicitness is better than
> > having those calls waiting forever?
> >
> > Also, why does it have to be MasterInfo? IIUC, all you want is someone
> > to update the version of the registry. Can we just use a "bool" or "UUID"
> > instead of MasterInfo?
1. Promises are not copyable or assignable, so you cannot use a Promise inside
an Option, unless you make it a pointer.
2. The calls wait forever only if a call to recover() does not occur. We could,
as you said, fail any operations that are initiated by introducing another
check in the admit/readmit/remove methods:
Future<bool> RegistrarProcess::admit(const SlaveInfo& info)
{
if (!info.has_id()) {
return Failure("SlaveInfo is missing the 'id' field");
}
if (recovered.isPending() && !recovering) {
return Failure("Attempted to admit prior to recovering");
}
return recovered.future()
.then(defer(self(), &Self::_admit, info));
}
However, the old semantics were to provide "auto-recovery". We could still
provide auto-recovery if we had a factory function that allows the Master to
construct a Registrar by passing MasterInfo.
3. You are correct, we technically do not need MasterInfo, but flipping a
boolean seems hacky, and a UUID is a bit out of place. The UUID would change
with each leader election, but what also changes with each leader election? The
MasterInfo. This was why I went with MasterInfo.
Let me know your thoughts on these points!
- Ben
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18675/#review36069
-----------------------------------------------------------
On March 3, 2014, 3:08 a.m., Ben Mahler wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18675/
> -----------------------------------------------------------
>
> (Updated March 3, 2014, 3:08 a.m.)
>
>
> Review request for mesos, Benjamin Hindman and Vinod Kone.
>
>
> Bugs: MESOS-764
> https://issues.apache.org/jira/browse/MESOS-764
>
>
> Repository: mesos-git
>
>
> Description
> -------
>
> As a safety measure, we would like to force a version change on the Registry
> as a result of the recovery process. This helps prevent a "rogue" Master
> (that still believes it is leading) from performing writes to the Registry.
>
> This change adds a 'Recover' operation which adds the latest MasterInfo to
> the Registry. Unfortunately, since the Registrar is injected to the Master, I
> had to add 'MasterInfo' as an argument to the 'recover' function, thus
> altering the design somewhat.
>
> The current semantics are that there is no strict ordering required between
> the 'recover' call and all other calls, however, all other calls are gated on
> recovery completing. That is, if the caller omits a call to 'recover', any
> other operations will be forever pending.
>
>
> Diffs
> -----
>
> src/master/registrar.hpp 20734afc69055197e9ab90d42253c56e4af4b97c
> src/master/registrar.cpp 37337c07b24a96e71910b7c83085d159361a1188
> src/tests/registrar_tests.cpp 3bf42bd77a10470a2afc6fd8e1da30d6134e792c
>
> Diff: https://reviews.apache.org/r/18675/diff/
>
>
> Testing
> -------
>
> Added a small test.
>
>
> Thanks,
>
> Ben Mahler
>
>