> On Feb. 2, 2015, 8:13 p.m., Jie Yu wrote: > > src/master/flags.hpp, line 152 > > <https://reviews.apache.org/r/30514/diff/1/?file=843908#file843908line152> > > > > The maximum rate? Does it make sense to you to rename the flag name as > > well? > > > > slave_removal_rate_limit?
sgtm. fixed. > On Feb. 2, 2015, 8:13 p.m., Jie Yu wrote: > > src/master/master.hpp, lines 575-577 > > <https://reviews.apache.org/r/30514/diff/1/?file=843909#file843909line575> > > > > The moment you pass it to the SlaveObserver, the ownership of the > > pointer is transferred and you cannot access to that anymore. > > > > I guess here you really don't have many options but using a raw pointer > > (and let the slave manages its lifecycle). I guess we did the same thing > > for Registrar, WhitelistWatcher, Allocator, etc. was using Owned as Shared with non-const because i remembered we did that before in our code base. will use shared_ptr with a NOTE instead. > On Feb. 2, 2015, 8:13 p.m., Jie Yu wrote: > > src/master/master.cpp, lines 185-207 > > <https://reviews.apache.org/r/30514/diff/1/?file=843910#file843910line185> > > > > Can we simply the timeout logic and move all the limiter related stuff > > to `shutdown` function? > > > > ``` > > void timeout() > > { > > if (pinged && ++timeouts >= XXX) { > > // Schedule a shutdown of the slave. > > shutdown(); > > } > > > > // NOTE: We keep sending PINGs even if a shutdown has > > // been scheduled so that we can cancel the shutdown > > // if the slave becomes reachable again. > > ping(); > > } > > > > void shutdown() > > { > > if (limiter.isNone()) { > > LOG(INFO) << "..."; > > > > dispatch(master, > > &Master::shutdownSlave, > > ...); > > return; > > } > > > > if (shuttingDown.isSome()) { > > // A shutdown has been scheduled already. > > return; > > } > > > > shuttingDown = limiter.get()->acquire() > > .onAny(defer(self(), &Self::_shutdown, ...); > > } > > > > void _shutdown(const Future<Nothing> future) > > { > > if (future.isReady()) { > > // Shutdown the slave. > > } else if (future.isFailed()) { > > // Not expected to happen? > > } else if (future.isDiscarded()) { > > // Do nothing here, maybe print a message. > > } > > > > shuttingDown = None(); > > } > > ``` looks good. thanks. - Vinod ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30514/#review70603 ----------------------------------------------------------- On Feb. 2, 2015, 6:51 p.m., Vinod Kone wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/30514/ > ----------------------------------------------------------- > > (Updated Feb. 2, 2015, 6:51 p.m.) > > > Review request for mesos, Ben Mahler, David Robinson, and Jie Yu. > > > Bugs: MESOS-1148 > https://issues.apache.org/jira/browse/MESOS-1148 > > > Repository: mesos > > > Description > ------- > > The algorithm is simple. All the slave observers share a rate limiter whose > rate is configured via command line. When a slave times out on health check, > a permit is acquired to shutdown the slave *but* the pings are continued to > be sent. If a pong arrives before the permit is satisifed, the shutdown is > cancelled. > > > Diffs > ----- > > src/master/flags.hpp 6c18a1af625311ef149b5877b08f63c2b12c040d > src/master/master.hpp 337e00aa46ea127f3667e3383d631c3fb8e22f30 > src/master/master.cpp 10056861b95ed9453c971787982db7d09f09f323 > src/tests/partition_tests.cpp fea78016268b007590516798eb30ff423fd0ae58 > src/tests/slave_tests.cpp e7e2af63da785644f3f7e6e23607c02be962a2c6 > > Diff: https://reviews.apache.org/r/30514/diff/ > > > Testing > ------- > > make check > > Ran the new tests 100 times > > > Thanks, > > Vinod Kone > >
