> On Feb. 4, 2015, 8:01 p.m., Ben Mahler wrote: > > src/master/master.cpp, lines 197-246 > > <https://reviews.apache.org/r/30514/diff/5/?file=847032#file847032line197> > > > > It looks like we only log when we're shutting down the slave, when > > there's no rate limiter? > > > > Have you considered having a single shutdown code path? This gets you a > > single dispatch, and a single log statement and it looks like it's less > > control flow? > > > > ``` > > void shutdown() > > { > > if (shuttingDown.isNone()) { > > Future<Nothing> future = Nothing(); > > > > if (limiter.isSome()) { > > LOG(INFO) << "Scheduling shutdown of slave " << slaveId > > << " due to health check timeout"; > > > > future = limiter.get()->acquire(); > > } > > > > shuttingDown = future.onAny(defer(self(), &Self::_shutdown)); > > > > ++metrics->slave_shutdowns_scheduled; > > } > > } > > > > void _shutdown() > > { > > CHECK_SOME(shuttingDown); > > > > const Future<Nothing>& future = shuttingDown.get(); > > > > CHECK(!future.isFailed()) << future.failure(); > > > > if (future.isReady()) { > > LOG(INFO) << "Shutting down slave " << slaveId > > << " due to health check timeout"; > > > > dispatch(master, > > &Master::shutdownSlave, > > slaveId, > > "health check timed out"); > > } else if (future.isDiscarded()) { > > // Shutdown was canceled. > > LOG(INFO) << "Canceling shutdown of slave " << slaveId > > << " since a pong is received!"; > > > > ++metrics->slave_shutdowns_canceled; > > } > > > > shuttingDown = None(); > > } > > ``` > > Jie Yu wrote: > Having both 'future' and 'shuttingDown' in 'shutdown' method looks > confusing to me:) You need to comment about why you need that future. Or, I > think having dup code is not a too bad thing in this case as the logic is > more clear IMO. > > Jie Yu wrote: > Another option to avoid dup code is: > > 1) s/shutdown/scheduleShutdown > 2) let 'shutdown' be the actual shutdown (sending message)
Imagine it is s/future/limited/: ``` Future<Nothing> limited = Nothing(); if (limiter.isSome()) { limited = limiter.get()->acquire(); } shuttingDown = limited.onAny(defer(self(), &Self::_shutdown)); ``` Seems pretty clear? Can you flush out the second option you mentioned in a code snippet? I didn't follow. - Ben ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30514/#review71021 ----------------------------------------------------------- On Feb. 4, 2015, 1:51 a.m., Vinod Kone wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/30514/ > ----------------------------------------------------------- > > (Updated Feb. 4, 2015, 1:51 a.m.) > > > Review request for mesos, Ben Mahler, David Robinson, and Jie Yu. > > > Bugs: MESOS-1148 > https://issues.apache.org/jira/browse/MESOS-1148 > > > Repository: mesos > > > Description > ------- > > The algorithm is simple. All the slave observers share a rate limiter whose > rate is configured via command line. When a slave times out on health check, > a permit is acquired to shutdown the slave *but* the pings are continued to > be sent. If a pong arrives before the permit is satisifed, the shutdown is > cancelled. > > > Diffs > ----- > > src/master/flags.hpp 6c18a1af625311ef149b5877b08f63c2b12c040d > src/master/master.hpp cd37ee9d3c57bcd91f08cd402ec1e4a09d9e42ee > src/master/master.cpp d04b2c4041d8fe8978b877f07579a6f907903e1b > src/tests/partition_tests.cpp fea78016268b007590516798eb30ff423fd0ae58 > src/tests/slave_tests.cpp e7e2af63da785644f3f7e6e23607c02be962a2c6 > > Diff: https://reviews.apache.org/r/30514/diff/ > > > Testing > ------- > > make check > > Ran the new tests 100 times > > > Thanks, > > Vinod Kone > >