----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/64098/#review192895 -----------------------------------------------------------
Awesome! Only minor comments below. src/master/master.cpp Lines 6806 (patched) <https://reviews.apache.org/r/64098/#comment271242> src/master/master.cpp Lines 6806-6810 (patched) <https://reviews.apache.org/r/64098/#comment271244> The fact that an unknown agent is allowed to register is already logged: https://github.com/apache/mesos/blob/c9462f4927cfffb1f3a90827467ded730c0f40b9/src/master/registry_operations.cpp#L133 Here I was orignally thinking we needed to add some information about the fact that the update is for an unknown agent. However since the logging in `forward` already logs `update.status().message()`, perhaps it's fine and so we don't need a separate log line after all. :) src/master/master.cpp Lines 6812-6814 (patched) <https://reviews.apache.org/r/64098/#comment271261> Our convention if the two conditions are on separate lines: ``` const string message = slaves.unreachable.contains(slaveInfo.id()) ? "Unreachable agent re-reregistered" : "Unknown agent reregistered"; ``` Note that I s/re-registration/reregistered/ because it's more consistent with other status messages. src/master/master.cpp Line 6809 (original), 6816 (patched) <https://reviews.apache.org/r/64098/#comment271251> Should this line be the first thing done under the if condition since the rest of the logic is about sending status updates? src/master/master.cpp Lines 6833 (patched) <https://reviews.apache.org/r/64098/#comment271264> So we don't log anything when `framework == nullptr` but log a warning when `!framework->connected()`, should we treat the two cases the same? src/master/master.cpp Lines 6834-6835 (patched) <https://reviews.apache.org/r/64098/#comment271296> Let's log the message of the update as well (the same way `forward` does it). FWIW I think it's probably the best to log it in `ostream& operator<<(ostream& stream, const StatusUpdate& update)` but it probably requires a separate commit to refactor. src/master/master.cpp Lines 6842-6843 (patched) <https://reviews.apache.org/r/64098/#comment271266> This comment comes from https://github.com/apache/mesos/commit/678b864cb78c74cc98b960f921d07869ce3300c5 which I don't think is relevant anymore and can be removed. src/tests/master_allocator_tests.cpp Lines 1362 (patched) <https://reviews.apache.org/r/64098/#comment271269> Not yours but the use of `this->` is annoying here (we generally don't do it unless for disambiguation) so perhaps don't add more occurrences for now and keep the existing ocurrences for a future sweep. src/tests/master_tests.cpp Lines 2881-2883 (original), 2885-2887 (patched) <https://reviews.apache.org/r/64098/#comment271274> Can we also set the expectation for the new reason? src/tests/master_tests.cpp Lines 7188-7190 (patched) <https://reviews.apache.org/r/64098/#comment271291> You have put the expectations above `detector.appoint(master.get()->pid);` in the test above. I think it's a better and clearer alternative to prevent race conditions. Even though there's a `Clock::advance(agentFlags.registration_backoff_factor);` below to prevent it when the clock is paused, it's clearer and more foolproof if we don't insert the expectation between `detector.appoint(master.get()->pid);` and the clock advancement. Plus ``` detector.appoint(master.get()->pid); Clock::advance(agentFlags.registration_backoff_factor); ``` can be thought of as the same group of statements so better to be kept together. Here and below. src/tests/master_tests.cpp Lines 7198-7199 (patched) <https://reviews.apache.org/r/64098/#comment271272> Can we also set the expectation on the reason? src/tests/master_tests.cpp Lines 7661-7663 (patched) <https://reviews.apache.org/r/64098/#comment271292> Ditto on pulling it above `detector.appoint(master.get()->pid);`. src/tests/master_tests.cpp Lines 7669-7671 (patched) <https://reviews.apache.org/r/64098/#comment271276> Can we also set the expectation on the reason? Can we do this for other places (when reasonable)? src/tests/partition_tests.cpp Line 2181 (original), 2146 (patched) <https://reviews.apache.org/r/64098/#comment271286> How come the commit hook didn't catch this over 80 char line? src/tests/partition_tests.cpp Lines 2299 (patched) <https://reviews.apache.org/r/64098/#comment271288> Here it seems that we'll receive two status updates, one from the agent and one sent by the master. Setting `WillRepeatedly` here is probably racy due to ``` EXPECT_CALL(sched, statusUpdate(&driver, _)) .WillOnce(FutureArg<1>(&reconcileUpdate)); ``` below, not to mention the case could be interesting to test. We can set the expectation explicitly: The first update will come from the master, the reason being "agent_reregistered". The second update comes from the agent. - Jiang Yan Xu On Dec. 1, 2017, 4:29 p.m., Megha Sharma wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/64098/ > ----------------------------------------------------------- > > (Updated Dec. 1, 2017, 4:29 p.m.) > > > Review request for mesos, Ilya Pronin, James Peach, and Jiang Yan Xu. > > > Bugs: MESOS-6406 > https://issues.apache.org/jira/browse/MESOS-6406 > > > Repository: mesos > > > Description > ------- > > Master will send task status updates to frameworks when an agent > which has been previously removed by the master for being unreachable > or is unknown to the master due to the garbage collection of > the unreachable and gone agents in the registry re-registers. > > > Diffs > ----- > > src/master/master.cpp dfe60ef670edcaefa0c1241df2e2870f650fcf9e > src/tests/master_allocator_tests.cpp > 3400d70bb0ba564eac43c4639eee0efd4d8059e6 > src/tests/master_tests.cpp 57eae320a7a398527cd3623c89bf67f319a8e955 > src/tests/partition_tests.cpp 31ebfe1655438eceae74d72a223df03a9dbd282d > src/tests/persistent_volume_tests.cpp > 4aa3c2e8b0f461cd78053707cff8bcb2e6f2b0d7 > src/tests/slave_recovery_tests.cpp f14c6ef69eb20a03454c8197df79b572a3c6d050 > src/tests/upgrade_tests.cpp 7f434dbba858f636719eec24e92b306b76430c4c > > > Diff: https://reviews.apache.org/r/64098/diff/11/ > > > Testing > ------- > > with make check > > > Thanks, > > Megha Sharma > >
