> On May 23, 2017, 9:32 p.m., Vinod Kone wrote: > > No tests!?
I have added a new test, so in total this change has two tests: one verifying that the state is recovered correctly and agentId is retained post the agent host reboot given the recovery finishes without errors and a second one to verify that no state is recovered and only the agentId is retained if the recovery fails after a reboot. > On May 23, 2017, 9:32 p.m., Vinod Kone wrote: > > src/slave/slave.cpp > > Line 5956 (original), 5967 (patched) > > <https://reviews.apache.org/r/56895/diff/4-6/?file=1693973#file1693973line5967> > > > > Add a comment here saying that we do this for backwards compatibiity, > > i.e., in Mesos <= 1.3 a rebooted agent did not recover checkpointed disk > > and registered as a new agent. Fixed > On May 23, 2017, 9:32 p.m., Vinod Kone wrote: > > src/tests/slave_recovery_tests.cpp > > Line 237 (original), 237 (patched) > > <https://reviews.apache.org/r/56895/diff/4-6/?file=1693977#file1693977line237> > > > > why this change in this review? looks independent. Actually, this was done to address Neil's comment about the variable name being too generic which seemed quite reasonable. See the comment below. `Can we rename _ack to something that identifies we're waiting for the agent to see the status update acknowledgment?` - Megha ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56895/#review175852 ----------------------------------------------------------- On June 9, 2017, 4:27 a.m., Megha Sharma wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56895/ > ----------------------------------------------------------- > > (Updated June 9, 2017, 4:27 a.m.) > > > Review request for mesos, Neil Conway, Vinod Kone, and Jiang Yan Xu. > > > Bugs: MESOS-6223 > https://issues.apache.org/jira/browse/MESOS-6223 > > > Repository: mesos > > > Description > ------- > > With partition awareness, the agents are now allowed to re-register > after they have been marked Unreachable. The executors are anyway > terminated on the agent when it reboots so there is no harm in > letting the agent keep its SlaveID, re-register with the master > and reconcile the lost executors. This is a pre-requisite for > supporting persistent/restartable tasks in mesos. > > > Diffs > ----- > > src/slave/containerizer/composing.cpp > a003e1b80dc9b4dec5b3fbbadb2daecf855c90c7 > src/slave/containerizer/docker.cpp 9f84109d7de22a39ace6e44e0c7d8d501bcb24de > src/slave/containerizer/mesos/containerizer.cpp > f3e6210eccd4a6b445ffd4447e69526d424ea36d > src/slave/slave.hpp 7ffaed14035a05259ec72c70532ee4f0affa1f5d > src/slave/slave.cpp 7d147ac6609933ac884bfc29032dba572a0952c6 > src/slave/state.hpp a497ce1f58fb8dc7718ee5bb10bc62dd7479efa5 > src/slave/state.cpp 18b790d2cc4f537cc9b0c3cca59b9cbaac0eda10 > src/tests/reservation_tests.cpp 6e9c215382ef41700921a673669ac1a7975e9b7f > src/tests/slave_recovery_tests.cpp 38502584186793686f78ff5f4e03f36a3bf7ad1c > > > Diff: https://reviews.apache.org/r/56895/diff/7/ > > > Testing > ------- > > make check > > > Thanks, > > Megha Sharma > >
