> On Feb. 17, 2015, 7:13 p.m., Vinod Kone wrote: > > src/tests/master_tests.cpp, lines 1616-1624 > > <https://reviews.apache.org/r/31114/diff/2/?file=866348#file866348line1616> > > > > Hmm. I don't understand the fix here. > > > > First off, any expectations should be set *before* the corresponding > > actions (e.g., StartSlave). > > > > I think you just want to move the AWAIT_READY(slaveRegisteredMessage) > > before the second StartSlave()? IOW, > > > > ``` > > // Start the first slave and wait for it to be registered. > > Future<SlaveRegisteredMessage> slave1RegisteredMessage = > > FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _); > > > > StartSlave(); > > AWAIT_READY(slave1RegisteredMessage); > > > > // Start the second slave and wait for it to be registered. > > Future<SlaveRegisteredMessage> slave2RegisteredMessage = > > FUTURE_PROTOBUF(SlaveRegisteredMessage(), master.get(), _); > > > > StartSlave(); > > AWAIT_READY(slave2RegisteredMessage); > > ``` > > Alexander Rojas wrote: > That doesn't work, since the future will be ready before the second state > starts by the second `SlaveRegisteredMessage` sent to the first slave. If you > want to try it, just run `./bin/mesos-tests.sh > --gtest_filter="MasterTest.SlavesEndpoint*" --gtest_repeat=1000 > --gtest_break_on_failure` on the master and you will see what I mean.
Sorry about how short the first message was. You are right in the sense that the expectations should be set before, but they break the test eventually. Assuming the code you posted, consider the following timeline: 1. Expectation on `slave1RegisteredMessage` set. 2. First slave starts, it registers to the server. 3. The delay for retry registration ([here](https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L1092)) is almost 0, slave sends a second registration request to the master. 4. `AWAIT_READY(slave1RegisteredMessage)`, message arrives to the slave, future set, first slave registered. 5. The master replies to the request of 3. 6. Expectation on `slave2RegisteredMessage` set. 7. Message from master arrives from 3, Future `slave2RegisteredMessage` set. 8. Second slave starts. 9. `AWAIT_READY(slave2RegisteredMessage)` doesn't wait, it was fulfilled in 6. Filled with data not related to the second slave. Test fails, we are not sure slave2 is running. - Alexander ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31114/#review72741 ----------------------------------------------------------- On Feb. 17, 2015, 5:36 p.m., Alexander Rojas wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/31114/ > ----------------------------------------------------------- > > (Updated Feb. 17, 2015, 5:36 p.m.) > > > Review request for mesos, Ben Mahler, Till Toenshoff, and Vinod Kone. > > > Bugs: MESOS-2355 > https://issues.apache.org/jira/browse/MESOS-2355 > > > Repository: mesos-incubating > > > Description > ------- > > Sets the "to" end to all calls to `FUTURE_PROTOBUF` within the > MasterTest.SlavesEndpointTwoSlaves. > > Failing to do that causes the test to fail on very rare ocasions, due to the > first slave registering twice to the server (this is normal behavior > according to > [slave.cpp](https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L1087) > and > [master.cpp](https://github.com/apache/mesos/blob/master/src/master/master.cpp#L2912). > > It those ocasions, where the first slave registers twice, the master's sends > a reponse to both registration requests before the second slave's turn, thus > setting the future which was expecting data for the second slave with > incorrect data. > > > Diffs > ----- > > src/tests/master_tests.cpp 18eabd4 > > Diff: https://reviews.apache.org/r/31114/diff/ > > > Testing > ------- > > /bin/mesos-tests.sh --gtest_filter="MasterTest.SlavesEndpoint*" > --gtest_repeat=1000 --gtest_break_on_failure > > > Thanks, > > Alexander Rojas > >
