> On March 21, 2014, 1:46 p.m., Adam B wrote: > > src/master/master.cpp, line 974 > > <https://reviews.apache.org/r/19542/diff/1/?file=531713#file531713line974> > > > > Since allocator->frameworkActivated() calls allocate(), shouldn't we > > make the frameworkActivated() call after we "Remove any offers [previously] > > sent to this framework"? Otherwise, the allocate() call (in > > frameworkActivated) might allocate/offer new resources and then the > > following loop could call resourcesRecovered on the newly received > > offer/resources. > > Vinod Kone wrote: > I don't think this is a problem because the offers sent by the allocator > are only processed by the master (Master::offer) after it finishes executing > this function. But I do think calling frameworkActivated() after removing > current offers make sense here and in failoverFramework(). I will do that in > a subsequent review. Added a TODO. Is that ok?
Ok. TODOs acknowledged. > On March 21, 2014, 1:46 p.m., Adam B wrote: > > src/master/master.cpp, lines 2744-2747 > > <https://reviews.apache.org/r/19542/diff/1/?file=531713#file531713line2744> > > > > Should we be setting framework->active = false; here? I know we're just > > moving it into frameworks.completed, but none of those should think that > > they're "active". > > Vinod Kone wrote: > Based on how we resurrect a completed framework > (Master::readdCompletedFramework) this seems fine. But I would like to do > this also in a different review just to avoid doing too many changes in this > review. Added a TODO. No big deal. I don't think the active flag on a completed framework would cause a difference in behavior (yet). FYI, readdCompletedFramework doesn't really "resurrect" a completed framework which I interpret as adding it back to frameworks.activated. Instead, it is used on master failover so that a (re)registering slave will provide the new master with the state of its completed frameworks. The completed framework remains completed, but is added (back) to frameworks.completed on the new master. Yet another reason (doxygen) method comments would be helpful. I'll create a doxygen JIRA so we can get moving on that discussion. - Adam ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19542/#review38179 ----------------------------------------------------------- On March 21, 2014, 2:19 p.m., Vinod Kone wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/19542/ > ----------------------------------------------------------- > > (Updated March 21, 2014, 2:19 p.m.) > > > Review request for mesos and Ben Mahler. > > > Bugs: MESOS-1135 > https://issues.apache.org/jira/browse/MESOS-1135 > > > Repository: mesos-git > > > Description > ------- > > See bug for details. > > > Diffs > ----- > > src/master/master.cpp 3b28f72be2016997e30649d2b12ed0e8f1a57dd5 > src/tests/fault_tolerance_tests.cpp > 3f4796a0f5ed236d20906ee90e2cf8d30b5e228b > src/tests/master_tests.cpp 42c5a77425209d96f8e7286102a672be50463a40 > > Diff: https://reviews.apache.org/r/19542/diff/ > > > Testing > ------- > > Updated 2 tests to verify the behavior. > > make check > > ./bin/mesos-tests.sh > --gtest_filter="*FaultToleranceTest.FrameworkReregister*:*MasterInfoOnReElection*" > --gtest_repeat=1000 --gtest_break_on_failure > > > Thanks, > > Vinod Kone > >
