> On 二月 4, 2016, 3:46 p.m., Guangya Liu wrote: > > This case may happen when master is recovering, framework recovery start > > before some agent. Can you please add a unit test to cover this code > > change? It could be register framework first, then addslave and check the > > result. > > Alexander Rukletsov wrote: > Nope, this can't happen, because in Mesos we reconstruct framework state > through reconnecting agents. > > Guangya Liu wrote: > What about in the case when framework failover? Does there are any > potential steps/operations which can cause `addFramework()` be called before > `addSlave()`? Thanks. > > Alexander Rukletsov wrote: > It's tricky. `addFramework()` _can_ be called before some `addSlave()` > calls, right, but it's not possible in production code (the master) to report > about yet unknown agents in `addFramework()`. If you look at the master code, > we reconstruct framework's allocation from agents' allocation and ensure all > agents that constitute the "already known" part of the framework allocation > are known to the allocator _before_ we call `addFramework()`. However, we can > discover additional "pieces" of framework's allocation while more agents > rejoin the cluster, but at this point the framework is already known to the > allocator. Is my explanation clear enough?
Yes, I see, the master can make sure that the `addFramework()` can always count on the agents which was already joined the mesos cluster, the problem is that why we need this fix for now? The `agents` in allocator is posted by master, does there are any unsync between master and allocator for `agents`? - Guangya ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43105/#review117822 ----------------------------------------------------------- On 二月 2, 2016, 11:13 p.m., Alexander Rukletsov wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/43105/ > ----------------------------------------------------------- > > (Updated 二月 2, 2016, 11:13 p.m.) > > > Review request for mesos and Ben Mahler. > > > Repository: mesos > > > Description > ------- > > If `addFramework()` is called before `addSlave()` double accounting > of resources is possible. We have not observed bugs because Mesos > currently always adds a framework in the allocator after all agents > with the framework's tasks have been added. > > > Diffs > ----- > > include/mesos/master/allocator.hpp e163669c9c4e4c98572968f18987704b60722a79 > src/master/allocator/mesos/hierarchical.cpp > 1a07d69016407e5aad2209586da37fecbcddb765 > > Diff: https://reviews.apache.org/r/43105/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Alexander Rukletsov > >
