> On March 6, 2013, 7:23 p.m., Ben Mahler wrote: > > src/master/hierarchical_allocator_process.hpp, line 529 > > <https://reviews.apache.org/r/6665/diff/8/?file=264106#file264106line529> > > > > Looks to me like allocatable can just be a function on the slave struct: > > > > bool allocatable() const { return whitelisted && > > isSufficient(available); } > > > > Seems cleaner, no? That way we only have to update the whitelisted bit > > here and elsewhere. > > > > This loop would then become: > > > > foreachkey (const SlaveID& slaveId, slaves) { > > slave[slaveId].whitelisted = isWhitelisted(slaveId); > > } > > Thomas Marshall wrote: > I implemented this the way that you suggested because it seems > reasonable, but I just wanted to note that this means that the title of this > review is now meaningless and this patch only accomplishes restructuring the > allocator to have clearer code and does not do anything about speeding > allocations up, since we are back to checking whitelisting and resource > minimums on all slaves for every allocation now (again, as per your > suggestions). > > I'm assuming that your relative lack of concern about efficiency issues > means that Twitter isn't experiencing any problems with the allocator running > too slowly, so I'm perfectly fine with this.
Great! I don't quite get your comment here. We do indeed now check resource minimums but "checking whitelisting" is *only* checking the newly added whitelisted bit, rather than the hashset, right? (are you referring to line 699?) > On March 6, 2013, 7:23 p.m., Ben Mahler wrote: > > src/master/hierarchical_allocator_process.hpp, line 452 > > <https://reviews.apache.org/r/6665/diff/8/?file=264106#file264106line452> > > > > Inline? > > > > slaves[slaveId] = Slave(slaveInfo); > > > > Also, maybe you want to add whitelisted as an optional constructor > > parameter? > > Thomas Marshall wrote: > The problem with doing something like 'slaves[slaveId] = Slave(slaveInfo, > isWhitelisted(slaveId))' is that you're calling isWhitelisted before slaveId > is in slaves, meaning that the 'CHECK(slaves.contains(slaveId)' in > isWhitelisted will fail and 'slaves[slaveId].hostname()' won't return > anything. We could modify isWhitelisted to take the hostname as the parameter > instead of a slaveId, but given that we're also using isWhitelisted in > updateWhitelist, I would rather leave isWhitelisted the way it is. Got it, sounds like a bad abstraction, maybe we should fix that in a later review! - Ben ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6665/#review17482 ----------------------------------------------------------- On March 11, 2013, 9:59 p.m., Thomas Marshall wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/6665/ > ----------------------------------------------------------- > > (Updated March 11, 2013, 9:59 p.m.) > > > Review request for mesos and Benjamin Hindman. > > > Description > ------- > > Currently, every time we do an allocation we have to traverse the entire list > of active slaves an check each one to see if its whitelisted and if it has > resources to allocate. This patch keeps a set of all slaves that meet those > requirements, and updates it when slaves are added/removed and when resources > are allocated/recovered. > > Timing comparisons, using test_framework on a local cluster on my laptop > (each number is an average over 10 tests): > > 100 slaves, 100 tasks > without patch: 5.82s > with patch: 5.773s > improvement of about 1% > > 1000 slaves, 100 tasks > without patch: 8.261s > with patch: 8.07s > improvement of about 2% > > Since this was a scalability issue, you'd presumably see a bigger improvement > in larger clusters but its difficult to simulate more than 1000 slaves or so > locally. If more convincing numbers are needed I can do some testing on EC2 > (if nothing else, I'm hoping we'll have a Jenkins build with automated > performance tests set up later this semester, so I can test this in the > process of setting that up), but at the very least it seems clear that the > patch improves the runtime complexity of the allocation algorithm and doesn't > slow allocations down even in small clusters. > > > Diffs > ----- > > src/local/local.hpp 2633d25 > src/local/local.cpp 3402029 > src/master/allocator.hpp b68b67d > src/master/drf_sorter.hpp 79566cc > src/master/drf_sorter.cpp fe31a00 > src/master/hierarchical_allocator_process.hpp c1d6f54 > src/master/main.cpp ca0abec > src/master/master.hpp c9b4b3f > src/master/master.cpp 814a6e1 > src/master/sorter.hpp 73db6b1 > src/tests/allocator_tests.cpp b953cd1 > src/tests/allocator_zookeeper_tests.cpp 3f70202 > src/tests/fault_tolerance_tests.cpp 61509f1 > src/tests/gc_tests.cpp fbdd6d6 > src/tests/master_detector_tests.cpp 787ba19 > src/tests/master_tests.cpp 2ba14fc > src/tests/resource_offers_tests.cpp 44eaf0d > src/tests/sorter_tests.cpp 61e6038 > src/tests/utils.hpp d3efa58 > third_party/libprocess/third_party/stout/include/stout/stringify.hpp > 136316d > > Diff: https://reviews.apache.org/r/6665/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Thomas Marshall > >
