Re: Review Request: Speed up allocations by keeping a set of slaves with available resources

Thomas Marshall Wed, 27 Feb 2013 17:43:12 -0800


> On Feb. 26, 2013, midnight, Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 52
> > <https://reviews.apache.org/r/6665/diff/6/?file=261853#file261853line52>
> >
> >     "Slave" would be a nicer name. I see that the allocator is inside the 
> > master namespace and so you have to deal with the fact that Slave is 
> > present inside the master header as well.
> >     
> >     Looking at the signatures on the allocator, it seems we don't need to 
> > know about the Slave struct from the master. Looks to me like all we need 
> > from the master namespace is 'Master'.. So we should either change the 
> > allocator to be namespaced inside "allocator" or further hide the master's 
> > structs (inside a namespace). Do other people rely on those structs?


Given that the allocator is getting bigger and more complicated and now spans 
several files, I agree that its worth making an allocator namespace that 
includes Allocator, AllocatorProcess (and its subclass), and Sorter (and its 
subclasses). This will also let me make a Framework struct to deal with your 
comments about too many hashmaps.


> On Feb. 26, 2013, midnight, Ben Mahler wrote:
> > src/master/hierarchical_allocator_process.hpp, line 63
> > <https://reviews.apache.org/r/6665/diff/6/?file=261853#file261853line63>
> >
> >     Great, we managed to kill the allocatable map using this struct! I 
> > think we should push to reverse the trend in this file of using separate 
> > maps to track related things, which is why I'd like to see more logic 
> > pushed into this struct. Can you compare the performance to see if it's 
> > worth it to add the complexity of a separate hashset?
> >     
> >     Likewise for frameworks, I see maps from:
> >     FrameworkID -> user
> >     user -> FrameworkSorter
> >     
> >     It seems like these would be obviated by using a struct instead, right?
> >     
> >     For now, TODOs are fine if you don't want to take this on, but I don't 
> > think we'll want to add much more here before we clean this stuff up a bit.

After running some tests, the speed difference with keeping 'available' as a 
separate hashset are pretty minor, so I'll get rid of it. I also think that its 
worth including a flag in Slave for the whitelist instead of checking against 
the hashset of hostnames every time.

I'm fine with doing the same thing for frameworks now, though it means this 
patch is getting kind of big. I was able to get rid of the FrameworkID -> user 
map as well as the FrameworkID -> filter map. I'm reluctant to get rid of the 
user -> FrameworkSorter map because it would mean that multiple Framework 
objects would need to share pointers to the same FrameworkSorter object, which 
gets into issue of ownership. Additionally, without an auxiliary data structure 
like the map, there's no way to find the FrameworkSorter corresponding to the 
user when a new framework comes in or to detect that all frameworks for a given 
user have been removed and therefore the FrameworkSorter can be deleted without 
traversing the entire FrameworkID -> Framework map.


- Thomas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6665/#review17065
-----------------------------------------------------------


On Feb. 28, 2013, 1:42 a.m., Thomas Marshall wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6665/
> -----------------------------------------------------------
> 
> (Updated Feb. 28, 2013, 1:42 a.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Description
> -------
> 
> Currently, every time we do an allocation we have to traverse the entire list 
> of active slaves an check each one to see if its whitelisted and if it has 
> resources to allocate. This patch keeps a set of all slaves that meet those 
> requirements, and updates it when slaves are added/removed and when resources 
> are allocated/recovered.
> 
> Timing comparisons, using test_framework on a local cluster on my laptop 
> (each number is an average over 10 tests):
> 
> 100 slaves, 100 tasks
> without patch: 5.82s
> with patch: 5.773s
> improvement of about 1%
> 
> 1000 slaves, 100 tasks
> without patch: 8.261s
> with patch: 8.07s
> improvement of about 2%
> 
> Since this was a scalability issue, you'd presumably see a bigger improvement 
> in larger clusters but its difficult to simulate more than 1000 slaves or so 
> locally. If more convincing numbers are needed I can do some testing on EC2 
> (if nothing else, I'm hoping we'll have a Jenkins build with automated 
> performance tests set up later this semester, so I can test this in the 
> process of setting that up), but at the very least it seems clear that the 
> patch improves the runtime complexity of the allocation algorithm and doesn't 
> slow allocations down even in small clusters. 
> 
> 
> Diffs
> -----
> 
>   src/local/local.hpp 2633d25 
>   src/local/local.cpp 3402029 
>   src/master/allocator.hpp b68b67d 
>   src/master/drf_sorter.hpp 79566cc 
>   src/master/drf_sorter.cpp 33a8ec8 
>   src/master/hierarchical_allocator_process.hpp 33e059c 
>   src/master/main.cpp ca0abec 
>   src/master/master.hpp c9b4b3f 
>   src/master/master.cpp 814a6e1 
>   src/master/sorter.hpp 73db6b1 
>   src/tests/allocator_tests.cpp 44d72ad 
>   src/tests/allocator_zookeeper_tests.cpp 5f83dd7 
>   src/tests/fault_tolerance_tests.cpp 44eef03 
>   src/tests/gc_tests.cpp 411bcc0 
>   src/tests/master_detector_tests.cpp 5d09bab 
>   src/tests/master_tests.cpp e140f40 
>   src/tests/resource_offers_tests.cpp 5981191 
>   src/tests/sorter_tests.cpp 61e6038 
>   src/tests/utils.hpp 4600c2f 
> 
> Diff: https://reviews.apache.org/r/6665/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Thomas Marshall
> 
>

Re: Review Request: Speed up allocations by keeping a set of slaves with available resources

Reply via email to