-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59480/#review175771
-----------------------------------------------------------


Ship it!




Master (4c0974b) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On May 23, 2017, 7:41 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59480/
> -----------------------------------------------------------
> 
> (Updated May 23, 2017, 7:41 a.m.)
> 
> 
> Review request for Aurora, Santhosh Kumar Shanmugham and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> This patch enables scalable, high-performance Scheduler bin-packing using the 
> existing first-fit task assigner, and it can be controlled with a simple 
> command line argument. 
> 
> The bin-packing is only an approximation, but can lead to pretty significant 
> improvements in resource utilization per agent. For example, on a CPU-bound 
> cluster with 30k+ hosts and 135k tasks (across 1k+ jobs) - we were able to 
> reduce the number of hosts with tasks scheduled on them to just 90%, down 
> from 99.7% (as one would expect from randomization). So if you are running 
> Aurora on elastic computing and paying for machines by the minute/hour, then 
> utilizing this patch _could_ allow you to reduce your server footprint by as 
> much as 10%. 
> 
> The approximation is based on the simple idea that you have the best chance 
> of having perfect bin-packing if you put tasks in the smallest slot 
> available. So if you have a task needing 8 cores and you have an 8 core and 
> 12 core offer available - you'd always want to put the task in the 8 core 
> offer*. By sorting offers in OfferManager during iteration, then a first-fit 
> algorithm is guaranteed to match the smallest possible offer for your task 
> and achieves this.
> 
> * - The correct decision of course depends on the other pending tasks and the 
> other resources available, and more satisfactory results may also need 
> preemption, etc.
> 
> 
> Diffs
> -----
> 
>   RELEASE-NOTES.md 77376e438bd7af74c364dcd5d1b3e3f1ece2adbf 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> f2296a9d7a88be7e43124370edecfe64415df00f 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java 
> 78255e6dfa31c4920afc0221ee60ec4f8c2a12c4 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferOrder.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java 
> adf7f33e4a72d87c3624f84dfe4998e20dc75fdc 
>   src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java 
> 317a2d26d8bfa27988c60a7706b9fb3aa9b4e2a2 
>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 
> d7addc0effb60c196cf339081ad81de541d05385 
>   src/test/java/org/apache/aurora/scheduler/resources/ResourceTestUtil.java 
> 676d305d257585e53f0a05b359ba7eb11f1b23be 
> 
> 
> Diff: https://reviews.apache.org/r/59480/diff/1/
> 
> 
> Testing
> -------
> 
> This has been scale-tested with production-like workloads and performs well, 
> adding only a few extra seconds total in TaskAssigner when applied to 
> thousands of tasks per minute. 
> 
> There is an  overhead when scheduling tasks that have large resource 
> requirements - as the task assigner will first need to skip offer all the 
> offers with low resources. In a packed cluster, this is where the extra 
> seconds are spent. This could be reduced by just jumping over all the offers 
> we know to be too small, but that decision has to map to the OfferOrder 
> (which adds complexity). That can be addressed in a follow-up review if 
> needed.
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>

Reply via email to