----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59480/#review175996 -----------------------------------------------------------
Ship it! Over all looks good to me. I am not sure if it's easy to add costomizability by provinding an ordering at run time! src/main/java/org/apache/aurora/scheduler/offers/OfferOrder.java Lines 24 (patched) <https://reviews.apache.org/r/59480/#comment249341> I just want to understand what the plan is in the long term. Do we need to add to this in case some other resource becomes tight? What happens when there are new resources such as GPU? - Reza Motamedi On May 23, 2017, 7:41 a.m., David McLaughlin wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59480/ > ----------------------------------------------------------- > > (Updated May 23, 2017, 7:41 a.m.) > > > Review request for Aurora, Santhosh Kumar Shanmugham and Stephan Erb. > > > Repository: aurora > > > Description > ------- > > This patch enables scalable, high-performance Scheduler bin-packing using the > existing first-fit task assigner, and it can be controlled with a simple > command line argument. > > The bin-packing is only an approximation, but can lead to pretty significant > improvements in resource utilization per agent. For example, on a CPU-bound > cluster with 30k+ hosts and 135k tasks (across 1k+ jobs) - we were able to > reduce the number of hosts with tasks scheduled on them to just 90%, down > from 99.7% (as one would expect from randomization). So if you are running > Aurora on elastic computing and paying for machines by the minute/hour, then > utilizing this patch _could_ allow you to reduce your server footprint by as > much as 10%. > > The approximation is based on the simple idea that you have the best chance > of having perfect bin-packing if you put tasks in the smallest slot > available. So if you have a task needing 8 cores and you have an 8 core and > 12 core offer available - you'd always want to put the task in the 8 core > offer*. By sorting offers in OfferManager during iteration, then a first-fit > algorithm is guaranteed to match the smallest possible offer for your task > and achieves this. > > * - The correct decision of course depends on the other pending tasks and the > other resources available, and more satisfactory results may also need > preemption, etc. > > > Diffs > ----- > > RELEASE-NOTES.md 77376e438bd7af74c364dcd5d1b3e3f1ece2adbf > src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java > f2296a9d7a88be7e43124370edecfe64415df00f > src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java > 78255e6dfa31c4920afc0221ee60ec4f8c2a12c4 > src/main/java/org/apache/aurora/scheduler/offers/OfferOrder.java > PRE-CREATION > src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java > adf7f33e4a72d87c3624f84dfe4998e20dc75fdc > src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java > 317a2d26d8bfa27988c60a7706b9fb3aa9b4e2a2 > src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java > d7addc0effb60c196cf339081ad81de541d05385 > src/test/java/org/apache/aurora/scheduler/resources/ResourceTestUtil.java > 676d305d257585e53f0a05b359ba7eb11f1b23be > > > Diff: https://reviews.apache.org/r/59480/diff/1/ > > > Testing > ------- > > This has been scale-tested with production-like workloads and performs well, > adding only a few extra seconds total in TaskAssigner when applied to > thousands of tasks per minute. > > There is an overhead when scheduling tasks that have large resource > requirements - as the task assigner will first need to skip offer all the > offers with low resources. In a packed cluster, this is where the extra > seconds are spent. This could be reduced by just jumping over all the offers > we know to be too small, but that decision has to map to the OfferOrder > (which adds complexity). That can be addressed in a follow-up review if > needed. > > > Thanks, > > David McLaughlin > >
