> On Jan. 19, 2018, 7:27 p.m., Stephan Erb wrote: > > The patch itself looks fine! However, I have difficulties assesing it > > properly as I am missing a bit of context. What is the main customization > > that you aim to implement using the new interface? > > > > I am also curious what is your take on the removed offer sorting logic. > > Does it still work for Twitter as intended originally? We are currently > > using it as well, so I would need to re-implement if it gets removed. (Main > > issue we have noticed so far was that the number of tasks per agent can > > shift drastically, sometimes even up to a point where the Thermos observer > > is completely unusable). > > Jordan Ly wrote: > Some context for the new interface: > > For performance, it is helpful to control the data structure we are > holding offers in and how we can take advantage of it to narrow down the list > of candidates to schedule on. For example, in the case of CPU ordering, if we > call `getOrdered(TaskGroupKey, ResourceRequest)` -- since we know the size of > the ResourceRequest and since we have access to the `offers` NavigableSet, we > might use `subMap` to return only candidates that we know will fit thus > reducing the number of offers we have to evaluate right away. > > Stephan Erb wrote: > Yeah that makes sense, but would have also been possible without the set > of recent scheduling refactorings (either by using the `NavigableSet` as-is > or via something like > https://github.com/wfarner/aurora/commit/f77d79a2d01c3b5b34f11b812d5dcff2789e0766 > ). On top, this could even be done by default. > > If I am reading correctly between the lines of this RB here and > https://reviews.apache.org/r/64954/ it sounds like you are moving towards a > custom, heavy-weight scheduling logic. My question essentially boils down to: > Do you go down the custom route because your usecase is so specific? Or is it > just more involved doing it in the open (lack of reviews, increases > turnaround times, ...)? > > I don't intend to block your patch, as it looks good. I just want to > understand where this is heading and why.
Twitter has a 85/15 split between production (preferred) and preemtible tier tasks. The idea we're pursuing is to try and split the cluster logically into hosts that have preferred tasks and hosts that have preemtible tasks. Why is this useful? Well, the machines with preemptible tasks are effectively "empty" with preemption enabled, in terms of available slots that other jobs can grow into. It means if we commit 15% of our budget to the preemptible tier, we can also guarantee that we can support that % of growth in the cluster. This will enable larger instances, and temporary bursts, etc. Bin-packing alone doesn't solve this, because if you end up with every host 85% prod and 15% non-prod, and you are low on capacity (forcing preemption) only slots 15% the size of a host are available. This excludes many jobs from growing, and it forces us to put unnecessary "core limits" on jobs. I think this type of scheduling optimization is very specific to Twitter. As is the scale we're working at, which meant that score-based scheduling using just OfferSelector didn't work for us. But there are some neat optimizations we can make for just the condition we want, but it requires access to the underlying data structures (basically to support reverse iteration in a performant way). So all of this motivates this patch. - David ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/65233/#review195844 ----------------------------------------------------------- On Jan. 19, 2018, 9:40 p.m., Jordan Ly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/65233/ > ----------------------------------------------------------- > > (Updated Jan. 19, 2018, 9:40 p.m.) > > > Review request for Aurora, David McLaughlin, Stephan Erb, and Bill Farner. > > > Repository: aurora > > > Description > ------- > > The goal of this patch is to provide a more reasonable abstraction for custom > scheduling. > > Currently, we have `OfferSelector`, `OfferOrder`, and recently I proposed > `FilterableOfferCollection` (https://reviews.apache.org/r/65225/). These were > all created in order to provide more customization within the scheduling > loop. However, they suffer from being leaky and too disparate. This patch > hopes to combine all of those components into a single `OfferSet` which can > be injected and used by HostOffers. This interface allows for custom > scheduling logic to be co-located with custom data structures for `offers` as > opposed to being passed around as different components. > > The following options will be removed from the last 0.19 to now: > ``` > -offer_order > -offer_order_modules > ``` > > > Diffs > ----- > > RELEASE-NOTES.md 3053e543793669601ec9132adae376bcdce64c32 > docs/reference/scheduler-configuration.md > f697b6f1f5332ac9a9feade48317ecf5818e0c92 > src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java > f44cb61f5d1805aeb0f98135453dbc82f3bc9ad4 > src/main/java/org/apache/aurora/scheduler/config/CliOptions.java > e4e535824adb86c98c6173a20f97a9a90b08483a > src/main/java/org/apache/aurora/scheduler/offers/HostOffers.java > 2ea7a01085b87c8ed6765537a8005e2349784ab0 > src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java > 8f9e33d81be9087821784a8a08079a1736d9cb63 > src/main/java/org/apache/aurora/scheduler/offers/OfferManagerModule.java > de16c14c887d4225e0629b1580eb5e740798f1f7 > src/main/java/org/apache/aurora/scheduler/offers/OfferOrder.java > 7c99c309e0e97519719905f725bb9f6a59d2a7f5 > src/main/java/org/apache/aurora/scheduler/offers/OfferOrderBuilder.java > 1260ef19506acb8e8a937d4fd7b7152361bd3c40 > src/main/java/org/apache/aurora/scheduler/offers/OfferSet.java PRE-CREATION > src/main/java/org/apache/aurora/scheduler/offers/OfferSetImpl.java > PRE-CREATION > src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java > 1e36b2c3094e99f05aa4a4f098a48df2293b4320 > > src/main/java/org/apache/aurora/scheduler/scheduling/FirstFitOfferSelector.java > 5f4f452cb133e2be73f881700a54ba4c8e60a35a > > src/main/java/org/apache/aurora/scheduler/scheduling/FirstFitOfferSelectorModule.java > 4d36487f430483444314a8cefd133e7c766075de > src/main/java/org/apache/aurora/scheduler/scheduling/OfferSelector.java > 6f5d55c5a586ac923b5f375d1935ec2049673ec0 > src/main/java/org/apache/aurora/scheduler/scheduling/TaskAssignerImpl.java > ec416cc12b2fb4a11a273c77dbe72b79f96fa7e1 > > src/main/java/org/apache/aurora/scheduler/scheduling/TaskAssignerImplModule.java > d101a497896ba3b8ca7984a63f861494cbddbb35 > src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java > c84fad3b615c49ca7527ae3f7fa116aa8f90ff0c > src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java > 28224a5c587b235ccc86a286e796eeca66929232 > > src/test/java/org/apache/aurora/scheduler/scheduling/FirstFitOfferSelectorTest.java > e6b2b7465fb4fd432cda50d1e714dfc587764689 > > src/test/java/org/apache/aurora/scheduler/scheduling/TaskAssignerImplTest.java > f84941edd96d974a6c2d7e087324bc8fbbac3b0a > > > Diff: https://reviews.apache.org/r/65233/diff/4/ > > > Testing > ------- > > `./gradlew test` > `./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh` > > Testing injection of a custom `OfferSet` onto a test cluster. > > > Thanks, > > Jordan Ly > >
