> On Jan. 19, 2018, 7:27 p.m., Stephan Erb wrote:
> > The patch itself looks fine! However, I have difficulties assesing it 
> > properly as I am missing a bit of context. What is the main customization 
> > that you aim to implement using the new interface?
> > 
> > I am also curious what is your take on the removed offer sorting logic. 
> > Does it still work for Twitter as intended originally? We are currently 
> > using it as well, so I would need to re-implement if it gets removed. (Main 
> > issue we have noticed so far was that the number of tasks per agent can 
> > shift drastically, sometimes even up to a point where the Thermos observer 
> > is completely unusable).
> 
> Jordan Ly wrote:
>     Some context for the new interface:
>     
>     For performance, it is helpful to control the data structure we are 
> holding offers in and how we can take advantage of it to narrow down the list 
> of candidates to schedule on. For example, in the case of CPU ordering, if we 
> call `getOrdered(TaskGroupKey, ResourceRequest)` -- since we know the size of 
> the ResourceRequest and since we have access to the `offers` NavigableSet, we 
> might use `subMap` to return only candidates that we know will fit thus 
> reducing the number of offers we have to evaluate right away.
> 
> Stephan Erb wrote:
>     Yeah that makes sense, but would have also been possible without the set 
> of recent scheduling refactorings (either by using the `NavigableSet` as-is 
> or via something like 
> https://github.com/wfarner/aurora/commit/f77d79a2d01c3b5b34f11b812d5dcff2789e0766
>  ). On top, this could even be done by default. 
>     
>     If I am reading correctly between the lines of this RB here and 
> https://reviews.apache.org/r/64954/ it sounds like you are moving towards a 
> custom, heavy-weight scheduling logic. My question essentially boils down to: 
> Do you go down the custom route because your usecase is so specific? Or is it 
> just more involved doing it in the open (lack of reviews, increases 
> turnaround times, ...)?
>     
>     I don't intend to block your patch, as it looks good. I just want to 
> understand where this is heading and why.
> 
> David McLaughlin wrote:
>     Twitter has a 85/15 split between production (preferred) and preemtible 
> tier tasks. The idea we're pursuing is to try and split the cluster logically 
> into hosts that have preferred tasks and hosts that have preemtible tasks. 
> Why is this useful? Well, the machines with preemptible tasks are effectively 
> "empty" with preemption enabled, in terms of available slots that other jobs 
> can grow into. It means if we commit 15% of our budget to the preemptible 
> tier, we can also guarantee that we can support that % of growth in the 
> cluster. This will enable larger instances, and temporary bursts, etc. 
>     
>     Bin-packing alone doesn't solve this, because if you end up with every 
> host 85% prod and 15% non-prod, and you are low on capacity (forcing 
> preemption) only slots 15% the size of a host are available. This excludes 
> many jobs from growing, and it forces us to put unnecessary "core limits" on 
> jobs. 
>     
>     I think this type of scheduling optimization is very specific to Twitter. 
> As is the scale we're working at, which meant that score-based scheduling 
> using just OfferSelector didn't work for us. But there are some neat 
> optimizations we can make for just the condition we want, but it requires 
> access to the underlying data structures (basically to support reverse 
> iteration in a performant way). So all of this motivates this patch.

@StephanErb I now realize that you meant you were still utilizing the 
`offer_order` option and not the custom `offer_order_modules` one.

I will re-add this in a follow-up patch since it can still co-exist with the 
default `OfferSetImpl`.


- Jordan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65233/#review195844
-----------------------------------------------------------


On Jan. 19, 2018, 9:40 p.m., Jordan Ly wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65233/
> -----------------------------------------------------------
> 
> (Updated Jan. 19, 2018, 9:40 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Stephan Erb, and Bill Farner.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> The goal of this patch is to provide a more reasonable abstraction for custom 
> scheduling.
> 
> Currently, we have `OfferSelector`, `OfferOrder`, and recently I proposed 
> `FilterableOfferCollection` (https://reviews.apache.org/r/65225/). These were 
> all created in order to provide more customization within the scheduling 
> loop. However, they suffer from being leaky and too disparate. This patch 
> hopes to combine all of those components into a single `OfferSet` which can 
> be injected and used by HostOffers. This interface allows for custom 
> scheduling logic to be co-located with custom data structures for `offers` as 
> opposed to being passed around as different components.
> 
> The following options will be removed from the last 0.19 to now:
> ```
> -offer_order
> -offer_order_modules
> ```
> 
> 
> Diffs
> -----
> 
>   RELEASE-NOTES.md 3053e543793669601ec9132adae376bcdce64c32 
>   docs/reference/scheduler-configuration.md 
> f697b6f1f5332ac9a9feade48317ecf5818e0c92 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> f44cb61f5d1805aeb0f98135453dbc82f3bc9ad4 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> e4e535824adb86c98c6173a20f97a9a90b08483a 
>   src/main/java/org/apache/aurora/scheduler/offers/HostOffers.java 
> 2ea7a01085b87c8ed6765537a8005e2349784ab0 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java 
> 8f9e33d81be9087821784a8a08079a1736d9cb63 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferManagerModule.java 
> de16c14c887d4225e0629b1580eb5e740798f1f7 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferOrder.java 
> 7c99c309e0e97519719905f725bb9f6a59d2a7f5 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferOrderBuilder.java 
> 1260ef19506acb8e8a937d4fd7b7152361bd3c40 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferSet.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferSetImpl.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java 
> 1e36b2c3094e99f05aa4a4f098a48df2293b4320 
>   
> src/main/java/org/apache/aurora/scheduler/scheduling/FirstFitOfferSelector.java
>  5f4f452cb133e2be73f881700a54ba4c8e60a35a 
>   
> src/main/java/org/apache/aurora/scheduler/scheduling/FirstFitOfferSelectorModule.java
>  4d36487f430483444314a8cefd133e7c766075de 
>   src/main/java/org/apache/aurora/scheduler/scheduling/OfferSelector.java 
> 6f5d55c5a586ac923b5f375d1935ec2049673ec0 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskAssignerImpl.java 
> ec416cc12b2fb4a11a273c77dbe72b79f96fa7e1 
>   
> src/main/java/org/apache/aurora/scheduler/scheduling/TaskAssignerImplModule.java
>  d101a497896ba3b8ca7984a63f861494cbddbb35 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> c84fad3b615c49ca7527ae3f7fa116aa8f90ff0c 
>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 
> 28224a5c587b235ccc86a286e796eeca66929232 
>   
> src/test/java/org/apache/aurora/scheduler/scheduling/FirstFitOfferSelectorTest.java
>  e6b2b7465fb4fd432cda50d1e714dfc587764689 
>   
> src/test/java/org/apache/aurora/scheduler/scheduling/TaskAssignerImplTest.java
>  f84941edd96d974a6c2d7e087324bc8fbbac3b0a 
> 
> 
> Diff: https://reviews.apache.org/r/65233/diff/4/
> 
> 
> Testing
> -------
> 
> `./gradlew test`
> `./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh`
> 
> Testing injection of a custom `OfferSet` onto a test cluster.
> 
> 
> Thanks,
> 
> Jordan Ly
> 
>

Reply via email to