> On March 24, 2015, 6:32 p.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/async/preemptor/PendingTaskProcessor.java,
> >  line 100
> > <https://reviews.apache.org/r/32352/diff/1/?file=902121#file902121line100>
> >
> >     This regresses on some important characteristics of `TaskGroups`:
> >     
> >     - starvation avoidance: a very large job (even a low priority one) can 
> > starve a single restarting instance
> >     - backoffs: we will perform redundant and possibly hopeless work to try 
> > to schedule the same things without reducing frequency
> >     
> >     It might be easier to retrofit `TaskGroups` to allow injection of the 
> > underlying work than to reimplement those behaviors, but i'm interested in 
> > what you think.
> 
> Maxim Khutornenko wrote:
>     >starvation avoidance: a very large job (even a low priority one) can 
> starve a single restarting instance
>     
>     I think this can be a fair concern when the number of pending tasks 
> exceeds the number of theoretical slots in the cluster. I don't think the 
> `TaskGroups` fits here given its penalty system, rescheduling calculator and 
> executor driven evaluation. We can go with a (hopefully) simpler `TaskGroup` 
> "circular buffer" interator batch algorithm instead, for example:
>     
>     Create a list of `TaskGroup` instances ordered by the number of pending 
> tasks in each group (largest first).
>     Create (or maintain) a circular *buffer* from the above list.
>     For each slave:
>       While no slot found and we have not looped:
>         Evaluate *current* `TaskGroup`
>         Move *current* to next
>       
>     After all slaves are evaluated we will have a mapping of `TaskGroup`s to 
> slots that we can create victims from.
>     
>     Consider TaskGroup to pending task mapping:
>     TG1 | TG2 | TG3 | TG4
>     ----|-----|-----|----
>     1.1 | 2.1 | 3.1 | 4.1
>     ----|-----|-----|----
>     1.2 | 2.2 | 3.2 | 
>     ----|-----|-----|----
>     1.3 | 2.3 |     |
>     ----|-----|-----|----
>     1.4 |     |     |
>     ----|-----|-----|----
>     1.5 |
>     
>     and slaves A, B, C, D, E, F, G, H
>     
>     A possible slot allocation after all slaves are processed could be:
>     
>     A:1.1, B:NONE, C:3.1, D:4.1, E:1.2, F:2.1, G:3.2, H:1.3
>     
>     This will give a fair chance to smaller job while making sure larger jobs 
> are not penalized unnecessarily. 
>     
>     >backoffs: we will perform redundant and possibly hopeless work to try to 
> schedule the same things without reducing frequency
>     
>     I am not concerned about backoffs here. While we are going to run the 
> above algorithm every minute, we will only evaluate tasks that don't already 
> have slots reserved for. With slot reservation timeout currently set to 5 
> minutes, I don't see much redundancy that we would have to throttle.
>     
>     If you agree with the above proposal I will add it into AURORA-1219 to 
> address shortly after.

I have implemented this approach in https://reviews.apache.org/r/32597/.


> On March 24, 2015, 6:32 p.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/async/preemptor/PendingTaskProcessor.java,
> >  line 73
> > <https://reviews.apache.org/r/32352/diff/1/?file=902121#file902121line73>
> >
> >     I'm beginning to question this pattern.  It seems reasonable to insist 
> > that the global injector _never_ has a binding for a general type like 
> > this.  With that in mind, you should be comfortable using a private module 
> > and an unqualified binding, saving this kind of clutter.  What do you think?
> 
> Maxim Khutornenko wrote:
>     I am not keen on this idea given that the clutter will shift from 
> attributes to private modules but I willing to give it a try to possibly it 
> within the module itself.

On second thought, I don't think we should chase this with private modules as 
it will only be successful until we have to add another general type binding 
targeted for the same constructor. At that point we may switch to a settings 
object approach instead.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32352/#review77609
-----------------------------------------------------------


On March 21, 2015, 2:19 a.m., Maxim Khutornenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32352/
> -----------------------------------------------------------
> 
> (Updated March 21, 2015, 2:19 a.m.)
> 
> 
> Review request for Aurora, Bill Farner and Zameer Manji.
> 
> 
> Bugs: AURORA-1158
>     https://issues.apache.org/jira/browse/AURORA-1158
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Summary of changes:
> - The PreemptorImpl now only hosts slot validation and task preemption logic 
> and requires a write transaction.
> - PendingTaskProcessor is called every minute to pull all available PENDING 
> tasks and search for preemption slots.
> - TaskScheduler has no connection to slot search anymore. It now completely 
> relies on periodic PreemptionService to find available slots.
> - preemption_delay is reduced from 10 to 3 minutes.
> 
> Benchmark refactoring will be addressed separately.
> 
> 
> Diffs
> -----
> 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 5309e8140fff411da30ee87c1b3b1a55d6fdaeeb 
>   src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java 
> 1b9d741dba7b9c2663ff119cd44adc8403c0d257 
>   
> src/main/java/org/apache/aurora/scheduler/async/preemptor/PendingTaskProcessor.java
>  PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/async/preemptor/PreemptionSlotCache.java
>  4ca36e5fe2cfd326f4d4f37f70dbcd0060109e73 
>   
> src/main/java/org/apache/aurora/scheduler/async/preemptor/PreemptionSlotFinder.java
>  84bcdc57d798aca229d4f184cae065ec4dcf8fc5 
>   src/main/java/org/apache/aurora/scheduler/async/preemptor/Preemptor.java 
> 84791a272f245c729706af95d70c7f1631bfe99c 
>   
> src/main/java/org/apache/aurora/scheduler/async/preemptor/PreemptorImpl.java 
> 18a2e6032ba86ff7efab4d42a4d83798a1d06b06 
>   
> src/main/java/org/apache/aurora/scheduler/async/preemptor/PreemptorMetrics.java
>  782e751f5b05391ebeee4d947570cc174dddece2 
>   
> src/main/java/org/apache/aurora/scheduler/async/preemptor/PreemptorModule.java
>  7034a07eae1f94d7a0bbccdf8146cf3ed0a5424e 
>   src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java 
> da7b662c3ca4040221805cba81e5ac7b64fb3df4 
>   src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 
> 29fe156da19f3c08af00a951bb3baccf2b97a6cb 
>   src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java 
> f5c2128e90eb5c849d68431225661d1cfa7da0cc 
>   
> src/test/java/org/apache/aurora/scheduler/async/preemptor/PendingTaskProcessorTest.java
>  PRE-CREATION 
>   
> src/test/java/org/apache/aurora/scheduler/async/preemptor/PreemptorImplTest.java
>  d17c4fb513afdb7d8ef6d7c2b0aef86c1f47c082 
>   
> src/test/java/org/apache/aurora/scheduler/async/preemptor/PreemptorModuleTest.java
>  0e2e958a053e5cee280b947f7349c076e2addb45 
> 
> Diff: https://reviews.apache.org/r/32352/diff/
> 
> 
> Testing
> -------
> 
> ./gradlew -Pq build
> Manual testing in vagrant.
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>

Reply via email to