Re: Review Request 31739: Making task preemption asynchronous.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31739/#review75222 --- src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java https://reviews.apache.org/r/31739/#comment122186 I think a comment would help explain why 30 Days is the desired value to use for the benchmarks. src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java https://reviews.apache.org/r/31739/#comment122189 Can you explain the reason for the delay? We already wait for a task to be pending for 10 minutes before we do preemption, I don't see the reason for another delay until we look for a victim. - Zameer Manji On March 4, 2015, 11:30 a.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31739/ --- (Updated March 4, 2015, 11:30 a.m.) Review request for Aurora, Bill Farner and Zameer Manji. Bugs: AURORA-1158 https://issues.apache.org/jira/browse/AURORA-1158 Repository: aurora Description --- Reservations now happen asynchronously with a configurable delay between a failed task scheduling and a preemption attempt. Added a new `PreemptorBenchmark` to measure preemption perf as it now happens off the main scheduling loop and thus unreachable by earlier benchmarks. Benchmark results are unsurprisingly great. The biggest winner is the PreemptorFallbackForLargeClusterBenchmark (now ClusterFullUtilizationBenchmark). Without the preemptor fallback and thanks to static veto offer filtering it's now 99.995% faster :) The lowest gain is for the limit constraint benchmark. It's the only dynamic veto type and thus is not subjected to offer filtering. Still ~71% improvement is nothing to complain about. Before: Benchmark Mode Cnt ScoreError Units SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark avgt 100781243.004 ± 9308.450 ns/op SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100 1205278.826 ± 19800.452 ns/op SchedulingBenchmarks.PreemptorFallbackForLargeClusterBenchmark.runBenchmark avgt 100 77048458.974 ± 918593.702 ns/op SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100769919.326 ± 18963.264 ns/op After: Benchmark Mode CntScoreError Units SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark avgt 10028117.603 ±243.556 ns/op SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100 348667.808 ± 2956.521 ns/op SchedulingBenchmarks.ClusterFullUtilizationBenchmark.runBenchmark avgt 100 3978.828 ±351.186 ns/op SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark avgt 10026096.782 ±412.138 ns/op SchedulingBenchmarks.PreemptorBenchmark.runBenchmark avgt 100 6054216.773 ± 105428.318 ns/op Perf gain summary: InsufficientResourcesSchedulingBenchmark - 96.4% LimitConstraintMismatchSchedulingBenchmark - 71% PreemptorFallbackForLargeClusterBenchmark- 99.995% ValueConstraintMismatchSchedulingBenchmark - 96.6% Diffs - src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 3239eaa139e35e8c3acdacf6375f492de2b5bfee src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java e87dda47a355654c66f6f54fb25a4d9a7f68422d src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java d0fe3e133cbec2418f31160bf8ab8adaa45bb958 src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 4ee13c8e5d46ba863f4d9871884c7d494d07758d src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java 87bc531d2a72f21c36ddd0c1bd3b2367826cc422 Diff: https://reviews.apache.org/r/31739/diff/ Testing --- ./gradlew -Pq build Manual testing in vagrant. Thanks, Maxim Khutornenko
Review Request 31739: Making task preemption asynchronous.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31739/ --- Review request for Aurora, Bill Farner and Zameer Manji. Bugs: AURORA-1158 https://issues.apache.org/jira/browse/AURORA-1158 Repository: aurora Description --- Reservations are now happen asynchronously with a configurable delay between a failed task scheduling and a preemption attempt. Added a new `PreemptorBenchmark` to measure preemption perf as it now happens off the main scheduling loop and thus unreachable by earlier benchmarks. Benchmark results are unsurprisingly great. The biggest winner is the PreemptorFallbackForLargeClusterBenchmark (now ClusterFullUtilizationBenchmark). Without the preemptor fallback and thanks to static veto offer filtering it's now 99.995% faster :) The lowest gain is for the limit constraint benchmark. It's the only dynamic veto type and thus is not subjected to offer filtering. Still ~82% improvement is nothing to complain about. Before: Benchmark Mode Cnt ScoreError Units SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark avgt 100781243.004 ± 9308.450 ns/op SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100 1205278.826 ± 19800.452 ns/op SchedulingBenchmarks.PreemptorFallbackForLargeClusterBenchmark.runBenchmark avgt 100 77048458.974 ± 918593.702 ns/op SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100769919.326 ± 18963.264 ns/op After: Benchmark Mode CntScoreError Units SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark avgt 10028117.603 ±243.556 ns/op SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100 348667.808 ± 2956.521 ns/op SchedulingBenchmarks.ClusterFullUtilizationBenchmark.runBenchmark avgt 100 3978.828 ±351.186 ns/op SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark avgt 10026096.782 ±412.138 ns/op SchedulingBenchmarks.PreemptorBenchmark.runBenchmark avgt 100 6054216.773 ± 105428.318 ns/op Diffs - src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 3239eaa139e35e8c3acdacf6375f492de2b5bfee src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java e87dda47a355654c66f6f54fb25a4d9a7f68422d src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java d0fe3e133cbec2418f31160bf8ab8adaa45bb958 src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 4ee13c8e5d46ba863f4d9871884c7d494d07758d src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java 87bc531d2a72f21c36ddd0c1bd3b2367826cc422 Diff: https://reviews.apache.org/r/31739/diff/ Testing --- ./gradlew -Pq build Manual testing in vagrant. Thanks, Maxim Khutornenko
Re: Review Request 31739: Making task preemption asynchronous.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31739/ --- (Updated March 4, 2015, 7:30 p.m.) Review request for Aurora, Bill Farner and Zameer Manji. Bugs: AURORA-1158 https://issues.apache.org/jira/browse/AURORA-1158 Repository: aurora Description (updated) --- Reservations now happen asynchronously with a configurable delay between a failed task scheduling and a preemption attempt. Added a new `PreemptorBenchmark` to measure preemption perf as it now happens off the main scheduling loop and thus unreachable by earlier benchmarks. Benchmark results are unsurprisingly great. The biggest winner is the PreemptorFallbackForLargeClusterBenchmark (now ClusterFullUtilizationBenchmark). Without the preemptor fallback and thanks to static veto offer filtering it's now 99.995% faster :) The lowest gain is for the limit constraint benchmark. It's the only dynamic veto type and thus is not subjected to offer filtering. Still ~71% improvement is nothing to complain about. Before: Benchmark Mode Cnt ScoreError Units SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark avgt 100781243.004 ± 9308.450 ns/op SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100 1205278.826 ± 19800.452 ns/op SchedulingBenchmarks.PreemptorFallbackForLargeClusterBenchmark.runBenchmark avgt 100 77048458.974 ± 918593.702 ns/op SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100769919.326 ± 18963.264 ns/op After: Benchmark Mode CntScoreError Units SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark avgt 10028117.603 ±243.556 ns/op SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100 348667.808 ± 2956.521 ns/op SchedulingBenchmarks.ClusterFullUtilizationBenchmark.runBenchmark avgt 100 3978.828 ±351.186 ns/op SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark avgt 10026096.782 ±412.138 ns/op SchedulingBenchmarks.PreemptorBenchmark.runBenchmark avgt 100 6054216.773 ± 105428.318 ns/op Perf gain summary: InsufficientResourcesSchedulingBenchmark - 96.4% LimitConstraintMismatchSchedulingBenchmark - 71% PreemptorFallbackForLargeClusterBenchmark- 99.995% ValueConstraintMismatchSchedulingBenchmark - 96.6% Diffs - src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 3239eaa139e35e8c3acdacf6375f492de2b5bfee src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java e87dda47a355654c66f6f54fb25a4d9a7f68422d src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java d0fe3e133cbec2418f31160bf8ab8adaa45bb958 src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 4ee13c8e5d46ba863f4d9871884c7d494d07758d src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java 87bc531d2a72f21c36ddd0c1bd3b2367826cc422 Diff: https://reviews.apache.org/r/31739/diff/ Testing --- ./gradlew -Pq build Manual testing in vagrant. Thanks, Maxim Khutornenko
Re: Review Request 31739: Making task preemption asynchronous.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31739/#review75247 --- src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java https://reviews.apache.org/r/31739/#comment122213 By only using one task per host we don't benchmark the costly O(tasks per slave) candidate search. The latter can probably be improved significantly, I therefore think it is wise to extend the benchmark accordingly. src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java https://reviews.apache.org/r/31739/#comment122217 Is this the same single threaded scheduler used for ordinary scheduling? (Otherwise I would expect races) - Stephan Erb On March 4, 2015, 8:30 p.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31739/ --- (Updated March 4, 2015, 8:30 p.m.) Review request for Aurora, Bill Farner and Zameer Manji. Bugs: AURORA-1158 https://issues.apache.org/jira/browse/AURORA-1158 Repository: aurora Description --- Reservations now happen asynchronously with a configurable delay between a failed task scheduling and a preemption attempt. Added a new `PreemptorBenchmark` to measure preemption perf as it now happens off the main scheduling loop and thus unreachable by earlier benchmarks. Benchmark results are unsurprisingly great. The biggest winner is the PreemptorFallbackForLargeClusterBenchmark (now ClusterFullUtilizationBenchmark). Without the preemptor fallback and thanks to static veto offer filtering it's now 99.995% faster :) The lowest gain is for the limit constraint benchmark. It's the only dynamic veto type and thus is not subjected to offer filtering. Still ~71% improvement is nothing to complain about. Before: Benchmark Mode Cnt ScoreError Units SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark avgt 100781243.004 ± 9308.450 ns/op SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100 1205278.826 ± 19800.452 ns/op SchedulingBenchmarks.PreemptorFallbackForLargeClusterBenchmark.runBenchmark avgt 100 77048458.974 ± 918593.702 ns/op SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100769919.326 ± 18963.264 ns/op After: Benchmark Mode CntScoreError Units SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark avgt 10028117.603 ±243.556 ns/op SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100 348667.808 ± 2956.521 ns/op SchedulingBenchmarks.ClusterFullUtilizationBenchmark.runBenchmark avgt 100 3978.828 ±351.186 ns/op SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark avgt 10026096.782 ±412.138 ns/op SchedulingBenchmarks.PreemptorBenchmark.runBenchmark avgt 100 6054216.773 ± 105428.318 ns/op Perf gain summary: InsufficientResourcesSchedulingBenchmark - 96.4% LimitConstraintMismatchSchedulingBenchmark - 71% PreemptorFallbackForLargeClusterBenchmark- 99.995% ValueConstraintMismatchSchedulingBenchmark - 96.6% Diffs - src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 3239eaa139e35e8c3acdacf6375f492de2b5bfee src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java e87dda47a355654c66f6f54fb25a4d9a7f68422d src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java d0fe3e133cbec2418f31160bf8ab8adaa45bb958 src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 4ee13c8e5d46ba863f4d9871884c7d494d07758d src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java 87bc531d2a72f21c36ddd0c1bd3b2367826cc422 Diff: https://reviews.apache.org/r/31739/diff/ Testing --- ./gradlew -Pq build Manual testing in vagrant. Thanks, Maxim Khutornenko
Re: Review Request 31739: Making task preemption asynchronous.
On March 4, 2015, 10:25 p.m., Stephan Erb wrote: src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java, line 269 https://reviews.apache.org/r/31739/diff/1/?file=884469#file884469line269 Is this the same single threaded scheduler used for ordinary scheduling? (Otherwise I would expect races) Yes, it is. On March 4, 2015, 10:25 p.m., Stephan Erb wrote: src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java, line 331 https://reviews.apache.org/r/31739/diff/1/?file=884467#file884467line331 By only using one task per host we don't benchmark the costly O(tasks per slave) candidate search. The latter can probably be improved significantly, I therefore think it is wise to extend the benchmark accordingly. This line is actually referring to a benchmark tester task rather than preemption candidates, which are populated in the base `buildClusterTasks()`. However, you are correct about one task per slave observation. We could potentially add a multiple tasks per slave preemption benchmark but we should probably do so when we try to optimize the preemptor algorithm itself. - Maxim --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31739/#review75247 --- On March 4, 2015, 7:30 p.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31739/ --- (Updated March 4, 2015, 7:30 p.m.) Review request for Aurora, Bill Farner and Zameer Manji. Bugs: AURORA-1158 https://issues.apache.org/jira/browse/AURORA-1158 Repository: aurora Description --- Reservations now happen asynchronously with a configurable delay between a failed task scheduling and a preemption attempt. Added a new `PreemptorBenchmark` to measure preemption perf as it now happens off the main scheduling loop and thus unreachable by earlier benchmarks. Benchmark results are unsurprisingly great. The biggest winner is the PreemptorFallbackForLargeClusterBenchmark (now ClusterFullUtilizationBenchmark). Without the preemptor fallback and thanks to static veto offer filtering it's now 99.995% faster :) The lowest gain is for the limit constraint benchmark. It's the only dynamic veto type and thus is not subjected to offer filtering. Still ~71% improvement is nothing to complain about. Before: Benchmark Mode Cnt ScoreError Units SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark avgt 100781243.004 ± 9308.450 ns/op SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100 1205278.826 ± 19800.452 ns/op SchedulingBenchmarks.PreemptorFallbackForLargeClusterBenchmark.runBenchmark avgt 100 77048458.974 ± 918593.702 ns/op SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100769919.326 ± 18963.264 ns/op After: Benchmark Mode CntScoreError Units SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark avgt 10028117.603 ±243.556 ns/op SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100 348667.808 ± 2956.521 ns/op SchedulingBenchmarks.ClusterFullUtilizationBenchmark.runBenchmark avgt 100 3978.828 ±351.186 ns/op SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark avgt 10026096.782 ±412.138 ns/op SchedulingBenchmarks.PreemptorBenchmark.runBenchmark avgt 100 6054216.773 ± 105428.318 ns/op Perf gain summary: InsufficientResourcesSchedulingBenchmark - 96.4% LimitConstraintMismatchSchedulingBenchmark - 71% PreemptorFallbackForLargeClusterBenchmark- 99.995% ValueConstraintMismatchSchedulingBenchmark - 96.6% Diffs - src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 3239eaa139e35e8c3acdacf6375f492de2b5bfee src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java e87dda47a355654c66f6f54fb25a4d9a7f68422d src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java d0fe3e133cbec2418f31160bf8ab8adaa45bb958 src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 4ee13c8e5d46ba863f4d9871884c7d494d07758d src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java 87bc531d2a72f21c36ddd0c1bd3b2367826cc422 Diff: https://reviews.apache.org/r/31739/diff/ Testing --- ./gradlew -Pq build Manual testing in
Re: Review Request 31739: Making task preemption asynchronous.
On March 4, 2015, 8:17 p.m., Bill Farner wrote: Is there a reason you did not opt to implement this behind the `Preemptor` interface? Seems like if you went with that approach, `TaskScheduler` can be oblivious to the background operations. Maxim Khutornenko wrote: Trying to keep things simple. Moving it behind `Preemptor` would require sharing `Reservations` (or some equivalent feedback notificaiton) between TaskScheduler and Preemptor. Bill Farner wrote: I don't see why the data structure would need to be shared. On one call you could asynchronously kick off the work, and a subsequent call could report back the result of the previous. Maxim Khutornenko wrote: Perhaps I am missing the point but how does it correlate with TaskScheduler can be oblivious to the background operations? If there is no immediate response back from the preemptor what is responsible for getting the reservation data and when? Where does that reservation data live in this case? Chatted with Bill offline and we agreed that we should start moving towards a standalone (background worker) preemptor. That would require moving async decision making into the preemptor itself. I am going to discard this RB and start working towards what benefits us longer term. - Maxim --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31739/#review75226 --- On March 4, 2015, 7:30 p.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31739/ --- (Updated March 4, 2015, 7:30 p.m.) Review request for Aurora, Bill Farner and Zameer Manji. Bugs: AURORA-1158 https://issues.apache.org/jira/browse/AURORA-1158 Repository: aurora Description --- Reservations now happen asynchronously with a configurable delay between a failed task scheduling and a preemption attempt. Added a new `PreemptorBenchmark` to measure preemption perf as it now happens off the main scheduling loop and thus unreachable by earlier benchmarks. Benchmark results are unsurprisingly great. The biggest winner is the PreemptorFallbackForLargeClusterBenchmark (now ClusterFullUtilizationBenchmark). Without the preemptor fallback and thanks to static veto offer filtering it's now 99.995% faster :) The lowest gain is for the limit constraint benchmark. It's the only dynamic veto type and thus is not subjected to offer filtering. Still ~71% improvement is nothing to complain about. Before: Benchmark Mode Cnt ScoreError Units SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark avgt 100781243.004 ± 9308.450 ns/op SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100 1205278.826 ± 19800.452 ns/op SchedulingBenchmarks.PreemptorFallbackForLargeClusterBenchmark.runBenchmark avgt 100 77048458.974 ± 918593.702 ns/op SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100769919.326 ± 18963.264 ns/op After: Benchmark Mode CntScoreError Units SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark avgt 10028117.603 ±243.556 ns/op SchedulingBenchmarks.LimitConstraintMismatchSchedulingBenchmark.runBenchmark avgt 100 348667.808 ± 2956.521 ns/op SchedulingBenchmarks.ClusterFullUtilizationBenchmark.runBenchmark avgt 100 3978.828 ±351.186 ns/op SchedulingBenchmarks.ValueConstraintMismatchSchedulingBenchmark.runBenchmark avgt 10026096.782 ±412.138 ns/op SchedulingBenchmarks.PreemptorBenchmark.runBenchmark avgt 100 6054216.773 ± 105428.318 ns/op Perf gain summary: InsufficientResourcesSchedulingBenchmark - 96.4% LimitConstraintMismatchSchedulingBenchmark - 71% PreemptorFallbackForLargeClusterBenchmark- 99.995% ValueConstraintMismatchSchedulingBenchmark - 96.6% Diffs - src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 3239eaa139e35e8c3acdacf6375f492de2b5bfee src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java e87dda47a355654c66f6f54fb25a4d9a7f68422d src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java d0fe3e133cbec2418f31160bf8ab8adaa45bb958 src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 4ee13c8e5d46ba863f4d9871884c7d494d07758d src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java