Re: Review Request 51929: Scheduling multiple tasks per round.

2016-09-27 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51929/
---

(Updated Sept. 27, 2016, 6:06 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Joshua's comments.


Repository: aurora


Description
---

This is phase 2 of scheduling perf improvement effort started in 
https://reviews.apache.org/r/51759/.

We can now take multiple (configurable) number of task IDs from a given 
`TaskGroup` per scheduling. The idea is to go deeper through the offer queue 
and assign more than one task if possible. This approach delivers substantially 
better MTTA and still ensures fairness across multiple `TaskGroups`. We have 
observed almost linear improvement in MTTA (4x+ with 5 tasks per round), which 
suggest the `max_tasks_per_schedule_attempt` can be set even higher if the 
majority of cluster jobs have large number of instances and/or update batch 
sizes.

As far as a single round perf goes, we can consider the following 2 worst-case 
scenarios:
- master: single task scheduling fails after trying all offers in the queue
- this patch: N tasks launched with the very last N offers in the queue + `(N x 
single_task_launch_latency)`

Assuming that matching N tasks against M offers takes exactly the same time as 
1 task against M offers (as they all share the same `TaskGroup`), the only 
measurable difference comes from the additional `N x 
single_task_launch_latency` overhead. Based on real cluster observations, the 
`single_task_launch_latency` is less than 1% of a single task scheduling 
attempt, which is << than the savings from avoided additional scheduling 
rounds. 

As far as jmh results go, the new approach (batching + multiple tasks per 
round) is only slightly more demanding (~8%). Both results though are MUCH 
higher than the real cluster perf, which just confirms we are not bound by CPU 
time here:

Master:
```
Benchmark
Mode  Cnt  Score Error  Units
SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
thrpt   10  17126.183 ± 488.425  ops/s
```

This patch:
```
Benchmark
Mode  Cnt  Score Error  Units
SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
thrpt   10  15838.051 ± 187.890  ops/s
```


Diffs (updated)
-

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
6f1cbfbc4510a037cffc95fee54f62f463d2b534 
  src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java 
87b9e1928ab2d44668df1123f32ffdc4197c0c70 
  src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
664bc6cf964ede2473a4463e58bcdbcb65bc7413 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroup.java 
5d319557057e27fd5fc6d3e553e9ca9139399c50 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
d390c07522d22e43d79ce4370985f3643ef021ca 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
207d38d1ddfd373892602218a98c1daaf4a1325f 
  src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 
7f7b4358ef05c0f0d0e14daac1a5c25488467dc9 
  
src/test/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilterTest.java
 ece476b918e6f2c128039e561eea23a94d8ed396 
  src/test/java/org/apache/aurora/scheduler/filter/AttributeAggregateTest.java 
209f9298a1d55207b9b41159f2ab366f92c1eb70 
  
src/test/java/org/apache/aurora/scheduler/filter/SchedulingFilterImplTest.java 
0cf23df9f373c0d9b27e55a12adefd5f5fd81ba5 
  src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
c1c3eca4a6e6c88dab6b1c69fae3e2f290b58039 
  
src/test/java/org/apache/aurora/scheduler/preemptor/PreemptionVictimFilterTest.java
 ee5c6528af89cc62a35fdb314358c489556d8131 
  src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorImplTest.java 
98048fabc00f233925b6cca015c2525980556e2b 
  src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorModuleTest.java 
2c3e5f32c774be07a5fa28c8bcf3b9a5d88059a1 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
88729626de5fa87b45472792c59cc0ff1ade3e93 
  
src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 
a4e87d2216401f344dca64d69b945de7bcf8159a 
  src/test/java/org/apache/aurora/scheduler/state/TaskAssignerImplTest.java 
b4d27f69ad5d4cce03da9f04424dc35d30e8af29 

Diff: https://reviews.apache.org/r/51929/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51929: Scheduling multiple tasks per round.

2016-09-27 Thread Maxim Khutornenko


> On Sept. 23, 2016, 7:52 p.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroup.java, line 55
> > <https://reviews.apache.org/r/51929/diff/4/?file=1506607#file1506607line55>
> >
> > javadoc might be useful here to make it clear that the collection is a 
> > list of task ids to remove? (or maybe just rename param to `taskIds`?)

Renamed the param.


> On Sept. 23, 2016, 7:52 p.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java, line 
> > 127
> > <https://reviews.apache.org/r/51929/diff/4/?file=1506608#file1506608line127>
> >
> > I know we have the `@Positive` annotation on the arg, but for the sake 
> > of safety (it's possible to initialize these settings from any source) it's 
> > probably worth ensuring `maxTasksPerScheduler` is > 0 here?
> > 
> > If we do that, it might also make sense to move the `checkArgument` on 
> > `firstScheduleDelay` from the constructor below into this constructor?

Good point. Done and done.


> On Sept. 23, 2016, 7:52 p.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java, line 
> > 143
> > <https://reviews.apache.org/r/51929/diff/4/?file=1506608#file1506608line143>
> >
> > If we're going to store settings as an instance variable, we should 
> > reference the other settings from it as well (and kill off those 
> > variables), alternately just store `maxTasksPerScheduler` rather than 
> > settings itself)?

Agree, there are a few refactoring waves colliding here and it's definitely 
time to simplify/remove duplication. Done.


> On Sept. 23, 2016, 7:52 p.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java, line 172
> > <https://reviews.apache.org/r/51929/diff/4/?file=1506610#file1506610line172>
> >
> > If we're just assuming `taskId` is present, why wrap it in an optional? 
> > (Presumably on the initial pass through the loop we know `taskIds` was not 
> > empty, and on each subsequent pass we check `remainingTasks.hasNext` before 
> > re-assigning `taskId`). Am I missing something?

Good catch. There was a legitimate use of the `Optional` in the 
original diff but it's no longer needed after we changed to return `Set<String` 
instead of `Map<String, Boolean>`.


- Maxim


-------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51929/#review150225
---


On Sept. 20, 2016, 10:02 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51929/
> ---
> 
> (Updated Sept. 20, 2016, 10:02 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is phase 2 of scheduling perf improvement effort started in 
> https://reviews.apache.org/r/51759/.
> 
> We can now take multiple (configurable) number of task IDs from a given 
> `TaskGroup` per scheduling. The idea is to go deeper through the offer queue 
> and assign more than one task if possible. This approach delivers 
> substantially better MTTA and still ensures fairness across multiple 
> `TaskGroups`. We have observed almost linear improvement in MTTA (4x+ with 5 
> tasks per round), which suggest the `max_tasks_per_schedule_attempt` can be 
> set even higher if the majority of cluster jobs have large number of 
> instances and/or update batch sizes.
> 
> As far as a single round perf goes, we can consider the following 2 
> worst-case scenarios:
> - master: single task scheduling fails after trying all offers in the queue
> - this patch: N tasks launched with the very last N offers in the queue + `(N 
> x single_task_launch_latency)`
> 
> Assuming that matching N tasks against M offers takes exactly the same time 
> as 1 task against M offers (as they all share the same `TaskGroup`), the only 
> measurable difference comes from the additional `N x 
> single_task_launch_latency` overhead. Based on real cluster observations, the 
> `single_task_launch_latency` is less than 1% of a single task scheduling 
> attempt, which is << than the savings from avoided additional scheduling 
> rounds. 
> 
> As far as jmh results go, the new approach (batching + multiple tasks per 
> round) is only slightly more demanding (~8%). Both results though are MUCH 
>

Re: Review Request 51929: Scheduling multiple tasks per round.

2016-09-22 Thread Maxim Khutornenko


> On Sept. 22, 2016, 10:34 p.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java, 
> > line 114
> > <https://reviews.apache.org/r/51929/diff/4/?file=1506605#file1506605line114>
> >
> > Is there a way of doing this without making `AttributeAggregate` 
> > explicitly mutable?
> > 
> > I don't like how we have to update this from the offer out of band 
> > instead of just refreshing from storage.
> > 
> > Why can't we just have a `refresh` method which re-runs the original 
> > query after we consume the offer?
> > 
> > I'm weary of bugs where we forget to call this method after scheduling 
> > with an offer.

The whole purpose of having the `AttributeAggregate` is to avoid reconstructing 
the job state every time a view into attributes is required. The current 
approach saves an expensive (especially in the `DBTaskStore case` and a large 
job combo) fetch across task and attribute stores. The underlying data 
structure of the `AttributeAggregate` is just a Multiset of host attributes and 
their values. Throwing all that data away and reconstructing it again just to 
have a new host representation added is very wasteful, especially given that we 
already have all new attributes handy.

As for not forgetting to call it, I don't see how reconstructing would be any 
better in that regard as we would still have to remember to reconstruct it out 
of band. The `TaskAssignerImplTest.testAssignPartialNoVetoes` covers 
`AttributeAggregate` update case by the way.


> On Sept. 22, 2016, 10:34 p.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java, line 
> > 179
> > <https://reviews.apache.org/r/51929/diff/4/?file=1506608#file1506608line179>
> >
> > We don't check if this is null later, so why not just make it an empty 
> > `ImmutableSet` to start?

I don't see any benefits in that. This is the classic assign-or-throw pattern. 
The `result.get()` returns a Set and we don't pass nulls for collections.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51929/#review150085
---


On Sept. 20, 2016, 10:02 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51929/
> ---
> 
> (Updated Sept. 20, 2016, 10:02 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is phase 2 of scheduling perf improvement effort started in 
> https://reviews.apache.org/r/51759/.
> 
> We can now take multiple (configurable) number of task IDs from a given 
> `TaskGroup` per scheduling. The idea is to go deeper through the offer queue 
> and assign more than one task if possible. This approach delivers 
> substantially better MTTA and still ensures fairness across multiple 
> `TaskGroups`. We have observed almost linear improvement in MTTA (4x+ with 5 
> tasks per round), which suggest the `max_tasks_per_schedule_attempt` can be 
> set even higher if the majority of cluster jobs have large number of 
> instances and/or update batch sizes.
> 
> As far as a single round perf goes, we can consider the following 2 
> worst-case scenarios:
> - master: single task scheduling fails after trying all offers in the queue
> - this patch: N tasks launched with the very last N offers in the queue + `(N 
> x single_task_launch_latency)`
> 
> Assuming that matching N tasks against M offers takes exactly the same time 
> as 1 task against M offers (as they all share the same `TaskGroup`), the only 
> measurable difference comes from the additional `N x 
> single_task_launch_latency` overhead. Based on real cluster observations, the 
> `single_task_launch_latency` is less than 1% of a single task scheduling 
> attempt, which is << than the savings from avoided additional scheduling 
> rounds. 
> 
> As far as jmh results go, the new approach (batching + multiple tasks per 
> round) is only slightly more demanding (~8%). Both results though are MUCH 
> higher than the real cluster perf, which just confirms we are not bound by 
> CPU time here:
> 
> Master:
> ```
> Benchmark
> Mode  Cnt  Score Error  Units
> SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
> thrpt   10  17

Re: Review Request 52167: Prepare release notes for release, all update committers guide with details on this step and on reverting auto-generated commits for subsequent release candidates.

2016-09-22 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52167/#review150053
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 22, 2016, 6:06 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52167/
> ---
> 
> (Updated Sept. 22, 2016, 6:06 p.m.)
> 
> 
> Review request for Aurora and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Prepare release notes for release, all update committers guide with details 
> on this step and on reverting auto-generated commits for subsequent release 
> candidates.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md 854bcbdf4e0dc40d0e18532642903e67232d2c9a 
>   docs/development/committers-guide.md 
> ebf464bf887772862bf9d420a8f65720038275bc 
> 
> Diff: https://reviews.apache.org/r/52167/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>



Re: Review Request 52141: Shutting down scheduler on unhandled BatchWorker error.

2016-09-22 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52141/
---

(Updated Sept. 22, 2016, 3:20 p.m.)


Review request for Aurora, Stephan Erb and Zameer Manji.


Changes
---

Removing redundant shutdown hook.


Bugs: AURORA-1779
https://issues.apache.org/jira/browse/AURORA-1779


Repository: aurora


Description
---

##Problem

The current `BatchWorker` error handling assumes graceful recovery if any of 
the individual batch items fail. This may not hold true if the storage is 
modified _before_ the error is thrown and the `LogStorage` transaction is not 
rolled back. Consider the following fragment from the `StateManagerImpl`:
```
storeProvider.getUnsafeTaskStore().saveTasks(scheduledTasks);

for (IScheduledTask scheduledTask : scheduledTasks) {
  updateTaskAndExternalState(
  storeProvider.getUnsafeTaskStore(),
  Tasks.id(scheduledTask),
  Optional.of(scheduledTask),
  Optional.of(PENDING),
  Optional.absent());
}
```
If a task is saved but the subsequent `updateTaskAndExternalState()` throws AND 
`BatchWorker` catches and logs that exception, the native store transaction 
will be committed and as a result the storage will be left in a logically 
inconsistent state.

##Solution

I can see at least 3 ways to solve this problem:
1. Fail hard and shutdown scheduler when any of the batch items throw;
2. Fail only the last batch (drop all its items) and let storage transaction 
roll back;
3. Implement `BatchWorker` transaction where items _before_ and _after_ the 
failed item would be retried when the storage transaction rolls back;

The #3 is the most involved and would be very hard to get right (assuming batch 
items are idempotent, which may not be the case). The #2 may result in a very 
hard to troubleshoot scenario where _some_ items would be dropped on the floor 
and never completed. 

I suggest we go with #1 as the most straightforward and transparent approach. 
It's also the only one that ensures storage state consistency, which is the 
most vital invariant to preserve (as AURORA-1779 proves). The only downside of 
this approach is that scheduler will go down hard any time an unhandled error 
is thrown but arguably this is the easiest way to improve our codebase and 
certainly better than leaving the scheduler in a crippled state.


Diffs (updated)
-

  src/main/java/org/apache/aurora/scheduler/BatchWorker.java 
e05d4b48749b0691902a505bea1b4490fdd08f29 
  src/test/java/org/apache/aurora/scheduler/BatchWorkerTest.java 
a86dc82cafa7f5436d2b8d00c6db575ff8311eea 

Diff: https://reviews.apache.org/r/52141/diff/


Testing
---

unit and e2e tests


Thanks,

Maxim Khutornenko



Re: Review Request 52141: Shutting down scheduler on unhandled BatchWorker error.

2016-09-22 Thread Maxim Khutornenko


> On Sept. 22, 2016, 2:24 p.m., Joshua Cohen wrote:
> > What's the behavior pre-BatchWorker in this case? Would we fail hard, or is 
> > there something inherent to the batch worker that makes this necessary?
> > 
> > Also, this fixes the error handling that led to corruption in the event of 
> > a failure, but should we also determine the root cause that led to the 
> > failure to begin with (as part of another review no doubt)?

The pre-BatchWorker behavior depended on where/how the error was thrown and 
could have resulted in a hard shutdown or a mysterious silent async failure 
killing one of the executor threads. With the BatchWorker, the scheduler will 
always fail on unhandled exception. This is needed to preserve the transaction 
boundaries and let the error rollback the changes.

> ...but should we also determine the root cause that led to the failure to 
> begin with  

I thought it should have been clear from the RB description but then I realized 
I explained it in Slack and not here. We purposefully induce scheduler 
failovers many times during the e2e run to verify things like job updates still 
work. If the shutdown happens during the transaction, the DB gets closed 
underneath and the subsequent queries fail any way they like. This is exactly 
what happened in this case. A task is saved, driver exits, StateManager 
attempts to save task events but fails due to "DB closed" error. That error is 
caught by the BatchWorker and the native log transaction successfully completes 
before the scheduler fully terminates.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52141/#review14
-------


On Sept. 22, 2016, 5:38 a.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52141/
> ---
> 
> (Updated Sept. 22, 2016, 5:38 a.m.)
> 
> 
> Review request for Aurora, Stephan Erb and Zameer Manji.
> 
> 
> Bugs: AURORA-1779
> https://issues.apache.org/jira/browse/AURORA-1779
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> ##Problem
> 
> The current `BatchWorker` error handling assumes graceful recovery if any of 
> the individual batch items fail. This may not hold true if the storage is 
> modified _before_ the error is thrown and the `LogStorage` transaction is not 
> rolled back. Consider the following fragment from the `StateManagerImpl`:
> ```
> storeProvider.getUnsafeTaskStore().saveTasks(scheduledTasks);
> 
> for (IScheduledTask scheduledTask : scheduledTasks) {
>   updateTaskAndExternalState(
>   storeProvider.getUnsafeTaskStore(),
>   Tasks.id(scheduledTask),
>   Optional.of(scheduledTask),
>   Optional.of(PENDING),
>   Optional.absent());
> }
> ```
> If a task is saved but the subsequent `updateTaskAndExternalState()` throws 
> AND `BatchWorker` catches and logs that exception, the native store 
> transaction will be committed and as a result the storage will be left in a 
> logically inconsistent state.
> 
> ##Solution
> 
> I can see at least 3 ways to solve this problem:
> 1. Fail hard and shutdown scheduler when any of the batch items throw;
> 2. Fail only the last batch (drop all its items) and let storage transaction 
> roll back;
> 3. Implement `BatchWorker` transaction where items _before_ and _after_ the 
> failed item would be retried when the storage transaction rolls back;
> 
> The #3 is the most involved and would be very hard to get right (assuming 
> batch items are idempotent, which may not be the case). The #2 may result in 
> a very hard to troubleshoot scenario where _some_ items would be dropped on 
> the floor and never completed. 
> 
> I suggest we go with #1 as the most straightforward and transparent approach. 
> It's also the only one that ensures storage state consistency, which is the 
> most vital invariant to preserve (as AURORA-1779 proves). The only downside 
> of this approach is that scheduler will go down hard any time an unhandled 
> error is thrown but arguably this is the easiest way to improve our codebase 
> and certainly better than leaving the scheduler in a crippled state.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/BatchWorker.java 
> e05d4b48749b0691902a505bea1b4490fdd08f29 
>   src/main/java/org/apache/aurora/scheduler/SchedulerModule.java 
> 2ec3967ddb1d470cf681de062a6400f647978185 
>   src/main/java/org/apac

Re: Review Request 52141: Shutting down scheduler on unhandled BatchWorker error.

2016-09-22 Thread Maxim Khutornenko


> On Sept. 22, 2016, 2:25 p.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 152
> > <https://reviews.apache.org/r/52141/diff/1/?file=1507901#file1507901line152>
> >
> > Something is wrong here.
> > 
> > The listener from `GuavaUtils.LifecycleShutdownListener` should have 
> > already been applied to this service in `SchedulerServicesModule`. We used 
> > `addSchedulerActiveServiceBinding` to add the BatchWorker services so I'm 
> > not sure why this helps at all.

Ah, that's right, that should be enough. So, the fix is even simpler, just let 
the error bubble up and the existing shutdown hook will take care of the rest. 
Verified in vagrant that's exactly what happens.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52141/#review15
-------


On Sept. 22, 2016, 5:38 a.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52141/
> ---
> 
> (Updated Sept. 22, 2016, 5:38 a.m.)
> 
> 
> Review request for Aurora, Stephan Erb and Zameer Manji.
> 
> 
> Bugs: AURORA-1779
> https://issues.apache.org/jira/browse/AURORA-1779
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> ##Problem
> 
> The current `BatchWorker` error handling assumes graceful recovery if any of 
> the individual batch items fail. This may not hold true if the storage is 
> modified _before_ the error is thrown and the `LogStorage` transaction is not 
> rolled back. Consider the following fragment from the `StateManagerImpl`:
> ```
> storeProvider.getUnsafeTaskStore().saveTasks(scheduledTasks);
> 
> for (IScheduledTask scheduledTask : scheduledTasks) {
>   updateTaskAndExternalState(
>   storeProvider.getUnsafeTaskStore(),
>   Tasks.id(scheduledTask),
>   Optional.of(scheduledTask),
>   Optional.of(PENDING),
>   Optional.absent());
> }
> ```
> If a task is saved but the subsequent `updateTaskAndExternalState()` throws 
> AND `BatchWorker` catches and logs that exception, the native store 
> transaction will be committed and as a result the storage will be left in a 
> logically inconsistent state.
> 
> ##Solution
> 
> I can see at least 3 ways to solve this problem:
> 1. Fail hard and shutdown scheduler when any of the batch items throw;
> 2. Fail only the last batch (drop all its items) and let storage transaction 
> roll back;
> 3. Implement `BatchWorker` transaction where items _before_ and _after_ the 
> failed item would be retried when the storage transaction rolls back;
> 
> The #3 is the most involved and would be very hard to get right (assuming 
> batch items are idempotent, which may not be the case). The #2 may result in 
> a very hard to troubleshoot scenario where _some_ items would be dropped on 
> the floor and never completed. 
> 
> I suggest we go with #1 as the most straightforward and transparent approach. 
> It's also the only one that ensures storage state consistency, which is the 
> most vital invariant to preserve (as AURORA-1779 proves). The only downside 
> of this approach is that scheduler will go down hard any time an unhandled 
> error is thrown but arguably this is the easiest way to improve our codebase 
> and certainly better than leaving the scheduler in a crippled state.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/BatchWorker.java 
> e05d4b48749b0691902a505bea1b4490fdd08f29 
>   src/main/java/org/apache/aurora/scheduler/SchedulerModule.java 
> 2ec3967ddb1d470cf681de062a6400f647978185 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
> 7c8047a7235a937c29fe96517242923ff84a080c 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
> d390c07522d22e43d79ce4370985f3643ef021ca 
>   src/test/java/org/apache/aurora/scheduler/BatchWorkerTest.java 
> a86dc82cafa7f5436d2b8d00c6db575ff8311eea 
>   src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
> 8556253fc11f6027316871eb9dc66d8627a77fe6 
> 
> Diff: https://reviews.apache.org/r/52141/diff/
> 
> 
> Testing
> ---
> 
> unit and e2e tests
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>



Review Request 52141: Shutting down scheduler on unhandled BatchWorker error.

2016-09-21 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52141/
---

Review request for Aurora, Stephan Erb and Zameer Manji.


Bugs: AURORA-1779
https://issues.apache.org/jira/browse/AURORA-1779


Repository: aurora


Description
---

##Problem

The current `BatchWorker` error handling assumes graceful recovery if any of 
the individual batch items fail. This may not hold true if the storage is 
modified _before_ the error is thrown and the `LogStorage` transaction is not 
rolled back. Consider the following fragment from the `StateManagerImpl`:
```
storeProvider.getUnsafeTaskStore().saveTasks(scheduledTasks);

for (IScheduledTask scheduledTask : scheduledTasks) {
  updateTaskAndExternalState(
  storeProvider.getUnsafeTaskStore(),
  Tasks.id(scheduledTask),
  Optional.of(scheduledTask),
  Optional.of(PENDING),
  Optional.absent());
}
```
If a task is saved but the subsequent `updateTaskAndExternalState()` throws AND 
`BatchWorker` catches and logs that exception, the native store transaction 
will be committed and as a result the storage will be left in a logically 
inconsistent state.

##Solution

I can see at least 3 ways to solve this problem:
1. Fail hard and shutdown scheduler when any of the batch items throw;
2. Fail only the last batch (drop all its items) and let storage transaction 
roll back;
3. Implement `BatchWorker` transaction where items _before_ and _after_ the 
failed item would be retried when the storage transaction rolls back;

The #3 is the most involved and would be very hard to get right (assuming batch 
items are idempotent, which may not be the case). The #2 may result in a very 
hard to troubleshoot scenario where _some_ items would be dropped on the floor 
and never completed. 

I suggest we go with #1 as the most straightforward and transparent approach. 
It's also the only one that ensures storage state consistency, which is the 
most vital invariant to preserve (as AURORA-1779 proves). The only downside of 
this approach is that scheduler will go down hard any time an unhandled error 
is thrown but arguably this is the easiest way to improve our codebase and 
certainly better than leaving the scheduler in a crippled state.


Diffs
-

  src/main/java/org/apache/aurora/scheduler/BatchWorker.java 
e05d4b48749b0691902a505bea1b4490fdd08f29 
  src/main/java/org/apache/aurora/scheduler/SchedulerModule.java 
2ec3967ddb1d470cf681de062a6400f647978185 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
7c8047a7235a937c29fe96517242923ff84a080c 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
d390c07522d22e43d79ce4370985f3643ef021ca 
  src/test/java/org/apache/aurora/scheduler/BatchWorkerTest.java 
a86dc82cafa7f5436d2b8d00c6db575ff8311eea 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
8556253fc11f6027316871eb9dc66d8627a77fe6 

Diff: https://reviews.apache.org/r/52141/diff/


Testing
---

unit and e2e tests


Thanks,

Maxim Khutornenko



Re: Review Request 52074: switching from launchTask to acceptOffers

2016-09-21 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52074/#review149862
---


Ship it!





src/main/java/org/apache/aurora/scheduler/mesos/SchedulerDriverService.java 
(line 124)
<https://reviews.apache.org/r/52074/#comment217589>

nit: add newline above



src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java (line 372)
<https://reviews.apache.org/r/52074/#comment217592>

Can be inlined below.


- Maxim Khutornenko


On Sept. 20, 2016, 7:14 p.m., Dmitriy Shirchenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52074/
> ---
> 
> (Updated Sept. 20, 2016, 7:14 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko, Mehrdad Nurolahzade, and Zameer 
> Manji.
> 
> 
> Bugs: AURORA-1776
> https://issues.apache.org/jira/browse/AURORA-1776
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Dynamic reservation requires us to use acceptOffers so this is in preparation 
> of that work. acceptOffers is the call to dynamically reserve resources.
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/fakes/FakeDriver.java 
> 1bfd8dbf3aec314aaeb89b756ea3a2049df8777a 
>   src/main/java/org/apache/aurora/scheduler/mesos/Driver.java 
> 14481468926b82a9bad3ee78c85f176e8ba893e2 
>   src/main/java/org/apache/aurora/scheduler/mesos/SchedulerDriverService.java 
> 41b9aab55aa439b13eba3545703ec97b44690444 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java 
> 925c025d21bf1d44e0c1d319f6653ecfa8899481 
>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 
> a739bce1226d9435fa7d0b18e411064a4e78e49e 
> 
> Diff: https://reviews.apache.org/r/52074/diff/
> 
> 
> Testing
> ---
> 
> src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Dmitriy Shirchenko
> 
>



Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-21 Thread Maxim Khutornenko


> On Sept. 20, 2016, 12:28 a.m., Kai Huang wrote:
> > src/main/python/apache/aurora/executor/common/status_checker.py, line 104
> > <https://reviews.apache.org/r/51876/diff/3/?file=1505371#file1505371line104>
> >
> > Use TaskState.Value('TASK_RUNNING') here instead of 
> > mesos_pb2.TASK_RUNNING, because this file also used TaskState in multiple 
> > places.

Not sure I get this. What prevents you from using `mesos_pb2.TASK_RUNNING` here 
and keep `TaskState` in other places?


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review149581
---


On Sept. 20, 2016, 12:25 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 20, 2016, 12:25 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 795dae2d6b661fc528d952c2315196d94127961f 
>   src/main/python/apache/aurora/executor/common/task_info.py 
> 4ef49e30eeb942886d02c1fb448055264f9aa874 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/python/apache/aurora/executor/common/test_status_checker.py 
> 5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
>   src/test/python/apache/aurora/executor/common/test_task_info.py 
> 0d9cc5cb340d697a887d8e001f23c948f4fa70c7 
>   src/test/python/apache/aurora/executor/test_status_manager.py 
> ce4679ba1aa7b42cf0115c943d84663030182d23 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 
> 0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 
> 
> Diff: https://reviews.apache.org/r/51876/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51876: Modify executor state transition logic to rely on health checks (if enabled)

2016-09-21 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review149846
---




api/src/main/thrift/org/apache/aurora/gen/api.thrift (lines 1218 - 1220)
<https://reviews.apache.org/r/51876/#comment217562>

What's the motivation behind moving these constants into the thrift layer? 
FWICT, these are only used inside the `health_checker.py` and should stay there.



src/main/python/apache/aurora/executor/aurora_executor.py (line 80)
<https://reviews.apache.org/r/51876/#comment217565>

This does not have to be a global field, does it? It looks like 
`_register()` only adds callbacks but nothing ever reads from it.



src/main/python/apache/aurora/executor/aurora_executor.py (lines 184 - 185)
<https://reviews.apache.org/r/51876/#comment217566>

Are there any guarantees the status_manager will always deal with 
`ThermosTaskRunner.EXIT_STATE_MAP` values? The way I see it any unmapped state 
will end up not calling the `_shutdown()` at all. Perhaps a safer way could be 
having an explicit pair of `_running()` to be called only for RUNNING state and 
`_shutdown()` act as a default callback for anything else? Curious what others 
think here.



src/main/python/apache/aurora/executor/common/health_checker.py (line 140)
<https://reviews.apache.org/r/51876/#comment217570>

leftover?



src/main/python/apache/aurora/executor/common/health_checker.py (lines 153 - 
155)
<https://reviews.apache.org/r/51876/#comment217578>

This should be moved into the `_maybe_update_failure_count()` to have all 
decision making logic in a single place.



src/main/python/apache/aurora/executor/common/health_checker.py (line 225)
<https://reviews.apache.org/r/51876/#comment217582>

This should be `mesos_pb2.TASK_RUNNING`, right?



src/main/python/apache/aurora/executor/common/health_checker.py (lines 270 - 
274)
<https://reviews.apache.org/r/51876/#comment217563>

This logic needs to be refactored to reduce duplication and reuse what's 
now in `is_health_check_enabled` as much as possible. Ideally, we should have 
the only place we extract `health_checker` and the like.



src/main/python/apache/aurora/executor/status_manager.py (line 68)
<https://reviews.apache.org/r/51876/#comment217567>

This is exactly the problem I alluded to above. What if there is no entry 
for a given `status` here?


- Maxim Khutornenko


On Sept. 20, 2016, 12:25 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 20, 2016, 12:25 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> 
> D

Re: Review Request 51929: Scheduling multiple tasks per round.

2016-09-20 Thread Maxim Khutornenko


> On Sept. 20, 2016, 6:49 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java, 
> > lines 93-96
> > <https://reviews.apache.org/r/51929/diff/3/?file=1500698#file1500698line93>
> >
> > Regarding your notes in the RB description: I don't see a problem if we 
> > set this to a slightly higher value such as `10` or `15`. 
> > 
> > It seems like we will maintain the basic taskgroup round robin 
> > scheduling fairness even with slightly larger batch sizes, so I am ok with 
> > bumping the value.

The higher this count is the less fair do we treat lower instance count jobs in 
case the capacity of the cluster is constrained. It's hard to come up with an 
ideal default here but I feel 10-15 would be a bit too extreme for the general 
case.


> On Sept. 20, 2016, 6:49 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java, lines 
> > 174-175
> > <https://reviews.apache.org/r/51929/diff/3/?file=1500700#file1500700line174>
> >
> > If I understand things correctly, I believe this line could have a 
> > performance bug in it:
> > 
> > `batchWorker.execute` acquires the global storage lock before calling 
> > the `taskScheduler`. For the latter, we use the following definition:
> > 
> > ```
> > this.taskScheduler = (store, taskIds) -> {
> >   settings.rateLimiter.acquire();
> >   return taskScheduler.schedule(store, taskIds);
> > };
> > ```
> > 
> > In combination, we will be throttled by the `rateLimiter` while holding 
> > the storage lock.
> > 
> > Instead, we should try to acquire the log within the `Runnable` in 
> > `startGroup` before calling `batchWorker.execute` so that the global lock 
> > is not hold longer than absolutely necessary.

This is a valid concern. It's certainly more possible than before to hit our 
default 40 tasks per second limit now. Fixed.


- Maxim


-------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51929/#review149681
---


On Sept. 16, 2016, 9:53 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51929/
> ---
> 
> (Updated Sept. 16, 2016, 9:53 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is phase 2 of scheduling perf improvement effort started in 
> https://reviews.apache.org/r/51759/.
> 
> We can now take multiple (configurable) number of task IDs from a given 
> `TaskGroup` per scheduling. The idea is to go deeper through the offer queue 
> and assign more than one task if possible. This approach delivers 
> substantially better MTTA and still ensures fairness across multiple 
> `TaskGroups`. We have observed almost linear improvement in MTTA (4x+ with 5 
> tasks per round), which suggest the `max_tasks_per_schedule_attempt` can be 
> set even higher if the majority of cluster jobs have large number of 
> instances and/or update batch sizes.
> 
> As far as a single round perf goes, we can consider the following 2 
> worst-case scenarios:
> - master: single task scheduling fails after trying all offers in the queue
> - this patch: N tasks launched with the very last N offers in the queue + `(N 
> x single_task_launch_latency)`
> 
> Assuming that matching N tasks against M offers takes exactly the same time 
> as 1 task against M offers (as they all share the same `TaskGroup`), the only 
> measurable difference comes from the additional `N x 
> single_task_launch_latency` overhead. Based on real cluster observations, the 
> `single_task_launch_latency` is less than 1% of a single task scheduling 
> attempt, which is << than the savings from avoided additional scheduling 
> rounds. 
> 
> As far as jmh results go, the new approach (batching + multiple tasks per 
> round) is only slightly more demanding (~8%). Both results though are MUCH 
> higher than the real cluster perf, which just confirms we are not bound by 
> CPU time here:
> 
> Master:
> ```
> Benchmark
> Mode  Cnt  Score Error  Units
> SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
> thrpt   10  17126.183

Re: Review Request 51929: Scheduling multiple tasks per round.

2016-09-20 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51929/
---

(Updated Sept. 20, 2016, 10:02 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Stephan's comments.


Repository: aurora


Description
---

This is phase 2 of scheduling perf improvement effort started in 
https://reviews.apache.org/r/51759/.

We can now take multiple (configurable) number of task IDs from a given 
`TaskGroup` per scheduling. The idea is to go deeper through the offer queue 
and assign more than one task if possible. This approach delivers substantially 
better MTTA and still ensures fairness across multiple `TaskGroups`. We have 
observed almost linear improvement in MTTA (4x+ with 5 tasks per round), which 
suggest the `max_tasks_per_schedule_attempt` can be set even higher if the 
majority of cluster jobs have large number of instances and/or update batch 
sizes.

As far as a single round perf goes, we can consider the following 2 worst-case 
scenarios:
- master: single task scheduling fails after trying all offers in the queue
- this patch: N tasks launched with the very last N offers in the queue + `(N x 
single_task_launch_latency)`

Assuming that matching N tasks against M offers takes exactly the same time as 
1 task against M offers (as they all share the same `TaskGroup`), the only 
measurable difference comes from the additional `N x 
single_task_launch_latency` overhead. Based on real cluster observations, the 
`single_task_launch_latency` is less than 1% of a single task scheduling 
attempt, which is << than the savings from avoided additional scheduling 
rounds. 

As far as jmh results go, the new approach (batching + multiple tasks per 
round) is only slightly more demanding (~8%). Both results though are MUCH 
higher than the real cluster perf, which just confirms we are not bound by CPU 
time here:

Master:
```
Benchmark
Mode  Cnt  Score Error  Units
SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
thrpt   10  17126.183 ± 488.425  ops/s
```

This patch:
```
Benchmark
Mode  Cnt  Score Error  Units
SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
thrpt   10  15838.051 ± 187.890  ops/s
```


Diffs (updated)
-

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
6f1cbfbc4510a037cffc95fee54f62f463d2b534 
  src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java 
87b9e1928ab2d44668df1123f32ffdc4197c0c70 
  src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
664bc6cf964ede2473a4463e58bcdbcb65bc7413 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroup.java 
5d319557057e27fd5fc6d3e553e9ca9139399c50 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
d390c07522d22e43d79ce4370985f3643ef021ca 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
207d38d1ddfd373892602218a98c1daaf4a1325f 
  src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 
7f7b4358ef05c0f0d0e14daac1a5c25488467dc9 
  
src/test/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilterTest.java
 ece476b918e6f2c128039e561eea23a94d8ed396 
  src/test/java/org/apache/aurora/scheduler/filter/AttributeAggregateTest.java 
209f9298a1d55207b9b41159f2ab366f92c1eb70 
  
src/test/java/org/apache/aurora/scheduler/filter/SchedulingFilterImplTest.java 
0cf23df9f373c0d9b27e55a12adefd5f5fd81ba5 
  src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
c1c3eca4a6e6c88dab6b1c69fae3e2f290b58039 
  
src/test/java/org/apache/aurora/scheduler/preemptor/PreemptionVictimFilterTest.java
 ee5c6528af89cc62a35fdb314358c489556d8131 
  src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorImplTest.java 
98048fabc00f233925b6cca015c2525980556e2b 
  src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorModuleTest.java 
2c3e5f32c774be07a5fa28c8bcf3b9a5d88059a1 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
88729626de5fa87b45472792c59cc0ff1ade3e93 
  
src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 
a4e87d2216401f344dca64d69b945de7bcf8159a 
  src/test/java/org/apache/aurora/scheduler/state/TaskAssignerImplTest.java 
b4d27f69ad5d4cce03da9f04424dc35d30e8af29 

Diff: https://reviews.apache.org/r/51929/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51929: Scheduling multiple tasks per round.

2016-09-20 Thread Maxim Khutornenko


> On Sept. 16, 2016, 9:08 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java, line 
> > 197
> > <https://reviews.apache.org/r/51929/diff/1/?file=1499323#file1499323line197>
> >
> > Side show: Isn't that `if` unnecessary here and we can adjust the 
> > penality in any case? We will remove the group if `hasMore()` returns 
> > false, so any penality should be fine.
> 
> Maxim Khutornenko wrote:
> Not sure I follow. This is the place that applies penalty accrued inside 
> the `startGroup()` or removes the group if it's empty.
> 
> Stephan Erb wrote:
> Let me print some more code to make clearer what I meant.
> 
> This is the code where we compute penaltyMs depending on `group.hasMore`
> ```
> scheduledTaskPenalties.accumulate(group.getPenaltyMs());
> group.remove(scheduled);
> if (group.hasMore()) {
>   penaltyMs = firstScheduleDelay;
> }
>   }
> }
> 
> group.setPenaltyMs(penaltyMs);
> evaluateGroupLater(this, group);
> ```
> 
> 
> Later on we then drop empty groups in `evaluateGroupLater`:
> ```
>   private synchronized void evaluateGroupLater(Runnable evaluate, 
> TaskGroup group) {
> // Avoid check-then-act by holding the intrinsic lock.  If not done 
> atomically, we could
> // remove a group while a task is being added to it.
> if (group.hasMore()) {
>   executor.execute(evaluate, Amount.of(group.getPenaltyMs(), 
> Time.MILLISECONDS));
> } else {
>   groups.remove(group.getKey());
> }
>   }
> ```
> 
> What I tried to say is that we could unconditionally write `penaltyMs = 
> firstScheduleDelay;` without the `if (group.hasMore())` in the first snippet. 
> If `group.hasMore()` returns false we will remove the group anyway, so it 
> does not matter if we set a new penality or not.
> 
> This is completely unreleated to the change in this RB so feel free to 
> ignore it. It was more about checking my own understanding of the code.

I see what you meant now. The slight problem with the simplification you 
propose is that it _may_ result in a penalty where it would not happen today: 
`startGroup()` calculates the penalty even though there are no more groups left 
but the call to `evaluateGroupLater()` is delayed for some reason AND a new 
task is added into the group thus recharging it. The penalty would be applied 
the moment `evaluateGroupLater()` is finally reached. It's certainly an edge 
case but I'd prefer not changing the current behavior in the this patch.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51929/#review149179
---


On Sept. 16, 2016, 9:53 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51929/
> ---
> 
> (Updated Sept. 16, 2016, 9:53 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is phase 2 of scheduling perf improvement effort started in 
> https://reviews.apache.org/r/51759/.
> 
> We can now take multiple (configurable) number of task IDs from a given 
> `TaskGroup` per scheduling. The idea is to go deeper through the offer queue 
> and assign more than one task if possible. This approach delivers 
> substantially better MTTA and still ensures fairness across multiple 
> `TaskGroups`. We have observed almost linear improvement in MTTA (4x+ with 5 
> tasks per round), which suggest the `max_tasks_per_schedule_attempt` can be 
> set even higher if the majority of cluster jobs have large number of 
> instances and/or update batch sizes.
> 
> As far as a single round perf goes, we can consider the following 2 
> worst-case scenarios:
> - master: single task scheduling fails after trying all offers in the queue
> - this patch: N tasks launched with the very last N offers in the queue + `(N 
> x single_task_launch_latency)`
> 
> Assuming that matching N tasks against M offers takes exactly the same time 
> as 1 task against M offers (as they all share the same `TaskGroup`), the only 
> measurable difference comes from the additional `N x 
> single_task_launch_latency` overhead. Based on real cluster observations, the 
&

Re: Review Request 52087: Fix host maintenance commands to properly initialize the api client.

2016-09-20 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52087/#review149690
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 20, 2016, 5:32 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52087/
> ---
> 
> (Updated Sept. 20, 2016, 5:32 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Fix host maintenance commands to properly initialize the api client.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/admin/admin.py 
> 9fc89a2842d4651ac10aa82118531c450f051712 
>   src/main/python/apache/aurora/admin/admin_util.py 
> 394deb57af9ad8832a02ceab15f33b3c1e5c902b 
>   src/main/python/apache/aurora/admin/host_maintenance.py 
> 677f870c50d1e19d34de36a3c677bb1dcd255ba0 
>   src/main/python/apache/aurora/admin/maintenance.py 
> bf446515222a1246b7fe32acd67315f8145a1212 
>   src/test/python/apache/aurora/admin/test_admin.py 
> f720742c50f774b9f5a41ba57ef251d08d79e8d1 
>   src/test/python/apache/aurora/admin/test_admin_sla.py 
> 54b5a823903b82c2082353ade132eb7b63e518db 
>   src/test/python/apache/aurora/admin/util.py 
> d0a915cd8edee6b36d671b41f43a3af4fe751ae7 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> e36726e0c0154e51bba40704e13eebec5c0f0d65 
> 
> Diff: https://reviews.apache.org/r/52087/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ran e2e tests.
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>



Re: Review Request 51980: Refactor of Webhook and no longer posting entire task state via webhook on scheduler restart

2016-09-20 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51980/#review149674
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 19, 2016, 6:28 p.m., Dmitriy Shirchenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51980/
> ---
> 
> (Updated Sept. 19, 2016, 6:28 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko, Stephan Erb, and Zameer Manji.
> 
> 
> Bugs: AURORA-1772
> https://issues.apache.org/jira/browse/AURORA-1772
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is a refactor with addition of HttpClient injected into Webhook class as 
> opposed to previous usage of lower level HtttpURLConnection objects. 
> Additionally due to peformance issues, it is unncessary to POST entire task 
> state to webhook endpoint on every scheduler restart so that is removed in 
> this patch.
> 
> 
> Diffs
> -
> 
>   build.gradle d5a3a7a3bdb8349de6fc01d4a6271b32d942e531 
>   src/main/java/org/apache/aurora/scheduler/events/Webhook.java 
> e54aa19d67278fcb5586f07c399f09062f845a18 
>   src/main/java/org/apache/aurora/scheduler/events/WebhookInfo.java 
> e4193f7518e66d210a6d5aae22a9f04e2d4984b3 
>   src/main/java/org/apache/aurora/scheduler/events/WebhookModule.java 
> eaa70533a64740c74cebf15739b7142e2d3a4d11 
>   src/main/resources/org/apache/aurora/scheduler/webhook.json 
> 00787985510d7d415b8074bef06d28b635c78b09 
>   src/test/java/org/apache/aurora/scheduler/events/WebhookTest.java 
> 488eefd14c3e67a41a75c809397c8d19f83cc08a 
> 
> Diff: https://reviews.apache.org/r/51980/diff/
> 
> 
> Testing
> ---
> 
> Part of reason for refactor is to allow easier testing so cleaner unit tests 
> were added with more code coverage.
> 
> 
> Thanks,
> 
> Dmitriy Shirchenko
> 
>



Re: Review Request 51993: Added the 'reason' to the /pendingTasks endpoint

2016-09-19 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51993/#review149487
---




src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java (line 138)
<https://reviews.apache.org/r/51993/#comment217136>

I don't think this is the right approach. `TaskGroups` has no business in 
accessing the `NearestFit` functionality. 

Please, consider extracting 
[this](https://github.com/apache/aurora/blob/795a2728c623c35bd509d582c24684a6921c74ad/src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java#L194-L208)
 and making it a public method of the `NearestFit` itself (e.g.: 
`getPendingReason(Query.Builder query)`. You can query that method and marry 
the results right within the `PendingTasks`.


- Maxim Khutornenko


On Sept. 17, 2016, 9:36 p.m., Pradyumna Kaushik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51993/
> ---
> 
> (Updated Sept. 17, 2016, 9:36 p.m.)
> 
> 
> Review request for Aurora.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Added the 'reason' to the /pendingTasks endpoint
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/http/PendingTasks.java 
> c80e0c8adf80e12082a6952ae79b7d9cc960c5b6 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroup.java 
> 5d319557057e27fd5fc6d3e553e9ca9139399c50 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
> d390c07522d22e43d79ce4370985f3643ef021ca 
>   src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
> 88729626de5fa87b45472792c59cc0ff1ade3e93 
> 
> Diff: https://reviews.apache.org/r/51993/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Pradyumna Kaushik
> 
>



Re: Review Request 51929: Scheduling multiple tasks per round.

2016-09-16 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51929/#review149290
---



@ReviewBot retry

- Maxim Khutornenko


On Sept. 16, 2016, 9:53 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51929/
> ---
> 
> (Updated Sept. 16, 2016, 9:53 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is phase 2 of scheduling perf improvement effort started in 
> https://reviews.apache.org/r/51759/.
> 
> We can now take multiple (configurable) number of task IDs from a given 
> `TaskGroup` per scheduling. The idea is to go deeper through the offer queue 
> and assign more than one task if possible. This approach delivers 
> substantially better MTTA and still ensures fairness across multiple 
> `TaskGroups`. We have observed almost linear improvement in MTTA (4x+ with 5 
> tasks per round), which suggest the `max_tasks_per_schedule_attempt` can be 
> set even higher if the majority of cluster jobs have large number of 
> instances and/or update batch sizes.
> 
> As far as a single round perf goes, we can consider the following 2 
> worst-case scenarios:
> - master: single task scheduling fails after trying all offers in the queue
> - this patch: N tasks launched with the very last N offers in the queue + `(N 
> x single_task_launch_latency)`
> 
> Assuming that matching N tasks against M offers takes exactly the same time 
> as 1 task against M offers (as they all share the same `TaskGroup`), the only 
> measurable difference comes from the additional `N x 
> single_task_launch_latency` overhead. Based on real cluster observations, the 
> `single_task_launch_latency` is less than 1% of a single task scheduling 
> attempt, which is << than the savings from avoided additional scheduling 
> rounds. 
> 
> As far as jmh results go, the new approach (batching + multiple tasks per 
> round) is only slightly more demanding (~8%). Both results though are MUCH 
> higher than the real cluster perf, which just confirms we are not bound by 
> CPU time here:
> 
> Master:
> ```
> Benchmark
> Mode  Cnt  Score Error  Units
> SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
> thrpt   10  17126.183 ± 488.425  ops/s
> ```
> 
> This patch:
> ```
> Benchmark
> Mode  Cnt  Score Error  Units
> SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
> thrpt   10  15838.051 ± 187.890  ops/s
> ```
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 6f1cbfbc4510a037cffc95fee54f62f463d2b534 
>   src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java 
> 87b9e1928ab2d44668df1123f32ffdc4197c0c70 
>   src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
> 664bc6cf964ede2473a4463e58bcdbcb65bc7413 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroup.java 
> 5d319557057e27fd5fc6d3e553e9ca9139399c50 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
> d390c07522d22e43d79ce4370985f3643ef021ca 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
> 207d38d1ddfd373892602218a98c1daaf4a1325f 
>   src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 
> 7f7b4358ef05c0f0d0e14daac1a5c25488467dc9 
>   
> src/test/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilterTest.java
>  ece476b918e6f2c128039e561eea23a94d8ed396 
>   
> src/test/java/org/apache/aurora/scheduler/filter/AttributeAggregateTest.java 
> 209f9298a1d55207b9b41159f2ab366f92c1eb70 
>   
> src/test/java/org/apache/aurora/scheduler/filter/SchedulingFilterImplTest.java
>  0cf23df9f373c0d9b27e55a12adefd5f5fd81ba5 
>   src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
> c1c3eca4a6e6c88dab6b1c69fae3e2f290b58039 
>   
> src/test/java/org/apache/aurora/scheduler/preemptor/PreemptionVictimFilterTest.java
>  ee5c6528af89cc62a35fdb314358c489556d8131 
>   src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorImplTest.java 
> 98048fabc00f233925b6cca015c2525980556e2b 
>   
> src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorModuleTest.java 
> 2c3e5f32c774

Re: Review Request 51929: Scheduling multiple tasks per round.

2016-09-16 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51929/
---

(Updated Sept. 16, 2016, 9:53 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Repository: aurora


Description (updated)
---

This is phase 2 of scheduling perf improvement effort started in 
https://reviews.apache.org/r/51759/.

We can now take multiple (configurable) number of task IDs from a given 
`TaskGroup` per scheduling. The idea is to go deeper through the offer queue 
and assign more than one task if possible. This approach delivers substantially 
better MTTA and still ensures fairness across multiple `TaskGroups`. We have 
observed almost linear improvement in MTTA (4x+ with 5 tasks per round), which 
suggest the `max_tasks_per_schedule_attempt` can be set even higher if the 
majority of cluster jobs have large number of instances and/or update batch 
sizes.

As far as a single round perf goes, we can consider the following 2 worst-case 
scenarios:
- master: single task scheduling fails after trying all offers in the queue
- this patch: N tasks launched with the very last N offers in the queue + `(N x 
single_task_launch_latency)`

Assuming that matching N tasks against M offers takes exactly the same time as 
1 task against M offers (as they all share the same `TaskGroup`), the only 
measurable difference comes from the additional `N x 
single_task_launch_latency` overhead. Based on real cluster observations, the 
`single_task_launch_latency` is less than 1% of a single task scheduling 
attempt, which is << than the savings from avoided additional scheduling 
rounds. 

As far as jmh results go, the new approach (batching + multiple tasks per 
round) is only slightly more demanding (~8%). Both results though are MUCH 
higher than the real cluster perf, which just confirms we are not bound by CPU 
time here:

Master:
```
Benchmark
Mode  Cnt  Score Error  Units
SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
thrpt   10  17126.183 ± 488.425  ops/s
```

This patch:
```
Benchmark
Mode  Cnt  Score Error  Units
SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
thrpt   10  15838.051 ± 187.890  ops/s
```


Diffs
-

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
6f1cbfbc4510a037cffc95fee54f62f463d2b534 
  src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java 
87b9e1928ab2d44668df1123f32ffdc4197c0c70 
  src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
664bc6cf964ede2473a4463e58bcdbcb65bc7413 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroup.java 
5d319557057e27fd5fc6d3e553e9ca9139399c50 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
d390c07522d22e43d79ce4370985f3643ef021ca 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
207d38d1ddfd373892602218a98c1daaf4a1325f 
  src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 
7f7b4358ef05c0f0d0e14daac1a5c25488467dc9 
  
src/test/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilterTest.java
 ece476b918e6f2c128039e561eea23a94d8ed396 
  src/test/java/org/apache/aurora/scheduler/filter/AttributeAggregateTest.java 
209f9298a1d55207b9b41159f2ab366f92c1eb70 
  
src/test/java/org/apache/aurora/scheduler/filter/SchedulingFilterImplTest.java 
0cf23df9f373c0d9b27e55a12adefd5f5fd81ba5 
  src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
c1c3eca4a6e6c88dab6b1c69fae3e2f290b58039 
  
src/test/java/org/apache/aurora/scheduler/preemptor/PreemptionVictimFilterTest.java
 ee5c6528af89cc62a35fdb314358c489556d8131 
  src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorImplTest.java 
98048fabc00f233925b6cca015c2525980556e2b 
  src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorModuleTest.java 
2c3e5f32c774be07a5fa28c8bcf3b9a5d88059a1 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
88729626de5fa87b45472792c59cc0ff1ade3e93 
  
src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 
a4e87d2216401f344dca64d69b945de7bcf8159a 
  src/test/java/org/apache/aurora/scheduler/state/TaskAssignerImplTest.java 
b4d27f69ad5d4cce03da9f04424dc35d30e8af29 

Diff: https://reviews.apache.org/r/51929/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51929: Scheduling multiple tasks per round.

2016-09-16 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51929/
---

(Updated Sept. 16, 2016, 9:53 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Rebased over master.


Repository: aurora


Description
---

This is phase 2 of scheduling perf improvement effort started in 
https://reviews.apache.org/r/51759/.

We can now take multiple (configurable) number of task IDs from a given 
`TaskGroup` per scheduling. The idea is to go deeper through the offer queue 
and assign more than one task if possible. This approach delivers substantially 
better MTTA and still ensures fairness across multiple `TaskGroups`. We have 
observed almost linear improvement in MTTA (4x+ with 5 tasks per round), which 
suggest the `max_tasks_per_schedule_attempt` can be set even higher if the 
majority of cluster jobs have large number of instances and/or update batch 
sizes.

As far as a single round perf goes, we can consider the following 2 worst-case 
scenarios:
- master: single task scheduling fails after trying all offers in the queue
- this patch: N tasks launched with the very last N offers in the queue + `(N x 
single_task_launch_latency)`

Assuming that matching N tasks against M offers takes exactly the same time as 
1 task against M offers (as they all share the same `TaskGroup`), the only 
measurable difference comes from the additional `N x 
single_task_launch_latency` overhead. Based on real cluster observations, the 
`single_task_launch_latency` is less than 1% of a single task scheduling 
attempt, which is << than the savings from avoided additional scheduling 
rounds. 

As far as jmh results go, the new approach (batching + multiple tasks per 
round) is only slightly more demanding (~8%). Both results though are MUCH 
higher than the real cluster perf, which just confirms we are not bound by CPU 
time here:

Master:
```
Benchmark
Mode  Cnt  Score Error  Units
SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
thrpt   10  17126.183 ± 488.425  ops/s
```

This patch:
```
Benchmark
Mode  Cnt  Score Error  Units
SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
thrpt   10  15838.051 ± 187.890  ops/s
```

NOTE: this will not apply cleanly as it branched off of 
https://reviews.apache.org/r/51765, which itself depends on 
https://reviews.apache.org/r/51759/.


Diffs (updated)
-

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
6f1cbfbc4510a037cffc95fee54f62f463d2b534 
  src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java 
87b9e1928ab2d44668df1123f32ffdc4197c0c70 
  src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
664bc6cf964ede2473a4463e58bcdbcb65bc7413 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroup.java 
5d319557057e27fd5fc6d3e553e9ca9139399c50 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
d390c07522d22e43d79ce4370985f3643ef021ca 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
207d38d1ddfd373892602218a98c1daaf4a1325f 
  src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 
7f7b4358ef05c0f0d0e14daac1a5c25488467dc9 
  
src/test/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilterTest.java
 ece476b918e6f2c128039e561eea23a94d8ed396 
  src/test/java/org/apache/aurora/scheduler/filter/AttributeAggregateTest.java 
209f9298a1d55207b9b41159f2ab366f92c1eb70 
  
src/test/java/org/apache/aurora/scheduler/filter/SchedulingFilterImplTest.java 
0cf23df9f373c0d9b27e55a12adefd5f5fd81ba5 
  src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
c1c3eca4a6e6c88dab6b1c69fae3e2f290b58039 
  
src/test/java/org/apache/aurora/scheduler/preemptor/PreemptionVictimFilterTest.java
 ee5c6528af89cc62a35fdb314358c489556d8131 
  src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorImplTest.java 
98048fabc00f233925b6cca015c2525980556e2b 
  src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorModuleTest.java 
2c3e5f32c774be07a5fa28c8bcf3b9a5d88059a1 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
88729626de5fa87b45472792c59cc0ff1ade3e93 
  
src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 
a4e87d2216401f344dca64d69b945de7bcf8159a 
  src/test/java/org/apache/aurora/scheduler/state/TaskAssignerImplTest.java 
b4d27f69ad5d4cce03da9f04424dc35d30e8af29 

Diff: https://reviews.apache.org/r/51929/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51763: Batching writes - Part 2 (of 3): Converting cron jobs to use BatchWorker.

2016-09-16 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/
---

(Updated Sept. 16, 2016, 9:04 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Rebased.


Repository: aurora


Description
---

This is the second part of the `BatchWorker` conversion work that moves cron 
jobs to use non-blocking kill followups and reduces the number of trigger 
threads. See https://reviews.apache.org/r/51759 for more background on the 
`BatchWorker`.

#Problem
The current implementation of the cron scheduling relies on a large number of 
threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
killing existing tasks according to `KILL_EXISTING` collision policy. This 
creates large spikes of activities at synchronized intervals as users tend to 
schedule their cron runs around similar schedules. Moreover, the current 
implementation re-acquires write locks multiple times to deliver on 
`KILL_EXISTING` policy. 

#Remediation
Trigger level batching is still done in a blocking way but multiple cron 
triggers may be bundled together to share the same write transaction. Any 
followups, however, are performed in a non-blocking way by relying on a 
`BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. In 
order to still ensure non-concurrent execution of a given job key trigger, a 
token (job key) is saved within the trigger itself. A concurrent trigger will 
bail if a kill followup is still in progress (token is set AND no entry in 
`killFollowups` set exists yet).

#Results
The above approach allowed reducing the number of cron threads to 10 and likely 
can be reduced even further. See https://reviews.apache.org/r/51759 for the 
lock contention results.


Diffs (updated)
-

  commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
  commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
bc30990d57f444f7d64805ed85c363f1302736d0 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
c07551e94f9221b5b21c5dc9715e82caa290c2e8 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
155d702d68367b247dd066f773c662407f0e3b5b 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
5c64ff2994e200b3453603ac5470e8e152cebc55 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f 

Diff: https://reviews.apache.org/r/51763/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51973: Fix for AURORA-1739, enables golang thrift bindings to create jobs

2016-09-16 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51973/#review149271
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 16, 2016, 8:49 p.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51973/
> ---
> 
> (Updated Sept. 16, 2016, 8:49 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Change in the thrift API to make thee cronSchedule string in JobConfiguration 
> an optional.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md 411872b0244698b4ca74228bc21da608dcb98ae0 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> a045a21585fe40e63e2094fa103f205e7883eb35 
> 
> Diff: https://reviews.apache.org/r/51973/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>



Re: Review Request 51973: Fix for AURORA-1739, enables golang thrift bindings to create jobs

2016-09-16 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51973/#review149266
---



Not that it changes much but since it's technically a schema change a release 
note would be great.

- Maxim Khutornenko


On Sept. 16, 2016, 8:36 p.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51973/
> ---
> 
> (Updated Sept. 16, 2016, 8:36 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Change in the thrift API to make thee cronSchedule string in JobConfiguration 
> an optional.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> a045a21585fe40e63e2094fa103f205e7883eb35 
> 
> Diff: https://reviews.apache.org/r/51973/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> 
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>



Re: Review Request 51763: Batching writes - Part 2 (of 3): Converting cron jobs to use BatchWorker.

2016-09-16 Thread Maxim Khutornenko


> On Sept. 15, 2016, 6:25 p.m., Zameer Manji wrote:
> > config/findbugs/excludeFilter.xml, line 123
> > <https://reviews.apache.org/r/51763/diff/4/?file=1498752#file1498752line123>
> >
> > If I'm reading the code correctly we need this because we have a 
> > `CompletableFuture` and to give it a value we use `null`.
> > 
> > Have you considered using `new Object()` instead so we don't have to 
> > add this exception to our rules?
> 
> Maxim Khutornenko wrote:
> I am not comfortable substituting a `Void` with a random `Object` just to 
> avoid analyzer error. This opens up a potential for much more confusing cases.
> 
> John Sirois wrote:
> Generally a `SideEffect` type with 1 instance would be handy for these 
> things.  Something to consider outside this review.
> 
> IE:
> ```java
> package org.apache.aurora.lang;
> 
> public enum SideEffect {
>   COMPLETE;
> }
> ```
> 
> I'm not sure if others would find `CompletableFuture` and the 
> corresponding `return SideEffect.COMPLETE` an improvement in these sorts of 
> situations.  It is explicit.

I don't think enum is useful here as it does not deliver any additional signal. 
How about using a `NoResult` interface instance instead? Updated the parent RB.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/#review149093
---


On Sept. 14, 2016, 11:12 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51763/
> ---
> 
> (Updated Sept. 14, 2016, 11:12 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the second part of the `BatchWorker` conversion work that moves cron 
> jobs to use non-blocking kill followups and reduces the number of trigger 
> threads. See https://reviews.apache.org/r/51759 for more background on the 
> `BatchWorker`.
> 
> #Problem
> The current implementation of the cron scheduling relies on a large number of 
> threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
> killing existing tasks according to `KILL_EXISTING` collision policy. This 
> creates large spikes of activities at synchronized intervals as users tend to 
> schedule their cron runs around similar schedules. Moreover, the current 
> implementation re-acquires write locks multiple times to deliver on 
> `KILL_EXISTING` policy. 
> 
> #Remediation
> Trigger level batching is still done in a blocking way but multiple cron 
> triggers may be bundled together to share the same write transaction. Any 
> followups, however, are performed in a non-blocking way by relying on a 
> `BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. 
> In order to still ensure non-concurrent execution of a given job key trigger, 
> a token (job key) is saved within the trigger itself. A concurrent trigger 
> will bail if a kill followup is still in progress (token is set AND no entry 
> in `killFollowups` set exists yet).
> 
> #Results
> The above approach allowed reducing the number of cron threads to 10 and 
> likely can be reduced even further. See https://reviews.apache.org/r/51759 
> for the lock contention results.
> 
> 
> Diffs
> -
> 
>   commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
> 8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
>   commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
> bc30990d57f444f7d64805ed85c363f1302736d0 
>   config/findbugs/excludeFilter.xml fe3f4ca5db1484124af14421a3349950dfec8519 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
> c07551e94f9221b5b21c5dc9715e82caa290c2e8 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
> 155d702d68367b247dd066f773c662407f0e3b5b 
>   
> src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
> 5c64ff2994e200b3453603ac5470e8e152cebc55 
>   src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
> 1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f 
> 
> Diff: https://reviews.apache.org/r/51763/diff/
> 
> 
> Testing
> ---
> 
> All types of testing including deploying to test and production clusters.
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-16 Thread Maxim Khutornenko
/scheduler/BatchWorkerTest.java PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java 
99c27e8012f10a67ce5f1b84d258e7a5608995c7 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskThrottlerTest.java 
7d104aa2ea4a4d99be4711f666d18beca238284e 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 94f5ca565476f62d72879837a0e7dafabcf30432 
  src/test/java/org/apache/aurora/scheduler/testing/BatchWorkerUtil.java 
PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
196df4754b553f05e50b66ad2f84271901bc9eba 

Diff: https://reviews.apache.org/r/51759/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-16 Thread Maxim Khutornenko


> On Sept. 15, 2016, 10:58 a.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, lines 166-167
> > <https://reviews.apache.org/r/51759/diff/7/?file=1498729#file1498729line166>
> >
> > super nitpicky: mind swapping the order of these args to keep inline 
> > with `execute(Work work)`?

This ordering is deliberate to improve reading at the call site. The verbose 
lambda is coming last in the `executeWithReplay` subjectively making it easier 
to read (https://reviews.apache.org/r/51763/). If you disagree I can reorder 
though.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review149044
---


On Sept. 16, 2016, 8:03 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 16, 2016, 8:03 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered the 
> best results. The lock contention latencies got down between 2x and 5x 
> depending on the cluster load. A number of other approaches tried but 
> discarded as not performing well or even performing much worse than the 
> current master:
> - Clock-driven batch execution - every batch is dispatched on a time schedule
> - Max batch with a deadline - a batch is dispatched when max size is reached 
> OR a timeout expires
> - Various combinations of the above - some `BatchWorkers` are using 
> clock-driven execution while others are using max batch with a deadline
> - Completely non-blocking (event-based) completion notification - all call 
> sites are notified of item completion via a `BatchWorkCompleted` event
> 
> Happy to provide more details on the above if interested.
> 
> #Upcoming
> The introduction of the `BatchWorker` by itself was not enough to 
> substantially improve the MTTA. It, however, paves the way for the next phase 
> of scheduling perf improvement - taking more than 1 task from a given 
> `TaskGroup` in a single scheduling ro

Re: Review Request 51765: Batching writes - Part 3 (of 3): Converting TaskScheduler to use BatchWorker.

2016-09-16 Thread Maxim Khutornenko


> On Sept. 16, 2016, 6:28 p.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java, 
> > line 91
> > <https://reviews.apache.org/r/51765/diff/5/?file=1498765#file1498765line91>
> >
> > Curious, arbitrary or derrived?

Arbitrary derived :) There is no perfect default here though. Just trying to be 
conservative especially given the upcoming https://reviews.apache.org/r/51929/. 
After all is done, we will have 15x improvement in scheduling rate derived from 
a single `write()` transaction.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51765/#review149229
---


On Sept. 14, 2016, 11:18 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51765/
> ---
> 
> (Updated Sept. 14, 2016, 11:18 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the final part of the `BatchWorker` conversion work that converts 
> `TaskScheduler`. See https://reviews.apache.org/r/51759 for more background 
> on the `BatchWorker`.
> 
> #Problem
> See https://reviews.apache.org/r/51759
> 
> #Remediation
> Task scheduling is one of the most dominant users of the write lock. It's 
> also one of the heaviest and the most latency-sensitive. As such, the default 
> max batch size is chosen conservatively low (3) and batch items are executed 
> in a blocking way. 
> 
> BTW, attempting to make task scheduling non-blocking resulted in a much worse 
> scheduling performance. The way our `DBTaskStore` is wired, all async 
> activities, including `EventBus` are bound to use a single async `Executor`, 
> which is currently limited at 8 threads [1]. Relying on the same `EventBus` 
> to deliver scheduling completion events resulted in slower scheduling perf as 
> those events were backed up behind all other activities, including tasks 
> status events, reconciliation and etc. Increasing the executor thread pool 
> size to a larger number on the other side, also increased the lock contention 
> defeating the whole purpose of this work.
> 
> #Results
> See https://reviews.apache.org/r/51759 for the lock contention results.
> 
> https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L51-L54
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 9d0d40b82653fb923bed16d06546288a1576c21d 
>   src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
> 11e8033438ad0808e446e41bb26b3fa4c04136c7 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
> c044ebe6f72183a67462bbd8e5be983eb592c3e9 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
> d266f6a25ae2360db2977c43768a19b1f1efe8ff 
>   src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
> c2ceb4e7685a9301f8014a9183e02fbad65bca26 
>   src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
> 95cf25eda0a5bfc0cc4c46d1439ebe9d5359ce79 
>   
> src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java
>  72562e6bd9a9860c834e6a9faa094c28600a8fed 
> 
> Diff: https://reviews.apache.org/r/51765/diff/
> 
> 
> Testing
> ---
> 
> All types of testing including deploying to test and production clusters.
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>



Re: Review Request 51763: Batching writes - Part 2 (of 3): Converting cron jobs to use BatchWorker.

2016-09-16 Thread Maxim Khutornenko


> On Sept. 15, 2016, 6:25 p.m., Zameer Manji wrote:
> > config/findbugs/excludeFilter.xml, line 123
> > <https://reviews.apache.org/r/51763/diff/4/?file=1498752#file1498752line123>
> >
> > If I'm reading the code correctly we need this because we have a 
> > `CompletableFuture` and to give it a value we use `null`.
> > 
> > Have you considered using `new Object()` instead so we don't have to 
> > add this exception to our rules?

I am not comfortable substituting a `Void` with a random `Object` just to avoid 
analyzer error. This opens up a potential for much more confusing cases.


> On Sept. 15, 2016, 6:25 p.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java, 
> > line 78
> > <https://reviews.apache.org/r/51763/diff/4/?file=1498753#file1498753line78>
> >
> > Just to be clear, this annotation is needed to ensure the data in 
> > `JobDataMap` is persisted properly?

Correct.


> On Sept. 15, 2016, 6:25 p.m., Zameer Manji wrote:
> > src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java, line 100
> > <https://reviews.apache.org/r/51763/diff/4/?file=1498756#file1498756line100>
> >
> > For integration tests I was under the impression we used 
> > `org.apache.aurora.scheduler.testing.FakeStatsProvider`

The `FakeStatsProvider` is only useful when we want to get a reading on updated 
metrics. Since everything else that can post stats is mocked, there is no 
reason to use a real object here.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/#review149093
---


On Sept. 14, 2016, 11:12 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51763/
> ---
> 
> (Updated Sept. 14, 2016, 11:12 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the second part of the `BatchWorker` conversion work that moves cron 
> jobs to use non-blocking kill followups and reduces the number of trigger 
> threads. See https://reviews.apache.org/r/51759 for more background on the 
> `BatchWorker`.
> 
> #Problem
> The current implementation of the cron scheduling relies on a large number of 
> threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
> killing existing tasks according to `KILL_EXISTING` collision policy. This 
> creates large spikes of activities at synchronized intervals as users tend to 
> schedule their cron runs around similar schedules. Moreover, the current 
> implementation re-acquires write locks multiple times to deliver on 
> `KILL_EXISTING` policy. 
> 
> #Remediation
> Trigger level batching is still done in a blocking way but multiple cron 
> triggers may be bundled together to share the same write transaction. Any 
> followups, however, are performed in a non-blocking way by relying on a 
> `BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. 
> In order to still ensure non-concurrent execution of a given job key trigger, 
> a token (job key) is saved within the trigger itself. A concurrent trigger 
> will bail if a kill followup is still in progress (token is set AND no entry 
> in `killFollowups` set exists yet).
> 
> #Results
> The above approach allowed reducing the number of cron threads to 10 and 
> likely can be reduced even further. See https://reviews.apache.org/r/51759 
> for the lock contention results.
> 
> 
> Diffs
> -
> 
>   commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
> 8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
>   commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
> bc30990d57f444f7d64805ed85c363f1302736d0 
>   config/findbugs/excludeFilter.xml fe3f4ca5db1484124af14421a3349950dfec8519 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
> c07551e94f9221b5b21c5dc9715e82caa290c2e8 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
> 155d702d68367b247dd066f773c662407f0e3b5b 
>   
> src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
> 5c64ff2994e200b3453603ac5470e8e152cebc55 
>   src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
> 1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f 
> 
> Diff: https://reviews.apache.org/r/51763/diff/
> 
> 
> Testing
> ---
> 
> All types of testing including deploying to test and production clusters.
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-16 Thread Maxim Khutornenko


> On Sept. 16, 2016, 6:18 p.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 213
> > <https://reviews.apache.org/r/51759/diff/7/?file=1498729#file1498729line213>
> >
> > I think this is a misuse of `CompletalbleFuture` I discovered this as I 
> > was reviewing https://reviews.apache.org/r/51765/.
> > 
> > I think here we need to use 
> > https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletableFuture.html#completeExceptionally-java.lang.Throwable-
> >  here and set the exception into the future.
> > 
> > Then the exception will propagate to the caller (like 
> > `TaskGroupBatchWorker`).
> > 
> > Then the user of the batch worker can handle the exception.

Good catch. This is a leftover after refactoring `AwaitableResult` into 
`CompletableFuture` and it makes perfect sense to channel that error upstream.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review149227
-------


On Sept. 14, 2016, 10:41 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 10:41 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered the 
> best results. The lock contention latencies got down between 2x and 5x 
> depending on the cluster load. A number of other approaches tried but 
> discarded as not performing well or even performing much worse than the 
> current master:
> - Clock-driven batch execution - every batch is dispatched on a time schedule
> - Max batch with a deadline - a batch is dispatched when max size is reached 
> OR a timeout expires
> - Various combinations of the above - some `BatchWorkers` are using 
> clock-driven execution while others are using max batch with a deadline
> - Completely non-blocking (event-based) completion notification - all call 
> sites are notified of item completion via a `BatchWor

Re: Review Request 51929: Scheduling multiple tasks per round.

2016-09-16 Thread Maxim Khutornenko
esting
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51929: Scheduling multiple tasks per round.

2016-09-16 Thread Maxim Khutornenko


> On Sept. 16, 2016, 9:08 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java, line 
> > 197
> > <https://reviews.apache.org/r/51929/diff/1/?file=1499323#file1499323line197>
> >
> > Side show: Isn't that `if` unnecessary here and we can adjust the 
> > penality in any case? We will remove the group if `hasMore()` returns 
> > false, so any penality should be fine.

Not sure I follow. This is the place that applies penalty accrued inside the 
`startGroup()` or removes the group if it's empty.


> On Sept. 16, 2016, 9:08 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java, 
> > lines 208-213
> > <https://reviews.apache.org/r/51929/diff/1/?file=1499324#file1499324line208>
> >
> > I would have expected that we only request preemption if we failed to 
> > schedule all tasks in the current group.  If I remember correctly, 
> > preemption slot search only happens on a per-group basis anyway.
> > 
> > You have probably thought about this, so I would like to understand 
> > your reasoning.

We still want to attempt a preemption if _some_ but not _all_ tasks are 
scheduled within a given round, right? Otherwise, preemption becomes an 
all-or-nothing feature and we have to wait for another scheduling cycle to 
request a reservation.


> On Sept. 16, 2016, 9:08 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java, line 174
> > <https://reviews.apache.org/r/51929/diff/1/?file=1499325#file1499325line174>
> >
> > Shouldn't that only happen if `launchTask` has succeeded?

I've debated that as well and decided it's more logical to finish accessing 
offer details before it's being launched with and removed from the 
`OfferManager`.

BTW, I just realized I was missing a `break` statement to bail out from the 
scheduling round if a `LaunchException` is thrown. While theoretically we 
_could_ continue matching even after the `LaunchException`, it's hard to reason 
about the state of the storage and the last assigned item and as such it's 
safer to terminate than continue. Fixed and added a test case. This should now 
be resolved. Thanks for asking :)


> On Sept. 16, 2016, 9:08 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java, lines 
> > 214-220
> > <https://reviews.apache.org/r/51929/diff/1/?file=1499325#file1499325line214>
> >
> > There is additional complexity here and in other places because 
> > `schedule`, `scheduleTask` and `maybeAssign` return a `Map<String, 
> > Boolean>` encoding if scheduling was successful for all given tasks.
> > 
> > I'd propose to change the signature so that all methods mentioned above 
> > return `Set` containing only the successfully launched tasks. 
> > 
> > If a caller is really interested in tasks failed to schedule, he can 
> > compute the set difference between passed and returned task ids.

I was fighting with myself over this initially and decided to keep the original 
contract. Now that the aproach fully shaped up, I can see how it can be 
simplified by not returning the exhaustive map of things. Refactored to return 
only what's needed. Thanks for bringing this up!


- Maxim


-------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51929/#review149179
---


On Sept. 16, 2016, 12:51 a.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51929/
> ---
> 
> (Updated Sept. 16, 2016, 12:51 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is phase 2 of scheduling perf improvement effort started in 
> https://reviews.apache.org/r/51759/.
> 
> We can now take multiple (configurable) number of task IDs from a given 
> `TaskGroup` per scheduling. The idea is to go deeper through the offer queue 
> and assign more than one task if possible. This approach delivers 
> substantially better MTTA and still ensures fairness across multiple 
> `TaskGroups`. We have observed almost linear improvement in MTTA (4x+ with 5 
> tasks per round), which suggest the `max_tasks_per_schedule_attempt` can be 
> set even higher if the majority of cluster jobs have large number of 
> instances and/or update batch sizes.
> 
>

Re: Review Request 51874: Change framework_name default value from 'TwitterScheduler' to 'Aurora'

2016-09-16 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51874/#review149216
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 15, 2016, 7:02 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51874/
> ---
> 
> (Updated Sept. 15, 2016, 7:02 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1688
> https://issues.apache.org/jira/browse/AURORA-1688
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Change framework_name default value from 'TwitterScheduler' to 'Aurora'
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md ad2c68a6defe07c94480d7dee5b1496b50dc34e5 
>   
> src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
>  8a386bd208956eb0c8c2f48874b0c6fb3af58872 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 97677f24a50963178a123b420d7ac136e4fde3fe 
> 
> Diff: https://reviews.apache.org/r/51874/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> Testing to make sure backward compatibility:
> 
> # HEAD of master:
> 
> # Case 1: Rolling forward does not impact running tasks:
> Renaming framework from 'TwitterScheduler' to 'Aurora':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:48:28.408182  9815 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:48:28.408226  9815 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> E0914 16:48:28.408617  9819 process.cpp:2105] Failed to shutdown socket with 
> fd 28: Transport endpoint is not connected
> I0914 16:48:43.722126  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'Aurora' at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> I0914 16:48:43.722190  9813 master.cpp:2500] Subscribing framework Aurora 
> with checkpointing enabled and capabilities [ REVOCABLE_RESOURCES, 
> GPU_RESOURCES ]
> I0914 16:48:43.75  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722256  9813 master.cpp:2577] Framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 failed over
> I0914 16:48:43.722429  9813 hierarchical.cpp:348] Activated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722595  9813 master.cpp:5709] Sending 1 offers to framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> 
> Scheduler log:
> I0914 16:48:44.157 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
> value: "071c44a1-b4d4-4339-a727-03a79f725851-"
> , master: id: "461b98b8-63e1-40e3-96fd-cb62420945ae"
> ip: 119646400
> port: 5050
> pid: "master@192.168.33.7:5050"
> hostname: "aurora.local"
> version: "1.0.0"
> address {
>   hostname: "aurora.local"
>   ip: "192.168.33.7"
>   port: 5050
> }
> 
> # Case 2: Rolling backward does not impact running tasks:
> Rolling back framework name from 'Aurora' to 'TwitterScheduler':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:51:33.203495  9812 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:51:33.203526  9812 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.614074  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'TwitterScheduler' at 
> scheduler-6fa8b819-aed9-42e1-9c6c-3e4be2f62500@192.168.33.7:8083
> I0914 16:51:49.614215  9813 master.cpp:2500] Subscribing framework 
> TwitterScheduler with checkpointing enabled and capabilities [ 
> REVOCABLE_RESOURCES, GPU_RESOURCES ]
> I0914 16:51:49.614312  

Re: Review Request 51929: Scheduling multiple tasks per round.

2016-09-15 Thread Maxim Khutornenko


> On Sept. 16, 2016, 1:20 a.m., Aurora ReviewBot wrote:
> > Master (783baae) is red with this patch.
> >   ./build-support/jenkins/build.sh
> > 
> >  # Create file stdout for capturing output. 
> > We can't use StringIO mock
> >  # because TestProcess is running fork.
> >  with open(os.path.join(td, 'sys_stdout'), 
> > 'w+') as stdout:
> >    with open(os.path.join(td, 
> > 'sys_stderr'), 'w+') as stderr:
> >  with mutable_sys():
> >    sys.stdout, sys.stderr = stdout, 
> > stderr
> >  
> >    p = TestProcess('process', 'echo 
> > hello world; echo >&2 hello stderr', 0,
> >    taskpath, sandbox, 
> > logger_destination=LoggerDestination.BOTH)
> >    p.start()
> >    rc = 
> > wait_for_rc(taskpath.getpath('process_checkpoint'))
> >  
> >    assert rc == 0
> >    # Check log files were created in 
> > std path with correct content
> >  > assert_log_content(taskpath, 
> > 'stdout', 'hello world\n')
> >  
> >  
> > src/test/python/apache/thermos/core/test_process.py:487: 
> >  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> >  
> >  taskpath =  > at 0x7fdd3cd73b10>
> >  log_name = 'stdout'
> >  expected_content = 'hello world\n'
> >  
> >  def assert_log_content(taskpath, log_name, 
> > expected_content):
> >    log = 
> > taskpath.with_filename(log_name).getpath('process_logdir')
> >    assert os.path.exists(log)
> >    with open(log, 'r') as fp:
> >  >   assert fp.read() == expected_content
> >  E   assert '' == 'hello world\n'
> >  E + hello world
> >  
> >  
> > src/test/python/apache/thermos/core/test_process.py:313: AssertionError
> >   generated xml file: 
> > /home/jenkins/jenkins-slave/workspace/AuroraBot/dist/test-results/415337499eb72578eab327a6487c1f5c9452b3d6.xml
> >  
> >   1 failed, 710 passed, 6 skipped, 1 warnings 
> > in 226.09 seconds 
> >  
> > FAILURE
> > 
> > 
> > 01:19:57 04:18   [complete]
> >FAILURE
> > 
> > 
> > I will refresh this build result if you post a review containing 
> > "@ReviewBot retry"

@ReviewBot retry


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51929/#review149162
---


On Sept. 16, 2016, 12:51 a.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51929/
> ---
> 
> (Updated Sept. 16, 2016, 12:51 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is phase 2 of scheduling perf improvement effort started in 
> https://reviews.apache.org/r/51759/.
> 
> We can now take multiple (configurable) number of task IDs from a given 
> `TaskGroup` per scheduling. The idea is to go deeper through the offer queue 
> and assign more than one task if possible. This approach delivers 
> substantially better MTTA and still ensures fairness across multiple 
> `TaskGroups`. We have observed almost linear improvement in MTTA (4x+ with 5 
> tasks per round), which suggest the `max_tasks_per_schedule_attempt` can be 
> set even higher if the majority of cluster jobs have large number of 
> instances and/or update batch sizes.
> 
>

Review Request 51929: Scheduling multiple tasks per round.

2016-09-15 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51929/
---

Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Repository: aurora


Description
---

This is phase 2 of scheduling perf improvement effort started in 
https://reviews.apache.org/r/51759/.

We can now take multiple (configurable) number of task IDs from a given 
`TaskGroup` per scheduling. The idea is to go deeper through the offer queue 
and assign more than one task if possible. This approach delivers substantially 
better MTTA and still ensures fairness across multiple `TaskGroups`. We have 
observed almost linear improvement in MTTA (4x+ with 5 tasks per round), which 
suggest the `max_tasks_per_schedule_attempt` can be set even higher if the 
majority of cluster jobs have large number of instances and/or update batch 
sizes.

As far as a single round perf goes, we can consider the following 2 worst-case 
scenarios:
- master: single task scheduling fails after trying all offers in the queue
- this patch: N tasks launched with the very last N offers in the queue + `(N x 
single_task_launch_latency)`

Assuming that matching N tasks against M offers takes exactly the same time as 
1 task against M offers (as they all share the same `TaskGroup`), the only 
measurable difference comes from the additional `N x 
single_task_launch_latency` overhead. Based on real cluster observations, the 
`single_task_launch_latency` is less than 1% of a single task scheduling 
attempt, which is << than the savings from avoided additional scheduling 
rounds. 

As far as jmh results go, the new approach (batching + multiple tasks per 
round) is only slightly more demanding (~8%). Both results though are MUCH 
higher than the real cluster perf, which just confirms we are not bound by CPU 
time here:

Master:
```
Benchmark
Mode  Cnt  Score Error  Units
SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
thrpt   10  17126.183 ± 488.425  ops/s
```

This patch:
```
Benchmark
Mode  Cnt  Score Error  Units
SchedulingBenchmarks.InsufficientResourcesSchedulingBenchmark.runBenchmark  
thrpt   10  15838.051 ± 187.890  ops/s
```

NOTE: this will not apply cleanly as it branched off of 
https://reviews.apache.org/r/51765, which itself depends on 
https://reviews.apache.org/r/51759/.


Diffs
-

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
9d0d40b82653fb923bed16d06546288a1576c21d 
  src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java 
87b9e1928ab2d44668df1123f32ffdc4197c0c70 
  src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
11e8033438ad0808e446e41bb26b3fa4c04136c7 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroup.java 
5d319557057e27fd5fc6d3e553e9ca9139399c50 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
c044ebe6f72183a67462bbd8e5be983eb592c3e9 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
d266f6a25ae2360db2977c43768a19b1f1efe8ff 
  src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 
7f7b4358ef05c0f0d0e14daac1a5c25488467dc9 
  
src/test/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilterTest.java
 ece476b918e6f2c128039e561eea23a94d8ed396 
  src/test/java/org/apache/aurora/scheduler/filter/AttributeAggregateTest.java 
209f9298a1d55207b9b41159f2ab366f92c1eb70 
  
src/test/java/org/apache/aurora/scheduler/filter/SchedulingFilterImplTest.java 
0cf23df9f373c0d9b27e55a12adefd5f5fd81ba5 
  src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
c2ceb4e7685a9301f8014a9183e02fbad65bca26 
  
src/test/java/org/apache/aurora/scheduler/preemptor/PreemptionVictimFilterTest.java
 ee5c6528af89cc62a35fdb314358c489556d8131 
  src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorImplTest.java 
98048fabc00f233925b6cca015c2525980556e2b 
  src/test/java/org/apache/aurora/scheduler/preemptor/PreemptorModuleTest.java 
2c3e5f32c774be07a5fa28c8bcf3b9a5d88059a1 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
95cf25eda0a5bfc0cc4c46d1439ebe9d5359ce79 
  
src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 
72562e6bd9a9860c834e6a9faa094c28600a8fed 
  src/test/java/org/apache/aurora/scheduler/state/TaskAssignerImplTest.java 
b4d27f69ad5d4cce03da9f04424dc35d30e8af29 

Diff: https://reviews.apache.org/r/51929/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51924: Remove --release-threshold option from aurora job restart.

2016-09-15 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51924/#review149097
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 15, 2016, 6:13 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51924/
> ---
> 
> (Updated Sept. 15, 2016, 6:13 p.m.)
> 
> 
> Review request for Aurora and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1681
> https://issues.apache.org/jira/browse/AURORA-1681
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Remove --release-threshold option from aurora job restart.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md ad2c68a6defe07c94480d7dee5b1496b50dc34e5 
>   src/main/python/apache/aurora/client/cli/jobs.py 
> 7b4c2692334acfddb53a52a602a5f07e94b4bd86 
> 
> Diff: https://reviews.apache.org/r/51924/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>



Re: Review Request 51765: Batching writes - Part 3 (of 3): Converting TaskScheduler to use BatchWorker.

2016-09-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51765/
---

(Updated Sept. 14, 2016, 11:17 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Rebased.


Repository: aurora


Description
---

This is the final part of the `BatchWorker` conversion work that converts 
`TaskScheduler`. See https://reviews.apache.org/r/51759 for more background on 
the `BatchWorker`.

#Problem
See https://reviews.apache.org/r/51759

#Remediation
Task scheduling is one of the most dominant users of the write lock. It's also 
one of the heaviest and the most latency-sensitive. As such, the default max 
batch size is chosen conservatively low (3) and batch items are executed in a 
blocking way. 

BTW, attempting to make task scheduling non-blocking resulted in a much worse 
scheduling performance. The way our `DBTaskStore` is wired, all async 
activities, including `EventBus` are bound to use a single async `Executor`, 
which is currently limited at 8 threads [1]. Relying on the same `EventBus` to 
deliver scheduling completion events resulted in slower scheduling perf as 
those events were backed up behind all other activities, including tasks status 
events, reconciliation and etc. Increasing the executor thread pool size to a 
larger number on the other side, also increased the lock contention defeating 
the whole purpose of this work.

#Results
See https://reviews.apache.org/r/51759 for the lock contention results.

https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L51-L54


Diffs (updated)
-

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
9d0d40b82653fb923bed16d06546288a1576c21d 
  src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
11e8033438ad0808e446e41bb26b3fa4c04136c7 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
c044ebe6f72183a67462bbd8e5be983eb592c3e9 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
d266f6a25ae2360db2977c43768a19b1f1efe8ff 
  src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
c2ceb4e7685a9301f8014a9183e02fbad65bca26 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
95cf25eda0a5bfc0cc4c46d1439ebe9d5359ce79 
  
src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 
72562e6bd9a9860c834e6a9faa094c28600a8fed 

Diff: https://reviews.apache.org/r/51765/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51763: Batching writes - Part 2 (of 3): Converting cron jobs to use BatchWorker.

2016-09-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/
---

(Updated Sept. 14, 2016, 11:12 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Rebasing and comments.


Repository: aurora


Description
---

This is the second part of the `BatchWorker` conversion work that moves cron 
jobs to use non-blocking kill followups and reduces the number of trigger 
threads. See https://reviews.apache.org/r/51759 for more background on the 
`BatchWorker`.

#Problem
The current implementation of the cron scheduling relies on a large number of 
threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
killing existing tasks according to `KILL_EXISTING` collision policy. This 
creates large spikes of activities at synchronized intervals as users tend to 
schedule their cron runs around similar schedules. Moreover, the current 
implementation re-acquires write locks multiple times to deliver on 
`KILL_EXISTING` policy. 

#Remediation
Trigger level batching is still done in a blocking way but multiple cron 
triggers may be bundled together to share the same write transaction. Any 
followups, however, are performed in a non-blocking way by relying on a 
`BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. In 
order to still ensure non-concurrent execution of a given job key trigger, a 
token (job key) is saved within the trigger itself. A concurrent trigger will 
bail if a kill followup is still in progress (token is set AND no entry in 
`killFollowups` set exists yet).

#Results
The above approach allowed reducing the number of cron threads to 10 and likely 
can be reduced even further. See https://reviews.apache.org/r/51759 for the 
lock contention results.


Diffs (updated)
-

  commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
  commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
bc30990d57f444f7d64805ed85c363f1302736d0 
  config/findbugs/excludeFilter.xml fe3f4ca5db1484124af14421a3349950dfec8519 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
c07551e94f9221b5b21c5dc9715e82caa290c2e8 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
155d702d68367b247dd066f773c662407f0e3b5b 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
5c64ff2994e200b3453603ac5470e8e152cebc55 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f 

Diff: https://reviews.apache.org/r/51763/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51763: Batching writes - Part 2 (of 3): Converting cron jobs to use BatchWorker.

2016-09-14 Thread Maxim Khutornenko


> On Sept. 14, 2016, 5:40 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java, 
> > lines 164-172
> > <https://reviews.apache.org/r/51763/diff/3/?file=1498135#file1498135line164>
> >
> > I probably don't completely understand the logic, so stupid question 
> > ahead:
> > 
> > Couldn't we do that cleanup of JobDataMap in the handler where you 
> > currently do the `killFollowups.add(key)`? Having this spread out over the 
> > large `doExecute` function doesn't help understandability very much.

Unfortunately, we can't. That was my first thought but the context and its 
detail map are reconstructed for every trigger and are gone the moment a 
trigger is released.


> On Sept. 14, 2016, 5:40 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java, 
> > line 174
> > <https://reviews.apache.org/r/51763/diff/3/?file=1498135#file1498135line174>
> >
> > I will probably find this message confusing if I see it in the logs, in 
> > particular if I have configured the job with KILL_EXISTING.

Any suggestions? There are lots of moving parts addressed in this message, so I 
am not sure how to better explain it without describing how quartz works :)


> On Sept. 14, 2016, 5:40 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java, 
> > line 274
> > <https://reviews.apache.org/r/51763/diff/3/?file=1498135#file1498135line274>
> >
> > The term "followup" is rather generic and difficuelt to understand if 
> > seen in isolation in the log. You should at least mention it is about 
> > killing a cron job.

Good point. Rephrased.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/#review148933
---


On Sept. 14, 2016, 12:23 a.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51763/
> ---
> 
> (Updated Sept. 14, 2016, 12:23 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the second part of the `BatchWorker` conversion work that moves cron 
> jobs to use non-blocking kill followups and reduces the number of trigger 
> threads. See https://reviews.apache.org/r/51759 for more background on the 
> `BatchWorker`.
> 
> #Problem
> The current implementation of the cron scheduling relies on a large number of 
> threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
> killing existing tasks according to `KILL_EXISTING` collision policy. This 
> creates large spikes of activities at synchronized intervals as users tend to 
> schedule their cron runs around similar schedules. Moreover, the current 
> implementation re-acquires write locks multiple times to deliver on 
> `KILL_EXISTING` policy. 
> 
> #Remediation
> Trigger level batching is still done in a blocking way but multiple cron 
> triggers may be bundled together to share the same write transaction. Any 
> followups, however, are performed in a non-blocking way by relying on a 
> `BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. 
> In order to still ensure non-concurrent execution of a given job key trigger, 
> a token (job key) is saved within the trigger itself. A concurrent trigger 
> will bail if a kill followup is still in progress (token is set AND no entry 
> in `killFollowups` set exists yet).
> 
> #Results
> The above approach allowed reducing the number of cron threads to 10 and 
> likely can be reduced even further. See https://reviews.apache.org/r/51759 
> for the lock contention results.
> 
> 
> Diffs
> -
> 
>   commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
> 8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
>   commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
> bc30990d57f444f7d64805ed85c363f1302736d0 
>   config/findbugs/excludeFilter.xml fe3f4ca5db1484124af14421a3349950dfec8519 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
> c07551e94f9221b5b21c5dc9715e82caa290c2e8 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
> 155d702d68367b247dd066f773c662407f0e3b5b 
>   
> src/test/java/org/apache/aurora/scheduler/cron/

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko


> On Sept. 14, 2016, 6:36 p.m., Zameer Manji wrote:
> > This patch LGTM. Just a single concern here and some questions. I will be 
> > moving on to the subsequent patches shortly.
> > Please do not commit this until all parts are shipped. Also, please don't 
> > forget to rebase/update subsequent patches if changes are made here.
> > 
> > Also, does this cover all state change consumers that interact on storage 
> > when the event is fired or is this a subset? Just wondering if this is 
> > supposed to apply to all of them or just a small set that you think need 
> > this change.
> 
> Maxim Khutornenko wrote:
> > Also, does this cover all state change consumers that interact on 
> storage when the event is fired or is this a subset
> 
> This covers all known high frequency consumers that require a write 
> transaction. The list is derived by running a stack-dumping version of the 
> `CallOrderEnforcingStorage.write()` in a live cluster for awhile and 
> analysing top offenders. It may be not fully comprehensive though, so if you 
> find any other candidates feel free to report them.
> 
> Zameer Manji wrote:
> I think some comments at the consumers to say that batch worker is 
> required for high throughput would be greatly appreciated for future 
> contributers.

I'd rather avoid comment copypasta everywhere we use `BatchWorker`. I thought 
it's enough to trace the execution from the callsite to the `BatchWorker` 
itself to see how/why it's used, no?


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148950
---


On Sept. 14, 2016, 10:41 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 10:41 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach has been heavily tested in production and delivered the 
> best results. The lock contention latencies got down between 2x and 5x 
> depending on the cluster load. A number of other app

Re: Review Request 51874: Change framework_name default value from 'TwitterScheduler' to 'Aurora'

2016-09-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51874/#review148988
---




src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
 (line 82)
<https://reviews.apache.org/r/51874/#comment216516>

Did you try to rollback to pre 0.15 scheduler while changing the framework 
name? Trying to see if we can drop this 'backwards incompatible' statement now.


- Maxim Khutornenko


On Sept. 14, 2016, 8:58 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51874/
> ---
> 
> (Updated Sept. 14, 2016, 8:58 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1688
> https://issues.apache.org/jira/browse/AURORA-1688
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Change framework_name default value from 'TwitterScheduler' to 'Aurora'
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md ad2c68a6defe07c94480d7dee5b1496b50dc34e5 
>   
> src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
>  8a386bd208956eb0c8c2f48874b0c6fb3af58872 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 97677f24a50963178a123b420d7ac136e4fde3fe 
> 
> Diff: https://reviews.apache.org/r/51874/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> Testing to make sure backward compatibility:
> 
> Case 1: Rolling forward does not impact running tasks:
> Renaming framework from 'TwitterScheduler' to 'Aurora':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:48:28.408182  9815 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (TwitterScheduler) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:48:28.408226  9815 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> E0914 16:48:28.408617  9819 process.cpp:2105] Failed to shutdown socket with 
> fd 28: Transport endpoint is not connected
> I0914 16:48:43.722126  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'Aurora' at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> I0914 16:48:43.722190  9813 master.cpp:2500] Subscribing framework Aurora 
> with checkpointing enabled and capabilities [ REVOCABLE_RESOURCES, 
> GPU_RESOURCES ]
> I0914 16:48:43.75  9813 master.cpp:2564] Updating info for framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722256  9813 master.cpp:2577] Framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-75517c8f-5913-49e9-8cc4-342a78c9bbcb@192.168.33.7:8083 failed over
> I0914 16:48:43.722429  9813 hierarchical.cpp:348] Activated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:48:43.722595  9813 master.cpp:5709] Sending 1 offers to framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083
> 
> Scheduler log:
> I0914 16:48:44.157 [Thread-10, MesosSchedulerImpl:151] Registered with ID 
> value: "071c44a1-b4d4-4339-a727-03a79f725851-"
> , master: id: "461b98b8-63e1-40e3-96fd-cb62420945ae"
> ip: 119646400
> port: 5050
> pid: "master@192.168.33.7:5050"
> hostname: "aurora.local"
> version: "1.0.0"
> address {
>   hostname: "aurora.local"
>   ip: "192.168.33.7"
>   port: 5050
> }
> 
> Case 2: Rolling backward does not impact running tasks:
> Rolling back framework name from 'Aurora' to 'TwitterScheduler':
> 
> The framework re-registers after restart (treated by master as failover) and 
> gets the same framework-id. Running task remain unaffected.
> 
> Master log:
> I0914 16:51:33.203495  9812 master.cpp:1297] Giving framework 
> 071c44a1-b4d4-4339-a727-03a79f725851- (Aurora) at 
> scheduler-dfad8309-de4b-47d8-a8f8-82828ea40a12@192.168.33.7:8083 3weeks to 
> failover
> I0914 16:51:33.203526  9812 hierarchical.cpp:382] Deactivated framework 
> 071c44a1-b4d4-4339-a727-03a79f725851-
> I0914 16:51:49.614074  9813 master.cpp:2424] Received SUBSCRIBE call for 
> framework 'TwitterScheduler' at 
> scheduler-6fa

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko
er/updater/JobUpdateControllerImpl.java 
594bb6219294dcc77d48dcad14e2a6f9caa0c534 
  src/test/java/org/apache/aurora/scheduler/BatchWorkerTest.java PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java 
99c27e8012f10a67ce5f1b84d258e7a5608995c7 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskThrottlerTest.java 
7d104aa2ea4a4d99be4711f666d18beca238284e 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 94f5ca565476f62d72879837a0e7dafabcf30432 
  src/test/java/org/apache/aurora/scheduler/testing/BatchWorkerUtil.java 
PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
196df4754b553f05e50b66ad2f84271901bc9eba 

Diff: https://reviews.apache.org/r/51759/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko


> On Sept. 14, 2016, 6:36 p.m., Zameer Manji wrote:
> > This patch LGTM. Just a single concern here and some questions. I will be 
> > moving on to the subsequent patches shortly.
> > Please do not commit this until all parts are shipped. Also, please don't 
> > forget to rebase/update subsequent patches if changes are made here.
> > 
> > Also, does this cover all state change consumers that interact on storage 
> > when the event is fired or is this a subset? Just wondering if this is 
> > supposed to apply to all of them or just a small set that you think need 
> > this change.

> Also, does this cover all state change consumers that interact on storage 
> when the event is fired or is this a subset

This covers all known high frequency consumers that require a write 
transaction. The list is derived by running a stack-dumping version of the 
`CallOrderEnforcingStorage.write()` in a live cluster for awhile and analysing 
top offenders. It may be not fully comprehensive though, so if you find any 
other candidates feel free to report them.


> On Sept. 14, 2016, 6:36 p.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 233
> > <https://reviews.apache.org/r/51759/diff/6/?file=1498583#file1498583line233>
> >
> > Instead of having an `Optional` and then later 
> > checking it's present when we compute the backoff, we could simplify the 
> > internal implementation by having non repeatable work items default to some 
> > sort of `NoopBackoffStrategy` instead.

I don't really see how that would be a better solution in this particular case. 
We do need a `BackoffStrategy` in the repeatable work case it better be 
non-fake (and hard fail if not provided). Having a noop strategy softens that 
constraint.


> On Sept. 14, 2016, 6:36 p.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/SchedulerModule.java, line 69
> > <https://reviews.apache.org/r/51759/diff/6/?file=1498584#file1498584line69>
> >
> > Just curious, is this default arbitrary or something that was derrived 
> > from your deployment in production?

It's semi-arbitrary but proven in production :).


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148950
---


On Sept. 14, 2016, 5:25 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 5:25 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thu

Re: Review Request 51876: @ReviewBot retry Modify executor state transition logic to rely on health checks (if enabled)

2016-09-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review148937
---




src/main/python/apache/aurora/executor/aurora_executor.py (line 125)
<https://reviews.apache.org/r/51876/#comment216462>

I don't think we should include `Task is healthy.` message for 
non-health-check-enabled tasks as we have no visibility to the health of such 
tasks.



src/main/python/apache/aurora/executor/aurora_executor.py (lines 133 - 141)
<https://reviews.apache.org/r/51876/#comment216461>

This code is largely a repetition of what alrady exists in 
health_checker.py. I'd be great to reuse it (say from task_info.py).



src/main/python/apache/aurora/executor/common/health_checker.py (lines 40 - 43)
<https://reviews.apache.org/r/51876/#comment216503>

Do we really need this enum? A task that is in the middle of the 
`initial_interval_secs` timeout technically is still healthy, right? I feel 
having a boolean `isHealthy` should be enough to cover all possible scenarios 
as `StatusManager` checks for `isHealthy` AND the task status itself to invoke 
a callback.



src/main/python/apache/aurora/executor/common/health_checker.py (line 69)
<https://reviews.apache.org/r/51876/#comment216477>

This comment needs update.



src/main/python/apache/aurora/executor/common/health_checker.py (line 103)
<https://reviews.apache.org/r/51876/#comment216474>

`...if is_healthy else...`



src/main/python/apache/aurora/executor/common/health_checker.py (lines 136 - 
150)
<https://reviews.apache.org/r/51876/#comment216506>

If you get rid of the `HealthCheckStatus` enum I think all we need is this:
```
if self.initial_in_progress:
  if self.current_consecutive_health_checks >= 
self.min_consecutive_health_checks:
  self.healthy = True
else:
  self.healthy = self.current_consecutive_failures > 
self.max_consecutive_failures
  self.reason = reason
  
  
@property
def initial_in_progress(self):
  if not self._expired:
if self.elapsed_time <= self.initial_interval:
  return true
else:  
  self._expired = True
return !self._expired
```



src/main/python/apache/aurora/executor/common/health_checker.py (line 149)
<https://reviews.apache.org/r/51876/#comment216482>

This should be log.info instead.



src/main/python/apache/aurora/executor/common/status_checker.py (line 104)
<https://reviews.apache.org/r/51876/#comment216500>

Not obvious to me: what's the purpose of this condition here?



src/main/python/apache/aurora/executor/status_manager.py (line 60)
<https://reviews.apache.org/r/51876/#comment216473>

How about extending the `StatusManager` to take two callbacks instead? The 
old callback would remain a default one and the new would be an optional one to 
exclusively process 'TASK_RUNNING'. Since it's now status aware, it would make 
sense to consolidate status management here and avoid the if-else switch 
upstream.

Also, I'd use `mesos_pb2.TASK_RUNNING` instead of relying on string const.


- Maxim Khutornenko


On Sept. 14, 2016, 4:43 a.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> ---
> 
> (Updated Sept. 14, 2016, 4:43 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
> https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks wit

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko
rTest.java PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java 
99c27e8012f10a67ce5f1b84d258e7a5608995c7 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskThrottlerTest.java 
7d104aa2ea4a4d99be4711f666d18beca238284e 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 94f5ca565476f62d72879837a0e7dafabcf30432 
  src/test/java/org/apache/aurora/scheduler/testing/BatchWorkerUtil.java 
PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
196df4754b553f05e50b66ad2f84271901bc9eba 

Diff: https://reviews.apache.org/r/51759/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko


> On Sept. 14, 2016, 4:56 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/TaskThrottler.java, 
> > lines 76-90
> > <https://reviews.apache.org/r/51759/diff/4/?file=1498120#file1498120line76>
> >
> > This code is exposing a slight design smell in my eye: We are using the 
> > `DelayedExecutor` to trigger the `batchWorker` when a backoff timer has 
> > passed. However, the latter already provides an internal backoff mechanism.
> > 
> > I'd rather see if we have just one defined way how to handle those 
> > time-based backoffs:
> > 
> > * either the BatchWorker does not perform any backoff handling at all. 
> > This seems to be in line with what Zameer has proposed in his last comment,
> > * or we make the backoff mechanism a first-class feature of the 
> > batchWorker so that I can say I can inject `backoffStrategy` and 
> > `lastBackoffMsec` to be used for the `WorkItem` created in 
> > `batchWorker.execute`.
> 
> Stephan Erb wrote:
> The same applies to the `startGroup(final TaskGroup group)` in Part 3: A 
> backoff mechanism is used that calls the batchWorker that could provide a 
> backoff on its own.

The `BatchWorker` is currently supporting only 2 use cases:
1. Simple no-repeat work item execution (`BatchWorker.execute()`)
2. Repeatable work item execution (`BatchWorker.executeWithReplay()`)

I don't see why `BatchWorker` has to use `BackoffStrategy` everywhere or not 
use it at all. There is a clear use case for it in #2 and absolutely no purpose 
in #1.

What you are suggesting adds yet another case of something like 
`BatchWorker.scheduleExecution()`, which goes beyond of what's immediately 
necessary to address the above cited goals. I don't mind considering further 
improvements to the `BatchWorker` outside of this patch scope but I'd like to 
keep it concise to avoid feature creep.


> On Sept. 14, 2016, 4:56 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 228
> > <https://reviews.apache.org/r/51759/diff/4/?file=1498117#file1498117line228>
> >
> > Do you have any particular usecase for this metric? 
> > 
> > We already have `batchesProcessed` and `itemsProcessed` and can compute 
> > the rate of processed items and batches. This seems more robust to me than 
> > just looking at `lastBatchSize`.

It's more of a convenience metric, similar to how `TimedInterceptor` exposes 
both `total_nanos` and `total_events` in addition to `events_per_sec`. If you 
feel strong about removing redundancy here I am ok dropping it. Just thought 
having a single metric to track the effective batch size may be helpful.


> On Sept. 14, 2016, 4:56 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 197
> > <https://reviews.apache.org/r/51759/diff/4/?file=1498117#file1498117line197>
> >
> > `readyItems` is still a left over from the previous patch version. 
> > Could be renamed to just `batch`.

Thanks. Fixed.


- Maxim


-----------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148918
---


On Sept. 14, 2016, 4:47 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 4:47 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task sched

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko
/scheduler/BatchWorkerTest.java PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java 
99c27e8012f10a67ce5f1b84d258e7a5608995c7 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskThrottlerTest.java 
7d104aa2ea4a4d99be4711f666d18beca238284e 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 94f5ca565476f62d72879837a0e7dafabcf30432 
  src/test/java/org/apache/aurora/scheduler/testing/BatchWorkerUtil.java 
PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
196df4754b553f05e50b66ad2f84271901bc9eba 

Diff: https://reviews.apache.org/r/51759/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-14 Thread Maxim Khutornenko


> On Sept. 14, 2016, 1:29 a.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 110
> > <https://reviews.apache.org/r/51759/diff/4/?file=1498117#file1498117line110>
> >
> > The documentation implies we are returning a boolean but we are 
> > returning a `Result` object. I think you need to update that. I think what 
> > you are saying is that the `Result` object signals through `isCompleted` 
> > that the work should be repeated right?
> > 
> > (As an aside, maybe we should name this to `shouldReschedule` or 
> > similar).
> > 
> > 
> > If work needs to be repeated once the result has been computed, why 
> > can't the caller just access the `BatchWorker` in the `Work` it submits 
> > and do that itself?
> > 
> > That is why can't we extend `apply` to also accept a `BatchWorker` and 
> > the caller can submit another task within the `Work` if that's required?
> > 
> > Alternatively now that `execute` returns a `CompletableFuture`, the 
> > caller could use one of the many methods to execute code when the future is 
> > complete and have that extra code add more work.

Fixed the javadoc and converted `NoResult` into a static field to simplify 
callsite consumption.

As for the second suggestion, I don't like disseminating `BatchWorkers` or work 
item state maintenance outside the `BatchWorker` itself. That would put the 
burden of maintaining and passing around the state to the callsite and may 
create hard to reason/troubleshoot use cases. Consider the 
https://reviews.apache.org/r/51763. We would have to use an explicit wrapper 
object to hold on to local data to be able to re-queue it again as the trigger 
context is gone the moment `BatchWorker` call is placed. Also, queueing 
recurrent homogenuous events via `CompletableStage` chaining does not strike me 
as the right use of the functional style.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148837
---


On Sept. 14, 2016, 12:16 a.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 14, 2016, 12:16 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types:

Re: Review Request 51765: Batching writes - Part 3 (of 3): Converting TaskScheduler to use BatchWorker.

2016-09-13 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51765/
---

(Updated Sept. 14, 2016, 12:29 a.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Rebased over 51759.


Repository: aurora


Description
---

This is the final part of the `BatchWorker` conversion work that converts 
`TaskScheduler`. See https://reviews.apache.org/r/51759 for more background on 
the `BatchWorker`.

#Problem
See https://reviews.apache.org/r/51759

#Remediation
Task scheduling is one of the most dominant users of the write lock. It's also 
one of the heaviest and the most latency-sensitive. As such, the default max 
batch size is chosen conservatively low (3) and batch items are executed in a 
blocking way. 

BTW, attempting to make task scheduling non-blocking resulted in a much worse 
scheduling performance. The way our `DBTaskStore` is wired, all async 
activities, including `EventBus` are bound to use a single async `Executor`, 
which is currently limited at 8 threads [1]. Relying on the same `EventBus` to 
deliver scheduling completion events resulted in slower scheduling perf as 
those events were backed up behind all other activities, including tasks status 
events, reconciliation and etc. Increasing the executor thread pool size to a 
larger number on the other side, also increased the lock contention defeating 
the whole purpose of this work.

#Results
See https://reviews.apache.org/r/51759 for the lock contention results.

https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L51-L54


Diffs (updated)
-

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
9d0d40b82653fb923bed16d06546288a1576c21d 
  src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
11e8033438ad0808e446e41bb26b3fa4c04136c7 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
c044ebe6f72183a67462bbd8e5be983eb592c3e9 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
d266f6a25ae2360db2977c43768a19b1f1efe8ff 
  src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
c2ceb4e7685a9301f8014a9183e02fbad65bca26 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
95cf25eda0a5bfc0cc4c46d1439ebe9d5359ce79 
  
src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 
72562e6bd9a9860c834e6a9faa094c28600a8fed 

Diff: https://reviews.apache.org/r/51765/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51763: Batching writes - Part 2 (of 3): Converting cron jobs to use BatchWorker.

2016-09-13 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/
---

(Updated Sept. 14, 2016, 12:23 a.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Rebasing over 51759.


Repository: aurora


Description
---

This is the second part of the `BatchWorker` conversion work that moves cron 
jobs to use non-blocking kill followups and reduces the number of trigger 
threads. See https://reviews.apache.org/r/51759 for more background on the 
`BatchWorker`.

#Problem
The current implementation of the cron scheduling relies on a large number of 
threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
killing existing tasks according to `KILL_EXISTING` collision policy. This 
creates large spikes of activities at synchronized intervals as users tend to 
schedule their cron runs around similar schedules. Moreover, the current 
implementation re-acquires write locks multiple times to deliver on 
`KILL_EXISTING` policy. 

#Remediation
Trigger level batching is still done in a blocking way but multiple cron 
triggers may be bundled together to share the same write transaction. Any 
followups, however, are performed in a non-blocking way by relying on a 
`BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. In 
order to still ensure non-concurrent execution of a given job key trigger, a 
token (job key) is saved within the trigger itself. A concurrent trigger will 
bail if a kill followup is still in progress (token is set AND no entry in 
`killFollowups` set exists yet).

#Results
The above approach allowed reducing the number of cron threads to 10 and likely 
can be reduced even further. See https://reviews.apache.org/r/51759 for the 
lock contention results.


Diffs (updated)
-

  commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
  commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
bc30990d57f444f7d64805ed85c363f1302736d0 
  config/findbugs/excludeFilter.xml fe3f4ca5db1484124af14421a3349950dfec8519 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
c07551e94f9221b5b21c5dc9715e82caa290c2e8 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
155d702d68367b247dd066f773c662407f0e3b5b 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
5c64ff2994e200b3453603ac5470e8e152cebc55 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f 

Diff: https://reviews.apache.org/r/51763/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-13 Thread Maxim Khutornenko
/scheduler/BatchWorkerTest.java PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java 
99c27e8012f10a67ce5f1b84d258e7a5608995c7 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskThrottlerTest.java 
7d104aa2ea4a4d99be4711f666d18beca238284e 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 94f5ca565476f62d72879837a0e7dafabcf30432 
  src/test/java/org/apache/aurora/scheduler/testing/BatchWorkerUtil.java 
PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
196df4754b553f05e50b66ad2f84271901bc9eba 

Diff: https://reviews.apache.org/r/51759/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-13 Thread Maxim Khutornenko


> On Sept. 11, 2016, 11:02 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, lines 257-259
> > <https://reviews.apache.org/r/51759/diff/1/?file=1495031#file1495031line257>
> >
> > 1) Doesn't that effectively result in a busy loop where we are queuing 
> > the same item over and over until it is can finally be executed? This 
> > sounds rather problematic.
> > 
> > 2) This changes the order of work items. Are you sure that no storage 
> > client requires writes to be in order?
> 
> Maxim Khutornenko wrote:
> > 1) Doesn't that effectively result in a busy loop where we are queuing 
> the same item over and over until it is can finally be executed? This sounds 
> rather problematic.
> 
> There is no busy loop in a pure sense as every `RepeatableWork` item must 
> have a backoff strategy associated with it. Any item scheduled for 
> re-processing is delayed according to the supplied backoff.
> 
> > 2) This changes the order of work items. Are you sure that no storage 
> client requires writes to be in order?
> 
> Order has never been implied or expected as far as cross-transaction 
> boundaries go. That said, the order is preserved for a given work type. We 
> re-queue incomplete items in the same order they were received.
> 
> Stephan Erb wrote:
> Regarding question 1)
> 
> It seems to me that the `run()` loop will drain a batch of WorkItems from 
> the `workQueue`. We then call `getReadyItems` and re-queue everything that is 
> currently in backoff. This repeats over and over again until the backoff time 
> has passed. 
> 
> Have you considered using a `@AsyncExecutor DelayExecutor` just like in 
> `TaskThrottler` for everything that is in backoff and cannot be executed 
> immediately?

In a real cluster, there is no shortage of items pending evaluation. As such, 
the FOLLOWUP item "tax" is almost never a problem as BatchWorker will always be 
in a "busy loop" processing new and old items. I can see how this can become 
too wasteful in a really small cluster with not too many repeatable work items 
though.

I just realized how I can improve on the above and simplify the work queue 
processing. Instead of relying on a queue to drive backoff, we can use backoff 
as a delay on re-queueing incomplete items. This will let us get rid of the 
`getReadyItems()` entirely. Thanks for bringing this up!


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148433
---


On Sept. 13, 2016, 11:14 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 13, 2016, 11:14 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This p

Re: Review Request 51765: Batching writes - Part 3 (of 3): Converting TaskScheduler to use BatchWorker.

2016-09-13 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51765/
---

(Updated Sept. 13, 2016, 11:14 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Rebased over 51759.


Repository: aurora


Description
---

This is the final part of the `BatchWorker` conversion work that converts 
`TaskScheduler`. See https://reviews.apache.org/r/51759 for more background on 
the `BatchWorker`.

#Problem
See https://reviews.apache.org/r/51759

#Remediation
Task scheduling is one of the most dominant users of the write lock. It's also 
one of the heaviest and the most latency-sensitive. As such, the default max 
batch size is chosen conservatively low (3) and batch items are executed in a 
blocking way. 

BTW, attempting to make task scheduling non-blocking resulted in a much worse 
scheduling performance. The way our `DBTaskStore` is wired, all async 
activities, including `EventBus` are bound to use a single async `Executor`, 
which is currently limited at 8 threads [1]. Relying on the same `EventBus` to 
deliver scheduling completion events resulted in slower scheduling perf as 
those events were backed up behind all other activities, including tasks status 
events, reconciliation and etc. Increasing the executor thread pool size to a 
larger number on the other side, also increased the lock contention defeating 
the whole purpose of this work.

#Results
See https://reviews.apache.org/r/51759 for the lock contention results.

https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L51-L54


Diffs (updated)
-

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
9d0d40b82653fb923bed16d06546288a1576c21d 
  src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
11e8033438ad0808e446e41bb26b3fa4c04136c7 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
c044ebe6f72183a67462bbd8e5be983eb592c3e9 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
d266f6a25ae2360db2977c43768a19b1f1efe8ff 
  src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
c2ceb4e7685a9301f8014a9183e02fbad65bca26 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
95cf25eda0a5bfc0cc4c46d1439ebe9d5359ce79 
  
src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 
72562e6bd9a9860c834e6a9faa094c28600a8fed 

Diff: https://reviews.apache.org/r/51765/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-13 Thread Maxim Khutornenko
t/java/org/apache/aurora/scheduler/BatchWorkerTest.java PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java 
99c27e8012f10a67ce5f1b84d258e7a5608995c7 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskThrottlerTest.java 
7d104aa2ea4a4d99be4711f666d18beca238284e 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 94f5ca565476f62d72879837a0e7dafabcf30432 
  src/test/java/org/apache/aurora/scheduler/testing/BatchWorkerUtil.java 
PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
196df4754b553f05e50b66ad2f84271901bc9eba 

Diff: https://reviews.apache.org/r/51759/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51763: Batching writes - Part 2 (of 3): Converting cron jobs to use BatchWorker.

2016-09-13 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/
---

(Updated Sept. 13, 2016, 11:14 p.m.)


Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Changes
---

Rebased over 51759.


Repository: aurora


Description
---

This is the second part of the `BatchWorker` conversion work that moves cron 
jobs to use non-blocking kill followups and reduces the number of trigger 
threads. See https://reviews.apache.org/r/51759 for more background on the 
`BatchWorker`.

#Problem
The current implementation of the cron scheduling relies on a large number of 
threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
killing existing tasks according to `KILL_EXISTING` collision policy. This 
creates large spikes of activities at synchronized intervals as users tend to 
schedule their cron runs around similar schedules. Moreover, the current 
implementation re-acquires write locks multiple times to deliver on 
`KILL_EXISTING` policy. 

#Remediation
Trigger level batching is still done in a blocking way but multiple cron 
triggers may be bundled together to share the same write transaction. Any 
followups, however, are performed in a non-blocking way by relying on a 
`BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. In 
order to still ensure non-concurrent execution of a given job key trigger, a 
token (job key) is saved within the trigger itself. A concurrent trigger will 
bail if a kill followup is still in progress (token is set AND no entry in 
`killFollowups` set exists yet).

#Results
The above approach allowed reducing the number of cron threads to 10 and likely 
can be reduced even further. See https://reviews.apache.org/r/51759 for the 
lock contention results.


Diffs (updated)
-

  commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
  commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
bc30990d57f444f7d64805ed85c363f1302736d0 
  config/findbugs/excludeFilter.xml fe3f4ca5db1484124af14421a3349950dfec8519 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
c07551e94f9221b5b21c5dc9715e82caa290c2e8 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
155d702d68367b247dd066f773c662407f0e3b5b 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
5c64ff2994e200b3453603ac5470e8e152cebc55 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f 

Diff: https://reviews.apache.org/r/51763/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51712: Extend getJobUpdateDetails to accept JobUpdateQuery

2016-09-13 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51712/#review148762
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 13, 2016, 6:33 p.m., Zameer Manji wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51712/
> ---
> 
> (Updated Sept. 13, 2016, 6:33 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1764
> https://issues.apache.org/jira/browse/AURORA-1764
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This extends getJobUpdateDetails to return a list of details instead of being 
> scoped to a single update.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md 4476d52c9e2363342d0c22c7d136e0e481633686 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
> 4202f0eec59ef16668fca6b6ebb925335ad5dea1 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   src/main/python/apache/aurora/client/cli/update.py 
> d23243dcf93dd82f66b4c8cc4342bd2c8d2adc79 
>   src/main/resources/scheduler/assets/js/services.js 
> 315fc17894c2261333715fbb9c45fc02505f2942 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java
>  a7d1f74acdfe5f58eb822a4d826e920cad53dced 
>   src/test/python/apache/aurora/client/api/test_api.py 
> 7a0797d5402e931006d4f215e2d9c0dbbd113257 
>   src/test/python/apache/aurora/client/cli/test_supdate.py 
> 92fb039fb7d3e368d7c0dfa5ebdb465c7f112416 
> 
> Diff: https://reviews.apache.org/r/51712/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>



Re: Review Request 51712: Extend getJobUpdateDetails to accept JobUpdateQuery

2016-09-12 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51712/#review148643
---




src/main/python/apache/aurora/client/api/__init__.py (line 266)
<https://reviews.apache.org/r/51712/#comment216139>

This should send both key and query for both v-1 and v schedulers to work 
correctly.



src/main/python/apache/aurora/client/cli/update.py (lines 529 - 533)
<https://reviews.apache.org/r/51712/#comment216138>

Are you sure the `detailsList` is going to be set when a response is coming 
from the v-1 scheduler? Even if that was the case, it would be completely 
empty. I'd expect something like this instead:
```
if detailsList is not None:
  # process detailsList
else
  # process details
```


- Maxim Khutornenko


On Sept. 12, 2016, 10:26 p.m., Zameer Manji wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51712/
> ---
> 
> (Updated Sept. 12, 2016, 10:26 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1764
> https://issues.apache.org/jira/browse/AURORA-1764
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This extends getJobUpdateDetails to return a list of details instead of being 
> scoped to a single update.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md bbf7198dc7ce53bb02a1bc59f75a661e23472913 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
> 4202f0eec59ef16668fca6b6ebb925335ad5dea1 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   src/main/python/apache/aurora/client/cli/update.py 
> d23243dcf93dd82f66b4c8cc4342bd2c8d2adc79 
>   src/main/resources/scheduler/assets/js/services.js 
> 315fc17894c2261333715fbb9c45fc02505f2942 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java
>  a7d1f74acdfe5f58eb822a4d826e920cad53dced 
>   src/test/python/apache/aurora/client/api/test_api.py 
> 7a0797d5402e931006d4f215e2d9c0dbbd113257 
>   src/test/python/apache/aurora/client/cli/test_supdate.py 
> 92fb039fb7d3e368d7c0dfa5ebdb465c7f112416 
> 
> Diff: https://reviews.apache.org/r/51712/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-12 Thread Maxim Khutornenko


> On Sept. 13, 2016, 1:14 a.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 188
> > <https://reviews.apache.org/r/51759/diff/2/?file=1497462#file1497462line188>
> >
> > Our current metrics use 'nanos' to indicate nanoseconds instead of 
> > 'ns'. Mind using that here for consistency?

We actually have both "nanos" and "ns". I agree that "nanos" looks more 
prevalent as it's what `TimedInterceptor` is using.


> On Sept. 13, 2016, 1:14 a.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, line 105
> > <https://reviews.apache.org/r/51759/diff/2/?file=1497462#file1497462line105>
> >
> > I agree we cannot use `java.util.concurrent.FutureTask` here. However, 
> > this seems to be pretty close to 
> > `java.util.concurrent.CompletableFuture` (new in Java 8).
> > 
> > It's not clear to me here if you have investigated it or not, but if 
> > you have I would like a brief explanation as to why it is not suitable. If 
> > you haven't please take a look at it as it seems to provide all of the 
> > funtionality you need.
> > 
> > Further if it is possible to use it, then we do not need pubsub for 
> > notification for when work completes, if item is it's own future, we can 
> > use the `CompletionStage` functionality to execute code after a future has 
> > been fulfilled. 
> > 
> > It has a `get()` method that can replace `waitForResult()` and a 
> > `complete()` method that allows you to set the value.

I did evaluate the `CompletableFuture` for the initial version of the 
`BatchWorker` and did not find it up to the task. Looking at it again though, I 
don't see any reasons why it would not satisfy the current version! Thanks for 
the nudge, the diff is incoming.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148633
---


On Sept. 12, 2016, 10:51 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 12, 2016, 10:51 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task schedu

Re: Review Request 51765: Batching writes - Part 3 (of 3): Converting TaskScheduler to use BatchWorker.

2016-09-12 Thread Maxim Khutornenko


> On Sept. 11, 2016, 11:18 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java, 
> > lines 115-128
> > <https://reviews.apache.org/r/51765/diff/1/?file=1495174#file1495174line115>
> >
> > The `scheduleTask` method is doing some staggering amount of work, such 
> > as iterating over offers and  matching constraints. Only a small part of 
> > this actually requires an open write transaction.
> > 
> > It seems to me as if moving part of this computation out of the write 
> > transaction could significantly improve write throughput. You have probably 
> > considered that, so what am I missing here? :-)

It's true that only a small part requires writing into the store. 
Unfortunately, we do use write transaction as a global system lock to freeze 
the state for the duration of the logical change. It's very hard to reason 
about that logic if things change underneath. The assumptions about being 
inside of a "locked" state begin with computing the `AttributeAggregate` in the 
`TaskScheduler` and follow along until we are ready to launch task or consider 
a preemption. There are MANY things that can go wrong if we move the 
transaction boundary. Starting from trivial nullref and concurrent collection 
modification exceptions to very hard to troubleshoot logical errors (offers 
reused, limit constraints are not enforced and etc.). While it's theoretically 
possible to implement an optimistic task/offer matching it would require 
substantial rewrite of what we currently have and is definitely outside the 
scope of this change.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51765/#review148434
-------


On Sept. 9, 2016, 6:52 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51765/
> ---
> 
> (Updated Sept. 9, 2016, 6:52 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the final part of the `BatchWorker` conversion work that converts 
> `TaskScheduler`. See https://reviews.apache.org/r/51759 for more background 
> on the `BatchWorker`.
> 
> #Problem
> See https://reviews.apache.org/r/51759
> 
> #Remediation
> Task scheduling is one of the most dominant users of the write lock. It's 
> also one of the heaviest and the most latency-sensitive. As such, the default 
> max batch size is chosen conservatively low (3) and batch items are executed 
> in a blocking way. 
> 
> BTW, attempting to make task scheduling non-blocking resulted in a much worse 
> scheduling performance. The way our `DBTaskStore` is wired, all async 
> activities, including `EventBus` are bound to use a single async `Executor`, 
> which is currently limited at 8 threads [1]. Relying on the same `EventBus` 
> to deliver scheduling completion events resulted in slower scheduling perf as 
> those events were backed up behind all other activities, including tasks 
> status events, reconciliation and etc. Increasing the executor thread pool 
> size to a larger number on the other side, also increased the lock contention 
> defeating the whole purpose of this work.
> 
> #Results
> See https://reviews.apache.org/r/51759 for the lock contention results.
> 
> https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L51-L54
> 
> 
> Diffs
> -
> 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 9d0d40b82653fb923bed16d06546288a1576c21d 
>   src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
> 11e8033438ad0808e446e41bb26b3fa4c04136c7 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
> c044ebe6f72183a67462bbd8e5be983eb592c3e9 
>   src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
> d266f6a25ae2360db2977c43768a19b1f1efe8ff 
>   src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
> c2ceb4e7685a9301f8014a9183e02fbad65bca26 
>   src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
> 95cf25eda0a5bfc0cc4c46d1439ebe9d5359ce79 
>   
> src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java
>  72562e6bd9a9860c834e6a9faa094c28600a8fed 
> 
> Diff: https://reviews.apache.org/r/51765/diff/
> 
> 
> Testing
> ---
> 
> All types of testing including deploying to test and production clusters.
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-12 Thread Maxim Khutornenko
rImpl.java 
594bb6219294dcc77d48dcad14e2a6f9caa0c534 
  src/test/java/org/apache/aurora/scheduler/BatchWorkerTest.java PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java 
99c27e8012f10a67ce5f1b84d258e7a5608995c7 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskThrottlerTest.java 
7d104aa2ea4a4d99be4711f666d18beca238284e 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 94f5ca565476f62d72879837a0e7dafabcf30432 
  src/test/java/org/apache/aurora/scheduler/testing/BatchWorkerUtil.java 
PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
196df4754b553f05e50b66ad2f84271901bc9eba 

Diff: https://reviews.apache.org/r/51759/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-12 Thread Maxim Khutornenko


> On Sept. 12, 2016, 6:26 p.m., Joshua Cohen wrote:
> > Along the lines of the question Stephan raised, what happens in the event 
> > of a failover mid-batch, especially w.r.t. repeatable work?

Same as today: the transaction either happens or does not. In case of a cron 
job (the only user of `RepeatableWork` at the moment), the cron job will not be 
triggered (again, same as today).


> On Sept. 12, 2016, 6:26 p.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, lines 106-107
> > <https://reviews.apache.org/r/51759/diff/1/?file=1495031#file1495031line106>
> >
> > I think this should be "does not initialize"?

Good catch. Fixed.


> On Sept. 12, 2016, 6:26 p.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, lines 257-259
> > <https://reviews.apache.org/r/51759/diff/1/?file=1495031#file1495031line257>
> >
> > Even though there's only a single callsite where we initialize a 
> > `WorkItem` in `FOLLOWUP` state, there's no guarantee that `laskBackoffMsec` 
> > will be present just because the state is `FOLLOWUP`.
> > 
> > We should add some state checks to the `WorkItem` constructor so that 
> > if state is `FOLLOWUP` we'll throw if `lastBackoffMsec` is not present to 
> > protect against future mistakes? (Or if that's a valid scenario we should 
> > perform an `isPresent` check or use `or` here...).

I assert both `backoffStrategy` and `lastBackoffMsec` inside the `backoffFor()` 
method on line 315. I think it should be sufficient given that `WorkItem` is a 
private class?


> On Sept. 12, 2016, 6:26 p.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, lines 273-275
> > <https://reviews.apache.org/r/51759/diff/1/?file=1495031#file1495031line273>
> >
> > Should we do this after the batch is processed rather than before?

It's largely irrelevant but I guess it's more logical to expect it at the end 
while going through the code. Moved.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148501
---


On Sept. 9, 2016, 5:29 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 9, 2016, 5:29 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the

Re: Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-12 Thread Maxim Khutornenko
e1495031line240>
> >
> > A queue could hold vital information that we should not loose during 
> > scheduler failover (in particular, as we do this once per day).
> > 
> > Would it be possible to drain the queue completely before exiting here?

This is no different than terminating the main process while threads are 
waiting on a write lock. Transaction boundaries and reconciliation on restart 
is what ensures data consistency.


> On Sept. 11, 2016, 11:02 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/BatchWorker.java, lines 257-259
> > <https://reviews.apache.org/r/51759/diff/1/?file=1495031#file1495031line257>
> >
> > 1) Doesn't that effectively result in a busy loop where we are queuing 
> > the same item over and over until it is can finally be executed? This 
> > sounds rather problematic.
> > 
> > 2) This changes the order of work items. Are you sure that no storage 
> > client requires writes to be in order?

> 1) Doesn't that effectively result in a busy loop where we are queuing the 
> same item over and over until it is can finally be executed? This sounds 
> rather problematic.

There is no busy loop in a pure sense as every `RepeatableWork` item must have 
a backoff strategy associated with it. Any item scheduled for re-processing is 
delayed according to the supplied backoff.

> 2) This changes the order of work items. Are you sure that no storage client 
> requires writes to be in order?

Order has never been implied or expected as far as cross-transaction boundaries 
go. That said, the order is preserved for a given work type. We re-queue 
incomplete items in the same order they were received.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51759/#review148433
---


On Sept. 9, 2016, 5:29 p.m., Maxim Khutornenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51759/
> ---
> 
> (Updated Sept. 9, 2016, 5:29 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This is the first (out of 3) patches intending to reduce storage write lock 
> contention and as such improve overall system write throughput. It introduces 
> the `BatchWorker` and migrates the majority of storage writes due to task 
> status change events to use `TaskEventBatchWorker`.
> 
> #Problem
> Our current storage system writes effectively behave as `SERIALIZABLE` 
> transaction isolation level in SQL terms. This means all writes require 
> exclusive access to the storage and no two transactions can happen in 
> parallel [1]. While it certainly simplifies our implementation, it creates a 
> single hotspot where multiple threads are competing for the storage write 
> access. This type of contention only worsens as the cluster size grows, more 
> tasks are scheduled, more status updates are processed, more subscribers are 
> listening to status updates and etc. Eventually, the scheduler throughput 
> (and especially task scheduling) becomes degraded to the extent that certain 
> operations wait much longer (4x and more) for the lock acquisition than it 
> takes to process their payload when inside the transaction. Some ops (like 
> event processing) are generally tolerant of these types of delays. Others - 
> not as much. The task scheduling suffers the most as backing up the 
> scheduling queue directly affects
  the Median Time To Assigned (MTTA).
> 
> #Remediation
> Given the above, it's natural to assume that reducing the number of write 
> transactions should help reducing the lock contention. This patch introduces 
> a generic `BatchWorker` service that delivers a "best effort" batching 
> approach by redirecting multiple individual write requests into a single FIFO 
> queue served non-stop by a single dedicated thread. Every batch shares a 
> single write transaction thus reducing the number of potential write lock 
> requests. To minimize wait-in-queue time, items are dispatched immediately 
> and the max number of items is bounded. There are a few `BatchWorker` 
> instances specialized on particular workload types: task even processing, 
> cron scheduling and task scheduling. Every instance can be tuned 
> independently (max batch size) and provides specialized metrics helping to 
> monitor each workload type perf.
> 
> #Results
> The proposed approach h

Re: Review Request 51807: Introduce a flag to treat RAM as a revocable resources

2016-09-12 Thread Maxim Khutornenko


> On Sept. 12, 2016, 7:31 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/resources/ResourceSettings.java, 
> > lines 22-33
> > <https://reviews.apache.org/r/51807/diff/1/?file=1496737#file1496737line22>
> >
> > This is somewhat non-standard way to define cmd flags. We tend to 
> > declare them in modules and use 'setting' classes to act as pure wrappers. 
> > Perhaps it's time to introduce a `ResourceModule` class (if only to bind 
> > the `ResourceSettings` instance for now)?
> 
> Stephan Erb wrote:
> I believe there would be no way to inject the settings into the 
> `ResourceType` enum at compile time with guice. That's why I've adopted this 
> approach here.  Happy to give it a try if you have some pointers for me how 
> to make it work.

Oh, that's right. I forgot java enums are extremely picky when it comes to 
field injections. Ignore this comment. Perhaps having an extra note in this 
class javadoc will help avoiding this type of questions in the future.


> On Sept. 12, 2016, 7:31 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/resources/ResourceType.java, line 
> > 75
> > <https://reviews.apache.org/r/51807/diff/1/?file=1496738#file1496738line75>
> >
> > Would be great to have a e2e test when this functionality is available 
> > in Mesos. Perhaps drop a TODO here?
> 
> Stephan Erb wrote:
> Acutally, it would work in Mesos out of the box even today. All we have 
> to do is adapt our Mesos config to also pass along memory resources 
> https://github.com/apache/aurora/blob/master/examples/vagrant/mesos_config/etc_mesos-slave/modules#L8.
> 
> I didn't do it because I felt like it wouldn't add any additional test 
> coverage, now that the entire resource handling within Aurora is generic.
> 
> Stephan Erb wrote:
> Additional info: We already have an e2e test using revocable CPU 
> resources.

It's true that things should now "just work". Reality proves that's not always 
the case, so any test coverage helping to validate we can actually run tasks 
against offers with more than one revocable resource would be welcome. Does not 
have to be in this RB though.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51807/#review148563
---


On Sept. 12, 2016, 1:38 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51807/
> ---
> 
> (Updated Sept. 12, 2016, 1:38 p.m.)
> 
> 
> Review request for Aurora and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> We plan to open source a very simple Mesos ResourceEstimator and 
> QosController that supports RAM and CPU oversubscription (ETA ~2 weeks). We 
> have been using it internally with a patched Aurora version where the 
> hardcoded `isMesosRevocable` flag of RAM has been set to `true`. This patch 
> makes this behaviour configurable.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md bbf7198dc7ce53bb02a1bc59f75a661e23472913 
>   docs/features/resource-isolation.md 
> 01c5b407a4964e89741d0f6c72d04ab5dc4c2f87 
>   docs/operations/configuration.md 90dde574ce517355d2d6045a5ab036c1a3838882 
>   docs/reference/scheduler-configuration.md 
> 87d2cded0780ffa34884b52cc049c6a0ad808f0a 
>   src/main/java/org/apache/aurora/scheduler/resources/ResourceSettings.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/resources/ResourceType.java 
> 4c102a3d4c4bdd27a1d0536b689acd6257e77788 
> 
> Diff: https://reviews.apache.org/r/51807/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew -Pq build
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 51807: Introduce a flag to treat RAM as a revocable resources

2016-09-12 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51807/#review148600
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 12, 2016, 1:38 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51807/
> ---
> 
> (Updated Sept. 12, 2016, 1:38 p.m.)
> 
> 
> Review request for Aurora and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> We plan to open source a very simple Mesos ResourceEstimator and 
> QosController that supports RAM and CPU oversubscription (ETA ~2 weeks). We 
> have been using it internally with a patched Aurora version where the 
> hardcoded `isMesosRevocable` flag of RAM has been set to `true`. This patch 
> makes this behaviour configurable.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md bbf7198dc7ce53bb02a1bc59f75a661e23472913 
>   docs/features/resource-isolation.md 
> 01c5b407a4964e89741d0f6c72d04ab5dc4c2f87 
>   docs/operations/configuration.md 90dde574ce517355d2d6045a5ab036c1a3838882 
>   docs/reference/scheduler-configuration.md 
> 87d2cded0780ffa34884b52cc049c6a0ad808f0a 
>   src/main/java/org/apache/aurora/scheduler/resources/ResourceSettings.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/resources/ResourceType.java 
> 4c102a3d4c4bdd27a1d0536b689acd6257e77788 
> 
> Diff: https://reviews.apache.org/r/51807/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew -Pq build
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 51712: Extend getJobUpdateDetails to accept JobUpdateQuery

2016-09-12 Thread Maxim Khutornenko


> On Sept. 12, 2016, 7:48 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/api/__init__.py, line 264
> > <https://reviews.apache.org/r/51712/diff/4/?file=1497099#file1497099line264>
> >
> > Why not going all the way to the client command and initializing both 
> > values in this changelist? Otherwise, there is a risk of cutting a release 
> > in between and delaying removal by extra release (until both client and 
> > scheduler are dual writing/reading).
> 
> Zameer Manji wrote:
> We can't send both values (it doesn't make sense). I will upgrade the 
> client to send the query object instead of the key object. I was not aware 
> when we made an API change we would also change the clients in the same 
> commit.
> 
> Zameer Manji wrote:
> Actually never mind, if I make this change (updating the client) it will 
> force us to deploy the client and the scheduler together. I think we need to 
> keep it as is for now right?

> We can't send both values (it doesn't make sense).

This will work just fine. The newer scheduler will prefer the query (and 
collection in the response), while the old one will continue using the key 
completely unaware of the new value (thrift drops unknown fields). This way no 
functional changes will be needed on the client side, just removal of the 
deprecated fields.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51712/#review148565
---


On Sept. 12, 2016, 7:39 p.m., Zameer Manji wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51712/
> -------
> 
> (Updated Sept. 12, 2016, 7:39 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1764
> https://issues.apache.org/jira/browse/AURORA-1764
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This extends getJobUpdateDetails to return a list of details instead of being 
> scoped to a single update.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md bbf7198dc7ce53bb02a1bc59f75a661e23472913 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
> 4202f0eec59ef16668fca6b6ebb925335ad5dea1 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   src/main/resources/scheduler/assets/js/services.js 
> 315fc17894c2261333715fbb9c45fc02505f2942 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java
>  a7d1f74acdfe5f58eb822a4d826e920cad53dced 
>   src/test/python/apache/aurora/client/api/test_api.py 
> 7a0797d5402e931006d4f215e2d9c0dbbd113257 
> 
> Diff: https://reviews.apache.org/r/51712/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>



Re: Review Request 51712: Extend getJobUpdateDetails to accept JobUpdateQuery

2016-09-12 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51712/#review148565
---




src/main/python/apache/aurora/client/api/__init__.py (line 264)
<https://reviews.apache.org/r/51712/#comment216022>

Why not going all the way to the client command and initializing both 
values in this changelist? Otherwise, there is a risk of cutting a release in 
between and delaying removal by extra release (until both client and scheduler 
are dual writing/reading).


- Maxim Khutornenko


On Sept. 12, 2016, 7:39 p.m., Zameer Manji wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51712/
> ---
> 
> (Updated Sept. 12, 2016, 7:39 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1764
> https://issues.apache.org/jira/browse/AURORA-1764
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This extends getJobUpdateDetails to return a list of details instead of being 
> scoped to a single update.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md bbf7198dc7ce53bb02a1bc59f75a661e23472913 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
> 4202f0eec59ef16668fca6b6ebb925335ad5dea1 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   src/main/resources/scheduler/assets/js/services.js 
> 315fc17894c2261333715fbb9c45fc02505f2942 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java
>  a7d1f74acdfe5f58eb822a4d826e920cad53dced 
>   src/test/python/apache/aurora/client/api/test_api.py 
> 7a0797d5402e931006d4f215e2d9c0dbbd113257 
> 
> Diff: https://reviews.apache.org/r/51712/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-12 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148564
---


Fix it, then Ship it!





RELEASE-NOTES.md (line 37)
<https://reviews.apache.org/r/51662/#comment216018>

s/reconcile/reconcile_tasks



src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
(lines 651 - 657)
<https://reviews.apache.org/r/51662/#comment216019>

No need to have an extra method here. Can be inlined.


- Maxim Khutornenko


On Sept. 12, 2016, 5:49 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 12, 2016, 5:49 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md bbf7198dc7ce53bb02a1bc59f75a661e23472913 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51807: Introduce a flag to treat RAM as a revocable resources

2016-09-12 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51807/#review148563
---




src/main/java/org/apache/aurora/scheduler/resources/ResourceSettings.java 
(lines 22 - 33)
<https://reviews.apache.org/r/51807/#comment216016>

This is somewhat non-standard way to define cmd flags. We tend to declare 
them in modules and use 'setting' classes to act as pure wrappers. Perhaps it's 
time to introduce a `ResourceModule` class (if only to bind the 
`ResourceSettings` instance for now)?



src/main/java/org/apache/aurora/scheduler/resources/ResourceType.java (line 75)
<https://reviews.apache.org/r/51807/#comment216017>

Would be great to have a e2e test when this functionality is available in 
Mesos. Perhaps drop a TODO here?


- Maxim Khutornenko


On Sept. 12, 2016, 1:38 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51807/
> ---
> 
> (Updated Sept. 12, 2016, 1:38 p.m.)
> 
> 
> Review request for Aurora and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> We plan to open source a very simple Mesos ResourceEstimator and 
> QosController that supports RAM and CPU oversubscription (ETA ~2 weeks). We 
> have been using it internally with a patched Aurora version where the 
> hardcoded `isMesosRevocable` flag of RAM has been set to `true`. This patch 
> makes this behaviour configurable.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md bbf7198dc7ce53bb02a1bc59f75a661e23472913 
>   docs/features/resource-isolation.md 
> 01c5b407a4964e89741d0f6c72d04ab5dc4c2f87 
>   docs/operations/configuration.md 90dde574ce517355d2d6045a5ab036c1a3838882 
>   docs/reference/scheduler-configuration.md 
> 87d2cded0780ffa34884b52cc049c6a0ad808f0a 
>   src/main/java/org/apache/aurora/scheduler/resources/ResourceSettings.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/resources/ResourceType.java 
> 4c102a3d4c4bdd27a1d0536b689acd6257e77788 
> 
> Diff: https://reviews.apache.org/r/51807/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew -Pq build
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-09 Thread Maxim Khutornenko


> On Sept. 9, 2016, 12:26 a.m., Zameer Manji wrote:
> > src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java,
> >  line 125
> > <https://reviews.apache.org/r/51662/diff/8/?file=1494851#file1494851line125>
> >
> > This smells to me and seems awkward. I guess this stemms from the fact 
> > that the `I*` wrappers don't have Optional fields for integers. Instead of 
> > this here, why not have the signature be 
> > `triggerExplicitReconciliation(Optional batchSize)`?
> > 
> > You can then do `int trueBatchSize = 
> > batchSize.orElse(settings.explicitBatchSize)`.
> > 
> > You can then push validation of the value to the RPC/API layer.

+1. Validating at RPC layer and sending down an `Optional` seems a 
better solution here.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148292
---


On Sept. 9, 2016, 6:43 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> -------
> 
> (Updated Sept. 9, 2016, 6:43 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md bbf7198dc7ce53bb02a1bc59f75a661e23472913 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Review Request 51765: Batching writes - Part 3 (of 3): Converting TaskScheduler to use BatchWorker.

2016-09-09 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51765/
---

Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Repository: aurora


Description
---

This is the final part of the `BatchWorker` conversion work that converts 
`TaskScheduler`. See https://reviews.apache.org/r/51759 for more background on 
the `BatchWorker`.

#Problem
See https://reviews.apache.org/r/51759

#Remediation
Task scheduling is one of the most dominant users of the write lock. It's also 
one of the heaviest and the most latency-sensitive. As such, the default max 
batch size is chosen conservatively low (3) and batch items are executed in a 
blocking way. 

BTW, attempting to make task scheduling non-blocking resulted in a much worse 
scheduling performance. The way our `DBTaskStore` is wired, all async 
activities, including `EventBus` are bound to use a single async `Executor`, 
which is currently limited at 8 threads [1]. Relying on the same `EventBus` to 
deliver scheduling completion events resulted in slower scheduling perf as 
those events were backed up behind all other activities, including tasks status 
events, reconciliation and etc. Increasing the executor thread pool size to a 
larger number on the other side, also increased the lock contention defeating 
the whole purpose of this work.

#Results
See https://reviews.apache.org/r/51759 for the lock contention results.

https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L51-L54


Diffs
-

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
9d0d40b82653fb923bed16d06546288a1576c21d 
  src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java 
11e8033438ad0808e446e41bb26b3fa4c04136c7 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java 
c044ebe6f72183a67462bbd8e5be983eb592c3e9 
  src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java 
d266f6a25ae2360db2977c43768a19b1f1efe8ff 
  src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java 
c2ceb4e7685a9301f8014a9183e02fbad65bca26 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java 
95cf25eda0a5bfc0cc4c46d1439ebe9d5359ce79 
  
src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java 
72562e6bd9a9860c834e6a9faa094c28600a8fed 

Diff: https://reviews.apache.org/r/51765/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Review Request 51763: Batching writes - Part 2 (of 3): Converting cron jobs to use BatchWorker.

2016-09-09 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/
---

Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.


Repository: aurora


Description
---

This is the second part of the `BatchWorker` conversion work that moves cron 
jobs to use non-blocking kill followups and reduces the number of trigger 
threads. See https://reviews.apache.org/r/51759 for more background on the 
`BatchWorker`.

#Problem
The current implementation of the cron scheduling relies on a large number of 
threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
killing existing tasks according to `KILL_EXISTING` collision policy. This 
creates large spikes of activities at synchronized intervals as users tend to 
schedule their cron runs around similar schedules. Moreover, the current 
implementation re-acquires write locks multiple times to deliver on 
`KILL_EXISTING` policy. 

#Remediation
Trigger level batching is still done in a blocking way but multiple cron 
triggers may be bundled together to share the same write transaction. Any 
followups, however, are performed in a non-blocking way by relying on a 
`BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. In 
order to still ensure non-concurrent execution of a given job key trigger, a 
token (job key) is saved within the trigger itself. A concurrent trigger will 
bail if a kill followup is still in progress (token is set AND no entry in 
`killFollowups` set exists yet).

#Results
The above approach allowed reducing the number of cron threads to 10 and likely 
can be reduced even further. See https://reviews.apache.org/r/51759 for the 
lock contention results.


Diffs
-

  commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
  commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
bc30990d57f444f7d64805ed85c363f1302736d0 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
c07551e94f9221b5b21c5dc9715e82caa290c2e8 
  src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
155d702d68367b247dd066f773c662407f0e3b5b 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
5c64ff2994e200b3453603ac5470e8e152cebc55 
  src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f 

Diff: https://reviews.apache.org/r/51763/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Review Request 51759: Batching writes - Part 1 (of 3): Introducing BatchWorker and task event batching.

2016-09-09 Thread Maxim Khutornenko
/scheduler/BatchWorkerTest.java PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java 
99c27e8012f10a67ce5f1b84d258e7a5608995c7 
  src/test/java/org/apache/aurora/scheduler/scheduling/TaskThrottlerTest.java 
7d104aa2ea4a4d99be4711f666d18beca238284e 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 94f5ca565476f62d72879837a0e7dafabcf30432 
  src/test/java/org/apache/aurora/scheduler/testing/BatchWorkerUtil.java 
PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
196df4754b553f05e50b66ad2f84271901bc9eba 

Diff: https://reviews.apache.org/r/51759/diff/


Testing
---

All types of testing including deploying to test and production clusters.


Thanks,

Maxim Khutornenko



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-08 Thread Maxim Khutornenko


> On Sept. 8, 2016, 6:18 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/admin/admin.py, line 351
> > <https://reviews.apache.org/r/51662/diff/7/?file=1494553#file1494553line351>
> >
> > Needs clarifcation that `reconciliation_explicit_batch_size` is a 
> > scheduler setting.
> 
> Karthik Anantha Padmanabhan wrote:
> Good question. It does not show up in the docs. However it is a command 
> line argument in the `ReconciliationModule`. So should this be a scheduler 
> setting ?

By 'setting' I meant the command line argument. Without any explanation it's 
hard to grok what that means.


> On Sept. 8, 2016, 6:18 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java,
> >  line 123
> > <https://reviews.apache.org/r/51662/diff/7/?file=1494551#file1494551line123>
> >
> > Prefer using immutable `IExplicit...` wrapper generated for you by the 
> > build.
> 
> Karthik Anantha Padmanabhan wrote:
> With the `IExplicit...` version I do not have the ability to check if the 
> field is set or not using `isSet...`. I use that in order to provide the 
> default values for the batch size.

What if it's set to 0 or negative value instead? Don't you want to avoid that 
scenario?


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148225
---


On Sept. 8, 2016, 5:27 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> -------
> 
> (Updated Sept. 8, 2016, 5:27 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md bbf7198dc7ce53bb02a1bc59f75a661e23472913 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-08 Thread Maxim Khutornenko


> On Sept. 8, 2016, 6:43 p.m., Santhosh Kumar Shanmugham wrote:
> > src/test/python/apache/aurora/admin/test_admin.py, line 285
> > <https://reviews.apache.org/r/51662/diff/7/?file=1494557#file1494557line285>
> >
> > Should this also be assert_called_once_with(None) ?
> 
> Karthik Anantha Padmanabhan wrote:
> No - I set the option as 500 in the test and I want to test that we call 
> with the provided option

Don't use `assert_called_with`. Use `mock_calls` instead as a generally more 
robust way to assert the exact number of calls. There are plenty of examples 
around.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148230
---


On Sept. 8, 2016, 5:27 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 8, 2016, 5:27 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md bbf7198dc7ce53bb02a1bc59f75a661e23472913 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-08 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148225
---




api/src/main/thrift/org/apache/aurora/gen/api.thrift (line 1215)
<https://reviews.apache.org/r/51662/#comment215688>

Typos in `reconciliation`. Here and in other places.

Also, please use `task` in the RPC name, e.g.: 
`triggerExplicitTaskReconciliation`.



src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
(line 123)
<https://reviews.apache.org/r/51662/#comment215694>

Prefer using immutable `IExplicit...` wrapper generated for you by the 
build.



src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
(lines 124 - 125)
<https://reviews.apache.org/r/51662/#comment215696>

Should be formatted as:
```
int trueBatchSize = reconcilationSettings.isSetBatchSize()
? reconcilationSettings.getBatchSize()
: settings.explicitBatchSize;
```



src/main/python/apache/aurora/admin/admin.py (line 345)
<https://reviews.apache.org/r/51662/#comment215687>

Please be explicit in the name, e.g.: `reconcile_tasks`



src/main/python/apache/aurora/admin/admin.py (line 351)
<https://reviews.apache.org/r/51662/#comment215699>

Needs clarifcation that `reconciliation_explicit_batch_size` is a scheduler 
setting.


- Maxim Khutornenko


On Sept. 8, 2016, 5:27 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 8, 2016, 5:27 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md bbf7198dc7ce53bb02a1bc59f75a661e23472913 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51712: Extend getJobUpdateDetails to accept JobUpdateQuery

2016-09-08 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51712/#review148208
---



This will most certainly break the UI. The client needs to be updated as well 
to pass both (old and new) arguments. Please, test in vagrant with different 
combinations of scheduler and client versions.

- Maxim Khutornenko


On Sept. 8, 2016, 12:03 a.m., Zameer Manji wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51712/
> ---
> 
> (Updated Sept. 8, 2016, 12:03 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1764
> https://issues.apache.org/jira/browse/AURORA-1764
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This extends getJobUpdateDetails to return a list of details instead of being 
> scoped to a single update.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
> 4202f0eec59ef16668fca6b6ebb925335ad5dea1 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java
>  a7d1f74acdfe5f58eb822a4d826e920cad53dced 
> 
> Diff: https://reviews.apache.org/r/51712/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Maxim Khutornenko


> On Sept. 7, 2016, 9:45 p.m., Santhosh Kumar Shanmugham wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift, lines 1209-1214
> > <https://reviews.apache.org/r/51662/diff/4/?file=1493667#file1493667line1209>
> >
> > Did we consider combining the "explicit" and "implicit" reconciliation 
> > RPCs into a single "reconcile" RPC with an optional batchSize argument, so 
> > that it maps to the mesos master API.
> > 
> > http://mesos.apache.org/documentation/latest/reconciliation/
> 
> Karthik Anantha Padmanabhan wrote:
> We have more than one option on the scheduler for the explicit reconcile. 
> If at all we want to expose those to the admin CLI in the future it may be 
> better to have them as two separate API's.
> 
> Santhosh Kumar Shanmugham wrote:
> "We have more than one option on the scheduler for the explicit 
> reconcile." - can you given an example of what you mean by this.
> 
> Karthik Anantha Padmanabhan wrote:
> There is another releavant parameter called `explicitBatchDelaySeconds` 
> which dictates how long we wait in between batches before we call reconcile 
> on mesos. 
> 
> But it is possible that we may add another option in the future ? In that 
> case wouldn't it be better to have two separate API's ?
> 
> Karthik Anantha Padmanabhan wrote:
> Hmm actually if we are passing in a struct of `ExplicitReconcileSettings` 
> , then adding options will be easier. So I can check `isNull` on the settings 
> object on the scheduler to identify an implicit reconcile. 
> 
> I can go either way. I just don't know if there is a strong reason to 
> make it a single API as opposed to two.

Please, don't save a few lines by merging the RPCs or forking off of the 
presence of the `batchSize`. Explicit and implicit reconciliations have very 
different scopes and cover different failure scenarios. I'd advocate in favor 
of Karthik's original proposal with separate call paths (RPCs and client 
commands) that clearly state their purpose in the admin API.


> On Sept. 7, 2016, 9:45 p.m., Santhosh Kumar Shanmugham wrote:
> > src/main/python/apache/aurora/admin/admin.py, line 342
> > <https://reviews.apache.org/r/51662/diff/4/?file=1493670#file1493670line342>
> >
> > 'default=0' here and the usage doc reads default as 1000.
> 
> Karthik Anantha Padmanabhan wrote:
> We want the default value to match the one from 
> settings.explicitBatchSize on the scheduler. So on the scheduler we use the 
> default value if the value here is 0. Although I agree that the 0 here is a 
> bit misleading. 
> 
> I like the optional batchSize argument for the explicit reconcile RPC. 
> 
> I am thinking there can be a 
> 
> struct ExplicitReconcileSettings {
>  1. optional i32 batchSize;
> }
> 
> This can let us add more explicit reconcile settings in the future. 
> 
> Also does thrift lets us declare optional parameters in a function 
> (unless it is wrapped in a struct) ?
> 
> Santhosh Kumar Shanmugham wrote:
> AFAIK, Thrift does not support optional function parameters, due to the 
> limitations of the languages that it gets translated to. You are on the 
> correct path, in making this a 'struct'.

Don't really see much value in overcomplicating this scenario. You can leave 
the `int` on the API side by have your command_option as `default=None` and 
traslate `None` to `0` in the client/api call. Your call.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148089
---


On Sept. 7, 2016, 9:08 p.m., Karthik Anantha Padmanabhan wrote:
> 
> -------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 7, 2016, 9:08 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>

Re: Review Request 51536: Allow watch_secs to be set to 0

2016-09-07 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review148107
---




RELEASE-NOTES.md (line 37)
<https://reviews.apache.org/r/51536/#comment215533>

IMO, this statement is useless to our users and could add to confusion. 
Client still does not allow `watch_secs` for health check enabled updates and 
scheduler always allowed zeros in the `SchedulerThriftInterface.startJobUpdate` 
for this value. It's only this assertion that was not in sync.

Suggest dropping this line and adding it into the client RB that will 
actually allow zero `watch_secs`.


- Maxim Khutornenko


On Sept. 7, 2016, 10:36 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> ---
> 
> (Updated Sept. 7, 2016, 10:36 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Allow watch_secs to be set to 0.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
> ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
> 
> Diff: https://reviews.apache.org/r/51536/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148046
---




api/src/main/thrift/org/apache/aurora/gen/api.thrift (lines 944 - 950)
<https://reviews.apache.org/r/51662/#comment215425>

I don't think there is much value in having this struct and the related 
`reconcileStatus` RPC. Cluster operators are normally relying on scheduler 
stats to monitor activities like this.



api/src/main/thrift/org/apache/aurora/gen/api.thrift (line 1223)
<https://reviews.apache.org/r/51662/#comment215428>

How about `triggerExplicitTaskReconciliation` and 
`triggerImplicitTaskReconciliation`? This would be self-explanatory to any API 
consumer.

nit: please end all comments with '.'



src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
(line 135)
<https://reviews.apache.org/r/51662/#comment215429>

Suggest defaulting to the configured `settings.explicitBatchSize` in case 
the provided one is 0. This will allow the correspondent admin command to make 
batch size completely optional.



src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
(lines 135 - 143)
<https://reviews.apache.org/r/51662/#comment215435>

This only prevents overlapping explicit or implicit runs but does not 
prevent explicit and implicit overlapping runs. Also, this does not cancel any 
calls already waiting for a lock.

Given that we deal with admin-triggered events (who are expected to know 
what they do), I suggest we simplify the logic here by simply scheduling an out 
of band run and not try to synchronize anything around reconciliation runs. 
This would let you cut the bulk of the changes below.



src/main/python/apache/aurora/admin/admin.py (line 355)
<https://reviews.apache.org/r/51662/#comment215437>

The default here would be 0 if you accept my suggestion above.


- Maxim Khutornenko


On Sept. 6, 2016, 11:34 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 6, 2016, 11:34 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/admin/admin_util.py 
> 394deb57af9ad8832a02ceab15f33b3c1e5c902b 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51384: Introduce UpdateMetadata fields in JobUpdateRequest.

2016-09-07 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51384/#review148035
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 7, 2016, 3:29 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51384/
> ---
> 
> (Updated Sept. 7, 2016, 3:29 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Maxim 
> Khutornenko.
> 
> 
> Bugs: AURORA-1711
> https://issues.apache.org/jira/browse/AURORA-1711
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Allow custom metadata to be stored with the job updates. This feature
> is to help case the reconciliation on job update request on the clients.
> 
> Today when a client's requests are proxied via a middle-man,
> if the middle-man times out before the scheduler responds with success,
> the client is left the only option of retry. In the case of updates,
> since multiple updates are not allowed in parallel, the retries fail,
> with INVALID_REQUEST. Although the original update is already accepted
> by the scheduler and is in-progress.
> 
> With this feature, clients can reconcile the job updates,
> - by marking updates them with a unique id in the metadata field
> - scheduler returns the in-progress job update in its response
> - client can match the client-generated ids to reconcile its state.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/jmh/java/org/apache/aurora/benchmark/JobUpdates.java 
> f4f8d0037751c9c2096747264c19f6292461b308 
>   src/jmh/java/org/apache/aurora/benchmark/UpdateStoreBenchmarks.java 
> e5228ae9baadb8cad1e5ce143df09426d6107583 
>   src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java 
> d2673e6b328cb1e249fbe91d18e0d9e935636eaa 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.java
>  a3b04949f1726f110638e4f93ef45947cdb9e7f8 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/migration/V008_CreateUpdateMetadataTable.java
>  PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  594bb6219294dcc77d48dcad14e2a6f9caa0c534 
>   
> src/main/java/org/apache/aurora/scheduler/updater/UpdateInProgressException.java
>  PRE-CREATION 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   src/main/python/apache/aurora/client/cli/update.py 
> d23243dcf93dd82f66b4c8cc4342bd2c8d2adc79 
>   src/main/python/apache/aurora/client/hooks/hooked_api.py 
> 542f165add0f1d01a782fe6d6089bff3e736fb82 
>   
> src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml
>  8563630a179cb6d1e3b0e22c730ccbfe1c9291e2 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql 
> a40830fd668aa650c22a48cbc757b45aef27e961 
>   
> src/test/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStoreTest.java
>  08530397ff75081bde6f07f9d53317b5486e0da4 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java
>  a7d1f74acdfe5f58eb822a4d826e920cad53dced 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
>   src/test/python/apache/aurora/client/cli/test_supdate.py 
> 92fb039fb7d3e368d7c0dfa5ebdb465c7f112416 
>   src/test/python/apache/aurora/client/cli/util.py 
> aac9f9c802e4ad1f06cee8cf3f56749760b33af5 
> 
> Diff: https://reviews.apache.org/r/51384/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> ./pants test src/test/python/apache/aurora/client/cli:cli
> 
> 
> Thanks,
> 
> Santhosh Kumar Shanmugham
> 
>



Re: Review Request 51384: Introduce UpdateMetadata fields in JobUpdateRequest.

2016-09-07 Thread Maxim Khutornenko


> On Sept. 2, 2016, 9:37 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java,
> >  line 913
> > <https://reviews.apache.org/r/51384/diff/7/?file=1489641#file1489641line913>
> >
> > This will be a null ref if scheduler is called from a v-1 client that 
> > does not know anything about update metadata. You need to do something like 
> > this:
> > ```
> > .setMetadata(request.isSetMetadata() 
> > ? toBuildersSet(request.getMetadata()) 
> > : ImmutableSet.of())
> > ```
> 
> Santhosh Kumar Shanmugham wrote:
> Thrift does not expose isSetMetadata() method for optional fileds of 
> primitive type in the IJobUpdateRequest. It is available however in 
> JobUpdateRequest.java and the earlier code that converts from 
> JobUpdateRequest to IJobUpdateRequest does the check, via 
> IJobUpdateRequest.build().
> 
> // SchedulerThrift.java
>   request = IJobUpdateRequest.build(new 
> JobUpdateRequest(mutableRequest).setTaskConfig(
>   configurationManager.validateAndPopulate(
>   
> ITaskConfig.build(mutableRequest.getTaskConfig())).newBuilder()));
>   
> // IJobUpdateRequest.java
>   private IJobUpdateRequest(JobUpdateRequest wrapped) {
> ...
> this.metadata = wrapped.isSetMetadata()
> ? FluentIterable.from(wrapped.getMetadata())
>   .transform(IMetadata::build)
>   .toSet()
> : ImmutableSet.of();
>   }
>   
>   public static IJobUpdateRequest build(JobUpdateRequest wrapped) {
> return new IJobUpdateRequest(wrapped);
>   }

That's right, I was mistakenly assuming that was a `JobUpdateRequest` but that 
is tracked under `mutableRequest`.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51384/#review147694
---


On Sept. 7, 2016, 3:29 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51384/
> ---
> 
> (Updated Sept. 7, 2016, 3:29 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Maxim 
> Khutornenko.
> 
> 
> Bugs: AURORA-1711
> https://issues.apache.org/jira/browse/AURORA-1711
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Allow custom metadata to be stored with the job updates. This feature
> is to help case the reconciliation on job update request on the clients.
> 
> Today when a client's requests are proxied via a middle-man,
> if the middle-man times out before the scheduler responds with success,
> the client is left the only option of retry. In the case of updates,
> since multiple updates are not allowed in parallel, the retries fail,
> with INVALID_REQUEST. Although the original update is already accepted
> by the scheduler and is in-progress.
> 
> With this feature, clients can reconcile the job updates,
> - by marking updates them with a unique id in the metadata field
> - scheduler returns the in-progress job update in its response
> - client can match the client-generated ids to reconcile its state.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/jmh/java/org/apache/aurora/benchmark/JobUpdates.java 
> f4f8d0037751c9c2096747264c19f6292461b308 
>   src/jmh/java/org/apache/aurora/benchmark/UpdateStoreBenchmarks.java 
> e5228ae9baadb8cad1e5ce143df09426d6107583 
>   src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java 
> d2673e6b328cb1e249fbe91d18e0d9e935636eaa 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.java
>  a3b04949f1726f110638e4f93ef45947cdb9e7f8 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/migration/V008_CreateUpdateMetadataTable.java
>  PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  594bb6219294dcc77d48dcad14e2a6f9caa0c534 
>   
> src/main/java/org/apache/aurora/scheduler/updater/UpdateInProgressException.java
>  PRE-CREATION 
>   src/main/python/apache/aurora

Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled

2016-09-06 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review147922
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 6, 2016, 6:46 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> ---
> 
> (Updated Sept. 6, 2016, 6:46 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Scheduler updater will not use watch_sec if health check is enabled.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> After discussion on Aurora dev list, we decided to keep the watch_secs 
> infrastructure intact on scheduler side. Our final conclusion is that we 
> adopt the following implementation:
> 
> 1. If the users want purely health checking driven updates they can set 
> watch_secs to 0 and enable health checks.
> 
> 2. If they want to have both health checking and time driven updates they can 
> set watch_secs to the time that they care about, and doing health checks at 
> STARTING state as well.
> 
> 3. If they just want time driven updates they can disable health checking and 
> set watch_secs to the time that they care about.
> 
> In this review, there will be only one scheduler change: 
> Currently scheduler does not accept zero value for watch_secs, we need to 
> relax this constraint.
> 
> Executor change to do (in a separate review):
> The executor starts health check at STARTING, if a successful health check is 
> performed before initial_interval_sec expires, the executor will sends a 
> status message for RUNNING.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
> ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
> 
> Diff: https://reviews.apache.org/r/51536/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled

2016-09-06 Thread Maxim Khutornenko


> On Sept. 6, 2016, 8:56 p.m., Stephan Erb wrote:
> > The current default of `watch_secs` is 45 seconds. Should we drop that to 0 
> > and also adapt the docs accordingly? That would optimize the default 
> > settings for the health check driven updates, which I think would be a good 
> > thing.

If we drop it to 0 now it will break many clients currently relying on it. I 
think we'll have to wait until the next release if we want to set a new default.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review147917
---


On Sept. 6, 2016, 6:46 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> ---
> 
> (Updated Sept. 6, 2016, 6:46 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Scheduler updater will not use watch_sec if health check is enabled.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> After discussion on Aurora dev list, we decided to keep the watch_secs 
> infrastructure intact on scheduler side. Our final conclusion is that we 
> adopt the following implementation:
> 
> 1. If the users want purely health checking driven updates they can set 
> watch_secs to 0 and enable health checks.
> 
> 2. If they want to have both health checking and time driven updates they can 
> set watch_secs to the time that they care about, and doing health checks at 
> STARTING state as well.
> 
> 3. If they just want time driven updates they can disable health checking and 
> set watch_secs to the time that they care about.
> 
> In this review, there will be only one scheduler change: 
> Currently scheduler does not accept zero value for watch_secs, we need to 
> relax this constraint.
> 
> Executor change to do (in a separate review):
> The executor starts health check at STARTING, if a successful health check is 
> performed before initial_interval_sec expires, the executor will sends a 
> status message for RUNNING.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
> ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
> 
> Diff: https://reviews.apache.org/r/51536/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51580: Add MEDIAN_TIME_TO_STARTING as a new metric

2016-09-06 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51580/#review147853
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 5, 2016, 6:56 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51580/
> ---
> 
> (Updated Sept. 5, 2016, 6:56 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> A new MTTS (Median Time To Starting) metric is added to the sla module in 
> addition to MTTA and MTTR.
> 
> This review request is related to my previous review request: 
> https://reviews.apache.org/r/51536
> 
> In the new implementation, the executor starts health check at STARTING, if a 
> successful health check is performed before initial_interval_sec expires, it 
> transitions into RUNNING state. Therefore, MTTS gives us an idea of how long 
> it takes for a task to become active, whereas the difference between MTTR and 
> MTTS represents the warm-up period for a task.
> 
> See the following issues for more backgrounds:
> 
> https://issues.apache.org/jira/browse/AURORA-1221
> 
> https://issues.apache.org/jira/browse/AURORA-1222
> 
> The new metrics represents the median time spent waiting for a set of tasks 
> to reach STARTING status within a time frame(including the tasks turning into 
> RUNNING state within the time frame).
> 
> Here I regard STARTING as an active state. However, STARTING state is account 
> for platform and job uptime calculations.
> 
> 
> Diffs
> -
> 
>   docs/features/sla-metrics.md 932b5dceb7e356175c7e55c75c5546ecde7ad2c4 
>   src/main/java/org/apache/aurora/scheduler/sla/MetricCalculator.java 
> 3ddac8b9c0adb0e2e7d02b1a741e9ff6976b3c9e 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 
> 4f243aab5a2c2f86ec795025e86302a09f864e2d 
>   src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 
> 90ea3a169dadc72e7d7493544ab865ec59d4d425 
> 
> Diff: https://reviews.apache.org/r/51580/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> ./gradlew :test
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51595: Add Job Store and Dynamic Reservations design docs

2016-09-02 Thread Maxim Khutornenko


> On Sept. 2, 2016, 7:13 p.m., Zameer Manji wrote:
> > AFAIK, the Job Store design doc is on hold right?
> > 
> > Maybe we shouldn't link to it until someone works on it.

+1. That design doc needs some love before we can start linking it.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51595/#review147687
---


On Sept. 2, 2016, 9:14 a.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51595/
> ---
> 
> (Updated Sept. 2, 2016, 9:14 a.m.)
> 
> 
> Review request for Aurora and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Add Job Store and Dynamic Reservations design docs
> 
> 
> Diffs
> -
> 
>   docs/development/design-documents.md 
> 50aeafbc5f96a8c443c90269730cd8b4c4bfd6d7 
> 
> Diff: https://reviews.apache.org/r/51595/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 51602: Extend the resource isolation and oversubscription documentation

2016-09-02 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51602/#review147701
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 2, 2016, 2:55 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51602/
> ---
> 
> (Updated Sept. 2, 2016, 2:55 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> I had to answer a couple of questions regarding these over the recent weeks 
> and thought it might make sense to update the docs accordingly.
> 
> 
> Diffs
> -
> 
>   docs/features/multitenancy.md cb45beb213eb74296194f8cad9e1aefbde07af02 
>   docs/features/resource-isolation.md 
> 59da8231a3e25e64b4be8c30a44c1043188aff2c 
>   docs/operations/configuration.md 350ea77e30ce445ba9e37ea95a9e4f593dbbe402 
> 
> Diff: https://reviews.apache.org/r/51602/diff/
> 
> 
> Testing
> ---
> 
> Rendered version available at 
> https://github.com/StephanErb/aurora/tree/isolation-docs
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 51384: Introduce UpdateMetadata fields in JobUpdateRequest.

2016-09-02 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51384/#review147694
---



Looks great! Only a few comments left.


api/src/main/thrift/org/apache/aurora/gen/api.thrift (line 774)
<https://reviews.apache.org/r/51384/#comment214905>

nit: all comments should be ended with '.'. Here and everywhere.



src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java 
(line 102)
<https://reviews.apache.org/r/51384/#comment214907>

`checkNotBlank` is redundant here as you enter this block only if it's not 
empty.



src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
(line 913)
<https://reviews.apache.org/r/51384/#comment214908>

This will be a null ref if scheduler is called from a v-1 client that does 
not know anything about update metadata. You need to do something like this:
```
.setMetadata(request.isSetMetadata() 
? toBuildersSet(request.getMetadata()) 
: ImmutableSet.of())
```



src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
(lines 932 - 934)
<https://reviews.apache.org/r/51384/#comment214909>

tabbing is off here



src/main/python/apache/aurora/client/cli/context.py (line 79)
<https://reviews.apache.org/r/51384/#comment214911>

This is only used in update.py, please move it there.



src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml
 (lines 357 - 358)
<https://reviews.apache.org/r/51384/#comment214915>

Instead of sub-selecting from a left join, have you considered a collection 
with a subselect instead? See `details.updateEvents` for example. 

It should be more compact and most importantly, perform better on large 
metadata counts. You can validate the last assumption via `./gradlew jmh 
-Pbenchmarks='UpdateStoreBenchmarks.JobDetailsBenchmark'` or create a new 
benchmark to test fetching job summaries only.


- Maxim Khutornenko


On Sept. 1, 2016, 5:58 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51384/
> ---
> 
> (Updated Sept. 1, 2016, 5:58 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Maxim 
> Khutornenko.
> 
> 
> Bugs: AURORA-1711
> https://issues.apache.org/jira/browse/AURORA-1711
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Allow custom metadata to be stored with the job updates. This feature
> is to help case the reconciliation on job update request on the clients.
> 
> Today when a client's requests are proxied via a middle-man,
> if the middle-man times out before the scheduler responds with success,
> the client is left the only option of retry. In the case of updates,
> since multiple updates are not allowed in parallel, the retries fail,
> with INVALID_REQUEST. Although the original update is already accepted
> by the scheduler and is in-progress.
> 
> With this feature, clients can reconcile the job updates,
> - by marking updates them with a unique id in the metadata field
> - scheduler returns the in-progress job update in its response
> - client can match the client-generated ids to reconcile its state.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java 
> d2673e6b328cb1e249fbe91d18e0d9e935636eaa 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.java
>  a3b04949f1726f110638e4f93ef45947cdb9e7f8 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/migration/V008_CreateUpdateMetadataTable.java
>  PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  594bb6219294dcc77d48dcad14e2a6f9caa0c534 
>   
> src/main/java/org/apache/aurora/scheduler/updater/UpdateInProgressException.java
>  PRE-CREATION 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   src/main/python/apache/aurora/client/cli/context.py 
> 9cf58396dc266e97f18f761d110e564ac02bb15d 
>   src/main/python/apache/aurora/client/cli/update.py 
> d23243dcf93dd82f66b4c8cc4342bd2c8d2adc79 
>   src/main/python/apache/aurora/client/hooks/hoo

Re: Review Request 51536: @ReviewBot retry Scheduler updater will not use watch_sec if health check is enabled

2016-08-31 Thread Maxim Khutornenko


> On Aug. 30, 2016, 9:25 p.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java, 
> > line 143
> > <https://reviews.apache.org/r/51536/diff/1/?file=1488791#file1488791line143>
> >
> > We should add a constant to api.thrift for `health` and use it here as 
> > well as in the executor.
> 
> Stephan Erb wrote:
> We need to look for a more general solution here. The presence of a 
> health port does not account for the potential existence of a 
> `.healthchecksnooze` file disabling health checks.
> 
> Kai Huang wrote:
> Thanks, that's a good point. We could combine this logic into "executor 
> capabilities" as Joshua mentioned.

I disagree that we should account for the `.healthchecksnooze` for this 
feature. This approach has always been considered extremely risky and "proceed 
at your own risk" way to debug end user issues. If someone drops a 
`.healthchecksnooze` _after_ task entring `STARTING` and _before_ it reaches 
`RUNNING` I believe the following should be enough to address such case (from 
design doc):

> 2. Instance fails to respond OK and the initial_interval_secs expires ? task 
> moves into FAILED.


> On Aug. 30, 2016, 9:25 p.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java, 
> > line 130
> > <https://reviews.apache.org/r/51536/diff/1/?file=1488791#file1488791line130>
> >
> > Doing this based simply on the presence of a string feels brittle to 
> > me. Would it make sense to define some basic form of "executor 
> > capabilities" passing? Even if it's something simple like a comma delimited 
> > list of constants that we define in api.thrift (so that both the scheduler 
> > and the executor can reuse the values)?
> 
> Kai Huang wrote:
> Do you mean adding executor capabilities as one attribute of TaskEvent in 
> api.thrift? 
> 
> I think on the executor side , we can put (isHealthCheckEnabled() && 
> isExecutorUpdated()) into one attribute called "watch_sec_skippable", and 
> send it to the scheduler.
> 
> Zameer Manji wrote:
> "I think on the executor side , we can put (isHealthCheckEnabled() && 
> isExecutorUpdated()) into one attribute called "watch_sec_skippable", and 
> send it to the scheduler."
> 
> We currently have no communication via the executor and scheduler outside 
> of status updates. If you want to do this, please discuss this idea on the 
> mailing list. The design doc makes no reference to executor capabilities or 
> any communication between the executor and scheduler.

Joshua, how did you see the "executor capabilities" working here? As Zameer 
pointed out, there is currently no way for us to communicate real-time executor 
capabilities to the scheduler. I don't think adding this type of info to every 
status update justifies the amount of work and additional storage space that 
would require. Besides, we'd have to deprecate this capability and remove from 
the schema in the next release or two. 

I agree relying on the presence of the message may appear brittle at first but 
as far as I can see it's the most straightforward approach that will not 
require any special handling or deprecation followups. We are in full control 
of the executor and it's guaranteed through code the message will remain absent 
for any V-n versions running out there.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review147358
---


On Aug. 30, 2016, 8:52 p.m., Kai Huang wrote:
> 
> -------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> ---
> 
> (Updated Aug. 30, 2016, 8:52 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Scheduler updater will not use watch_sec if health check is enabled
> - Add unit tests for successful updates.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details a

Re: Review Request 51535: Fix a Python unittest that is not asserting anything

2016-08-30 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51535/#review147346
---


Ship it!




Good catch!

- Maxim Khutornenko


On Aug. 30, 2016, 7:15 p.m., Stephan Erb wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51535/
> ---
> 
> (Updated Aug. 30, 2016, 7:15 p.m.)
> 
> 
> Review request for Aurora and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> I discovered this one during a failed attempt to update to a new mock 
> version. I have only aimed for a minimal fix:
> 
> * incorrect usage of PropertyMock: 
> https://docs.python.org/dev/library/unittest.mock.html#unittest.mock.PropertyMock
>   
> * assert_called_once() does not exist: 
> https://engineeringblog.yelp.com/2015/02/assert_called_once-threat-or-menace.html
> 
> 
> Diffs
> -
> 
>   src/test/python/apache/aurora/admin/test_admin.py 
> fd3b91938b5b67ea59d3955dd94a046b3181be2d 
> 
> Diff: https://reviews.apache.org/r/51535/diff/
> 
> 
> Testing
> ---
> 
> Made sure it fails if the production method is tempered with.
> ./pants test.pytest src/test/python/apache/aurora/admin::
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>



Re: Review Request 51506: Enable -zk_use_curator by default and deprecate.

2016-08-30 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51506/#review147326
---




src/main/java/org/apache/aurora/scheduler/discovery/FlaggedZooKeeperConfig.java 
(line 78)
<https://reviews.apache.org/r/51506/#comment214449>

I suggest against placing a version number into the warning message. We've 
had a few past examples when we did not follow up on deprecation warnings as 
promised.


- Maxim Khutornenko


On Aug. 29, 2016, 11:37 p.m., John Sirois wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51506/
> ---
> 
> (Updated Aug. 29, 2016, 11:37 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Bugs: AURORA-1669
> https://issues.apache.org/jira/browse/AURORA-1669
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> The flag is noted as deprecated for removal in 0.17.0.
> 
>  RELEASE-NOTES.md 
>|  2 ++
>  docs/reference/scheduler-configuration.md
>|  4 ++--
>  examples/vagrant/upstart/aurora-scheduler.conf   
>|  1 -
>  
> src/main/java/org/apache/aurora/scheduler/discovery/FlaggedZooKeeperConfig.java
>  | 11 +--
>  4 files changed, 13 insertions(+), 5 deletions(-)
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md 1819eaa20cf5014228643a1e120316d646cc2824 
>   docs/reference/scheduler-configuration.md 
> c0c39442725c970e5f9811cd7b4ab3104364c671 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> dd60981564286e5b25546737fdd0ce08b0168ed4 
>   
> src/main/java/org/apache/aurora/scheduler/discovery/FlaggedZooKeeperConfig.java
>  36ad18c49c5693031136440ab163070f9ffa9405 
> 
> Diff: https://reviews.apache.org/r/51506/diff/
> 
> 
> Testing
> ---
> 
> Locally green:
> ```
> ./gradlew -Pq build
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> ```
> 
> 
> Thanks,
> 
> John Sirois
> 
>



Re: Review Request 51307: Catch IOError.

2016-08-29 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51307/#review147179
---


Ship it!




Ship It!

- Maxim Khutornenko


On Aug. 23, 2016, 9:12 p.m., David Robinson wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51307/
> ---
> 
> (Updated Aug. 23, 2016, 9:12 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1752
> https://issues.apache.org/jira/browse/AURORA-1752
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Catch IOError.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/thermos/monitoring/process_collector_psutil.py 
> bb7c90206791309772c4bb8e2ccf6e62a3991403 
>   src/test/python/apache/thermos/monitoring/test_process_collector_psutil.py 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/51307/diff/
> 
> 
> Testing
> ---
> 
> $ ./pants test src/test/python/apache/thermos/monitoring
> 
> 13:59:36 00:00 [main]
>(To run a reporting server: ./pants server)
> 13:59:36 00:00   [setup]
> 13:59:36 00:00 [parse]
>Executing tasks in goals: test
> 13:59:36 00:00   [test]
> 13:59:36 00:00 [test-prep-command]
> 13:59:36 00:00 [test]
> 13:59:36 00:00 [pytest]
> 13:59:36 00:00   [run]
>  == test session starts ===
>  platform linux2 -- Python 2.7.11 -- py-1.4.31 -- 
> pytest-2.6.4
>  plugins: cov, timeout
>  collected 12 items 
>  
>  src/test/python/apache/thermos/monitoring/test_disk.py .
>  
> src/test/python/apache/thermos/monitoring/test_detector.py .
>  
> src/test/python/apache/thermos/monitoring/test_process_collector_psutil.py .
>  
> src/test/python/apache/thermos/monitoring/test_resource.py .
>  
>  === 12 passed in 0.17 seconds 
>  
> 13:59:37 00:01   [complete]
>SUCCESS
> 
> 
> Thanks,
> 
> David Robinson
> 
>



Re: Review Request 51469: Remove static Stats method `exportSize`.

2016-08-29 Thread Maxim Khutornenko


> On Aug. 29, 2016, 4:33 p.m., Maxim Khutornenko wrote:
> > Not against this change but rather curious what your thoughts are wrt the 
> > larger picture here. Are you intending to get rid of all static helps in 
> > `Stats` in favor of the `StatsProvider`?
> 
> Zameer Manji wrote:
> Yes, I think the static dependency on stats is tech debt that we need to 
> fix. There is also lots of unused code in Stats that I will be removing as 
> well.

Not sure I agree about eliminating all of `Stats` static helpers. Having 
methods like `exportLong` have proven to help us reduce friction around adding 
new stats and generally help quick prototyping and perf investigations.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51469/#review147138
---


On Aug. 26, 2016, 10:11 p.m., Zameer Manji wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51469/
> ---
> 
> (Updated Aug. 26, 2016, 10:11 p.m.)
> 
> 
> Review request for Aurora and Joshua Cohen.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Remove static Stats method `exportSize`.
> 
> 
> Diffs
> -
> 
>   commons/src/main/java/org/apache/aurora/common/stats/Stats.java 
> 538e8077ba762ec9a8f14f759740bb82144a5dc4 
>   commons/src/main/java/org/apache/aurora/common/stats/StatsProvider.java 
> 6b1fa4b35c54ab3e37a80c91664c6e8cfd4696c2 
>   src/main/java/org/apache/aurora/scheduler/TaskStatusHandlerImpl.java 
> a83f18307fd8e1d266f3d3ae8682333575eda884 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java 
> a1ae7c7231d15d2f9715a7de30dfcde0a2f4d730 
>   src/test/java/org/apache/aurora/scheduler/TaskStatusHandlerImplTest.java 
> 6d4934b6b6de5273687041174315c4e3d151ac11 
>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 
> e7534c4325afa29b580d890cffdc26bdec7ef938 
> 
> Diff: https://reviews.apache.org/r/51469/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>



Re: Review Request 51384: Introduce UpdateMetadata fields in JobUpdateRequest.

2016-08-29 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51384/#review147146
---




api/src/main/thrift/org/apache/aurora/gen/api.thrift (lines 795 - 796)
<https://reviews.apache.org/r/51384/#comment214251>

How about moving this into `JobUpdateSummary` instead? As it stands now, 
the access to update metadata requires exposing a potentially much heavier 
`JobUpdate` struct through our thrift interface. 

Bonus point: if you move it into the `JobUpdateSummary` the metadata will 
be automatically exposed via the existing `getJobUpdateSummaries` RPC.


- Maxim Khutornenko


On Aug. 28, 2016, 3:58 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51384/
> ---
> 
> (Updated Aug. 28, 2016, 3:58 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Maxim 
> Khutornenko.
> 
> 
> Bugs: AURORA-1711
> https://issues.apache.org/jira/browse/AURORA-1711
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Allow custom metadata to be stored with the job updates. This feature
> is to help case the reconciliation on job update request on the clients.
> 
> Today when a client's requests are proxied via a middle-man,
> if the middle-man times out before the scheduler responds with success,
> the client is left the only option of retry. In the case of updates,
> since multiple updates are not allowed in parallel, the retries fail,
> with INVALID_REQUEST. Although the original update is already accepted
> by the scheduler and is in-progress.
> 
> With this feature, clients can reconcile the job updates,
> - by marking updates them with a unique id in the metadata field
> - scheduler returns the in-progress job update in its response
> - client can match the client-generated ids to reconcile its state.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java 
> d2673e6b328cb1e249fbe91d18e0d9e935636eaa 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.java
>  a3b04949f1726f110638e4f93ef45947cdb9e7f8 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/migration/V008_CreateUpdateMetadataTable.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/db/views/DbJobUpdate.java 
> 78703e92c932cd5e93ff0b70f2a0b6928f6b4003 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  594bb6219294dcc77d48dcad14e2a6f9caa0c534 
>   
> src/main/java/org/apache/aurora/scheduler/updater/UpdateInProgressException.java
>  PRE-CREATION 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   src/main/python/apache/aurora/client/cli/context.py 
> f1a256a8d09d23d8d4d4ee7d264be0fe376398c4 
>   src/main/python/apache/aurora/client/cli/options.py 
> 1245ff15a69a4b4347672f7b556985521e813a00 
>   src/main/python/apache/aurora/client/cli/update.py 
> 23aaa2c1b67599420408633733e4581553f7151b 
>   src/main/python/apache/aurora/client/hooks/hooked_api.py 
> 542f165add0f1d01a782fe6d6089bff3e736fb82 
>   
> src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml
>  8563630a179cb6d1e3b0e22c730ccbfe1c9291e2 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql 
> a40830fd668aa650c22a48cbc757b45aef27e961 
>   
> src/test/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStoreTest.java
>  08530397ff75081bde6f07f9d53317b5486e0da4 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
>   src/test/python/apache/aurora/client/cli/test_supdate.py 
> 92fb039fb7d3e368d7c0dfa5ebdb465c7f112416 
>   src/test/python/apache/aurora/client/cli/util.py 
> aac9f9c802e4ad1f06cee8cf3f56749760b33af5 
> 
> Diff: https://reviews.apache.org/r/51384/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> ./pants test src/test/python/apache/aurora/client/cli:cli
> 
> 
> Thanks,
> 
> Santhosh Kumar Shanmugham
> 
>



Re: Review Request 51469: Remove static Stats method `exportSize`.

2016-08-29 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51469/#review147138
---



Not against this change but rather curious what your thoughts are wrt the 
larger picture here. Are you intending to get rid of all static helps in 
`Stats` in favor of the `StatsProvider`?

- Maxim Khutornenko


On Aug. 26, 2016, 10:11 p.m., Zameer Manji wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51469/
> ---
> 
> (Updated Aug. 26, 2016, 10:11 p.m.)
> 
> 
> Review request for Aurora and Joshua Cohen.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Remove static Stats method `exportSize`.
> 
> 
> Diffs
> -
> 
>   commons/src/main/java/org/apache/aurora/common/stats/Stats.java 
> 538e8077ba762ec9a8f14f759740bb82144a5dc4 
>   commons/src/main/java/org/apache/aurora/common/stats/StatsProvider.java 
> 6b1fa4b35c54ab3e37a80c91664c6e8cfd4696c2 
>   src/main/java/org/apache/aurora/scheduler/TaskStatusHandlerImpl.java 
> a83f18307fd8e1d266f3d3ae8682333575eda884 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java 
> a1ae7c7231d15d2f9715a7de30dfcde0a2f4d730 
>   src/test/java/org/apache/aurora/scheduler/TaskStatusHandlerImplTest.java 
> 6d4934b6b6de5273687041174315c4e3d151ac11 
>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 
> e7534c4325afa29b580d890cffdc26bdec7ef938 
> 
> Diff: https://reviews.apache.org/r/51469/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>



Re: Review Request 51474: Uber

2016-08-29 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51474/#review147136
---


Ship it!




Ship It!

- Maxim Khutornenko


On Aug. 27, 2016, 5:53 a.m., Tarun Gogineni wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51474/
> ---
> 
> (Updated Aug. 27, 2016, 5:53 a.m.)
> 
> 
> Review request for Aurora and Joshua Cohen.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Added Uber to the relevant user list
> 
> 
> Diffs
> -
> 
>   README.md 77fb77063eb96da9574794f989641c7746d2c35c 
> 
> Diff: https://reviews.apache.org/r/51474/diff/
> 
> 
> Testing
> ---
> 
> Their latest blog post regarding their engineering stack mentioned their use 
> of aurora for long running jobs.
> 
> 
> Thanks,
> 
> Tarun Gogineni
> 
>



Re: Review Request 51298: A few executor fixes for filesystem isolation:

2016-08-22 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51298/#review146441
---


Ship it!




Ship It!

- Maxim Khutornenko


On Aug. 22, 2016, 8:25 p.m., Joshua Cohen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51298/
> ---
> 
> (Updated Aug. 22, 2016, 8:25 p.m.)
> 
> 
> Review request for Aurora and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Add an option to skip the groupadd/useradd calls into the task's filesystem.
> - Mount any configured volumes into the task's filesystem.
> - Clean up http server script used by appc e2e tests.
> 
> 
> Diffs
> -
> 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> dde19a6914c7c7b2178220707f242f61f11f38bd 
>   src/main/python/apache/aurora/executor/bin/thermos_executor_main.py 
> 65a495d5c50d91b38c4328bab3bfec667f6a7ba9 
>   src/main/python/apache/aurora/executor/common/sandbox.py 
> 5f091af7636bd94f028f15d63437e305b02f741c 
>   src/test/python/apache/aurora/executor/common/test_sandbox.py 
> ce989b1ccda0f1bc7ba9e15dfe4be20116db3491 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 
> 06601df3bc355a690ff1789b2e2e34484fadefe9 
>   src/test/sh/org/apache/aurora/e2e/run-server.sh 
> 76939888bed2e8138671d97f7bc56fd5641008e4 
> 
> Diff: https://reviews.apache.org/r/51298/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> e2e tests
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>



Re: Review Request 51107: Thrift API change to support instance-specific task config in JobConfiguration

2016-08-15 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51107/#review145805
---



-1 to this change. Please, see here for reasons behind it: 
http://markmail.org/message/d4j6lkncqkua5a23

- Maxim Khutornenko


On Aug. 15, 2016, 9:11 p.m., Min Cai wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51107/
> ---
> 
> (Updated Aug. 15, 2016, 9:11 p.m.)
> 
> 
> Review request for Aurora.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Implement instance-specific task config in JobConfiguration
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> b799cce665c99ee20fba84a9ddcb5a895ff5685e 
>   
> src/main/java/org/apache/aurora/scheduler/configuration/SanitizedConfiguration.java
>  264cb7cb38809a3a37b13b76e93f423ec03fbacc 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
> c07551e94f9221b5b21c5dc9715e82caa290c2e8 
>   src/main/java/org/apache/aurora/scheduler/state/StateManager.java 
> d395104af84477b03649ddd65276fbd1dc1e9210 
>   src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
> ffa9481fa4d2f5e1cf74f5819b398e3cb74f1010 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  b534abf95bab6e1657e3ef993cf34c0d6ec460be 
>   
> src/main/java/org/apache/aurora/scheduler/updater/InstanceActionHandler.java 
> 11ffa663e4e0fa5fda0ebb343d11d2485a83c7c6 
> 
> Diff: https://reviews.apache.org/r/51107/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Min Cai
> 
>



Re: Review Request 50168: Add rollback functionality to the scheduler

2016-08-10 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50168/#review145455
---


Ship it!




Ship It!

- Maxim Khutornenko


On Aug. 11, 2016, 1:32 a.m., Igor Morozov wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50168/
> ---
> 
> (Updated Aug. 11, 2016, 1:32 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, Bill Farner, and 
> Zameer Manji.
> 
> 
> Bugs: AURORA-1721
> https://issues.apache.org/jira/browse/AURORA-1721
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Add rollback functionality to the scheduler
> 
> 
> Diffs
> -
> 
>   CHANGELOG fc6a46d77ebf889c8f60402c95e8a7472980ba1c 
>   RELEASE-NOTES.md 50f9b8372b364773d4ee6b29f031deb13c9f6444 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> b799cce665c99ee20fba84a9ddcb5a895ff5685e 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  b534abf95bab6e1657e3ef993cf34c0d6ec460be 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/aop/AnnotatedAuroraAdmin.java
>  9243c92b11040b68ed6014b3991db69fc08bcddf 
>   src/main/java/org/apache/aurora/scheduler/updater/JobUpdateController.java 
> f8357c46df1b025bf4e38a7ce1cb1c13a50c39f9 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  364c5c753f884a2d89e27802d7bbf3b2b6d3a08e 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateStateMachine.java 
> 7ab739a4dfa292895ad6ba8849e65f5c45ce9770 
>   src/main/python/apache/aurora/client/api/__init__.py 
> ec2c786d143347a7953f9fe431c6ddc59e1b9854 
>   src/main/python/apache/aurora/client/cli/update.py 
> bb526f7bf94d7bfe02fe2786493c85be1bfeb86f 
>   
> src/test/java/org/apache/aurora/scheduler/updater/JobUpdateStateMachineTest.java
>  8d78bae2cbf365a5359d88c0e996b3972a21e50e 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> e157c0dfde5efc418448e138aa008ade742fe816 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afbd385b7eda64cb1f7d118b695e65e4045eac6c 
>   src/test/python/apache/aurora/client/cli/test_supdate.py 
> 317b17547e8b33106787fbe43b26cec2da830ba1 
> 
> Diff: https://reviews.apache.org/r/50168/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Igor Morozov
> 
>



Re: Review Request 50902: AURORA-1656 Fix broken links in tier documentation

2016-08-10 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50902/#review145393
---


Ship it!




Ship It!

- Maxim Khutornenko


On Aug. 8, 2016, 5:13 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50902/
> ---
> 
> (Updated Aug. 8, 2016, 5:13 p.m.)
> 
> 
> Review request for Aurora and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1656
> https://issues.apache.org/jira/browse/AURORA-1656
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Just realized that tier documentation (that I recently touched for 
> AURORA-1656) has broken links that I had not notice.
> 
> 
> Diffs
> -
> 
>   docs/features/multitenancy.md 417c433665bf9e32416bde426078883931783c1c 
>   docs/reference/configuration.md 16b31be5bf3021eb2f646e7a4361d5b614bd1164 
> 
> Diff: https://reviews.apache.org/r/50902/diff/
> 
> 
> Testing
> ---
> 
> Manual testing
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Re: Review Request 50902: AURORA-1656 Fix broken links in tier documentation

2016-08-10 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50902/#review145364
---



The easiest way to validate doc changes like this is to push your branch to 
your github remote and post a link here for review. Care to give it a try?

- Maxim Khutornenko


On Aug. 8, 2016, 5:13 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50902/
> ---
> 
> (Updated Aug. 8, 2016, 5:13 p.m.)
> 
> 
> Review request for Aurora and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1656
> https://issues.apache.org/jira/browse/AURORA-1656
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Just realized that tier documentation (that I recently touched for 
> AURORA-1656) has broken links that I had not notice.
> 
> 
> Diffs
> -
> 
>   docs/features/multitenancy.md 417c433665bf9e32416bde426078883931783c1c 
>   docs/reference/configuration.md 16b31be5bf3021eb2f646e7a4361d5b614bd1164 
> 
> Diff: https://reviews.apache.org/r/50902/diff/
> 
> 
> Testing
> ---
> 
> Manual testing
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Re: Review Request 50168: Add rollback functionality to the scheduler

2016-08-10 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50168/#review145356
---




src/main/java/org/apache/aurora/scheduler/updater/JobUpdateController.java 
(lines 131 - 133)
<https://reviews.apache.org/r/50168/#comment211555>

This needs to be updated to include ROLL_FORWARD_AWAITING_PULSE and 
ROLL_FORWARD_PAUSED.



src/main/java/org/apache/aurora/scheduler/updater/JobUpdateStateMachine.java 
(line 73)
<https://reviews.apache.org/r/50168/#comment211556>

If we want to make it truly generic I think we also need to support 
ROLL_FORWARD_PAUSED -> ROLLING_BACK transition. That will cover *all* active 
states.


- Maxim Khutornenko


On Aug. 9, 2016, 10:41 p.m., Igor Morozov wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50168/
> ---
> 
> (Updated Aug. 9, 2016, 10:41 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, Bill Farner, and 
> Zameer Manji.
> 
> 
> Bugs: AURORA-1721
> https://issues.apache.org/jira/browse/AURORA-1721
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Add rollback functionality to the scheduler
> 
> 
> Diffs
> -
> 
>   CHANGELOG fc6a46d77ebf889c8f60402c95e8a7472980ba1c 
>   RELEASE-NOTES.md 50f9b8372b364773d4ee6b29f031deb13c9f6444 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> b799cce665c99ee20fba84a9ddcb5a895ff5685e 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  b534abf95bab6e1657e3ef993cf34c0d6ec460be 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/aop/AnnotatedAuroraAdmin.java
>  9243c92b11040b68ed6014b3991db69fc08bcddf 
>   src/main/java/org/apache/aurora/scheduler/updater/JobUpdateController.java 
> f8357c46df1b025bf4e38a7ce1cb1c13a50c39f9 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  364c5c753f884a2d89e27802d7bbf3b2b6d3a08e 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateStateMachine.java 
> 7ab739a4dfa292895ad6ba8849e65f5c45ce9770 
>   src/main/python/apache/aurora/client/api/__init__.py 
> ec2c786d143347a7953f9fe431c6ddc59e1b9854 
>   src/main/python/apache/aurora/client/cli/update.py 
> bb526f7bf94d7bfe02fe2786493c85be1bfeb86f 
>   
> src/test/java/org/apache/aurora/scheduler/updater/JobUpdateStateMachineTest.java
>  8d78bae2cbf365a5359d88c0e996b3972a21e50e 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> e157c0dfde5efc418448e138aa008ade742fe816 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afbd385b7eda64cb1f7d118b695e65e4045eac6c 
>   src/test/python/apache/aurora/client/cli/test_supdate.py 
> 317b17547e8b33106787fbe43b26cec2da830ba1 
> 
> Diff: https://reviews.apache.org/r/50168/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Igor Morozov
> 
>



  1   2   3   4   5   6   7   8   >