Re: Review Request 55105: AURORA-1870 Add finer grained timings to the Snapshot process

2017-01-04 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55105/#review160560
---


Ship it!




Master (d4ebb56) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Jan. 5, 2017, 4:36 a.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55105/
> ---
> 
> (Updated Jan. 5, 2017, 4:36 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Joshua Cohen.
> 
> 
> Bugs: AURORA-1870
> https://issues.apache.org/jira/browse/AURORA-1870
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1870 Add finer grained timings to the Snapshot process
> 
> I gave up on `@Timed` interceptor approach because major refactoring is 
> required in order to have snapshot fields instantiated by Guice through 
> `Provider` or `@Provides` interfaces. The abstract class approach is much 
> easier/cleaner.
> 
> 
> Diffs
> -
> 
>   commons/src/main/java/org/apache/aurora/common/stats/SlidingStats.java 
> f7a5ae41e307627fc55157758e9b7cdd861c3268 
>   commons/src/test/java/org/apache/aurora/common/stats/SlidingStatsTest.java 
> PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
> 7aa111ec14696ae40f518c42f3c7f45d8ab0e94c 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java
>  f56a1624c7188da175ad3e6de323c3442802f2ea 
> 
> Diff: https://reviews.apache.org/r/55105/diff/
> 
> 
> Testing
> ---
> 
> ```
> $ curl localhost:8081/vars | grep snapshot_restore
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
> snapshot_restore_crons_events 1
> snapshot_restore_crons_events_per_sec 0.0
> snapshot_restore_crons_nanos_per_event 0.0
> snapshot_restore_crons_nanos_total 73648
> snapshot_restore_crons_nanos_total_per_sec 0.0
> snapshot_restore_dbscript_events 1
> snapshot_restore_dbscript_events_per_sec 0.0
> snapshot_restore_dbscript_nanos_per_event 0.0
> snapshot_restore_dbscript_nanos_total 1148842021
> snapshot_restore_dbscript_nanos_total_per_sec 0.0
> snapshot_restore_hosts_events 1
> snapshot_restore_hosts_events_per_sec 0.0
> snapshot_restore_hosts_nanos_per_event 0.0
> snapshot_restore_hosts_nanos_total 76166
> snapshot_restore_hosts_nanos_total_per_sec 0.0
> snapshot_restore_job_updates_events 1
> snapshot_restore_job_updates_events_per_sec 0.0
> snapshot_restore_job_updates_nanos_per_event 0.0
> snapshot_restore_job_updates_nanos_total 49482
> snapshot_restore_job_updates_nanos_total_per_sec 0.0
> snapshot_restore_locks_events 1
> snapshot_restore_locks_events_per_sec 0.0
> snapshot_restore_locks_nanos_per_event 0.0
> snapshot_restore_locks_nanos_total 125084
> snapshot_restore_locks_nanos_total_per_sec 0.0
> snapshot_restore_quota_events 1
> snapshot_restore_quota_events_per_sec 0.0
> snapshot_restore_quota_nanos_per_event 0.0
> snapshot_restore_quota_nanos_total 52305
> snapshot_restore_quota_nanos_total_per_sec 0.0
> snapshot_restore_scheduler_metadata_events 1
> snapshot_restore_scheduler_metadata_events_per_sec 0.0
> snapshot_restore_scheduler_metadata_nanos_per_event 0.0
> snapshot_restore_scheduler_metadata_nanos_total 70816
> snapshot_restore_scheduler_metadata_nanos_total_per_sec 0.0
> snapshot_restore_tasks_events 1
> snapshot_restore_tasks_events_per_sec 0.0
> snapshot_restore_tasks_nanos_per_event 0.0
> snapshot_restore_tasks_nanos_total 91377
> snapshot_restore_tasks_nanos_total_per_sec 0.0
> ```
> 
> ```
> $ aurora_admin scheduler_snapshot devcluster
>  INFO] Response from scheduler: OK (message: )
>  
> $ curl localhost:8081/vars | grep snapshot_save
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
> 100 482260 482260 0  3266k  0 --:--:-- --:--:-- --:--:-- 3363k
> snapshot_save_crons_events 1
> snapshot_save_crons_events_per_sec 0.0
> snapshot_save_crons_nanos_per_event 0.0
> snapshot_save_crons_nanos_total 466181
> snapshot_save_crons_nanos_total_per_sec 0.0
> snapshot_save_dbscript_events 1
> snapshot_save_dbscript_events_per_sec 0.0
> snapshot_save_dbscript_nanos_per_event 0.0
> snapshot_save_dbscript_nanos_total 18201542
> snapshot_save_dbscript_nanos_total_per_sec 0.0
> snapshot_save_hosts_events 1
> snapshot_save_hosts_events_per_sec 0.0
> snapshot_save_hosts_nanos_per_event 0.0
> snapshot_save_hosts_nanos_total 1286180
> 

Re: Review Request 55105: AURORA-1870 Add finer grained timings to the Snapshot process

2017-01-04 Thread Mehrdad Nurolahzade


> On Jan. 4, 2017, 10:48 a.m., Joshua Cohen wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java,
> >  lines 520-532
> > 
> >
> > Can these be default methods on the interface rather than converting to 
> > an abstract class?
> 
> Mehrdad Nurolahzade wrote:
> You are right, will refactor.

Actually, converting it back to an interface causes `SnapshotStoreImplIT` fail 
after `Stats` is flushed; a static field is only initialized once.
```
private interface SnapshotField {

LoadingCache STATS = CacheBuilder.newBuilder().build(
new CacheLoader() {
  @Override
  public SlidingStats load(String name) throws Exception {
return new SlidingStats(name, "nanos");
  }
});

String getName();

void saveToSnapshot(MutableStoreProvider storeProvider, Snapshot snapshot);

void restoreFromSnapshot(MutableStoreProvider storeProvider, Snapshot 
snapshot);

default void save(MutableStoreProvider storeProvider, Snapshot snapshot) {
  STATS.getUnchecked(SNAPSHOT_SAVE + getName())
  .time((Timeable.NoResult.Quiet) () -> saveToSnapshot(storeProvider, 
snapshot));
}

default void restore(MutableStoreProvider storeProvider, Snapshot snapshot) 
{
  STATS.getUnchecked(SNAPSHOT_RESTORE + getName())
  .time((Timeable.NoResult.Quiet) () -> 
restoreFromSnapshot(storeProvider, snapshot));
}
  }
```
Seems like it is safer to keep it as abstract class?


- Mehrdad


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55105/#review160521
---


On Jan. 4, 2017, 8:36 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55105/
> ---
> 
> (Updated Jan. 4, 2017, 8:36 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Joshua Cohen.
> 
> 
> Bugs: AURORA-1870
> https://issues.apache.org/jira/browse/AURORA-1870
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1870 Add finer grained timings to the Snapshot process
> 
> I gave up on `@Timed` interceptor approach because major refactoring is 
> required in order to have snapshot fields instantiated by Guice through 
> `Provider` or `@Provides` interfaces. The abstract class approach is much 
> easier/cleaner.
> 
> 
> Diffs
> -
> 
>   commons/src/main/java/org/apache/aurora/common/stats/SlidingStats.java 
> f7a5ae41e307627fc55157758e9b7cdd861c3268 
>   commons/src/test/java/org/apache/aurora/common/stats/SlidingStatsTest.java 
> PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
> 7aa111ec14696ae40f518c42f3c7f45d8ab0e94c 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java
>  f56a1624c7188da175ad3e6de323c3442802f2ea 
> 
> Diff: https://reviews.apache.org/r/55105/diff/
> 
> 
> Testing
> ---
> 
> ```
> $ curl localhost:8081/vars | grep snapshot_restore
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
> snapshot_restore_crons_events 1
> snapshot_restore_crons_events_per_sec 0.0
> snapshot_restore_crons_nanos_per_event 0.0
> snapshot_restore_crons_nanos_total 73648
> snapshot_restore_crons_nanos_total_per_sec 0.0
> snapshot_restore_dbscript_events 1
> snapshot_restore_dbscript_events_per_sec 0.0
> snapshot_restore_dbscript_nanos_per_event 0.0
> snapshot_restore_dbscript_nanos_total 1148842021
> snapshot_restore_dbscript_nanos_total_per_sec 0.0
> snapshot_restore_hosts_events 1
> snapshot_restore_hosts_events_per_sec 0.0
> snapshot_restore_hosts_nanos_per_event 0.0
> snapshot_restore_hosts_nanos_total 76166
> snapshot_restore_hosts_nanos_total_per_sec 0.0
> snapshot_restore_job_updates_events 1
> snapshot_restore_job_updates_events_per_sec 0.0
> snapshot_restore_job_updates_nanos_per_event 0.0
> snapshot_restore_job_updates_nanos_total 49482
> snapshot_restore_job_updates_nanos_total_per_sec 0.0
> snapshot_restore_locks_events 1
> snapshot_restore_locks_events_per_sec 0.0
> snapshot_restore_locks_nanos_per_event 0.0
> snapshot_restore_locks_nanos_total 125084
> snapshot_restore_locks_nanos_total_per_sec 0.0
> snapshot_restore_quota_events 1
> snapshot_restore_quota_events_per_sec 0.0
> snapshot_restore_quota_nanos_per_event 0.0
> snapshot_restore_quota_nanos_total 52305
> snapshot_restore_quota_nanos_total_per_sec 0.0
> snapshot_restore_scheduler_metadata_events 1
> 

Re: Review Request 55105: AURORA-1870 Add finer grained timings to the Snapshot process

2017-01-04 Thread Mehrdad Nurolahzade

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55105/
---

(Updated Jan. 4, 2017, 8:36 p.m.)


Review request for Aurora, David McLaughlin and Joshua Cohen.


Changes
---

Moved timing logic to `SlidingStats`


Bugs: AURORA-1870
https://issues.apache.org/jira/browse/AURORA-1870


Repository: aurora


Description
---

AURORA-1870 Add finer grained timings to the Snapshot process

I gave up on `@Timed` interceptor approach because major refactoring is 
required in order to have snapshot fields instantiated by Guice through 
`Provider` or `@Provides` interfaces. The abstract class approach is much 
easier/cleaner.


Diffs (updated)
-

  commons/src/main/java/org/apache/aurora/common/stats/SlidingStats.java 
f7a5ae41e307627fc55157758e9b7cdd861c3268 
  commons/src/test/java/org/apache/aurora/common/stats/SlidingStatsTest.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
7aa111ec14696ae40f518c42f3c7f45d8ab0e94c 
  
src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java 
f56a1624c7188da175ad3e6de323c3442802f2ea 

Diff: https://reviews.apache.org/r/55105/diff/


Testing
---

```
$ curl localhost:8081/vars | grep snapshot_restore
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
snapshot_restore_crons_events 1
snapshot_restore_crons_events_per_sec 0.0
snapshot_restore_crons_nanos_per_event 0.0
snapshot_restore_crons_nanos_total 73648
snapshot_restore_crons_nanos_total_per_sec 0.0
snapshot_restore_dbscript_events 1
snapshot_restore_dbscript_events_per_sec 0.0
snapshot_restore_dbscript_nanos_per_event 0.0
snapshot_restore_dbscript_nanos_total 1148842021
snapshot_restore_dbscript_nanos_total_per_sec 0.0
snapshot_restore_hosts_events 1
snapshot_restore_hosts_events_per_sec 0.0
snapshot_restore_hosts_nanos_per_event 0.0
snapshot_restore_hosts_nanos_total 76166
snapshot_restore_hosts_nanos_total_per_sec 0.0
snapshot_restore_job_updates_events 1
snapshot_restore_job_updates_events_per_sec 0.0
snapshot_restore_job_updates_nanos_per_event 0.0
snapshot_restore_job_updates_nanos_total 49482
snapshot_restore_job_updates_nanos_total_per_sec 0.0
snapshot_restore_locks_events 1
snapshot_restore_locks_events_per_sec 0.0
snapshot_restore_locks_nanos_per_event 0.0
snapshot_restore_locks_nanos_total 125084
snapshot_restore_locks_nanos_total_per_sec 0.0
snapshot_restore_quota_events 1
snapshot_restore_quota_events_per_sec 0.0
snapshot_restore_quota_nanos_per_event 0.0
snapshot_restore_quota_nanos_total 52305
snapshot_restore_quota_nanos_total_per_sec 0.0
snapshot_restore_scheduler_metadata_events 1
snapshot_restore_scheduler_metadata_events_per_sec 0.0
snapshot_restore_scheduler_metadata_nanos_per_event 0.0
snapshot_restore_scheduler_metadata_nanos_total 70816
snapshot_restore_scheduler_metadata_nanos_total_per_sec 0.0
snapshot_restore_tasks_events 1
snapshot_restore_tasks_events_per_sec 0.0
snapshot_restore_tasks_nanos_per_event 0.0
snapshot_restore_tasks_nanos_total 91377
snapshot_restore_tasks_nanos_total_per_sec 0.0
```

```
$ aurora_admin scheduler_snapshot devcluster
 INFO] Response from scheduler: OK (message: )
 
$ curl localhost:8081/vars | grep snapshot_save
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
100 482260 482260 0  3266k  0 --:--:-- --:--:-- --:--:-- 3363k
snapshot_save_crons_events 1
snapshot_save_crons_events_per_sec 0.0
snapshot_save_crons_nanos_per_event 0.0
snapshot_save_crons_nanos_total 466181
snapshot_save_crons_nanos_total_per_sec 0.0
snapshot_save_dbscript_events 1
snapshot_save_dbscript_events_per_sec 0.0
snapshot_save_dbscript_nanos_per_event 0.0
snapshot_save_dbscript_nanos_total 18201542
snapshot_save_dbscript_nanos_total_per_sec 0.0
snapshot_save_hosts_events 1
snapshot_save_hosts_events_per_sec 0.0
snapshot_save_hosts_nanos_per_event 0.0
snapshot_save_hosts_nanos_total 1286180
snapshot_save_hosts_nanos_total_per_sec 0.0
snapshot_save_job_updates_events 1
snapshot_save_job_updates_events_per_sec 0.0
snapshot_save_job_updates_nanos_per_event 0.0
snapshot_save_job_updates_nanos_total 123760632
snapshot_save_job_updates_nanos_total_per_sec 0.0
snapshot_save_locks_events 1
snapshot_save_locks_events_per_sec 0.0
snapshot_save_locks_nanos_per_event 0.0
snapshot_save_locks_nanos_total 1523926
snapshot_save_locks_nanos_total_per_sec 0.0
snapshot_save_quota_events 1
snapshot_save_quota_events_per_sec 0.0
snapshot_save_quota_nanos_per_event 0.0
snapshot_save_quota_nanos_total 5725489
snapshot_save_quota_nanos_total_per_sec 0.0

Re: Review Request 55179: AURORA-1820 Reduce storage write lock contention by adopting Double-Checked Locking pattern in TimedOutTaskHandler

2017-01-04 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55179/#review160540
---


Ship it!




Master (21ad18e) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Jan. 4, 2017, 9:23 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55179/
> ---
> 
> (Updated Jan. 4, 2017, 9:23 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Stephan Erb.
> 
> 
> Bugs: AURORA-1820
> https://issues.apache.org/jira/browse/AURORA-1820
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `TimedOutTaskHandler` acquires storage write lock for every task every time 
> they transition to a transient state. It then verifies after a default 
> time-out period of 5 minutes if the task has transitioned out of the 
> transient state.
> 
> The verification step takes place while holding the storage write lock. In 
> over 99% of cases the logic short-circuits and returns from 
> `StateManagerImpl.updateTaskAndExternalState()` once it learns task has 
> transitioned out of the transient state.
> 
> This patch reduces storage write lock contention by adopting Double-Checked 
> Locking pattern in `TimedOutTaskHandler.run()`.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/reconciliation/TaskTimeout.java 
> 2dc9bc2c6916595270187f0f29d5bd8c5ba7e9ad 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskTimeoutTest.java 
> 1006ddb6caea015c2d4e014bd044f2933541c84f 
> 
> Diff: https://reviews.apache.org/r/55179/diff/
> 
> 
> Testing
> ---
> 
> ```
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> ...
> 
> *** OK (All tests passed) ***
> 
> mesos-master start/running, process 22759
> + RETCODE=0
> + restore_netrc
> + mv /home/vagrant/.netrc.bak /home/vagrant/.netrc
> + true
> Connection to 127.0.0.1 closed.
> 
> real  25m36.144s
> user  0m1.358s
> sys   0m0.595s
> ```
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Re: Review Request 55179: AURORA-1820 Reduce storage write lock contention by adopting Double-Checked Locking pattern in TimedOutTaskHandler

2017-01-04 Thread Mehrdad Nurolahzade

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55179/
---

(Updated Jan. 4, 2017, 1:23 p.m.)


Review request for Aurora, Joshua Cohen and Stephan Erb.


Changes
---

Added expectations for storage calls


Bugs: AURORA-1820
https://issues.apache.org/jira/browse/AURORA-1820


Repository: aurora


Description
---

`TimedOutTaskHandler` acquires storage write lock for every task every time 
they transition to a transient state. It then verifies after a default time-out 
period of 5 minutes if the task has transitioned out of the transient state.

The verification step takes place while holding the storage write lock. In over 
99% of cases the logic short-circuits and returns from 
`StateManagerImpl.updateTaskAndExternalState()` once it learns task has 
transitioned out of the transient state.

This patch reduces storage write lock contention by adopting Double-Checked 
Locking pattern in `TimedOutTaskHandler.run()`.


Diffs (updated)
-

  src/main/java/org/apache/aurora/scheduler/reconciliation/TaskTimeout.java 
2dc9bc2c6916595270187f0f29d5bd8c5ba7e9ad 
  src/test/java/org/apache/aurora/scheduler/reconciliation/TaskTimeoutTest.java 
1006ddb6caea015c2d4e014bd044f2933541c84f 

Diff: https://reviews.apache.org/r/55179/diff/


Testing
---

```
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh

...

*** OK (All tests passed) ***

mesos-master start/running, process 22759
+ RETCODE=0
+ restore_netrc
+ mv /home/vagrant/.netrc.bak /home/vagrant/.netrc
+ true
Connection to 127.0.0.1 closed.

real25m36.144s
user0m1.358s
sys 0m0.595s
```


Thanks,

Mehrdad Nurolahzade



Re: Review Request 55105: AURORA-1870 Add finer grained timings to the Snapshot process

2017-01-04 Thread Mehrdad Nurolahzade


> On Jan. 4, 2017, 9:59 a.m., Kai Huang wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java,
> >  line 521
> > 
> >
> > If we need this elsewhere, can we make the timing part more reusable 
> > like:
> > 
> > Stats.time("statsName", () -> {#function to time})
> 
> Mehrdad Nurolahzade wrote:
> Currently, in Java 8, only `Supplier`, `Function`, and `BiFunction` 
> functional interfaces are supported (e.g., 0, 1, and 2 argument functions). 
> There is no way to represent a multi-argument function with an unknown 
> argument list. That's why we are using AOP to intercept method calls and 
> perform timing on them.

Actually, I retract that comment, this can be implemented using either 
`Supplier` or `Callable` interfaces. For example like:
```
  static  T time(SlidingStats stat, Callable callable) throws Exception {
long start = System.nanoTime();
try {
  return callable.call();
} finally {
  stat.accumulate(System.nanoTime() - start);
}
  }
```

then it can be used like:
```
SlidingStats stat = new SlidingStats("someName", "nanos");
Stats.time(stat, () -> {
  Thread.sleep(100);
  return "HelloWorld";
});
```

Thanks for pointing it out. 
cc @shirchen


- Mehrdad


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55105/#review160517
---


On Dec. 30, 2016, 1:18 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55105/
> ---
> 
> (Updated Dec. 30, 2016, 1:18 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Joshua Cohen.
> 
> 
> Bugs: AURORA-1870
> https://issues.apache.org/jira/browse/AURORA-1870
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1870 Add finer grained timings to the Snapshot process
> 
> I gave up on `@Timed` interceptor approach because major refactoring is 
> required in order to have snapshot fields instantiated by Guice through 
> `Provider` or `@Provides` interfaces. The abstract class approach is much 
> easier/cleaner.
> 
> 
> Diffs
> -
> 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
> 7aa111ec14696ae40f518c42f3c7f45d8ab0e94c 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java
>  f56a1624c7188da175ad3e6de323c3442802f2ea 
> 
> Diff: https://reviews.apache.org/r/55105/diff/
> 
> 
> Testing
> ---
> 
> ```
> $ curl localhost:8081/vars | grep snapshot_restore
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
> snapshot_restore_crons_events 1
> snapshot_restore_crons_events_per_sec 0.0
> snapshot_restore_crons_nanos_per_event 0.0
> snapshot_restore_crons_nanos_total 73648
> snapshot_restore_crons_nanos_total_per_sec 0.0
> snapshot_restore_dbscript_events 1
> snapshot_restore_dbscript_events_per_sec 0.0
> snapshot_restore_dbscript_nanos_per_event 0.0
> snapshot_restore_dbscript_nanos_total 1148842021
> snapshot_restore_dbscript_nanos_total_per_sec 0.0
> snapshot_restore_hosts_events 1
> snapshot_restore_hosts_events_per_sec 0.0
> snapshot_restore_hosts_nanos_per_event 0.0
> snapshot_restore_hosts_nanos_total 76166
> snapshot_restore_hosts_nanos_total_per_sec 0.0
> snapshot_restore_job_updates_events 1
> snapshot_restore_job_updates_events_per_sec 0.0
> snapshot_restore_job_updates_nanos_per_event 0.0
> snapshot_restore_job_updates_nanos_total 49482
> snapshot_restore_job_updates_nanos_total_per_sec 0.0
> snapshot_restore_locks_events 1
> snapshot_restore_locks_events_per_sec 0.0
> snapshot_restore_locks_nanos_per_event 0.0
> snapshot_restore_locks_nanos_total 125084
> snapshot_restore_locks_nanos_total_per_sec 0.0
> snapshot_restore_quota_events 1
> snapshot_restore_quota_events_per_sec 0.0
> snapshot_restore_quota_nanos_per_event 0.0
> snapshot_restore_quota_nanos_total 52305
> snapshot_restore_quota_nanos_total_per_sec 0.0
> snapshot_restore_scheduler_metadata_events 1
> snapshot_restore_scheduler_metadata_events_per_sec 0.0
> snapshot_restore_scheduler_metadata_nanos_per_event 0.0
> snapshot_restore_scheduler_metadata_nanos_total 70816
> snapshot_restore_scheduler_metadata_nanos_total_per_sec 0.0
> snapshot_restore_tasks_events 1
> snapshot_restore_tasks_events_per_sec 0.0
> snapshot_restore_tasks_nanos_per_event 0.0
> snapshot_restore_tasks_nanos_total 91377
> snapshot_restore_tasks_nanos_total_per_sec 0.0
> ```
> 
> ```
> $ aurora_admin scheduler_snapshot 

Re: Review Request 52437: Adding support for Ubuntu Xenial packages

2017-01-04 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52437/#review160527
---


Ship it!




LGTM


specs/ubuntu-xenial/changelog (line 1)


This should be `xenial`.

I also think we should only include the changelog for the packages that do 
actually exist for xenial. So this would essential only be the upcoming 0.17. 
You can add this one as a dummy for now.



test/deb/ubuntu-xenial/README.md (line 26)


`ubuntu-xenial`


- Stephan Erb


On Jan. 4, 2017, 2:43 p.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52437/
> ---
> 
> (Updated Jan. 4, 2017, 2:43 p.m.)
> 
> 
> Review request for Aurora, Dmitriy Shirchenko and Zameer Manji.
> 
> 
> Bugs: AURORA-1872
> https://issues.apache.org/jira/browse/AURORA-1872
> 
> 
> Repository: aurora-packaging
> 
> 
> Description
> ---
> 
> Added builder and test environment for Xenial as well as updated instructions 
> on how to test it.
> 
> Added distributtion to release-candidate script.
> 
> 
> Diffs
> -
> 
>   build-support/release/release-candidate 
> 98df82412b6218ed7c1f650a7b1cc725ac297c37 
>   builder/deb/ubuntu-xenial/Dockerfile PRE-CREATION 
>   builder/deb/ubuntu-xenial/build.sh PRE-CREATION 
>   specs/debian/aurora-pants.ini e4f2c54df45f746ffb2a4e3f786fc4105323bc0b 
>   specs/debian/aurora-scheduler.startup.sh PRE-CREATION 
>   specs/ubuntu-xenial/aurora-doc.docs PRE-CREATION 
>   specs/ubuntu-xenial/aurora-doc.examples PRE-CREATION 
>   specs/ubuntu-xenial/aurora-pants.ini PRE-CREATION 
>   specs/ubuntu-xenial/aurora-scheduler.default PRE-CREATION 
>   specs/ubuntu-xenial/aurora-scheduler.install PRE-CREATION 
>   specs/ubuntu-xenial/aurora-scheduler.links PRE-CREATION 
>   specs/ubuntu-xenial/aurora-scheduler.postinst PRE-CREATION 
>   specs/ubuntu-xenial/aurora-scheduler.service PRE-CREATION 
>   specs/ubuntu-xenial/aurora-scheduler.startup.sh PRE-CREATION 
>   specs/ubuntu-xenial/aurora-tools.install PRE-CREATION 
>   specs/ubuntu-xenial/aurora-tools.links PRE-CREATION 
>   specs/ubuntu-xenial/changelog PRE-CREATION 
>   specs/ubuntu-xenial/clusters.json PRE-CREATION 
>   specs/ubuntu-xenial/compat PRE-CREATION 
>   specs/ubuntu-xenial/control PRE-CREATION 
>   specs/ubuntu-xenial/copyright PRE-CREATION 
>   specs/ubuntu-xenial/rules PRE-CREATION 
>   specs/ubuntu-xenial/source/format PRE-CREATION 
>   specs/ubuntu-xenial/thermos.default PRE-CREATION 
>   specs/ubuntu-xenial/thermos.dirs PRE-CREATION 
>   specs/ubuntu-xenial/thermos.install PRE-CREATION 
>   specs/ubuntu-xenial/thermos.links PRE-CREATION 
>   specs/ubuntu-xenial/thermos.service PRE-CREATION 
>   test/deb/ubuntu-xenial/README.md PRE-CREATION 
>   test/deb/ubuntu-xenial/Vagrantfile PRE-CREATION 
>   test/deb/ubuntu-xenial/provision.sh PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/52437/diff/
> 
> 
> Testing
> ---
> 
> Created artifacts using the build-artifacts script.
> 
> Brought a vagrant image up, installed all deb files created by the artifacts 
> script, started both aurora-scheduler and thermos services, and launched a 
> sample job.
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>



Re: Review Request 55105: AURORA-1870 Add finer grained timings to the Snapshot process

2017-01-04 Thread Joshua Cohen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55105/#review160521
---




src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
(lines 520 - 532)


Can these be default methods on the interface rather than converting to an 
abstract class?


- Joshua Cohen


On Dec. 30, 2016, 9:18 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55105/
> ---
> 
> (Updated Dec. 30, 2016, 9:18 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Joshua Cohen.
> 
> 
> Bugs: AURORA-1870
> https://issues.apache.org/jira/browse/AURORA-1870
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1870 Add finer grained timings to the Snapshot process
> 
> I gave up on `@Timed` interceptor approach because major refactoring is 
> required in order to have snapshot fields instantiated by Guice through 
> `Provider` or `@Provides` interfaces. The abstract class approach is much 
> easier/cleaner.
> 
> 
> Diffs
> -
> 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
> 7aa111ec14696ae40f518c42f3c7f45d8ab0e94c 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java
>  f56a1624c7188da175ad3e6de323c3442802f2ea 
> 
> Diff: https://reviews.apache.org/r/55105/diff/
> 
> 
> Testing
> ---
> 
> ```
> $ curl localhost:8081/vars | grep snapshot_restore
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
> snapshot_restore_crons_events 1
> snapshot_restore_crons_events_per_sec 0.0
> snapshot_restore_crons_nanos_per_event 0.0
> snapshot_restore_crons_nanos_total 73648
> snapshot_restore_crons_nanos_total_per_sec 0.0
> snapshot_restore_dbscript_events 1
> snapshot_restore_dbscript_events_per_sec 0.0
> snapshot_restore_dbscript_nanos_per_event 0.0
> snapshot_restore_dbscript_nanos_total 1148842021
> snapshot_restore_dbscript_nanos_total_per_sec 0.0
> snapshot_restore_hosts_events 1
> snapshot_restore_hosts_events_per_sec 0.0
> snapshot_restore_hosts_nanos_per_event 0.0
> snapshot_restore_hosts_nanos_total 76166
> snapshot_restore_hosts_nanos_total_per_sec 0.0
> snapshot_restore_job_updates_events 1
> snapshot_restore_job_updates_events_per_sec 0.0
> snapshot_restore_job_updates_nanos_per_event 0.0
> snapshot_restore_job_updates_nanos_total 49482
> snapshot_restore_job_updates_nanos_total_per_sec 0.0
> snapshot_restore_locks_events 1
> snapshot_restore_locks_events_per_sec 0.0
> snapshot_restore_locks_nanos_per_event 0.0
> snapshot_restore_locks_nanos_total 125084
> snapshot_restore_locks_nanos_total_per_sec 0.0
> snapshot_restore_quota_events 1
> snapshot_restore_quota_events_per_sec 0.0
> snapshot_restore_quota_nanos_per_event 0.0
> snapshot_restore_quota_nanos_total 52305
> snapshot_restore_quota_nanos_total_per_sec 0.0
> snapshot_restore_scheduler_metadata_events 1
> snapshot_restore_scheduler_metadata_events_per_sec 0.0
> snapshot_restore_scheduler_metadata_nanos_per_event 0.0
> snapshot_restore_scheduler_metadata_nanos_total 70816
> snapshot_restore_scheduler_metadata_nanos_total_per_sec 0.0
> snapshot_restore_tasks_events 1
> snapshot_restore_tasks_events_per_sec 0.0
> snapshot_restore_tasks_nanos_per_event 0.0
> snapshot_restore_tasks_nanos_total 91377
> snapshot_restore_tasks_nanos_total_per_sec 0.0
> ```
> 
> ```
> $ aurora_admin scheduler_snapshot devcluster
>  INFO] Response from scheduler: OK (message: )
>  
> $ curl localhost:8081/vars | grep snapshot_save
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
> 100 482260 482260 0  3266k  0 --:--:-- --:--:-- --:--:-- 3363k
> snapshot_save_crons_events 1
> snapshot_save_crons_events_per_sec 0.0
> snapshot_save_crons_nanos_per_event 0.0
> snapshot_save_crons_nanos_total 466181
> snapshot_save_crons_nanos_total_per_sec 0.0
> snapshot_save_dbscript_events 1
> snapshot_save_dbscript_events_per_sec 0.0
> snapshot_save_dbscript_nanos_per_event 0.0
> snapshot_save_dbscript_nanos_total 18201542
> snapshot_save_dbscript_nanos_total_per_sec 0.0
> snapshot_save_hosts_events 1
> snapshot_save_hosts_events_per_sec 0.0
> snapshot_save_hosts_nanos_per_event 0.0
> snapshot_save_hosts_nanos_total 1286180
> snapshot_save_hosts_nanos_total_per_sec 0.0
> snapshot_save_job_updates_events 1
> snapshot_save_job_updates_events_per_sec 0.0
> 

Re: Review Request 55105: AURORA-1870 Add finer grained timings to the Snapshot process

2017-01-04 Thread Mehrdad Nurolahzade


> On Jan. 4, 2017, 9:59 a.m., Kai Huang wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java,
> >  line 521
> > 
> >
> > If we need this elsewhere, can we make the timing part more reusable 
> > like:
> > 
> > Stats.time("statsName", () -> {#function to time})

Currently, in Java 8, only `Supplier`, `Function`, and `BiFunction` functional 
interfaces are supported (e.g., 0, 1, and 2 argument functions). There is no 
way to represent a multi-argument function with an unknown argument list. 
That's why we are using AOP to intercept method calls and perform timing on 
them.


- Mehrdad


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55105/#review160517
---


On Dec. 30, 2016, 1:18 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55105/
> ---
> 
> (Updated Dec. 30, 2016, 1:18 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Joshua Cohen.
> 
> 
> Bugs: AURORA-1870
> https://issues.apache.org/jira/browse/AURORA-1870
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1870 Add finer grained timings to the Snapshot process
> 
> I gave up on `@Timed` interceptor approach because major refactoring is 
> required in order to have snapshot fields instantiated by Guice through 
> `Provider` or `@Provides` interfaces. The abstract class approach is much 
> easier/cleaner.
> 
> 
> Diffs
> -
> 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
> 7aa111ec14696ae40f518c42f3c7f45d8ab0e94c 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java
>  f56a1624c7188da175ad3e6de323c3442802f2ea 
> 
> Diff: https://reviews.apache.org/r/55105/diff/
> 
> 
> Testing
> ---
> 
> ```
> $ curl localhost:8081/vars | grep snapshot_restore
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
> snapshot_restore_crons_events 1
> snapshot_restore_crons_events_per_sec 0.0
> snapshot_restore_crons_nanos_per_event 0.0
> snapshot_restore_crons_nanos_total 73648
> snapshot_restore_crons_nanos_total_per_sec 0.0
> snapshot_restore_dbscript_events 1
> snapshot_restore_dbscript_events_per_sec 0.0
> snapshot_restore_dbscript_nanos_per_event 0.0
> snapshot_restore_dbscript_nanos_total 1148842021
> snapshot_restore_dbscript_nanos_total_per_sec 0.0
> snapshot_restore_hosts_events 1
> snapshot_restore_hosts_events_per_sec 0.0
> snapshot_restore_hosts_nanos_per_event 0.0
> snapshot_restore_hosts_nanos_total 76166
> snapshot_restore_hosts_nanos_total_per_sec 0.0
> snapshot_restore_job_updates_events 1
> snapshot_restore_job_updates_events_per_sec 0.0
> snapshot_restore_job_updates_nanos_per_event 0.0
> snapshot_restore_job_updates_nanos_total 49482
> snapshot_restore_job_updates_nanos_total_per_sec 0.0
> snapshot_restore_locks_events 1
> snapshot_restore_locks_events_per_sec 0.0
> snapshot_restore_locks_nanos_per_event 0.0
> snapshot_restore_locks_nanos_total 125084
> snapshot_restore_locks_nanos_total_per_sec 0.0
> snapshot_restore_quota_events 1
> snapshot_restore_quota_events_per_sec 0.0
> snapshot_restore_quota_nanos_per_event 0.0
> snapshot_restore_quota_nanos_total 52305
> snapshot_restore_quota_nanos_total_per_sec 0.0
> snapshot_restore_scheduler_metadata_events 1
> snapshot_restore_scheduler_metadata_events_per_sec 0.0
> snapshot_restore_scheduler_metadata_nanos_per_event 0.0
> snapshot_restore_scheduler_metadata_nanos_total 70816
> snapshot_restore_scheduler_metadata_nanos_total_per_sec 0.0
> snapshot_restore_tasks_events 1
> snapshot_restore_tasks_events_per_sec 0.0
> snapshot_restore_tasks_nanos_per_event 0.0
> snapshot_restore_tasks_nanos_total 91377
> snapshot_restore_tasks_nanos_total_per_sec 0.0
> ```
> 
> ```
> $ aurora_admin scheduler_snapshot devcluster
>  INFO] Response from scheduler: OK (message: )
>  
> $ curl localhost:8081/vars | grep snapshot_save
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
> 100 482260 482260 0  3266k  0 --:--:-- --:--:-- --:--:-- 3363k
> snapshot_save_crons_events 1
> snapshot_save_crons_events_per_sec 0.0
> snapshot_save_crons_nanos_per_event 0.0
> snapshot_save_crons_nanos_total 466181
> snapshot_save_crons_nanos_total_per_sec 0.0
> snapshot_save_dbscript_events 1
> snapshot_save_dbscript_events_per_sec 0.0
> 

Re: Review Request 55105: AURORA-1870 Add finer grained timings to the Snapshot process

2017-01-04 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55105/#review160517
---




src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
(line 521)


If we need this elsewhere, can we make the timing part more reusable like:

Stats.time("statsName", () -> {#function to time})


- Kai Huang


On Dec. 30, 2016, 9:18 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55105/
> ---
> 
> (Updated Dec. 30, 2016, 9:18 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Joshua Cohen.
> 
> 
> Bugs: AURORA-1870
> https://issues.apache.org/jira/browse/AURORA-1870
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1870 Add finer grained timings to the Snapshot process
> 
> I gave up on `@Timed` interceptor approach because major refactoring is 
> required in order to have snapshot fields instantiated by Guice through 
> `Provider` or `@Provides` interfaces. The abstract class approach is much 
> easier/cleaner.
> 
> 
> Diffs
> -
> 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
> 7aa111ec14696ae40f518c42f3c7f45d8ab0e94c 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java
>  f56a1624c7188da175ad3e6de323c3442802f2ea 
> 
> Diff: https://reviews.apache.org/r/55105/diff/
> 
> 
> Testing
> ---
> 
> ```
> $ curl localhost:8081/vars | grep snapshot_restore
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
> snapshot_restore_crons_events 1
> snapshot_restore_crons_events_per_sec 0.0
> snapshot_restore_crons_nanos_per_event 0.0
> snapshot_restore_crons_nanos_total 73648
> snapshot_restore_crons_nanos_total_per_sec 0.0
> snapshot_restore_dbscript_events 1
> snapshot_restore_dbscript_events_per_sec 0.0
> snapshot_restore_dbscript_nanos_per_event 0.0
> snapshot_restore_dbscript_nanos_total 1148842021
> snapshot_restore_dbscript_nanos_total_per_sec 0.0
> snapshot_restore_hosts_events 1
> snapshot_restore_hosts_events_per_sec 0.0
> snapshot_restore_hosts_nanos_per_event 0.0
> snapshot_restore_hosts_nanos_total 76166
> snapshot_restore_hosts_nanos_total_per_sec 0.0
> snapshot_restore_job_updates_events 1
> snapshot_restore_job_updates_events_per_sec 0.0
> snapshot_restore_job_updates_nanos_per_event 0.0
> snapshot_restore_job_updates_nanos_total 49482
> snapshot_restore_job_updates_nanos_total_per_sec 0.0
> snapshot_restore_locks_events 1
> snapshot_restore_locks_events_per_sec 0.0
> snapshot_restore_locks_nanos_per_event 0.0
> snapshot_restore_locks_nanos_total 125084
> snapshot_restore_locks_nanos_total_per_sec 0.0
> snapshot_restore_quota_events 1
> snapshot_restore_quota_events_per_sec 0.0
> snapshot_restore_quota_nanos_per_event 0.0
> snapshot_restore_quota_nanos_total 52305
> snapshot_restore_quota_nanos_total_per_sec 0.0
> snapshot_restore_scheduler_metadata_events 1
> snapshot_restore_scheduler_metadata_events_per_sec 0.0
> snapshot_restore_scheduler_metadata_nanos_per_event 0.0
> snapshot_restore_scheduler_metadata_nanos_total 70816
> snapshot_restore_scheduler_metadata_nanos_total_per_sec 0.0
> snapshot_restore_tasks_events 1
> snapshot_restore_tasks_events_per_sec 0.0
> snapshot_restore_tasks_nanos_per_event 0.0
> snapshot_restore_tasks_nanos_total 91377
> snapshot_restore_tasks_nanos_total_per_sec 0.0
> ```
> 
> ```
> $ aurora_admin scheduler_snapshot devcluster
>  INFO] Response from scheduler: OK (message: )
>  
> $ curl localhost:8081/vars | grep snapshot_save
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
> 100 482260 482260 0  3266k  0 --:--:-- --:--:-- --:--:-- 3363k
> snapshot_save_crons_events 1
> snapshot_save_crons_events_per_sec 0.0
> snapshot_save_crons_nanos_per_event 0.0
> snapshot_save_crons_nanos_total 466181
> snapshot_save_crons_nanos_total_per_sec 0.0
> snapshot_save_dbscript_events 1
> snapshot_save_dbscript_events_per_sec 0.0
> snapshot_save_dbscript_nanos_per_event 0.0
> snapshot_save_dbscript_nanos_total 18201542
> snapshot_save_dbscript_nanos_total_per_sec 0.0
> snapshot_save_hosts_events 1
> snapshot_save_hosts_events_per_sec 0.0
> snapshot_save_hosts_nanos_per_event 0.0
> snapshot_save_hosts_nanos_total 1286180
> snapshot_save_hosts_nanos_total_per_sec 0.0
> snapshot_save_job_updates_events 1
> snapshot_save_job_updates_events_per_sec 

Re: Review Request 55179: AURORA-1820 Reduce storage write lock contention by adopting Double-Checked Locking pattern in TimedOutTaskHandler

2017-01-04 Thread Joshua Cohen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55179/#review160515
---


Ship it!




LGTM pending additional testing per Stephan's comments.

- Joshua Cohen


On Jan. 4, 2017, 5:16 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55179/
> ---
> 
> (Updated Jan. 4, 2017, 5:16 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Stephan Erb.
> 
> 
> Bugs: AURORA-1820
> https://issues.apache.org/jira/browse/AURORA-1820
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `TimedOutTaskHandler` acquires storage write lock for every task every time 
> they transition to a transient state. It then verifies after a default 
> time-out period of 5 minutes if the task has transitioned out of the 
> transient state.
> 
> The verification step takes place while holding the storage write lock. In 
> over 99% of cases the logic short-circuits and returns from 
> `StateManagerImpl.updateTaskAndExternalState()` once it learns task has 
> transitioned out of the transient state.
> 
> This patch reduces storage write lock contention by adopting Double-Checked 
> Locking pattern in `TimedOutTaskHandler.run()`.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/reconciliation/TaskTimeout.java 
> 2dc9bc2c6916595270187f0f29d5bd8c5ba7e9ad 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskTimeoutTest.java 
> 1006ddb6caea015c2d4e014bd044f2933541c84f 
> 
> Diff: https://reviews.apache.org/r/55179/diff/
> 
> 
> Testing
> ---
> 
> ```
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> ...
> 
> *** OK (All tests passed) ***
> 
> mesos-master start/running, process 22759
> + RETCODE=0
> + restore_netrc
> + mv /home/vagrant/.netrc.bak /home/vagrant/.netrc
> + true
> Connection to 127.0.0.1 closed.
> 
> real  25m36.144s
> user  0m1.358s
> sys   0m0.595s
> ```
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Re: Review Request 55179: AURORA-1820 Reduce storage write lock contention by adopting Double-Checked Locking pattern in TimedOutTaskHandler

2017-01-04 Thread Joshua Cohen


> On Jan. 4, 2017, 5:39 p.m., Stephan Erb wrote:
> > src/test/java/org/apache/aurora/scheduler/reconciliation/TaskTimeoutTest.java,
> >  line 130
> > 
> >
> > Given the motivation of the patch, we should probably check that we did 
> > not try to acquire the storage lock.

+1


- Joshua


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55179/#review160510
---


On Jan. 4, 2017, 5:16 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55179/
> ---
> 
> (Updated Jan. 4, 2017, 5:16 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Stephan Erb.
> 
> 
> Bugs: AURORA-1820
> https://issues.apache.org/jira/browse/AURORA-1820
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `TimedOutTaskHandler` acquires storage write lock for every task every time 
> they transition to a transient state. It then verifies after a default 
> time-out period of 5 minutes if the task has transitioned out of the 
> transient state.
> 
> The verification step takes place while holding the storage write lock. In 
> over 99% of cases the logic short-circuits and returns from 
> `StateManagerImpl.updateTaskAndExternalState()` once it learns task has 
> transitioned out of the transient state.
> 
> This patch reduces storage write lock contention by adopting Double-Checked 
> Locking pattern in `TimedOutTaskHandler.run()`.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/reconciliation/TaskTimeout.java 
> 2dc9bc2c6916595270187f0f29d5bd8c5ba7e9ad 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskTimeoutTest.java 
> 1006ddb6caea015c2d4e014bd044f2933541c84f 
> 
> Diff: https://reviews.apache.org/r/55179/diff/
> 
> 
> Testing
> ---
> 
> ```
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> ...
> 
> *** OK (All tests passed) ***
> 
> mesos-master start/running, process 22759
> + RETCODE=0
> + restore_netrc
> + mv /home/vagrant/.netrc.bak /home/vagrant/.netrc
> + true
> Connection to 127.0.0.1 closed.
> 
> real  25m36.144s
> user  0m1.358s
> sys   0m0.595s
> ```
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Re: Review Request 55179: AURORA-1820 Reduce storage write lock contention by adopting Double-Checked Locking pattern in TimedOutTaskHandler

2017-01-04 Thread Stephan Erb

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55179/#review160510
---


Fix it, then Ship it!




LGTM.


src/test/java/org/apache/aurora/scheduler/reconciliation/TaskTimeoutTest.java 
(line 130)


Given the motivation of the patch, we should probably check that we did not 
try to acquire the storage lock.


- Stephan Erb


On Jan. 4, 2017, 6:16 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55179/
> ---
> 
> (Updated Jan. 4, 2017, 6:16 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Stephan Erb.
> 
> 
> Bugs: AURORA-1820
> https://issues.apache.org/jira/browse/AURORA-1820
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `TimedOutTaskHandler` acquires storage write lock for every task every time 
> they transition to a transient state. It then verifies after a default 
> time-out period of 5 minutes if the task has transitioned out of the 
> transient state.
> 
> The verification step takes place while holding the storage write lock. In 
> over 99% of cases the logic short-circuits and returns from 
> `StateManagerImpl.updateTaskAndExternalState()` once it learns task has 
> transitioned out of the transient state.
> 
> This patch reduces storage write lock contention by adopting Double-Checked 
> Locking pattern in `TimedOutTaskHandler.run()`.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/reconciliation/TaskTimeout.java 
> 2dc9bc2c6916595270187f0f29d5bd8c5ba7e9ad 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskTimeoutTest.java 
> 1006ddb6caea015c2d4e014bd044f2933541c84f 
> 
> Diff: https://reviews.apache.org/r/55179/diff/
> 
> 
> Testing
> ---
> 
> ```
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> ...
> 
> *** OK (All tests passed) ***
> 
> mesos-master start/running, process 22759
> + RETCODE=0
> + restore_netrc
> + mv /home/vagrant/.netrc.bak /home/vagrant/.netrc
> + true
> Connection to 127.0.0.1 closed.
> 
> real  25m36.144s
> user  0m1.358s
> sys   0m0.595s
> ```
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Re: Review Request 55179: AURORA-1820 Reduce storage write lock contention by adopting Double-Checked Locking pattern in TimedOutTaskHandler

2017-01-04 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55179/#review160512
---


Ship it!




Master (21ad18e) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Jan. 4, 2017, 5:16 p.m., Mehrdad Nurolahzade wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55179/
> ---
> 
> (Updated Jan. 4, 2017, 5:16 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Stephan Erb.
> 
> 
> Bugs: AURORA-1820
> https://issues.apache.org/jira/browse/AURORA-1820
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `TimedOutTaskHandler` acquires storage write lock for every task every time 
> they transition to a transient state. It then verifies after a default 
> time-out period of 5 minutes if the task has transitioned out of the 
> transient state.
> 
> The verification step takes place while holding the storage write lock. In 
> over 99% of cases the logic short-circuits and returns from 
> `StateManagerImpl.updateTaskAndExternalState()` once it learns task has 
> transitioned out of the transient state.
> 
> This patch reduces storage write lock contention by adopting Double-Checked 
> Locking pattern in `TimedOutTaskHandler.run()`.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/reconciliation/TaskTimeout.java 
> 2dc9bc2c6916595270187f0f29d5bd8c5ba7e9ad 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskTimeoutTest.java 
> 1006ddb6caea015c2d4e014bd044f2933541c84f 
> 
> Diff: https://reviews.apache.org/r/55179/diff/
> 
> 
> Testing
> ---
> 
> ```
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> ...
> 
> *** OK (All tests passed) ***
> 
> mesos-master start/running, process 22759
> + RETCODE=0
> + restore_netrc
> + mv /home/vagrant/.netrc.bak /home/vagrant/.netrc
> + true
> Connection to 127.0.0.1 closed.
> 
> real  25m36.144s
> user  0m1.358s
> sys   0m0.595s
> ```
> 
> 
> Thanks,
> 
> Mehrdad Nurolahzade
> 
>



Review Request 55179: AURORA-1820 Reduce storage write lock contention by adopting Double-Checked Locking pattern in TimedOutTaskHandler

2017-01-04 Thread Mehrdad Nurolahzade

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55179/
---

Review request for Aurora, Joshua Cohen and Stephan Erb.


Bugs: AURORA-1820
https://issues.apache.org/jira/browse/AURORA-1820


Repository: aurora


Description
---

`TimedOutTaskHandler` acquires storage write lock for every task every time 
they transition to a transient state. It then verifies after a default time-out 
period of 5 minutes if the task has transitioned out of the transient state.

The verification step takes place while holding the storage write lock. In over 
99% of cases the logic short-circuits and returns from 
`StateManagerImpl.updateTaskAndExternalState()` once it learns task has 
transitioned out of the transient state.

This patch reduces storage write lock contention by adopting Double-Checked 
Locking pattern in `TimedOutTaskHandler.run()`.


Diffs
-

  src/main/java/org/apache/aurora/scheduler/reconciliation/TaskTimeout.java 
2dc9bc2c6916595270187f0f29d5bd8c5ba7e9ad 
  src/test/java/org/apache/aurora/scheduler/reconciliation/TaskTimeoutTest.java 
1006ddb6caea015c2d4e014bd044f2933541c84f 

Diff: https://reviews.apache.org/r/55179/diff/


Testing
---

```
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh

...

*** OK (All tests passed) ***

mesos-master start/running, process 22759
+ RETCODE=0
+ restore_netrc
+ mv /home/vagrant/.netrc.bak /home/vagrant/.netrc
+ true
Connection to 127.0.0.1 closed.

real25m36.144s
user0m1.358s
sys 0m0.595s
```


Thanks,

Mehrdad Nurolahzade



Re: Review Request 52437: Adding support for Ubuntu Xenial packages

2017-01-04 Thread Renan DelValle


> On Jan. 4, 2017, 6:01 a.m., Stephan Erb wrote:
> > We used to have shared specs for all Debian-based distributions. This patch 
> > is now duplicating `specs/debian` for `ubuntu-xenial`. Is there anything 
> > particular that makes this necessary?

The reason why I have it a separate the specs  is two fold. First ubuntu xenial 
doesn’t play well if there’s _any_ init script installed, so the separate spec 
doesn't have them (we can achieve the same effect by only copying necessary 
files out of a common folder). Second, the existing SystemD unit scripts for 
debian were not all that native. I wrote new ones to be closer to their RPM 
counterparts. However, as the debian ones are already being used, I didn't want 
to force my changes upon anyone. We can decide as a community the way we want 
to handle this though.


- Renan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52437/#review160490
---


On Jan. 4, 2017, 5:43 a.m., Renan DelValle wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52437/
> ---
> 
> (Updated Jan. 4, 2017, 5:43 a.m.)
> 
> 
> Review request for Aurora, Dmitriy Shirchenko and Zameer Manji.
> 
> 
> Bugs: AURORA-1872
> https://issues.apache.org/jira/browse/AURORA-1872
> 
> 
> Repository: aurora-packaging
> 
> 
> Description
> ---
> 
> Added builder and test environment for Xenial as well as updated instructions 
> on how to test it.
> 
> Added distributtion to release-candidate script.
> 
> 
> Diffs
> -
> 
>   build-support/release/release-candidate 
> 98df82412b6218ed7c1f650a7b1cc725ac297c37 
>   builder/deb/ubuntu-xenial/Dockerfile PRE-CREATION 
>   builder/deb/ubuntu-xenial/build.sh PRE-CREATION 
>   specs/debian/aurora-pants.ini e4f2c54df45f746ffb2a4e3f786fc4105323bc0b 
>   specs/debian/aurora-scheduler.startup.sh PRE-CREATION 
>   specs/ubuntu-xenial/aurora-doc.docs PRE-CREATION 
>   specs/ubuntu-xenial/aurora-doc.examples PRE-CREATION 
>   specs/ubuntu-xenial/aurora-pants.ini PRE-CREATION 
>   specs/ubuntu-xenial/aurora-scheduler.default PRE-CREATION 
>   specs/ubuntu-xenial/aurora-scheduler.install PRE-CREATION 
>   specs/ubuntu-xenial/aurora-scheduler.links PRE-CREATION 
>   specs/ubuntu-xenial/aurora-scheduler.postinst PRE-CREATION 
>   specs/ubuntu-xenial/aurora-scheduler.service PRE-CREATION 
>   specs/ubuntu-xenial/aurora-scheduler.startup.sh PRE-CREATION 
>   specs/ubuntu-xenial/aurora-tools.install PRE-CREATION 
>   specs/ubuntu-xenial/aurora-tools.links PRE-CREATION 
>   specs/ubuntu-xenial/changelog PRE-CREATION 
>   specs/ubuntu-xenial/clusters.json PRE-CREATION 
>   specs/ubuntu-xenial/compat PRE-CREATION 
>   specs/ubuntu-xenial/control PRE-CREATION 
>   specs/ubuntu-xenial/copyright PRE-CREATION 
>   specs/ubuntu-xenial/rules PRE-CREATION 
>   specs/ubuntu-xenial/source/format PRE-CREATION 
>   specs/ubuntu-xenial/thermos.default PRE-CREATION 
>   specs/ubuntu-xenial/thermos.dirs PRE-CREATION 
>   specs/ubuntu-xenial/thermos.install PRE-CREATION 
>   specs/ubuntu-xenial/thermos.links PRE-CREATION 
>   specs/ubuntu-xenial/thermos.service PRE-CREATION 
>   test/deb/ubuntu-xenial/README.md PRE-CREATION 
>   test/deb/ubuntu-xenial/Vagrantfile PRE-CREATION 
>   test/deb/ubuntu-xenial/provision.sh PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/52437/diff/
> 
> 
> Testing
> ---
> 
> Created artifacts using the build-artifacts script.
> 
> Brought a vagrant image up, installed all deb files created by the artifacts 
> script, started both aurora-scheduler and thermos services, and launched a 
> sample job.
> 
> 
> Thanks,
> 
> Renan DelValle
> 
>



Re: Review Request 52437: Adding support for Ubuntu Xenial packages

2017-01-04 Thread Renan DelValle

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52437/
---

(Updated Jan. 4, 2017, 2:43 p.m.)


Review request for Aurora, Dmitriy Shirchenko and Zameer Manji.


Bugs: AURORA-1872
https://issues.apache.org/jira/browse/AURORA-1872


Repository: aurora-packaging


Description
---

Added builder and test environment for Xenial as well as updated instructions 
on how to test it.

Added distributtion to release-candidate script.


Diffs
-

  build-support/release/release-candidate 
98df82412b6218ed7c1f650a7b1cc725ac297c37 
  builder/deb/ubuntu-xenial/Dockerfile PRE-CREATION 
  builder/deb/ubuntu-xenial/build.sh PRE-CREATION 
  specs/debian/aurora-pants.ini e4f2c54df45f746ffb2a4e3f786fc4105323bc0b 
  specs/debian/aurora-scheduler.startup.sh PRE-CREATION 
  specs/ubuntu-xenial/aurora-doc.docs PRE-CREATION 
  specs/ubuntu-xenial/aurora-doc.examples PRE-CREATION 
  specs/ubuntu-xenial/aurora-pants.ini PRE-CREATION 
  specs/ubuntu-xenial/aurora-scheduler.default PRE-CREATION 
  specs/ubuntu-xenial/aurora-scheduler.install PRE-CREATION 
  specs/ubuntu-xenial/aurora-scheduler.links PRE-CREATION 
  specs/ubuntu-xenial/aurora-scheduler.postinst PRE-CREATION 
  specs/ubuntu-xenial/aurora-scheduler.service PRE-CREATION 
  specs/ubuntu-xenial/aurora-scheduler.startup.sh PRE-CREATION 
  specs/ubuntu-xenial/aurora-tools.install PRE-CREATION 
  specs/ubuntu-xenial/aurora-tools.links PRE-CREATION 
  specs/ubuntu-xenial/changelog PRE-CREATION 
  specs/ubuntu-xenial/clusters.json PRE-CREATION 
  specs/ubuntu-xenial/compat PRE-CREATION 
  specs/ubuntu-xenial/control PRE-CREATION 
  specs/ubuntu-xenial/copyright PRE-CREATION 
  specs/ubuntu-xenial/rules PRE-CREATION 
  specs/ubuntu-xenial/source/format PRE-CREATION 
  specs/ubuntu-xenial/thermos.default PRE-CREATION 
  specs/ubuntu-xenial/thermos.dirs PRE-CREATION 
  specs/ubuntu-xenial/thermos.install PRE-CREATION 
  specs/ubuntu-xenial/thermos.links PRE-CREATION 
  specs/ubuntu-xenial/thermos.service PRE-CREATION 
  test/deb/ubuntu-xenial/README.md PRE-CREATION 
  test/deb/ubuntu-xenial/Vagrantfile PRE-CREATION 
  test/deb/ubuntu-xenial/provision.sh PRE-CREATION 

Diff: https://reviews.apache.org/r/52437/diff/


Testing
---

Created artifacts using the build-artifacts script.

Brought a vagrant image up, installed all deb files created by the artifacts 
script, started both aurora-scheduler and thermos services, and launched a 
sample job.


Thanks,

Renan DelValle