-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55179/
-----------------------------------------------------------
Review request for Aurora, Joshua Cohen and Stephan Erb.
Bugs: AURORA-1820
https://issues.apache.org/jira/browse/AURORA-1820
Repository: aurora
Description
-------
`TimedOutTaskHandler` acquires storage write lock for every task every time
they transition to a transient state. It then verifies after a default time-out
period of 5 minutes if the task has transitioned out of the transient state.
The verification step takes place while holding the storage write lock. In over
99% of cases the logic short-circuits and returns from
`StateManagerImpl.updateTaskAndExternalState()` once it learns task has
transitioned out of the transient state.
This patch reduces storage write lock contention by adopting Double-Checked
Locking pattern in `TimedOutTaskHandler.run()`.
Diffs
-----
src/main/java/org/apache/aurora/scheduler/reconciliation/TaskTimeout.java
2dc9bc2c6916595270187f0f29d5bd8c5ba7e9ad
src/test/java/org/apache/aurora/scheduler/reconciliation/TaskTimeoutTest.java
1006ddb6caea015c2d4e014bd044f2933541c84f
Diff: https://reviews.apache.org/r/55179/diff/
Testing
-------
```
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
...
*** OK (All tests passed) ***
mesos-master start/running, process 22759
+ RETCODE=0
+ restore_netrc
+ mv /home/vagrant/.netrc.bak /home/vagrant/.netrc
+ true
Connection to 127.0.0.1 closed.
real 25m36.144s
user 0m1.358s
sys 0m0.595s
```
Thanks,
Mehrdad Nurolahzade