-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55179/
-----------------------------------------------------------

Review request for Aurora, Joshua Cohen and Stephan Erb.


Bugs: AURORA-1820
    https://issues.apache.org/jira/browse/AURORA-1820


Repository: aurora


Description
-------

`TimedOutTaskHandler` acquires storage write lock for every task every time 
they transition to a transient state. It then verifies after a default time-out 
period of 5 minutes if the task has transitioned out of the transient state.

The verification step takes place while holding the storage write lock. In over 
99% of cases the logic short-circuits and returns from 
`StateManagerImpl.updateTaskAndExternalState()` once it learns task has 
transitioned out of the transient state.

This patch reduces storage write lock contention by adopting Double-Checked 
Locking pattern in `TimedOutTaskHandler.run()`.


Diffs
-----

  src/main/java/org/apache/aurora/scheduler/reconciliation/TaskTimeout.java 
2dc9bc2c6916595270187f0f29d5bd8c5ba7e9ad 
  src/test/java/org/apache/aurora/scheduler/reconciliation/TaskTimeoutTest.java 
1006ddb6caea015c2d4e014bd044f2933541c84f 

Diff: https://reviews.apache.org/r/55179/diff/


Testing
-------

```
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh

...

*** OK (All tests passed) ***

mesos-master start/running, process 22759
+ RETCODE=0
+ restore_netrc
+ mv /home/vagrant/.netrc.bak /home/vagrant/.netrc
+ true
Connection to 127.0.0.1 closed.

real    25m36.144s
user    0m1.358s
sys     0m0.595s
```


Thanks,

Mehrdad Nurolahzade

Reply via email to