----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30225/#review71063 -----------------------------------------------------------
I suggest you skip to the big comment before paying attention to the smaller ones. If you agree with the concern, i suggest an initial focused diff that implements receiving heartbeats, and asynchronously reacting to the lack of their presence. api/src/main/thrift/org/apache/aurora/gen/api.thrift <https://reviews.apache.org/r/30225/#comment116606> Maybe s/BLOCKED/AWAITING_PULSE/? That would at least self-document and avoid additional terminology. src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java <https://reviews.apache.org/r/30225/#comment116607> s/query/fetch/ to be consistent with the method above. src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java <https://reviews.apache.org/r/30225/#comment116619> Remove Optional here, do Optional.of() at the call site. src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java <https://reviews.apache.org/r/30225/#comment116620> TODO + ticket, this map needs to be exposed via a debug endpoint src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java <https://reviews.apache.org/r/30225/#comment116626> I think avoid acquisition of a write lock here is a good goal to aim for. If, for example, we are already holding the write lock for a large snapshot, we could cause it to appear as though we have not received a pulse for an extended duration. Also, given that there is likely to be an automated system on the other end, we should be somewhat defensive to a high call frequency. This suggests to me that potentially-expensive read operations should be avoided if possible. For these reasons, i think it would be wise to decouple recording of pulses from reacting to them. It's also apparent that there is currently no asynchronous transition to the 'blocked' states, they're triggered either by a tardy pulse or another external action. I think this would be surprising behavior, and the scheduler should automatically transition to these states without any external input. src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml <https://reviews.apache.org/r/30225/#comment116643> is it possible to reduce the diff a bit by moving this block to its former location, or is it needed - Bill Farner On Feb. 4, 2015, 5:24 p.m., Maxim Khutornenko wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/30225/ > ----------------------------------------------------------- > > (Updated Feb. 4, 2015, 5:24 p.m.) > > > Review request for Aurora, David McLaughlin, Joshua Cohen, and Bill Farner. > > > Bugs: AURORA-1010 > https://issues.apache.org/jira/browse/AURORA-1010 > > > Repository: aurora > > > Description > ------- > > Added pulsing support into the JobUpdateController. The qualified coordinated > updates get blocked until a pulse arrives. An update then becomes active and > proceeds until `blockIfNoPulsesAfterMs` expires or the update reaches a > terminal state (whichever comes first). > > Not particularly happy with plumbing through OneWayJobUpdater but the > alternative is a state machine change, which is much hairier and will require > more changes in the JobUpdaterController. Going with the minimal diff here. > > > Diffs > ----- > > api/src/main/thrift/org/apache/aurora/gen/api.thrift > 4f603f9e7ed004e53937a45ea8edf7241e15f5cf > src/main/java/org/apache/aurora/scheduler/storage/ForwardingStore.java > 7f2ec71d744d5fafac84291cc79feba3398d0e1e > src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java > b7d8d524e15f101416889c00efc3ecf2c8d9c1a1 > src/main/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStore.java > d479d20259f284528b2291e2e486b36d8e47fe5e > > src/main/java/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.java > 60f535988a20638fb16515d5027bfa65f1afb73c > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateController.java > d3b30d48b76d8d7c64cda006a34f7ed3296526f2 > > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java > ec9d1e07abca1bf30262e1c0f741a34741e96f5d > > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateStateMachine.java > 76460f95e058181b711fb6869f5a34c1d5bdfe31 > src/main/java/org/apache/aurora/scheduler/updater/OneWayJobUpdater.java > 27a5b9026f5ac3b3bdeb32813b10435bc3dab173 > > src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml > f9c9ceddc559b43b4a5c45c745d54ff47484edde > > src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java > ca7c0c2675477cc727ca006697665f997972dfde > > src/test/java/org/apache/aurora/scheduler/updater/JobUpdateStateMachineTest.java > 89765ac3d34a827d3748fb96a275d78e9d1b8b72 > src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java > 41d422939209d0808235128e4242c11e8ef25969 > src/test/java/org/apache/aurora/scheduler/updater/OneWayJobUpdaterTest.java > 7d0a7438b4a517e5e0d44f4e99aceb1a6d19f987 > > Diff: https://reviews.apache.org/r/30225/diff/ > > > Testing > ------- > > ./gradlew -Pq build > > > Thanks, > > Maxim Khutornenko > >
