> On July 28, 2016, 8:44 p.m., David McLaughlin wrote:
> > For me ROLLED_BACK has a clear meaning - a job that was rolled back due to 
> > a failed health check in the update process. Overloading like this is going 
> > to lead to ambiguity and I wouldn't be comfortable exposing this 
> > functionality to users. 
> > 
> > Further to that, having to reinsert the code lock is basically a code smell 
> > that this approach is not what the current code was designed for. In fact 
> > this approach should have failed the unit tests because ROLLED_FORWARD (or 
> > any other terminal state) to ROLLING_BACK isn't an allowed state transition 
> > into the JobUpdateStateMachine: 
> > (https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/updater/JobUpdateStateMachine.java#L56).
> > 
> > It looks like a bug that only one of the changeUpdateStatus methods 
> > contains the transition check:
> > 
> > https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java#L404
> > 
> > I'm concerned that moving a previously terminal state into ROLLING_BACK 
> > creates a bunch of conceptual issues and complexity that will make the 
> > codebase (even) harder to understand. For example, throughout the code 
> > (also the UI code) we designate some states as terminal 
> > (https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java#L416)
> >  - but in the future we'll have to keep in mind that those states are 
> > TERMINAL_BUT_NOT_REALLY. ROLLED_BACK becomes the only true terminal state 
> > in this model. 
> > 
> > I'm not against reusing the rollback functionality built into the scheduler 
> > in the way you're doing, but maybe we should introduce a new STATE to 
> > reflect that it was user-initiated and add it to ALLOWED_TRANSITIONS? I'm 
> > not really familiar with this code, so maybe Bill or Maxim could speak to 
> > an approach (or dismiss my concerns ;)).
> 
> Igor Morozov wrote:
>     Would not creating a new state imply the same level of conceptual 
> complexity? You would need to keep in mind not only to properly handle 
> terminal states but also a new rollbackable one? May be terminal states are 
> not being useful as a conocept as they are being used only here 
> https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java#L451
>  where update job locks being cleand up and it sounds like it should be 
> refactored away (see Maxim's comment). So what I'm trying to say it would not 
> be too much of a stratch to extend the semantic of rollbacks and terminal 
> states for update jobs.

They are only used there in the Scheduler, yes. But for things like hooks and 
other processes around update lifecycle, the idea that an update can "come back 
to life" breaks a lot of assumptions. The use of pulse in your diff is a great 
example. How is the service that was sending pulses for the original update 
supposed to know the update was revived? The service we use to send pulses 
needs some signal for when it can stop otherwise it would need to spam forever 
- it relies on the terminal states to do that. I see similar problems with the 
hooks in the CLI too. 

Again, maybe this is too specific to our tooling and others might disagree with 
my concerns. We can always just remove this feature from our CLI.


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50168/#review144001
-----------------------------------------------------------


On July 28, 2016, 10:54 p.m., Igor Morozov wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50168/
> -----------------------------------------------------------
> 
> (Updated July 28, 2016, 10:54 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, Bill Farner, and 
> Zameer Manji.
> 
> 
> Bugs: AURORA-1721
>     https://issues.apache.org/jira/browse/AURORA-1721
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Add rollback functionality to the scheduler
> 
> Thic change also includes an addition to storage replicated log. I needed a 
> way to re-insert lock token for an existing job update.
> 
> 
> Diffs
> -----
> 
>   CHANGELOG fc6a46d77ebf889c8f60402c95e8a7472980ba1c 
>   RELEASE-NOTES.md 29d224d5596eb060356f1343cf347db79b1ad687 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> 1d66208490aff6ea8af4c737845fa2cf13617529 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> 9e4213f13255a182df938bea44ca87fa03a25318 
>   src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java 
> 52c3c6618a3cf1009435ca8a9cece36365913e55 
>   src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java 
> d2673e6b328cb1e249fbe91d18e0d9e935636eaa 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 
> 39924c62108f6a68f545f90d77ceab4265d41fad 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/WriteAheadStorage.java 
> d0de063fd78e6c4f62fae4a598d1d22f9775772d 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  b534abf95bab6e1657e3ef993cf34c0d6ec460be 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/aop/AnnotatedAuroraAdmin.java
>  9243c92b11040b68ed6014b3991db69fc08bcddf 
>   src/main/java/org/apache/aurora/scheduler/updater/JobUpdateController.java 
> f8357c46df1b025bf4e38a7ce1cb1c13a50c39f9 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  364c5c753f884a2d89e27802d7bbf3b2b6d3a08e 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 68baf8fdb90cd26100159401c46c9963c24332b3 
>   src/main/python/apache/aurora/client/cli/update.py 
> bb526f7bf94d7bfe02fe2786493c85be1bfeb86f 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> e157c0dfde5efc418448e138aa008ade742fe816 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afbd385b7eda64cb1f7d118b695e65e4045eac6c 
> 
> Diff: https://reviews.apache.org/r/50168/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Igor Morozov
> 
>

Reply via email to