> On Sept. 1, 2016, 7:51 p.m., Zameer Manji wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift, line 467
> > <https://reviews.apache.org/r/51536/diff/3/?file=1489406#file1489406line467>
> >
> >     This is not sufficent to determine if healthchecking occurs.
> >     
> >     We now support shell healthchecking, so a job may not have any port 
> > named health but it will still have it's health checked by thermos.
> >     
> >     Why can't enabling of this feature be a property of the Job or Update?
> 
> Kai Huang wrote:
>     We can infer if the health check is enabled from the Job Config, so I 
> think rather than modifying the Job, we should modify the Update to direct 
> the Instance Updater to skip watch_secs(e.g. creating a boolean named 
> "isWatchSecsSkippable" in InstanceUpdater).
> 
> Kai Huang wrote:
>     A second thought which seems to be a better solution: We just modify the 
> executor to send a message in its healthChecker thread, and on scheduler side 
> we check the presence of the task status update message string to determine 
> if watch_secs should be skipped.
>     
>     There is no need to check if health check is enabled at scheduler side 
> nor modifying the Update.
>     
>     The watch_secs is skipped by the vCurrent scheduler if and only if the 
> vCurrent executor is doing a health check, and sends an non-empty message 
> string during the TASK_RUNNING state transition.
> 
> Maxim Khutornenko wrote:
>     I think Kai's latest proposal should be sufficient to determine what 
> policy to apply. Iff RUNNING gets back with our "special" message added, the 
> updater will skip watch_secs. In all other cases, it will honor watch_secs.

ping for response.


- Kai


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review147597
-----------------------------------------------------------


On Sept. 2, 2016, 3:55 p.m., Kai Huang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> -----------------------------------------------------------
> 
> (Updated Sept. 2, 2016, 3:55 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-894
>     https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> - Scheduler updater will not use watch_sec if health check is enabled
> - Add unit tests for successful updates.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> 
> Assumptions on executor change:
> The executor starts health check at STARTING, if a successful health check is 
> performed before initial_interval_sec expires, the executor will sends a 
> status message for RUNNING.
> 
> What I try to implement:
> vCurrent updater will infer the executor version from the presence of a 
> status message sent by the executor during RUNNING status update. 
> 
> For vPrev executor:
>     The vCurrent updater will use watch_sec for instance update, this is 
> independent from the health check.
> For vCurrent executor:
>     If health check is disabled on vCurrent executor, the vCurrent updater 
> will use watch_sec for instance update.
>     If health check is enabled on vCurrent executor, the vCurrent updater 
> will not use watch_sec for instance update.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java 
> c129896d8cd54abd2634e2a339c27921042b0162 
>   src/test/java/org/apache/aurora/scheduler/updater/InstanceUpdaterTest.java 
> c78c7fbd7d600586136863c99ce3d7387895efee 
> 
> Diff: https://reviews.apache.org/r/51536/diff/
> 
> 
> Testing
> -------
> 
> ./gradlew build
> 
> ./gradlew :test --tests 
> "org.apache.aurora.scheduler.updater.InstanceUpdaterTest"
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>

Reply via email to