> On Sept. 1, 2016, 7:51 p.m., Zameer Manji wrote: > > api/src/main/thrift/org/apache/aurora/gen/api.thrift, line 467 > > <https://reviews.apache.org/r/51536/diff/3/?file=1489406#file1489406line467> > > > > This is not sufficent to determine if healthchecking occurs. > > > > We now support shell healthchecking, so a job may not have any port > > named health but it will still have it's health checked by thermos. > > > > Why can't enabling of this feature be a property of the Job or Update? > > Kai Huang wrote: > We can infer if the health check is enabled from the Job Config, so I > think rather than modifying the Job, we should modify the Update to direct > the Instance Updater to skip watch_secs(e.g. creating a boolean named > "isWatchSecsSkippable" in InstanceUpdater). > > Kai Huang wrote: > A second thought which seems to be a better solution: We just modify the > executor to send a message in its healthChecker thread, and on scheduler side > we check the presence of the task status update message string to determine > if watch_secs should be skipped. > > There is no need to check if health check is enabled at scheduler side > nor modifying the Update. > > The watch_secs is skipped by the vCurrent scheduler if and only if the > vCurrent executor is doing a health check, and sends an non-empty message > string during the TASK_RUNNING state transition. > > Maxim Khutornenko wrote: > I think Kai's latest proposal should be sufficient to determine what > policy to apply. Iff RUNNING gets back with our "special" message added, the > updater will skip watch_secs. In all other cases, it will honor watch_secs.
ping for response. - Kai ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/#review147597 ----------------------------------------------------------- On Sept. 2, 2016, 3:55 p.m., Kai Huang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51536/ > ----------------------------------------------------------- > > (Updated Sept. 2, 2016, 3:55 p.m.) > > > Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. > > > Bugs: AURORA-894 > https://issues.apache.org/jira/browse/AURORA-894 > > > Repository: aurora > > > Description > ------- > > - Scheduler updater will not use watch_sec if health check is enabled > - Add unit tests for successful updates. > > This feature intends to improve reliability and performance of the Aurora > scheduler job updater by relying on health check status rather than > watch_secs timeout when deciding an individual instance update state. > > See this epic: https://issues.apache.org/jira/browse/AURORA-894 > and the design doc: > https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# > for more details and background. > > > Assumptions on executor change: > The executor starts health check at STARTING, if a successful health check is > performed before initial_interval_sec expires, the executor will sends a > status message for RUNNING. > > What I try to implement: > vCurrent updater will infer the executor version from the presence of a > status message sent by the executor during RUNNING status update. > > For vPrev executor: > The vCurrent updater will use watch_sec for instance update, this is > independent from the health check. > For vCurrent executor: > If health check is disabled on vCurrent executor, the vCurrent updater > will use watch_sec for instance update. > If health check is enabled on vCurrent executor, the vCurrent updater > will not use watch_sec for instance update. > > > Diffs > ----- > > api/src/main/thrift/org/apache/aurora/gen/api.thrift > c5765b70501c101f0535b4eed94e9948c36808f9 > src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java > c129896d8cd54abd2634e2a339c27921042b0162 > src/test/java/org/apache/aurora/scheduler/updater/InstanceUpdaterTest.java > c78c7fbd7d600586136863c99ce3d7387895efee > > Diff: https://reviews.apache.org/r/51536/diff/ > > > Testing > ------- > > ./gradlew build > > ./gradlew :test --tests > "org.apache.aurora.scheduler.updater.InstanceUpdaterTest" > > ./build-support/jenkins/build.sh > > > Thanks, > > Kai Huang > >