Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/ --- (Updated Sept. 7, 2016, 10 p.m.) Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. Changes --- @ReviewBot retry update the release notes Bugs: AURORA-894 https://issues.apache.org/jira/browse/AURORA-894 Repository: aurora Description --- - Scheduler updater will not use watch_sec if health check is enabled. This feature intends to improve reliability and performance of the Aurora scheduler job updater by relying on health check status rather than watch_secs timeout when deciding an individual instance update state. See this epic: https://issues.apache.org/jira/browse/AURORA-894 and the design doc: https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# for more details and background. After discussion on Aurora dev list, we decided to keep the watch_secs infrastructure intact on scheduler side. Our final conclusion is that we adopt the following implementation: 1. If the users want purely health checking driven updates they can set watch_secs to 0 and enable health checks. 2. If they want to have both health checking and time driven updates they can set watch_secs to the time that they care about, and doing health checks at STARTING state as well. 3. If they just want time driven updates they can disable health checking and set watch_secs to the time that they care about. In this review, there will be only one scheduler change: Currently scheduler does not accept zero value for watch_secs, we need to relax this constraint. Executor change to do (in a separate review): The executor starts health check at STARTING, if a successful health check is performed before initial_interval_sec expires, the executor will sends a status message for RUNNING. Diffs (updated) - RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 04551f17999d742c53dfb1b36286b119b448550e Diff: https://reviews.apache.org/r/51536/diff/ Testing --- ./gradlew build ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT" ./build-support/jenkins/build.sh Thanks, Kai Huang
Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/#review148095 --- Master (19866b5) is red with this patch. ./build-support/jenkins/build.sh [1m# Create file stdout for capturing output. We can't use StringIO mock[0m [1m# because TestProcess is running fork.[0m [1mwith open(os.path.join(td, 'sys_stdout'), 'w+') as stdout:[0m [1m with open(os.path.join(td, 'sys_stderr'), 'w+') as stderr:[0m [1mwith mutable_sys():[0m [1m sys.stdout, sys.stderr = stdout, stderr[0m [1m[0m [1m p = TestProcess('process', 'echo hello world; echo >&2 hello stderr', 0,[0m [1m taskpath, sandbox, logger_destination=LoggerDestination.BOTH)[0m [1m p.start()[0m [1m rc = wait_for_rc(taskpath.getpath('process_checkpoint'))[0m [1m[0m [1m assert rc == 0[0m [1m # Check log files were created in std path with correct content[0m [1m> assert_log_content(taskpath, 'stdout', 'hello world\n')[0m src/test/python/apache/thermos/core/test_process.py:487: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ taskpath = log_name = 'stdout' expected_content = 'hello world\n' [1mdef assert_log_content(taskpath, log_name, expected_content):[0m [1m log = taskpath.with_filename(log_name).getpath('process_logdir')[0m [1m assert os.path.exists(log)[0m [1m with open(log, 'r') as fp:[0m [1m> assert fp.read() == expected_content[0m [1m[31mE assert '' == 'hello world\n'[0m [1m[31mE + hello world[0m src/test/python/apache/thermos/core/test_process.py:313: AssertionError generated xml file: /home/jenkins/jenkins-slave/workspace/AuroraBot/dist/test-results/415337499eb72578eab327a6487c1f5c9452b3d6.xml [1m[31m 1 failed, 700 passed, 6 skipped, 1 warnings in 396.57 seconds [0m FAILURE 21:44:37 07:22 [complete][31m FAILURE[0m I will refresh this build result if you post a review containing "@ReviewBot retry" - Aurora ReviewBot On Sept. 7, 2016, 9:28 p.m., Kai Huang wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51536/ > --- > > (Updated Sept. 7, 2016, 9:28 p.m.) > > > Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. > > > Bugs: AURORA-894 > https://issues.apache.org/jira/browse/AURORA-894 > > > Repository: aurora > > > Description > --- > > - Scheduler updater will not use watch_sec if health check is enabled. > > This feature intends to improve reliability and performance of the Aurora > scheduler job updater by relying on health check status rather than > watch_secs timeout when deciding an individual instance update state. > > See this epic: https://issues.apache.org/jira/browse/AURORA-894 > and the design doc: > https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# > for more details and background. > > After discussion on Aurora dev list, we decided to keep the watch_secs > infrastructure intact on scheduler side. Our final conclusion is that we > adopt the following implementation: > > 1. If the users want purely health checking driven updates they can set > watch_secs to 0 and enable health checks. > > 2. If they want to have both health checking and time driven updates they can > set watch_secs to the time that they care about, and doing health checks at > STARTING state as well. > > 3. If they just want time driven updates they can disable health checking and > set watch_secs to the time that they care about. > > In this review, there will be only one scheduler change: > Currently scheduler does not accept zero value for watch_secs, we need to > relax this constraint. > > Executor change to do (in a separate review): > The executor starts health check at STARTING, if a successful health check is > perfo
Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/#review148092 --- Ship it! Ship It! - Joshua Cohen On Sept. 7, 2016, 9:28 p.m., Kai Huang wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51536/ > --- > > (Updated Sept. 7, 2016, 9:28 p.m.) > > > Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. > > > Bugs: AURORA-894 > https://issues.apache.org/jira/browse/AURORA-894 > > > Repository: aurora > > > Description > --- > > - Scheduler updater will not use watch_sec if health check is enabled. > > This feature intends to improve reliability and performance of the Aurora > scheduler job updater by relying on health check status rather than > watch_secs timeout when deciding an individual instance update state. > > See this epic: https://issues.apache.org/jira/browse/AURORA-894 > and the design doc: > https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# > for more details and background. > > After discussion on Aurora dev list, we decided to keep the watch_secs > infrastructure intact on scheduler side. Our final conclusion is that we > adopt the following implementation: > > 1. If the users want purely health checking driven updates they can set > watch_secs to 0 and enable health checks. > > 2. If they want to have both health checking and time driven updates they can > set watch_secs to the time that they care about, and doing health checks at > STARTING state as well. > > 3. If they just want time driven updates they can disable health checking and > set watch_secs to the time that they care about. > > In this review, there will be only one scheduler change: > Currently scheduler does not accept zero value for watch_secs, we need to > relax this constraint. > > Executor change to do (in a separate review): > The executor starts health check at STARTING, if a successful health check is > performed before initial_interval_sec expires, the executor will sends a > status message for RUNNING. > > > Diffs > - > > RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 > src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java > ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa > src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java > 04551f17999d742c53dfb1b36286b119b448550e > > Diff: https://reviews.apache.org/r/51536/diff/ > > > Testing > --- > > ./gradlew build > > ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT" > > ./build-support/jenkins/build.sh > > > Thanks, > > Kai Huang > >
Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/ --- (Updated Sept. 7, 2016, 9:28 p.m.) Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. Changes --- Update release notes. Bugs: AURORA-894 https://issues.apache.org/jira/browse/AURORA-894 Repository: aurora Description --- - Scheduler updater will not use watch_sec if health check is enabled. This feature intends to improve reliability and performance of the Aurora scheduler job updater by relying on health check status rather than watch_secs timeout when deciding an individual instance update state. See this epic: https://issues.apache.org/jira/browse/AURORA-894 and the design doc: https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# for more details and background. After discussion on Aurora dev list, we decided to keep the watch_secs infrastructure intact on scheduler side. Our final conclusion is that we adopt the following implementation: 1. If the users want purely health checking driven updates they can set watch_secs to 0 and enable health checks. 2. If they want to have both health checking and time driven updates they can set watch_secs to the time that they care about, and doing health checks at STARTING state as well. 3. If they just want time driven updates they can disable health checking and set watch_secs to the time that they care about. In this review, there will be only one scheduler change: Currently scheduler does not accept zero value for watch_secs, we need to relax this constraint. Executor change to do (in a separate review): The executor starts health check at STARTING, if a successful health check is performed before initial_interval_sec expires, the executor will sends a status message for RUNNING. Diffs (updated) - RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 04551f17999d742c53dfb1b36286b119b448550e Diff: https://reviews.apache.org/r/51536/diff/ Testing --- ./gradlew build ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT" ./build-support/jenkins/build.sh Thanks, Kai Huang
Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
> On Sept. 7, 2016, 8:18 p.m., Joshua Cohen wrote: > > This change should probably be called out in RELEASE_NOTES.md? This feature will not be available to users until we relax the client-side constraint. We can add it to RELEASE_NOTES when the executor/client feature is ready. - Kai --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/#review148075 --- On Sept. 6, 2016, 6:46 p.m., Kai Huang wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51536/ > --- > > (Updated Sept. 6, 2016, 6:46 p.m.) > > > Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. > > > Bugs: AURORA-894 > https://issues.apache.org/jira/browse/AURORA-894 > > > Repository: aurora > > > Description > --- > > - Scheduler updater will not use watch_sec if health check is enabled. > > This feature intends to improve reliability and performance of the Aurora > scheduler job updater by relying on health check status rather than > watch_secs timeout when deciding an individual instance update state. > > See this epic: https://issues.apache.org/jira/browse/AURORA-894 > and the design doc: > https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# > for more details and background. > > After discussion on Aurora dev list, we decided to keep the watch_secs > infrastructure intact on scheduler side. Our final conclusion is that we > adopt the following implementation: > > 1. If the users want purely health checking driven updates they can set > watch_secs to 0 and enable health checks. > > 2. If they want to have both health checking and time driven updates they can > set watch_secs to the time that they care about, and doing health checks at > STARTING state as well. > > 3. If they just want time driven updates they can disable health checking and > set watch_secs to the time that they care about. > > In this review, there will be only one scheduler change: > Currently scheduler does not accept zero value for watch_secs, we need to > relax this constraint. > > Executor change to do (in a separate review): > The executor starts health check at STARTING, if a successful health check is > performed before initial_interval_sec expires, the executor will sends a > status message for RUNNING. > > > Diffs > - > > src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java > ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa > src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java > 04551f17999d742c53dfb1b36286b119b448550e > > Diff: https://reviews.apache.org/r/51536/diff/ > > > Testing > --- > > ./gradlew build > > ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT" > > ./build-support/jenkins/build.sh > > > Thanks, > > Kai Huang > >
Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/#review148075 --- This change should probably be called out in RELEASE_NOTES.md? - Joshua Cohen On Sept. 6, 2016, 6:46 p.m., Kai Huang wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51536/ > --- > > (Updated Sept. 6, 2016, 6:46 p.m.) > > > Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. > > > Bugs: AURORA-894 > https://issues.apache.org/jira/browse/AURORA-894 > > > Repository: aurora > > > Description > --- > > - Scheduler updater will not use watch_sec if health check is enabled. > > This feature intends to improve reliability and performance of the Aurora > scheduler job updater by relying on health check status rather than > watch_secs timeout when deciding an individual instance update state. > > See this epic: https://issues.apache.org/jira/browse/AURORA-894 > and the design doc: > https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# > for more details and background. > > After discussion on Aurora dev list, we decided to keep the watch_secs > infrastructure intact on scheduler side. Our final conclusion is that we > adopt the following implementation: > > 1. If the users want purely health checking driven updates they can set > watch_secs to 0 and enable health checks. > > 2. If they want to have both health checking and time driven updates they can > set watch_secs to the time that they care about, and doing health checks at > STARTING state as well. > > 3. If they just want time driven updates they can disable health checking and > set watch_secs to the time that they care about. > > In this review, there will be only one scheduler change: > Currently scheduler does not accept zero value for watch_secs, we need to > relax this constraint. > > Executor change to do (in a separate review): > The executor starts health check at STARTING, if a successful health check is > performed before initial_interval_sec expires, the executor will sends a > status message for RUNNING. > > > Diffs > - > > src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java > ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa > src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java > 04551f17999d742c53dfb1b36286b119b448550e > > Diff: https://reviews.apache.org/r/51536/diff/ > > > Testing > --- > > ./gradlew build > > ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT" > > ./build-support/jenkins/build.sh > > > Thanks, > > Kai Huang > >
Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
> On Sept. 6, 2016, 8:56 p.m., Stephan Erb wrote: > > The current default of `watch_secs` is 45 seconds. Should we drop that to 0 > > and also adapt the docs accordingly? That would optimize the default > > settings for the health check driven updates, which I think would be a good > > thing. > > Maxim Khutornenko wrote: > If we drop it to 0 now it will break many clients currently relying on > it. I think we'll have to wait until the next release if we want to set a new > default. Thanks Stephan, we will release in the order of: scheduler -> executor -> client. - Kai --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/#review147917 --- On Sept. 6, 2016, 6:46 p.m., Kai Huang wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51536/ > --- > > (Updated Sept. 6, 2016, 6:46 p.m.) > > > Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. > > > Bugs: AURORA-894 > https://issues.apache.org/jira/browse/AURORA-894 > > > Repository: aurora > > > Description > --- > > - Scheduler updater will not use watch_sec if health check is enabled. > > This feature intends to improve reliability and performance of the Aurora > scheduler job updater by relying on health check status rather than > watch_secs timeout when deciding an individual instance update state. > > See this epic: https://issues.apache.org/jira/browse/AURORA-894 > and the design doc: > https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# > for more details and background. > > After discussion on Aurora dev list, we decided to keep the watch_secs > infrastructure intact on scheduler side. Our final conclusion is that we > adopt the following implementation: > > 1. If the users want purely health checking driven updates they can set > watch_secs to 0 and enable health checks. > > 2. If they want to have both health checking and time driven updates they can > set watch_secs to the time that they care about, and doing health checks at > STARTING state as well. > > 3. If they just want time driven updates they can disable health checking and > set watch_secs to the time that they care about. > > In this review, there will be only one scheduler change: > Currently scheduler does not accept zero value for watch_secs, we need to > relax this constraint. > > Executor change to do (in a separate review): > The executor starts health check at STARTING, if a successful health check is > performed before initial_interval_sec expires, the executor will sends a > status message for RUNNING. > > > Diffs > - > > src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java > ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa > src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java > 04551f17999d742c53dfb1b36286b119b448550e > > Diff: https://reviews.apache.org/r/51536/diff/ > > > Testing > --- > > ./gradlew build > > ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT" > > ./build-support/jenkins/build.sh > > > Thanks, > > Kai Huang > >
Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/#review147922 --- Ship it! Ship It! - Maxim Khutornenko On Sept. 6, 2016, 6:46 p.m., Kai Huang wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51536/ > --- > > (Updated Sept. 6, 2016, 6:46 p.m.) > > > Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. > > > Bugs: AURORA-894 > https://issues.apache.org/jira/browse/AURORA-894 > > > Repository: aurora > > > Description > --- > > - Scheduler updater will not use watch_sec if health check is enabled. > > This feature intends to improve reliability and performance of the Aurora > scheduler job updater by relying on health check status rather than > watch_secs timeout when deciding an individual instance update state. > > See this epic: https://issues.apache.org/jira/browse/AURORA-894 > and the design doc: > https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# > for more details and background. > > After discussion on Aurora dev list, we decided to keep the watch_secs > infrastructure intact on scheduler side. Our final conclusion is that we > adopt the following implementation: > > 1. If the users want purely health checking driven updates they can set > watch_secs to 0 and enable health checks. > > 2. If they want to have both health checking and time driven updates they can > set watch_secs to the time that they care about, and doing health checks at > STARTING state as well. > > 3. If they just want time driven updates they can disable health checking and > set watch_secs to the time that they care about. > > In this review, there will be only one scheduler change: > Currently scheduler does not accept zero value for watch_secs, we need to > relax this constraint. > > Executor change to do (in a separate review): > The executor starts health check at STARTING, if a successful health check is > performed before initial_interval_sec expires, the executor will sends a > status message for RUNNING. > > > Diffs > - > > src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java > ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa > src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java > 04551f17999d742c53dfb1b36286b119b448550e > > Diff: https://reviews.apache.org/r/51536/diff/ > > > Testing > --- > > ./gradlew build > > ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT" > > ./build-support/jenkins/build.sh > > > Thanks, > > Kai Huang > >
Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
> On Sept. 6, 2016, 8:56 p.m., Stephan Erb wrote: > > The current default of `watch_secs` is 45 seconds. Should we drop that to 0 > > and also adapt the docs accordingly? That would optimize the default > > settings for the health check driven updates, which I think would be a good > > thing. If we drop it to 0 now it will break many clients currently relying on it. I think we'll have to wait until the next release if we want to set a new default. - Maxim --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/#review147917 --- On Sept. 6, 2016, 6:46 p.m., Kai Huang wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51536/ > --- > > (Updated Sept. 6, 2016, 6:46 p.m.) > > > Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. > > > Bugs: AURORA-894 > https://issues.apache.org/jira/browse/AURORA-894 > > > Repository: aurora > > > Description > --- > > - Scheduler updater will not use watch_sec if health check is enabled. > > This feature intends to improve reliability and performance of the Aurora > scheduler job updater by relying on health check status rather than > watch_secs timeout when deciding an individual instance update state. > > See this epic: https://issues.apache.org/jira/browse/AURORA-894 > and the design doc: > https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# > for more details and background. > > After discussion on Aurora dev list, we decided to keep the watch_secs > infrastructure intact on scheduler side. Our final conclusion is that we > adopt the following implementation: > > 1. If the users want purely health checking driven updates they can set > watch_secs to 0 and enable health checks. > > 2. If they want to have both health checking and time driven updates they can > set watch_secs to the time that they care about, and doing health checks at > STARTING state as well. > > 3. If they just want time driven updates they can disable health checking and > set watch_secs to the time that they care about. > > In this review, there will be only one scheduler change: > Currently scheduler does not accept zero value for watch_secs, we need to > relax this constraint. > > Executor change to do (in a separate review): > The executor starts health check at STARTING, if a successful health check is > performed before initial_interval_sec expires, the executor will sends a > status message for RUNNING. > > > Diffs > - > > src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java > ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa > src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java > 04551f17999d742c53dfb1b36286b119b448550e > > Diff: https://reviews.apache.org/r/51536/diff/ > > > Testing > --- > > ./gradlew build > > ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT" > > ./build-support/jenkins/build.sh > > > Thanks, > > Kai Huang > >
Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/#review147917 --- The current default of `watch_secs` is 45 seconds. Should we drop that to 0 and also adapt the docs accordingly? That would optimize the default settings for the health check driven updates, which I think would be a good thing. - Stephan Erb On Sept. 6, 2016, 8:46 p.m., Kai Huang wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51536/ > --- > > (Updated Sept. 6, 2016, 8:46 p.m.) > > > Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. > > > Bugs: AURORA-894 > https://issues.apache.org/jira/browse/AURORA-894 > > > Repository: aurora > > > Description > --- > > - Scheduler updater will not use watch_sec if health check is enabled. > > This feature intends to improve reliability and performance of the Aurora > scheduler job updater by relying on health check status rather than > watch_secs timeout when deciding an individual instance update state. > > See this epic: https://issues.apache.org/jira/browse/AURORA-894 > and the design doc: > https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# > for more details and background. > > After discussion on Aurora dev list, we decided to keep the watch_secs > infrastructure intact on scheduler side. Our final conclusion is that we > adopt the following implementation: > > 1. If the users want purely health checking driven updates they can set > watch_secs to 0 and enable health checks. > > 2. If they want to have both health checking and time driven updates they can > set watch_secs to the time that they care about, and doing health checks at > STARTING state as well. > > 3. If they just want time driven updates they can disable health checking and > set watch_secs to the time that they care about. > > In this review, there will be only one scheduler change: > Currently scheduler does not accept zero value for watch_secs, we need to > relax this constraint. > > Executor change to do (in a separate review): > The executor starts health check at STARTING, if a successful health check is > performed before initial_interval_sec expires, the executor will sends a > status message for RUNNING. > > > Diffs > - > > src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java > ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa > src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java > 04551f17999d742c53dfb1b36286b119b448550e > > Diff: https://reviews.apache.org/r/51536/diff/ > > > Testing > --- > > ./gradlew build > > ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT" > > ./build-support/jenkins/build.sh > > > Thanks, > > Kai Huang > >
Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/#review147893 --- Ship it! Master (5d3f945) is green with this patch. ./build-support/jenkins/build.sh I will refresh this build result if you post a review containing "@ReviewBot retry" - Aurora ReviewBot On Sept. 6, 2016, 6:46 p.m., Kai Huang wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51536/ > --- > > (Updated Sept. 6, 2016, 6:46 p.m.) > > > Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. > > > Bugs: AURORA-894 > https://issues.apache.org/jira/browse/AURORA-894 > > > Repository: aurora > > > Description > --- > > - Scheduler updater will not use watch_sec if health check is enabled. > > This feature intends to improve reliability and performance of the Aurora > scheduler job updater by relying on health check status rather than > watch_secs timeout when deciding an individual instance update state. > > See this epic: https://issues.apache.org/jira/browse/AURORA-894 > and the design doc: > https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# > for more details and background. > > After discussion on Aurora dev list, we decided to keep the watch_secs > infrastructure intact on scheduler side. Our final conclusion is that we > adopt the following implementation: > > 1. If the users want purely health checking driven updates they can set > watch_secs to 0 and enable health checks. > > 2. If they want to have both health checking and time driven updates they can > set watch_secs to the time that they care about, and doing health checks at > STARTING state as well. > > 3. If they just want time driven updates they can disable health checking and > set watch_secs to the time that they care about. > > In this review, there will be only one scheduler change: > Currently scheduler does not accept zero value for watch_secs, we need to > relax this constraint. > > Executor change to do (in a separate review): > The executor starts health check at STARTING, if a successful health check is > performed before initial_interval_sec expires, the executor will sends a > status message for RUNNING. > > > Diffs > - > > src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java > ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa > src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java > 04551f17999d742c53dfb1b36286b119b448550e > > Diff: https://reviews.apache.org/r/51536/diff/ > > > Testing > --- > > ./gradlew build > > ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT" > > ./build-support/jenkins/build.sh > > > Thanks, > > Kai Huang > >
Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/ --- (Updated Sept. 6, 2016, 6:46 p.m.) Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji. Changes --- Relax the watch_secs PreCondition check constraint on scheduler. Summary (updated) - Scheduler updater will not use watch_sec if health check is enabled Bugs: AURORA-894 https://issues.apache.org/jira/browse/AURORA-894 Repository: aurora Description (updated) --- - Scheduler updater will not use watch_sec if health check is enabled. This feature intends to improve reliability and performance of the Aurora scheduler job updater by relying on health check status rather than watch_secs timeout when deciding an individual instance update state. See this epic: https://issues.apache.org/jira/browse/AURORA-894 and the design doc: https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit# for more details and background. After discussion on Aurora dev list, we decided to keep the watch_secs infrastructure intact on scheduler side. Our final conclusion is that we adopt the following implementation: 1. If the users want purely health checking driven updates they can set watch_secs to 0 and enable health checks. 2. If they want to have both health checking and time driven updates they can set watch_secs to the time that they care about, and doing health checks at STARTING state as well. 3. If they just want time driven updates they can disable health checking and set watch_secs to the time that they care about. In this review, there will be only one scheduler change: Currently scheduler does not accept zero value for watch_secs, we need to relax this constraint. Executor change to do (in a separate review): The executor starts health check at STARTING, if a successful health check is performed before initial_interval_sec expires, the executor will sends a status message for RUNNING. Diffs (updated) - src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 04551f17999d742c53dfb1b36286b119b448550e Diff: https://reviews.apache.org/r/51536/diff/ Testing (updated) --- ./gradlew build ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT" ./build-support/jenkins/build.sh Thanks, Kai Huang
Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/#review147349 --- Master (c99f2fb) is red with this patch. ./build-support/jenkins/build.sh :commons:generateThriftResources :commons:processResources :commons:classes :commons:jar :compileJava/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/storage/log/WriteAheadStorage.java:74: Note: Wrote forwarder org.apache.aurora.scheduler.storage.log.WriteAheadStorageForwarder @Forward({ ^ Note: Writing file:/home/jenkins/jenkins-slave/workspace/AuroraBot/dist/classes/main/org/apache/aurora/common/args/apt/cmdline.arg.info.txt.2 Note: Writing file:/home/jenkins/jenkins-slave/workspace/AuroraBot/dist/classes/main/META-INF/compiler/resource-mappings/org.apache.aurora.common.args.apt.CmdLineProcessor :generateBuildProperties :processResources :classes :jar :startScripts :distTar :distZip :assemble :compileJmhJavaNote: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/jmh/java/org/apache/aurora/benchmark/fakes/FakeSchedulerDriver.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. :processJmhResources UP-TO-DATE :jmhClasses :checkstyleJmh :jsHint :checkstyleMain[ant:checkstyle] /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java:19:8: error: Unused import - com.google.common.collect.ImmutableSet. FAILED FAILURE: Build failed with an exception. * What went wrong: Execution failed for task ':checkstyleMain'. > Checkstyle rule violations were found. See the report at: > file:///home/jenkins/jenkins-slave/workspace/AuroraBot/dist/reports/checkstyle/main.html * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED Total time: 2 mins 27.087 secs I will refresh this build result if you post a review containing "@ReviewBot retry" - Aurora ReviewBot On Aug. 30, 2016, 7:24 p.m., Kai Huang wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51536/ > --- > > (Updated Aug. 30, 2016, 7:24 p.m.) > > > Review request for Aurora, Joshua Cohen and Maxim Khutornenko. > > > Repository: aurora > > > Description > --- > > - Scheduler updater will not use watch_sec if health check is enabled > - Add unit tests for successful updates. > > Assumptions on executor change: > The executor starts health check at STARTING, if a successful health check is > performed before initial_interval_sec expires, the executor will sends a > status message for RUNNING. > > What I try to implement: > vCurrent updater will infer the executor version from the presence of a > status message sent by the executor during RUNNING status update. > > For vPrev executor: > The vCurrent updater will use watch_sec for instance update, this is > independent from the health check. > For vCurrent executor: > If health check is disabled on vCurrent executor, the vCurrent updater > will use watch_sec for instance update. > If health check is enabled on vCurrent executor, the vCurrent updater > will not use watch_sec for instance update. > > > Diffs > - > > src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java > c129896d8cd54abd2634e2a339c27921042b0162 > src/test/java/org/apache/aurora/scheduler/updater/InstanceUpdaterTest.java > c78c7fbd7d600586136863c99ce3d7387895efee > > Diff: https://reviews.apache.org/r/51536/diff/ > > > Testing > --- > > ./gradlew build > > ./gradlew :test --tests > "org.apache.aurora.scheduler.updater.InstanceUpdaterTest" > > > Thanks, > > Kai Huang > >
Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51536/ --- Review request for Aurora, Joshua Cohen and Maxim Khutornenko. Repository: aurora Description --- - Scheduler updater will not use watch_sec if health check is enabled - Add unit tests for successful updates. Assumptions on executor change: The executor starts health check at STARTING, if a successful health check is performed before initial_interval_sec expires, the executor will sends a status message for RUNNING. What I try to implement: vCurrent updater will infer the executor version from the presence of a status message sent by the executor during RUNNING status update. For vPrev executor: The vCurrent updater will use watch_sec for instance update, this is independent from the health check. For vCurrent executor: If health check is disabled on vCurrent executor, the vCurrent updater will use watch_sec for instance update. If health check is enabled on vCurrent executor, the vCurrent updater will not use watch_sec for instance update. Diffs - src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java c129896d8cd54abd2634e2a339c27921042b0162 src/test/java/org/apache/aurora/scheduler/updater/InstanceUpdaterTest.java c78c7fbd7d600586136863c99ce3d7387895efee Diff: https://reviews.apache.org/r/51536/diff/ Testing --- ./gradlew build ./gradlew :test --tests "org.apache.aurora.scheduler.updater.InstanceUpdaterTest" Thanks, Kai Huang