----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/53590/#review155443 -----------------------------------------------------------
LGTM overall. It is a *little* concerning that with this change, since the default value for min_consecutive_successes is 1 and we've removed the sleep(initial_interval_secs) before the first health check, that now by default almost everyone will opt into this feature and have vastly different behavior than before. If someone was just using initial_interval_secs in lieu of service warm-up, they might be in for a shock. I don't see a way around this without having to deal with other backwards compatibility issues, but let's be vigilant in release notes, etc. documenting the behavior. Might even be worth logging a message from the client if they are using a non-default value for initial_interval_secs that the behavior has changed? That might be overkill. src/main/python/apache/aurora/executor/common/health_checker.py (line 64) <https://reviews.apache.org/r/53590/#comment225420> Description is no longer true. src/main/python/apache/aurora/executor/common/health_checker.py (lines 91 - 94) <https://reviews.apache.org/r/53590/#comment225419> Suggest renaming this to grace_period_secs internally. src/main/python/apache/aurora/executor/common/health_checker.py (line 97) <https://reviews.apache.org/r/53590/#comment225406> birth_time is more accurate. src/main/python/apache/aurora/executor/common/health_checker.py (line 128) <https://reviews.apache.org/r/53590/#comment225409> It looks like a bug that this isn't part of the above conditional. Would consider removing the conditional or moving this inside it for clarity. src/main/python/apache/aurora/executor/common/health_checker.py (line 167) <https://reviews.apache.org/r/53590/#comment225415> *successful health checks src/main/python/apache/aurora/executor/common/health_checker.py (line 262) <https://reviews.apache.org/r/53590/#comment225413> Can you provide the reasoning for this class (from the design-doc) as a comment for future readers? src/test/python/apache/aurora/executor/common/test_health_checker.py (lines 465 - 466) <https://reviews.apache.org/r/53590/#comment225416> I feel like we need more coverage with regards to these input values. At the minimum, we should have coverage for min_consecutive_successes set to 1, since it's the default value that will affect existing users and it will help us track what exactly is changing. - David McLaughlin On Nov. 9, 2016, 8:29 a.m., Santhosh Kumar Shanmugham wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/53590/ > ----------------------------------------------------------- > > (Updated Nov. 9, 2016, 8:29 a.m.) > > > Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji. > > > Bugs: AURORA-1225 > https://issues.apache.org/jira/browse/AURORA-1225 > > > Repository: aurora > > > Description > ------- > > Make RUNNING a first class state to indicate that the task is running > and is healthy. It is achieved by introducing a new configuration > parameter `min_consecutive_successes`, which will dictate when to move > a task into RUNNING state. > > With this change, it is possible to set the `watch_secs` to 0, so that > updates are purely based on the task's health, rather than relying on > watching the task to in RUNNING state for a pre-determined timeout. > > > Diffs > ----- > > RELEASE-NOTES.md 392496284b7f046a5dd29a5a3f175305c28a5afe > docs/development/design-documents.md > 6bfc679c841ef1b0861758ccc12a5150f1bdb5e3 > docs/features/job-updates.md 792f2ae5fd14b1ea9af8be000629ce5a7fc2fe8f > docs/reference/configuration.md f2a0b1873f31e91f3bf0cac6f8448e8130fae688 > src/main/python/apache/aurora/client/api/updater_util.py > c649316edb876565c92cc90c9f030e153c008924 > src/main/python/apache/aurora/client/config.py > 0186af52f0d7d7e3981ec59bf6a01aafee2bcfb1 > src/main/python/apache/aurora/config/schema/base.py > 845163043b0b7b2f9e7aca14677ca9f094658551 > src/main/python/apache/aurora/executor/aurora_executor.py > aee5e56ae66ec7a6acbe97d94d6187fd8646ec9a > src/main/python/apache/aurora/executor/common/health_checker.py > 3c7c09d173c4adbd89eddaee22cffcdd8d268378 > src/test/python/apache/aurora/client/test_config.py > 5cf68a5145ddf9478baa30453c0bcb73136fa7eb > src/test/python/apache/aurora/executor/common/test_health_checker.py > da0c56ca084b65427a6122f22f251af8637772d1 > src/test/python/apache/aurora/executor/test_thermos_executor.py > 3f82165333665d127e1ace765a7dada7348a4b91 > > Diff: https://reviews.apache.org/r/53590/diff/ > > > Testing > ------- > > buils-support/jenkins/build.sh > sh ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh > > > Thanks, > > Santhosh Kumar Shanmugham > >
