Zameer Manji created AURORA-1791:
------------------------------------
Summary: Commit ca683 is not backwards compatible.
Key: AURORA-1791
URL: https://issues.apache.org/jira/browse/AURORA-1791
Project: Aurora
Issue Type: Bug
Reporter: Zameer Manji
Assignee: Kai Huang
Priority: Blocker
The commit [ca683cb9e27bae76424a687bc6c3af5a73c501b9 |
https://github.com/apache/aurora/commit/ca683cb9e27bae76424a687bc6c3af5a73c501b9]
is not backwards compatible. The last section of the commit
{quote}
4. Modified the Health Checker and redefined the meaning initial_interval_secs.
{quote}
has serious, unintended consequences.
Consider the following health check config:
{noformat}
initial_interval_secs: 10
interval_secs: 5
max_consecutive_failures: 1
{noformat}
On the 0.16.0 executor, no health checking will occur for the first 10 seconds.
Here the earliest a task can cause failure is at the 10th second.
On master, health checking starts right away which means the task can fail at
the first second since {{max_consecutive_failures}} is set to 1.
This is not backwards compatible and needs to be fixed.
I think a good solution would be to revert the meaning change to
initial_interval_secs and have the task transition into RUNNING when
{{max_consecutive_successes}} is met.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)