[
https://issues.apache.org/jira/browse/AURORA-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172381#comment-16172381
]
Santhosh Kumar Shanmugham commented on AURORA-1946:
---------------------------------------------------
Just realized that {{STARTING}} state although can be treated as a Transient
state, the timeout depends on the {{HealthCheckConfig}} which dictates how long
the {{Task}} can stay in {{STARTING}}. Further {{HealthCheckConfig}} is an
{{Executor}} concept that the Scheduler does not care about. So it does not
make sense to convert {{STARTING}} into a Transient state that will degrade
into a {{LOST}} state base on a common timeout value.
> Make STARTING a transient state
> -------------------------------
>
> Key: AURORA-1946
> URL: https://issues.apache.org/jira/browse/AURORA-1946
> Project: Aurora
> Issue Type: Task
> Reporter: Santhosh Kumar Shanmugham
> Assignee: Santhosh Kumar Shanmugham
>
> We saw a case where an update was stuck in {{IN_PROGRESS}} state, after a
> task's status update from {{STARTING}} to {{FAILED}} was lost. In the ideal
> scenario the {{Task}} should have been transitioned into {{LOST}} due to a
> transient state. But {{STARTING}} is not a transient state.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)