[
https://issues.apache.org/jira/browse/AURORA-1240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14481463#comment-14481463
]
Bill Farner commented on AURORA-1240:
-------------------------------------
Scheduler change: https://reviews.apache.org/r/32840/
> Deprecate UpdateConfig "restart_threshold" setting
> --------------------------------------------------
>
> Key: AURORA-1240
> URL: https://issues.apache.org/jira/browse/AURORA-1240
> Project: Aurora
> Issue Type: Task
> Components: Client, Scheduler
> Reporter: Maxim Khutornenko
> Assignee: Bill Farner
>
> The UpdateConfig {{restart_theshold}} \[1\] setting does not appear to
> deliver much user value as it's highly sensitive to scheduling performance
> and may result in aborted/rolled back job updates when set too low.
> Some background: This timeout controls task transition from {{PENDING}} to
> {{RUNNING}} during the job update. In the event of cluster capacity shortage,
> assigning a task to a host may take considerably longer thus expiring the
> timeout and depending on the failure settings causing an unnecessary job
> update abort or rollback. It was meant to give users some protection against
> unsatisfiable resource/constraint requirements. In reality though, it proved
> to be rather an annoyance to users when an update is interrupted due to
> unexpected delay in task assignment.
> Consider deprecating and subsequently removing this setting.
> \[1\] -
> https://github.com/apache/aurora/blob/master/docs/configuration-reference.md#updateconfig-objects
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)