The watch_secs is triggered when a task enters RUNNING. In order for the rolling update to not fail early the restart_threshold [1] needs to be bumped up to account for the preemption delay.
As for the default preemption delay, it was implemented to avoid unnecessary churn in the cluster. Larger/constraint-diverse tasks take longer to bin-place, as such there could be occasional scheduling delays when resources are tight. Hence, the grace buffer. You can definitely dial it in given the specifics of your cluster. Thanks, Maxim [1] - https://github.com/apache/incubator-aurora/blob/master/docs/configuration-reference.md#updateconfig-objects On Tue, Feb 17, 2015 at 12:51 AM, Erb, Stephan <stephan....@blue-yonder.com> wrote: > If I remember correctly, you also have to make sure that your UpdateConfig > watch_secs is larger than your preemption_delay. Otherwise a rolling update > of a production job might not be able to get the resources it needs. > > Best Regards, > Stephan > ________________________________________ > From: Bhuvan Arumugam <bhu...@apache.org> > Sent: Monday, February 16, 2015 7:14 AM > To: dev@aurora.incubator.apache.org > Subject: reasonable preemption delay to use > > Hello, > Recently, in one of our clusters we noticed production jobs go to > PENDING state, due to insufficient CPU. The non production jobs are > not preempted, as we haven't used --preemption_delay flag for > scheduler. The default value for this flag is 10mins. Why is it too > high? Is there any reasoning behind using 10mins as a default value? > > We are thinking to to use 2mins for this flag. We wouldn't want to > wait beyond 2mins to run a prod job during resource constraint. Does > it sound reasonable? What's the typical preemption delay used by SREs? > > -- > Regards, > Bhuvan Arumugam > www.livecipher.com