If I remember correctly, you also have to make sure that your UpdateConfig 
watch_secs is larger than your preemption_delay. Otherwise a rolling update of 
a production job might not be able to get the resources it needs.

Best Regards,
Stephan
________________________________________
From: Bhuvan Arumugam <bhu...@apache.org>
Sent: Monday, February 16, 2015 7:14 AM
To: dev@aurora.incubator.apache.org
Subject: reasonable preemption delay to use

Hello,
Recently, in one of our clusters we noticed production jobs go to
PENDING state, due to insufficient CPU. The non production jobs are
not preempted, as we haven't used --preemption_delay flag for
scheduler. The default value for this flag is 10mins. Why is it too
high? Is there any reasoning behind using 10mins as a default value?

We are thinking to to use 2mins for this flag. We wouldn't want to
wait beyond 2mins to run a prod job during resource constraint. Does
it sound reasonable? What's the typical preemption delay used by SREs?

--
Regards,
Bhuvan Arumugam
www.livecipher.com

Reply via email to