aurora git commit: Bump initial_task_kill_retry_interval to 15s.

serb Tue, 25 Apr 2017 14:19:47 -0700

Repository: aurora
Updated Branches:
  refs/heads/master 2026ca040 -> 6cb2d4f69



Bump initial_task_kill_retry_interval to 15s.

It is not very common that kills are dropped by Mesos and have to be retried
by Aurora. It therefore makes sense to slightly increase the retry timeout
so that we don't retry needlessly when Thermos is still busy executing
the lifecycle methods.

By default, Thermos uses the following kill escalation sequence:

  * /quitquitquit
  * wait 5s
  * /abortabortabort
  * wait 5s
  * SIGTERM
  * wait up to 1 minute
  * SIGKILL

Reviewed at https://reviews.apache.org/r/58611/


Project: http://git-wip-us.apache.org/repos/asf/aurora/repo
Commit: http://git-wip-us.apache.org/repos/asf/aurora/commit/6cb2d4f6
Tree: http://git-wip-us.apache.org/repos/asf/aurora/tree/6cb2d4f6
Diff: http://git-wip-us.apache.org/repos/asf/aurora/diff/6cb2d4f6

Branch: refs/heads/master
Commit: 6cb2d4f698a75edf15d3688b4b39e2e6e7467fdd
Parents: 2026ca0
Author: Stephan Erb <[email protected]>
Authored: Tue Apr 25 23:18:30 2017 +0200
Committer: Stephan Erb <[email protected]>
Committed: Tue Apr 25 23:18:30 2017 +0200

----------------------------------------------------------------------
 .../aurora/scheduler/reconciliation/ReconciliationModule.java      | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/aurora/blob/6cb2d4f6/src/main/java/org/apache/aurora/scheduler/reconciliation/ReconciliationModule.java
----------------------------------------------------------------------
diff --git 
a/src/main/java/org/apache/aurora/scheduler/reconciliation/ReconciliationModule.java
 
b/src/main/java/org/apache/aurora/scheduler/reconciliation/ReconciliationModule.java
index e076e80..80fc616 100644
--- 
a/src/main/java/org/apache/aurora/scheduler/reconciliation/ReconciliationModule.java
+++ 
b/src/main/java/org/apache/aurora/scheduler/reconciliation/ReconciliationModule.java
@@ -59,7 +59,7 @@ public class ReconciliationModule extends AbstractModule {
       help = "When killing a task, retry after this delay if mesos has not 
responded,"
           + " backing off up to transient_task_state_timeout")
   private static final Arg<Amount<Long, Time>> 
INITIAL_TASK_KILL_RETRY_INTERVAL =
-      Arg.create(Amount.of(5L, Time.SECONDS));
+      Arg.create(Amount.of(15L, Time.SECONDS));
 
   // Reconciliation may create a big surge of status updates in a large 
cluster. Setting the default
   // initial delay to 1 minute to ease up storage contention during scheduler 
start up.

aurora git commit: Bump initial_task_kill_retry_interval to 15s.

Reply via email to