Repository: mesos
Updated Branches:
  refs/heads/master 2ec2e48d1 -> 498a000ac


Updated configuration.md for --executor_reregistration_retry_interval.


Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/498a000a
Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/498a000a
Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/498a000a

Branch: refs/heads/master
Commit: 498a000ac1bb8f51dc871f22aea265424a407a17
Parents: 2ec2e48
Author: Adam B <a...@mesosphere.io>
Authored: Wed Aug 2 01:24:10 2017 -0700
Committer: Adam B <a...@mesosphere.io>
Committed: Wed Aug 2 01:27:39 2017 -0700

----------------------------------------------------------------------
 docs/configuration.md | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mesos/blob/498a000a/docs/configuration.md
----------------------------------------------------------------------
diff --git a/docs/configuration.md b/docs/configuration.md
index 5449b92..058e366 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1540,7 +1540,32 @@ master until this timeout has elapsed (see MESOS-7539). 
(default: 2secs)
 </tr>
 <tr>
   <td>
-    --max_completed_executors_per_framework
+    --executor_reregistration_retry_interval=VALUE
+  </td>
+  <td>
+For PID-based executors, how long the agent waits before retrying
+the reconnect message sent to the executor during recovery.
+NOTE: Do not use this unless you understand the following
+(see MESOS-5332): PID-based executors using Mesos libraries &gt;= 1.1.2
+always re-link with the agent upon receiving the reconnect message.
+This avoids the executor replying on a half-open TCP connection to
+the old agent (possible if netfilter is dropping packets,
+see: MESOS-7057). However, PID-based executors using Mesos
+libraries &lt; 1.1.2 do not re-link and are therefore prone to
+replying on a half-open connection after the agent restarts. If we
+only send a single reconnect message, these "old" executors will
+reply on their half-open connection and receive a RST; without any
+retries, they will fail to reconnect and be killed by the agent once
+the executor re-registration timeout elapses. To ensure these "old"
+executors can reconnect in the presence of netfilter dropping
+packets, we introduced optional retries of the reconnect message.
+This results in "old" executors correctly establishing a link
+when processing the second reconnect message. (default: no retries)
+  </td>
+</tr>
+<tr>
+  <td>
+    --max_completed_executors_per_framework=VALUE
   </td>
   <td>
 Maximum number of completed executors per framework to store

Reply via email to