Repository: mesos-site Updated Branches: refs/heads/asf-site 7b8abbd9b -> b953594b0
http://git-wip-us.apache.org/repos/asf/mesos-site/blob/b953594b/content/api/latest/java/org/apache/mesos/Protos.OperationState.html ---------------------------------------------------------------------- diff --git a/content/api/latest/java/org/apache/mesos/Protos.OperationState.html b/content/api/latest/java/org/apache/mesos/Protos.OperationState.html index 585f9ae..33834c3 100644 --- a/content/api/latest/java/org/apache/mesos/Protos.OperationState.html +++ b/content/api/latest/java/org/apache/mesos/Protos.OperationState.html @@ -169,11 +169,39 @@ extends java.lang.Enum<<a href="../../../org/apache/mesos/Protos.OperationSta </td> </tr> <tr class="altColor"> +<td class="colOne"><code><span class="memberNameLink"><a href="../../../org/apache/mesos/Protos.OperationState.html#OPERATION_GONE_BY_OPERATOR">OPERATION_GONE_BY_OPERATOR</a></span></code> +<div class="block"> + The operation affected an agent that the master cannot contact; + the operator has asserted that the agent has been shutdown, but this has + not been directly confirmed by the master.</div> +</td> +</tr> +<tr class="rowColor"> <td class="colOne"><code><span class="memberNameLink"><a href="../../../org/apache/mesos/Protos.OperationState.html#OPERATION_PENDING">OPERATION_PENDING</a></span></code> <div class="block"> Initial state.</div> </td> </tr> +<tr class="altColor"> +<td class="colOne"><code><span class="memberNameLink"><a href="../../../org/apache/mesos/Protos.OperationState.html#OPERATION_RECOVERING">OPERATION_RECOVERING</a></span></code> +<div class="block"> + The operation affects an agent that the master recovered from its + state, but that agent has not yet re-registered.</div> +</td> +</tr> +<tr class="rowColor"> +<td class="colOne"><code><span class="memberNameLink"><a href="../../../org/apache/mesos/Protos.OperationState.html#OPERATION_UNKNOWN">OPERATION_UNKNOWN</a></span></code> +<div class="block"> + The master has no knowledge of the operation.</div> +</td> +</tr> +<tr class="altColor"> +<td class="colOne"><code><span class="memberNameLink"><a href="../../../org/apache/mesos/Protos.OperationState.html#OPERATION_UNREACHABLE">OPERATION_UNREACHABLE</a></span></code> +<div class="block"> + The operation affects an agent that has lost contact with the master, + typically due to a network failure or partition.</div> +</td> +</tr> <tr class="rowColor"> <td class="colOne"><code><span class="memberNameLink"><a href="../../../org/apache/mesos/Protos.OperationState.html#OPERATION_UNSUPPORTED">OPERATION_UNSUPPORTED</a></span></code> <div class="block"> @@ -225,11 +253,43 @@ extends java.lang.Enum<<a href="../../../org/apache/mesos/Protos.OperationSta </tr> <tr class="altColor"> <td class="colFirst"><code>static int</code></td> +<td class="colLast"><code><span class="memberNameLink"><a href="../../../org/apache/mesos/Protos.OperationState.html#OPERATION_GONE_BY_OPERATOR_VALUE">OPERATION_GONE_BY_OPERATOR_VALUE</a></span></code> +<div class="block"> + The operation affected an agent that the master cannot contact; + the operator has asserted that the agent has been shutdown, but this has + not been directly confirmed by the master.</div> +</td> +</tr> +<tr class="rowColor"> +<td class="colFirst"><code>static int</code></td> <td class="colLast"><code><span class="memberNameLink"><a href="../../../org/apache/mesos/Protos.OperationState.html#OPERATION_PENDING_VALUE">OPERATION_PENDING_VALUE</a></span></code> <div class="block"> Initial state.</div> </td> </tr> +<tr class="altColor"> +<td class="colFirst"><code>static int</code></td> +<td class="colLast"><code><span class="memberNameLink"><a href="../../../org/apache/mesos/Protos.OperationState.html#OPERATION_RECOVERING_VALUE">OPERATION_RECOVERING_VALUE</a></span></code> +<div class="block"> + The operation affects an agent that the master recovered from its + state, but that agent has not yet re-registered.</div> +</td> +</tr> +<tr class="rowColor"> +<td class="colFirst"><code>static int</code></td> +<td class="colLast"><code><span class="memberNameLink"><a href="../../../org/apache/mesos/Protos.OperationState.html#OPERATION_UNKNOWN_VALUE">OPERATION_UNKNOWN_VALUE</a></span></code> +<div class="block"> + The master has no knowledge of the operation.</div> +</td> +</tr> +<tr class="altColor"> +<td class="colFirst"><code>static int</code></td> +<td class="colLast"><code><span class="memberNameLink"><a href="../../../org/apache/mesos/Protos.OperationState.html#OPERATION_UNREACHABLE_VALUE">OPERATION_UNREACHABLE_VALUE</a></span></code> +<div class="block"> + The operation affects an agent that has lost contact with the master, + typically due to a network failure or partition.</div> +</td> +</tr> <tr class="rowColor"> <td class="colFirst"><code>static int</code></td> <td class="colLast"><code><span class="memberNameLink"><a href="../../../org/apache/mesos/Protos.OperationState.html#OPERATION_UNSUPPORTED_VALUE">OPERATION_UNSUPPORTED_VALUE</a></span></code> @@ -403,7 +463,7 @@ the order they are declared.</div> <a name="OPERATION_DROPPED"> <!-- --> </a> -<ul class="blockListLast"> +<ul class="blockList"> <li class="blockList"> <h4>OPERATION_DROPPED</h4> <pre>public static final <a href="../../../org/apache/mesos/Protos.OperationState.html" title="enum in org.apache.mesos">Protos.OperationState</a> OPERATION_DROPPED</pre> @@ -414,6 +474,77 @@ the order they are declared.</div> <code>OPERATION_DROPPED = 5;</code></div> </li> </ul> +<a name="OPERATION_UNREACHABLE"> +<!-- --> +</a> +<ul class="blockList"> +<li class="blockList"> +<h4>OPERATION_UNREACHABLE</h4> +<pre>public static final <a href="../../../org/apache/mesos/Protos.OperationState.html" title="enum in org.apache.mesos">Protos.OperationState</a> OPERATION_UNREACHABLE</pre> +<div class="block"><pre> + The operation affects an agent that has lost contact with the master, + typically due to a network failure or partition. The operation may or may + not still be pending. + </pre> + + <code>OPERATION_UNREACHABLE = 6;</code></div> +</li> +</ul> +<a name="OPERATION_GONE_BY_OPERATOR"> +<!-- --> +</a> +<ul class="blockList"> +<li class="blockList"> +<h4>OPERATION_GONE_BY_OPERATOR</h4> +<pre>public static final <a href="../../../org/apache/mesos/Protos.OperationState.html" title="enum in org.apache.mesos">Protos.OperationState</a> OPERATION_GONE_BY_OPERATOR</pre> +<div class="block"><pre> + The operation affected an agent that the master cannot contact; + the operator has asserted that the agent has been shutdown, but this has + not been directly confirmed by the master. + If the operator is correct, the operation is not pending and this is a + terminal state; if the operator is mistaken, the operation may still be + pending and might return to a different state in the future. + </pre> + + <code>OPERATION_GONE_BY_OPERATOR = 7;</code></div> +</li> +</ul> +<a name="OPERATION_RECOVERING"> +<!-- --> +</a> +<ul class="blockList"> +<li class="blockList"> +<h4>OPERATION_RECOVERING</h4> +<pre>public static final <a href="../../../org/apache/mesos/Protos.OperationState.html" title="enum in org.apache.mesos">Protos.OperationState</a> OPERATION_RECOVERING</pre> +<div class="block"><pre> + The operation affects an agent that the master recovered from its + state, but that agent has not yet re-registered. + The operation can transition to `OPERATION_UNREACHABLE` if the + corresponding agent is marked as unreachable, and will transition to + another status if the agent re-registers. + </pre> + + <code>OPERATION_RECOVERING = 8;</code></div> +</li> +</ul> +<a name="OPERATION_UNKNOWN"> +<!-- --> +</a> +<ul class="blockListLast"> +<li class="blockList"> +<h4>OPERATION_UNKNOWN</h4> +<pre>public static final <a href="../../../org/apache/mesos/Protos.OperationState.html" title="enum in org.apache.mesos">Protos.OperationState</a> OPERATION_UNKNOWN</pre> +<div class="block"><pre> + The master has no knowledge of the operation. This is typically + because either (a) the master never had knowledge of the operation, or + (b) the master forgot about the operation because it garbage collected + its metadata about the operation. The operation may or may not still be + pending. + </pre> + + <code>OPERATION_UNKNOWN = 9;</code></div> +</li> +</ul> </li> </ul> <!-- ============ FIELD DETAIL =========== --> @@ -515,7 +646,7 @@ the order they are declared.</div> <a name="OPERATION_DROPPED_VALUE"> <!-- --> </a> -<ul class="blockListLast"> +<ul class="blockList"> <li class="blockList"> <h4>OPERATION_DROPPED_VALUE</h4> <pre>public static final int OPERATION_DROPPED_VALUE</pre> @@ -530,6 +661,93 @@ the order they are declared.</div> </dl> </li> </ul> +<a name="OPERATION_UNREACHABLE_VALUE"> +<!-- --> +</a> +<ul class="blockList"> +<li class="blockList"> +<h4>OPERATION_UNREACHABLE_VALUE</h4> +<pre>public static final int OPERATION_UNREACHABLE_VALUE</pre> +<div class="block"><pre> + The operation affects an agent that has lost contact with the master, + typically due to a network failure or partition. The operation may or may + not still be pending. + </pre> + + <code>OPERATION_UNREACHABLE = 6;</code></div> +<dl> +<dt><span class="seeLabel">See Also:</span></dt> +<dd><a href="../../../constant-values.html#org.apache.mesos.Protos.OperationState.OPERATION_UNREACHABLE_VALUE">Constant Field Values</a></dd> +</dl> +</li> +</ul> +<a name="OPERATION_GONE_BY_OPERATOR_VALUE"> +<!-- --> +</a> +<ul class="blockList"> +<li class="blockList"> +<h4>OPERATION_GONE_BY_OPERATOR_VALUE</h4> +<pre>public static final int OPERATION_GONE_BY_OPERATOR_VALUE</pre> +<div class="block"><pre> + The operation affected an agent that the master cannot contact; + the operator has asserted that the agent has been shutdown, but this has + not been directly confirmed by the master. + If the operator is correct, the operation is not pending and this is a + terminal state; if the operator is mistaken, the operation may still be + pending and might return to a different state in the future. + </pre> + + <code>OPERATION_GONE_BY_OPERATOR = 7;</code></div> +<dl> +<dt><span class="seeLabel">See Also:</span></dt> +<dd><a href="../../../constant-values.html#org.apache.mesos.Protos.OperationState.OPERATION_GONE_BY_OPERATOR_VALUE">Constant Field Values</a></dd> +</dl> +</li> +</ul> +<a name="OPERATION_RECOVERING_VALUE"> +<!-- --> +</a> +<ul class="blockList"> +<li class="blockList"> +<h4>OPERATION_RECOVERING_VALUE</h4> +<pre>public static final int OPERATION_RECOVERING_VALUE</pre> +<div class="block"><pre> + The operation affects an agent that the master recovered from its + state, but that agent has not yet re-registered. + The operation can transition to `OPERATION_UNREACHABLE` if the + corresponding agent is marked as unreachable, and will transition to + another status if the agent re-registers. + </pre> + + <code>OPERATION_RECOVERING = 8;</code></div> +<dl> +<dt><span class="seeLabel">See Also:</span></dt> +<dd><a href="../../../constant-values.html#org.apache.mesos.Protos.OperationState.OPERATION_RECOVERING_VALUE">Constant Field Values</a></dd> +</dl> +</li> +</ul> +<a name="OPERATION_UNKNOWN_VALUE"> +<!-- --> +</a> +<ul class="blockListLast"> +<li class="blockList"> +<h4>OPERATION_UNKNOWN_VALUE</h4> +<pre>public static final int OPERATION_UNKNOWN_VALUE</pre> +<div class="block"><pre> + The master has no knowledge of the operation. This is typically + because either (a) the master never had knowledge of the operation, or + (b) the master forgot about the operation because it garbage collected + its metadata about the operation. The operation may or may not still be + pending. + </pre> + + <code>OPERATION_UNKNOWN = 9;</code></div> +<dl> +<dt><span class="seeLabel">See Also:</span></dt> +<dd><a href="../../../constant-values.html#org.apache.mesos.Protos.OperationState.OPERATION_UNKNOWN_VALUE">Constant Field Values</a></dd> +</dl> +</li> +</ul> </li> </ul> <!-- ============ METHOD DETAIL ========== --> http://git-wip-us.apache.org/repos/asf/mesos-site/blob/b953594b/content/blog/feed.xml ---------------------------------------------------------------------- diff --git a/content/blog/feed.xml b/content/blog/feed.xml index bfdca09..860c347 100644 --- a/content/blog/feed.xml +++ b/content/blog/feed.xml @@ -292,7 +292,7 @@ To learn more about CSI work in Mesos, you can dig into the design document < </ul> -<p>If you are a user and would like to suggest some areas for performance improvement, please let us know by emailing <a href="&#x6d;&#97;&#x69;&#108;&#x74;&#111;&#58;&#100;&#x65;&#x76;&#x40;&#97;&#x70;&#x61;&#99;&#104;&#101;&#x2e;&#x6d;&#x65;&#115;&#111;&#x73;&#46;&#111;&#x72;&#x67;">&#x64;&#101;&#x76;&#x40;&#x61;&#x70;&#x61;&#x63;&#x68;&#x65;&#x2e;&#x6d;&#101;&#115;&#111;&#115;&#46;&#x6f;&#114;&#103;</a>.</p> +<p>If you are a user and would like to suggest some areas for performance improvement, please let us know by emailing <a href="&#x6d;&#97;&#105;&#x6c;&#116;&#x6f;&#58;&#100;&#x65;&#118;&#64;&#x61;&#112;&#x61;&#x63;&#104;&#x65;&#x2e;&#109;&#101;&#115;&#x6f;&#115;&#46;&#111;&#x72;&#x67;">&#x64;&#101;&#118;&#x40;&#97;&#112;&#97;&#x63;&#104;&#x65;&#x2e;&#109;&#101;&#115;&#x6f;&#115;&#46;&#111;&#x72;&#103;</a>.</p> </content> </entry> http://git-wip-us.apache.org/repos/asf/mesos-site/blob/b953594b/content/blog/performance-working-group-progress-report/index.html ---------------------------------------------------------------------- diff --git a/content/blog/performance-working-group-progress-report/index.html b/content/blog/performance-working-group-progress-report/index.html index 491fa5f..d3fe5f0 100644 --- a/content/blog/performance-working-group-progress-report/index.html +++ b/content/blog/performance-working-group-progress-report/index.html @@ -238,7 +238,7 @@ </ul> -<p>If you are a user and would like to suggest some areas for performance improvement, please let us know by emailing <a href="mailto:dev@apache.mesos.org">dev@apache.mesos.org</a>.</p> +<p>If you are a user and would like to suggest some areas for performance improvement, please let us know by emailing <a href="mailto:dev@apache.mesos.org">dev@apache.mesos.org</a>.</p> </div> </div> http://git-wip-us.apache.org/repos/asf/mesos-site/blob/b953594b/content/documentation/csi/index.html ---------------------------------------------------------------------- diff --git a/content/documentation/csi/index.html b/content/documentation/csi/index.html index db9e084..22e0dc6 100644 --- a/content/documentation/csi/index.html +++ b/content/documentation/csi/index.html @@ -448,30 +448,11 @@ the source disk resource will be in <code>BLOCK</code> type.</p> operations so that they know if the dynamic disk provisioning is successful or not.</p> -<p>Unfortunately, the current scheduler API does not provide a way to give explicit -offer operation feedback. Frameworks have to infer the result of the operation -by looking at various sources of information that are available to them. Here are -the tips to get offer operation results:</p> - -<ul> -<li>Leverage <a href="/documentation/latest/./reservation/#reservation-labels">reservation labels</a>. Reservation -labels can be used to uniquely identify a resource. By looking at the -reservation labels of an offered resource, the framework can infer if an -operation is successful or not.</li> -<li>Use <a href="/documentation/latest/./operator-http-api/">operator API</a> to get the current set of resources.</li> -</ul> - - -<h5>Explicit Operation Feedback</h5> - -<p>Even if there are tips to infer offer operation results, it is far from ideal. -The biggest issue is that it is impossible to get the failure reason if an offer -operation fails. For instance, a CSI plugin might return a failure when creating -a volume, and it is important for the framework to know about that and surface -that information to the end user.</p> - -<p>As a result, we need a way to get explicit operation feedback just like task -status updates. This feature is <a href="https://issues.apache.org/jira/browse/MESOS-8054">coming soon</a>.</p> +<p>Starting with Mesos 1.6.0 it is possible to opt-in to receive status updates +related to operations that affect resources managed by a resource provider. In +order to do so, the framework has to set the <code>id</code> field in the operation. +Support for operations affecting the agent default resources is <a href="https://issues.apache.org/jira/browse/MESOS-8194">coming +soon</a>.</p> <h2>Profiles</h2> http://git-wip-us.apache.org/repos/asf/mesos-site/blob/b953594b/content/documentation/latest/csi/index.html ---------------------------------------------------------------------- diff --git a/content/documentation/latest/csi/index.html b/content/documentation/latest/csi/index.html index 84c6509..ed45a72 100644 --- a/content/documentation/latest/csi/index.html +++ b/content/documentation/latest/csi/index.html @@ -448,30 +448,11 @@ the source disk resource will be in <code>BLOCK</code> type.</p> operations so that they know if the dynamic disk provisioning is successful or not.</p> -<p>Unfortunately, the current scheduler API does not provide a way to give explicit -offer operation feedback. Frameworks have to infer the result of the operation -by looking at various sources of information that are available to them. Here are -the tips to get offer operation results:</p> - -<ul> -<li>Leverage <a href="/documentation/latest/./reservation/#reservation-labels">reservation labels</a>. Reservation -labels can be used to uniquely identify a resource. By looking at the -reservation labels of an offered resource, the framework can infer if an -operation is successful or not.</li> -<li>Use <a href="/documentation/latest/./operator-http-api/">operator API</a> to get the current set of resources.</li> -</ul> - - -<h5>Explicit Operation Feedback</h5> - -<p>Even if there are tips to infer offer operation results, it is far from ideal. -The biggest issue is that it is impossible to get the failure reason if an offer -operation fails. For instance, a CSI plugin might return a failure when creating -a volume, and it is important for the framework to know about that and surface -that information to the end user.</p> - -<p>As a result, we need a way to get explicit operation feedback just like task -status updates. This feature is <a href="https://issues.apache.org/jira/browse/MESOS-8054">coming soon</a>.</p> +<p>Starting with Mesos 1.6.0 it is possible to opt-in to receive status updates +related to operations that affect resources managed by a resource provider. In +order to do so, the framework has to set the <code>id</code> field in the operation. +Support for operations affecting the agent default resources is <a href="https://issues.apache.org/jira/browse/MESOS-8194">coming +soon</a>.</p> <h2>Profiles</h2> http://git-wip-us.apache.org/repos/asf/mesos-site/blob/b953594b/content/documentation/latest/monitoring/index.html ---------------------------------------------------------------------- diff --git a/content/documentation/latest/monitoring/index.html b/content/documentation/latest/monitoring/index.html index 24218b7..65a2af7 100644 --- a/content/documentation/latest/monitoring/index.html +++ b/content/documentation/latest/monitoring/index.html @@ -771,6 +771,13 @@ messages may indicate that there is a problem with the network.</p> </tr> <tr> <td> + <code>master/messages_reconcile_operations</code> + </td> + <td>Number of reconcile operations messages</td> + <td>Counter</td> +</tr> +<tr> + <td> <code>master/messages_reconcile_tasks</code> </td> <td>Number of reconcile task messages</td> http://git-wip-us.apache.org/repos/asf/mesos-site/blob/b953594b/content/documentation/latest/operator-http-api/index.html ---------------------------------------------------------------------- diff --git a/content/documentation/latest/operator-http-api/index.html b/content/documentation/latest/operator-http-api/index.html index f734143..0d530a2 100644 --- a/content/documentation/latest/operator-http-api/index.html +++ b/content/documentation/latest/operator-http-api/index.html @@ -774,6 +774,10 @@ Content-Type: application/json "value": 0.0 }, { + "name": "master/messages_reconcile_operations", + "value": 0.0 + }, + { "name": "master/messages_reconcile_tasks", "value": 0.0 }, http://git-wip-us.apache.org/repos/asf/mesos-site/blob/b953594b/content/documentation/latest/scheduler-http-api/index.html ---------------------------------------------------------------------- diff --git a/content/documentation/latest/scheduler-http-api/index.html b/content/documentation/latest/scheduler-http-api/index.html index 3ab68da..c6830b7 100644 --- a/content/documentation/latest/scheduler-http-api/index.html +++ b/content/documentation/latest/scheduler-http-api/index.html @@ -139,7 +139,11 @@ considered stable and is the recommended way to develop new Mesos schedulers.</p <p><strong>Schedulers are expected to keep the subscription connection open as long as possible (barring errors in network, software, hardware, etc.) and incrementally process the response.</strong> HTTP client libraries that can only parse the response after the connection is closed cannot be used. For the encoding used, please refer to <strong>Events</strong> section below.</p> -<p>All subsequent (non-<code>SUBSCRIBE</code>) requests to the “/scheduler” endpoint (see details below in <strong>Calls</strong> section) must be sent using a different connection than the one used for subscription. The master responds to these HTTP POST requests with “202 Accepted” status codes (or, for unsuccessful requests, with 4xx or 5xx status codes; details in later sections). The “202 Accepted” response means that a request has been accepted for processing, not that the processing of the request has been completed. The request might or might not be acted upon by Mesos (e.g., master fails during the processing of the request). Any asynchronous responses from these requests will be streamed on the long-lived subscription connection. Schedulers can submit requests using more than one different HTTP connection.</p> +<p>All subsequent (non-<code>SUBSCRIBE</code>) requests to the “/scheduler” endpoint (see details below in <strong>Calls</strong> section) must be sent using a different connection than the one used for subscription. Schedulers can submit requests using more than one different HTTP connection.</p> + +<p>The master responds to HTTP POST requests that require asynchronous processing with status <strong>202 Accepted</strong> (or, for unsuccessful requests, with 4xx or 5xx status codes; details in later sections). The <strong>202 Accepted</strong> response means that a request has been accepted for processing, not that the processing of the request has been completed. The request might or might not be acted upon by Mesos (e.g., master fails during the processing of the request). Any asynchronous responses from these requests will be streamed on the long-lived subscription connection.</p> + +<p>The master responds to HTTP POST requests that can be answered synchronously and immediately with status <strong>200 OK</strong> (or, for unsuccessful requests, with 4xx or 5xx status codes; details in later sections), possibly including a response body encoded in JSON or Protobuf. The encoding depends on the <strong>Accept</strong> header present in the request (the default encoding is JSON).</p> <h2>Calls</h2> @@ -272,6 +276,10 @@ HTTP/1.1 202 Accepted <p>NOTE: Mesos will cap <code>Filters.refuse_seconds</code> at 31536000 seconds (365 days).</p> +<p>The master will send task status updates in response to <code>LAUNCH</code> and <code>LAUNCH_GROUP</code> operations. For other types of operations, if an operation ID is specified, the master will send operation status updates in response.</p> + +<p>NOTE: For the time being, an operation ID can only be set if the operation affects resources provided by a <a href="/documentation/latest/./csi/#resource-providers">resource provider</a>. See <a href="https://issues.apache.org/jira/browse/MESOS-8371">MESOS-8194</a> for more details.</p> + <pre><code>ACCEPT Request (JSON): POST /api/v1/scheduler HTTP/1.1 @@ -452,6 +460,32 @@ ACKNOWLEDGE Response: HTTP/1.1 202 Accepted </code></pre> +<h3>ACKNOWLEDGE_OPERATION_STATUS</h3> + +<p>Sent by the scheduler to acknowledge an operation status update. Schedulers are responsible for explicitly acknowledging the receipt of status updates that have <code>status.uuid</code> set. These status updates are retried until they are acknowledged by the scheduler. The scheduler must not acknowledge status updates that do not have <code>status.uuid</code> set, as they are not retried. The <code>uuid</code> field contains raw bytes encoded in Base64.</p> + +<pre><code>ACKNOWLEDGE_OPERATION_STATUS Request (JSON): +POST /api/v1/scheduler HTTP/1.1 + +Host: masterhost:5050 +Content-Type: application/json +Mesos-Stream-Id: 130ae4e3-6b13-4ef4-baa9-9f2e85c3e9af + +{ + "framework_id": { "value": "12220-3440-12532-2345" }, + "type": "ACKNOWLEDGE_OPERATION_STATUS", + "acknowledge_operation_status": { + "agent_id": { "value": "12220-3440-12532-S1233" }, + "resource_provider_id": { "value": "12220-3440-12532-rp" }, + "uuid": "jhadf73jhakdlfha723adf", + "operation_id": "73jhakdlfha723adf" + } +} + +ACKNOWLEDGE_OPERATION_STATUS Response: +HTTP/1.1 202 Accepted +</code></pre> + <h3>RECONCILE</h3> <p>Sent by the scheduler to query the status of non-terminal tasks. This causes the master to send back <code>UPDATE</code> events for each task in the list. Tasks that are no longer known to Mesos will result in <code>TASK_LOST</code> updates. If the list of tasks is empty, master will send <code>UPDATE</code> events for all currently known tasks of the framework.</p> @@ -479,6 +513,50 @@ RECONCILE Response: HTTP/1.1 202 Accepted </code></pre> +<h3>RECONCILE_OPERATIONS</h3> + +<p>Sent by the scheduler to query the status of non-terminal operations. The master will respond with a <code>RECONCILE_OPERATIONS</code> response containing the status of each operation in the list. If the list of operations is empty, the master will include in the response all currently known operations of the framework.</p> + +<pre><code>RECONCILE_OPERATIONS Request (JSON): +POST /api/v1/scheduler HTTP/1.1 + +Host: masterhost:5050 +Content-Type: application/json +Accept: application/json +Mesos-Stream-Id: 130ae4e3-6b13-4ef4-baa9-9f2e85c3e9af + +{ + "framework_id": { "value": "12220-3440-12532-2345" }, + "type": "RECONCILE_OPERATIONS", + "reconcile_operations": { + "operations": [ + { + "operation_id": { "value": "312325" }, + "agent_id": { "value": "123535" } + } + ] + } +} + +RECONCILE_OPERATIONS Response: +HTTP/1.1 200 Accepted + +Content-Type: application/json + +{ + "type": "RECONCILE_OPERATIONS", + "reconcile_operations": { + "operation_statuses": [ + { + "operation_id": { "value": "312325" }, + "state": "OPERATION_PENDING", + "uuid": "adfadfadbhgvjayd23r2uahj" + } + ] + } +} +</code></pre> + <h3>MESSAGE</h3> <p>Sent by the scheduler to send arbitrary binary data to the executor. Mesos neither interprets this data nor makes any guarantees about the delivery of this message to the executor. <code>data</code> is raw bytes encoded in Base64.</p> http://git-wip-us.apache.org/repos/asf/mesos-site/blob/b953594b/content/documentation/monitoring/index.html ---------------------------------------------------------------------- diff --git a/content/documentation/monitoring/index.html b/content/documentation/monitoring/index.html index 293ca08..9f74fcc 100644 --- a/content/documentation/monitoring/index.html +++ b/content/documentation/monitoring/index.html @@ -771,6 +771,13 @@ messages may indicate that there is a problem with the network.</p> </tr> <tr> <td> + <code>master/messages_reconcile_operations</code> + </td> + <td>Number of reconcile operations messages</td> + <td>Counter</td> +</tr> +<tr> + <td> <code>master/messages_reconcile_tasks</code> </td> <td>Number of reconcile task messages</td> http://git-wip-us.apache.org/repos/asf/mesos-site/blob/b953594b/content/documentation/operator-http-api/index.html ---------------------------------------------------------------------- diff --git a/content/documentation/operator-http-api/index.html b/content/documentation/operator-http-api/index.html index 4bcb9a4..c710e54 100644 --- a/content/documentation/operator-http-api/index.html +++ b/content/documentation/operator-http-api/index.html @@ -774,6 +774,10 @@ Content-Type: application/json "value": 0.0 }, { + "name": "master/messages_reconcile_operations", + "value": 0.0 + }, + { "name": "master/messages_reconcile_tasks", "value": 0.0 }, http://git-wip-us.apache.org/repos/asf/mesos-site/blob/b953594b/content/documentation/scheduler-http-api/index.html ---------------------------------------------------------------------- diff --git a/content/documentation/scheduler-http-api/index.html b/content/documentation/scheduler-http-api/index.html index c254010..d6b7550 100644 --- a/content/documentation/scheduler-http-api/index.html +++ b/content/documentation/scheduler-http-api/index.html @@ -139,7 +139,11 @@ considered stable and is the recommended way to develop new Mesos schedulers.</p <p><strong>Schedulers are expected to keep the subscription connection open as long as possible (barring errors in network, software, hardware, etc.) and incrementally process the response.</strong> HTTP client libraries that can only parse the response after the connection is closed cannot be used. For the encoding used, please refer to <strong>Events</strong> section below.</p> -<p>All subsequent (non-<code>SUBSCRIBE</code>) requests to the “/scheduler” endpoint (see details below in <strong>Calls</strong> section) must be sent using a different connection than the one used for subscription. The master responds to these HTTP POST requests with “202 Accepted” status codes (or, for unsuccessful requests, with 4xx or 5xx status codes; details in later sections). The “202 Accepted” response means that a request has been accepted for processing, not that the processing of the request has been completed. The request might or might not be acted upon by Mesos (e.g., master fails during the processing of the request). Any asynchronous responses from these requests will be streamed on the long-lived subscription connection. Schedulers can submit requests using more than one different HTTP connection.</p> +<p>All subsequent (non-<code>SUBSCRIBE</code>) requests to the “/scheduler” endpoint (see details below in <strong>Calls</strong> section) must be sent using a different connection than the one used for subscription. Schedulers can submit requests using more than one different HTTP connection.</p> + +<p>The master responds to HTTP POST requests that require asynchronous processing with status <strong>202 Accepted</strong> (or, for unsuccessful requests, with 4xx or 5xx status codes; details in later sections). The <strong>202 Accepted</strong> response means that a request has been accepted for processing, not that the processing of the request has been completed. The request might or might not be acted upon by Mesos (e.g., master fails during the processing of the request). Any asynchronous responses from these requests will be streamed on the long-lived subscription connection.</p> + +<p>The master responds to HTTP POST requests that can be answered synchronously and immediately with status <strong>200 OK</strong> (or, for unsuccessful requests, with 4xx or 5xx status codes; details in later sections), possibly including a response body encoded in JSON or Protobuf. The encoding depends on the <strong>Accept</strong> header present in the request (the default encoding is JSON).</p> <h2>Calls</h2> @@ -272,6 +276,10 @@ HTTP/1.1 202 Accepted <p>NOTE: Mesos will cap <code>Filters.refuse_seconds</code> at 31536000 seconds (365 days).</p> +<p>The master will send task status updates in response to <code>LAUNCH</code> and <code>LAUNCH_GROUP</code> operations. For other types of operations, if an operation ID is specified, the master will send operation status updates in response.</p> + +<p>NOTE: For the time being, an operation ID can only be set if the operation affects resources provided by a <a href="/documentation/latest/./csi/#resource-providers">resource provider</a>. See <a href="https://issues.apache.org/jira/browse/MESOS-8371">MESOS-8194</a> for more details.</p> + <pre><code>ACCEPT Request (JSON): POST /api/v1/scheduler HTTP/1.1 @@ -452,6 +460,32 @@ ACKNOWLEDGE Response: HTTP/1.1 202 Accepted </code></pre> +<h3>ACKNOWLEDGE_OPERATION_STATUS</h3> + +<p>Sent by the scheduler to acknowledge an operation status update. Schedulers are responsible for explicitly acknowledging the receipt of status updates that have <code>status.uuid</code> set. These status updates are retried until they are acknowledged by the scheduler. The scheduler must not acknowledge status updates that do not have <code>status.uuid</code> set, as they are not retried. The <code>uuid</code> field contains raw bytes encoded in Base64.</p> + +<pre><code>ACKNOWLEDGE_OPERATION_STATUS Request (JSON): +POST /api/v1/scheduler HTTP/1.1 + +Host: masterhost:5050 +Content-Type: application/json +Mesos-Stream-Id: 130ae4e3-6b13-4ef4-baa9-9f2e85c3e9af + +{ + "framework_id": { "value": "12220-3440-12532-2345" }, + "type": "ACKNOWLEDGE_OPERATION_STATUS", + "acknowledge_operation_status": { + "agent_id": { "value": "12220-3440-12532-S1233" }, + "resource_provider_id": { "value": "12220-3440-12532-rp" }, + "uuid": "jhadf73jhakdlfha723adf", + "operation_id": "73jhakdlfha723adf" + } +} + +ACKNOWLEDGE_OPERATION_STATUS Response: +HTTP/1.1 202 Accepted +</code></pre> + <h3>RECONCILE</h3> <p>Sent by the scheduler to query the status of non-terminal tasks. This causes the master to send back <code>UPDATE</code> events for each task in the list. Tasks that are no longer known to Mesos will result in <code>TASK_LOST</code> updates. If the list of tasks is empty, master will send <code>UPDATE</code> events for all currently known tasks of the framework.</p> @@ -479,6 +513,50 @@ RECONCILE Response: HTTP/1.1 202 Accepted </code></pre> +<h3>RECONCILE_OPERATIONS</h3> + +<p>Sent by the scheduler to query the status of non-terminal operations. The master will respond with a <code>RECONCILE_OPERATIONS</code> response containing the status of each operation in the list. If the list of operations is empty, the master will include in the response all currently known operations of the framework.</p> + +<pre><code>RECONCILE_OPERATIONS Request (JSON): +POST /api/v1/scheduler HTTP/1.1 + +Host: masterhost:5050 +Content-Type: application/json +Accept: application/json +Mesos-Stream-Id: 130ae4e3-6b13-4ef4-baa9-9f2e85c3e9af + +{ + "framework_id": { "value": "12220-3440-12532-2345" }, + "type": "RECONCILE_OPERATIONS", + "reconcile_operations": { + "operations": [ + { + "operation_id": { "value": "312325" }, + "agent_id": { "value": "123535" } + } + ] + } +} + +RECONCILE_OPERATIONS Response: +HTTP/1.1 200 Accepted + +Content-Type: application/json + +{ + "type": "RECONCILE_OPERATIONS", + "reconcile_operations": { + "operation_statuses": [ + { + "operation_id": { "value": "312325" }, + "state": "OPERATION_PENDING", + "uuid": "adfadfadbhgvjayd23r2uahj" + } + ] + } +} +</code></pre> + <h3>MESSAGE</h3> <p>Sent by the scheduler to send arbitrary binary data to the executor. Mesos neither interprets this data nor makes any guarantees about the delivery of this message to the executor. <code>data</code> is raw bytes encoded in Base64.</p>
