+1

On Mon, Oct 9, 2017 at 10:56 AM, James Peach <jor...@gmail.com> wrote:

> Hi all,
>
> In https://reviews.apache.org/r/62644/, I am proposing to add an optional
> Resources field to the TaskStatus message named `limited_resources`.
>
> In the case that a task is killed because it violated a resource
> constraint (ie. the reason field is REASON_CONTAINER_LIMITATION,
> REASON_CONTAINER_LIMITATION_DISK or REASON_CONTAINER_LIMITATION_MEMORY),
> this field may be populated with the resource that triggered the
> limitation. This is intended to give better information to schedulers about
> task resource failures, in the expectation that it will help them bubble
> useful information up to the user or a monitoring system.
>
> diff --git a/include/mesos/v1/mesos.proto b/include/mesos/v1/mesos.proto
> index d742adbbf..559d09e37 100644
> --- a/include/mesos/v1/mesos.proto
> +++ b/include/mesos/v1/mesos.proto
> @@ -2252,6 +2252,13 @@ message TaskStatus {
>    // status updates for tasks running on agents that are unreachable
>    // (e.g., partitioned away from the master).
>    optional TimeInfo unreachable_time = 14;
> +
> +  // If the reason field indicates a container resource limitation,
> +  // this field contains the resource whose limits were violated.
> +  //
> +  // NOTE: 'Resources' is used here because the resource may span
> +  // multiple roles (e.g. `"mem(*):1;mem(role):2"`).
> +  repeated Resource limited_resources = 16;
>  }
>
>
>
> cheers,
> James
>
>
>

Reply via email to