[
https://issues.apache.org/jira/browse/AURORA-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14654067#comment-14654067
]
Maxim Khutornenko commented on AURORA-1415:
-------------------------------------------
h3. Problem
As it stands today, scheduler resource management is unable to support custom
resource attributes/traits defined in Mesos. Scheduler treats host resources as
homogenous scalar vectors (CPU, RAM, DISK) or ranges (PORTS) and rolls them up
without considering their unique attributes (e.g.
[role|https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L494]
or
[revokable|https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L560]).
This makes it impossible to support certain Mesos features that rely on
resource customization ([framework role|AURORA-1109], [revokable
offers|AURORA-1343], [dynamic
reservation|https://docs.google.com/document/d/1e3j69pfBgtc8xM00DhcuiMl6ImkEB5na0TzOMyzrg8A/edit]).
Scheduler resource management needs to recognize and expose Mesos resource
attributes. This will require de-anonymizing resources and preventing their
automatic rollup based on vector type.
h3. Current Implementation
The bulk of resource management in scheduler is represented by the following
classes:
- {{Resources}} - main resource container and handler. Converts from/to Mesos
protobuf resources to/From internal representation (scalar/range values for
cpu, ram, disk and ports).
- {{ResourceSlot}} - wrapper over {{Resources}} providing additional logic for
executor overhead and utility functions.
- {{ResourceAggregate}} - internal thrift object intended to communicate
aggregated cpu/ram/disk values with client/UI.
- {{ResourceAggregates}} - static library helpers for dealing with
{{ResourceAggregate}} instances.
h3. Proposal
Mesos {{Resource}} protobuf
[struct|https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L488]
is already quite rich with attributes to mirror it on the scheduler side with
a correspondent POJO. A more resilient long term approach may be holding on to
the original Mesos protobuf structs within our internal {{Resources}} object
before converting them into a generalized aggregated representation. Below is a
proposed set of changes intending to de-generalize resource management and
enable filtering/querying by arbitrary resource attributes:
- {{Resources}} will store a list of original resource vectors unchanged. E.g.:
{{Iterable<Resource> mesosResources;}}.
- {{Resources}} will support a new {{Resources filter(Predicate<Resource>
predicate);}} to filter resources by a given attribute.
- {{ResourceSlot}} will be repurposed to serve as an anonymous (aggregated
resource representation) to be returned by the new {{ResourceSlot
aggregate();}} method in {{Resources}} any time an aggregation or
transformation is needed.
- Static transformation methods (like {{subtract}}, {{divide}} and etc.) will
become {{ResourceSlot}} instance methods.
- {{ResourceAggregates}} will be merged with {{ResourceSlot}} to convert
to/from {{ResourceAggregate}} when needed.
h4. Usage Examples
h6. Getting available revocable resources from an offer
{noformat}
public static final Predicate<Resource> CPU = e -> e.getName().equals("cpu");
public static final Predicate<Resource> REVOCABLE = e -> CPU.negate() ||
CPU.and(e.getRevocable() != null);
...
ResourceSlot unused = Resources.from(offer).filter(REVOCABLE).aggregate();
{noformat}
h6. Getting available revocable resources from an offer for a given framework
role
{noformat}
public static final Predicate<Resource> ROLE = e ->
e.getRole().equals("Aurora");
...
ResourceSlot roleRevocable =
Resources.from(offer).filter(ROLE.and(REVOCABLE)).aggregate();
{noformat}
h6. Creating Mesos resources for TaskInfo with executor overhead (and epsilon,
see {{MesosTaskFactory}} for real life example)
{noformat}
List<Resource> resources =
Resources.from(ResourceSlot.from(task).add(executorOverhead).subtract(RESOURCES_EPSILON)).toResourceList();
{noformat}
> De-generalize resource handling in Scheduler
> --------------------------------------------
>
> Key: AURORA-1415
> URL: https://issues.apache.org/jira/browse/AURORA-1415
> Project: Aurora
> Issue Type: Task
> Components: Scheduler
> Reporter: Maxim Khutornenko
> Assignee: Maxim Khutornenko
>
> From design doc:
> {quote}
> To handle revocable resources correctly Aurora needs to de-generalize and
> simplify its internal resource representation. The new resource vector should
> be capable of aggregating resources by revocable flag. This will require
> refactoring existing resource handling (including AURORA-105), which will
> also help to support Mesos framework role in future.
> {quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)