[ 
https://issues.apache.org/jira/browse/MESOS-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-5280:
--------------------------
    Description: 
There exist a few different error handling styles in the sorter.

h2. Hard checks
e.g., 
[DRFSorter::update|https://github.com/apache/mesos/blob/c530deb3050d862fd894a9c4ed0a8ddca8714a63/src/master/allocator/sorter/drf/sorter.cpp#L62]
{code}
CHECK(weights.contains(name));
{code}

h2. No-op if it results in an error condition.
e.g., 
[DRFSorter::allocated|https://github.com/apache/mesos/blob/c530deb3050d862fd894a9c4ed0a8ddca8714a63/src/master/allocator/sorter/drf/sorter.cpp#L116]:
{code}
set<Client, DRFComparator>::iterator it = find(name);

if (it != clients.end()) { // TODO(benh): This should really be a CHECK.
...
}
{code}

The problem:

- Silence no-ops is not ideal. (Implicitness makes it hard to debug things and 
we have run into one instance of this).
- Hard CHECKs on invalid arguments is often too harsh.
- Not checking preconditions can lead to subtle bugs.
- We should check errors consistently.

  was:
There exist a few different error handling styles in the sorter.

h2. Hard checks
e.g., 
[DRFSorter::update|https://github.com/apache/mesos/blob/c530deb3050d862fd894a9c4ed0a8ddca8714a63/src/master/allocator/sorter/drf/sorter.cpp#L62]
{code}
CHECK(weights.contains(name));
{code}

h2. No-op if it results in an error condition.
e.g., 
[DRFSorter::allocated|https://github.com/apache/mesos/blob/c530deb3050d862fd894a9c4ed0a8ddca8714a63/src/master/allocator/sorter/drf/sorter.cpp#L116]:
{code}
set<Client, DRFComparator>::iterator it = find(name);

if (it != clients.end()) { // TODO(benh): This should really be a CHECK.
...
}
{code}

IMO there should never be silent no-ops. Short of CHECK, we should return an 
error if it's indeed an error. If either path of the branch is valid and one is 
a  noop, we should log the noop branch or return a 'bool' so the caller can 
distinguish the two.

Implicitness makes it hard to debug things and we have run into one instance of 
this.

My proposal is to use CHECKs consistently in sorter.


> Inconsistent error checking in DRF sorter.
> ------------------------------------------
>
>                 Key: MESOS-5280
>                 URL: https://issues.apache.org/jira/browse/MESOS-5280
>             Project: Mesos
>          Issue Type: Bug
>          Components: allocation
>            Reporter: Yan Xu
>            Assignee: Yan Xu
>
> There exist a few different error handling styles in the sorter.
> h2. Hard checks
> e.g., 
> [DRFSorter::update|https://github.com/apache/mesos/blob/c530deb3050d862fd894a9c4ed0a8ddca8714a63/src/master/allocator/sorter/drf/sorter.cpp#L62]
> {code}
> CHECK(weights.contains(name));
> {code}
> h2. No-op if it results in an error condition.
> e.g., 
> [DRFSorter::allocated|https://github.com/apache/mesos/blob/c530deb3050d862fd894a9c4ed0a8ddca8714a63/src/master/allocator/sorter/drf/sorter.cpp#L116]:
> {code}
> set<Client, DRFComparator>::iterator it = find(name);
> if (it != clients.end()) { // TODO(benh): This should really be a CHECK.
> ...
> }
> {code}
> The problem:
> - Silence no-ops is not ideal. (Implicitness makes it hard to debug things 
> and we have run into one instance of this).
> - Hard CHECKs on invalid arguments is often too harsh.
> - Not checking preconditions can lead to subtle bugs.
> - We should check errors consistently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to