-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45961/#review141305
-----------------------------------------------------------



Partial review. Let's discuss first.


src/master/master.cpp (lines 3495 - 3498)
<https://reviews.apache.org/r/45961/#comment206863>

    `task.has_executor()` doesn't always lead to additional resources being 
used/charged against the framework. See `Master::addTask()`. It will suck a lot 
if we have to duplicate that logic here though so we might have to consider 
doing the pending task refactor first...



src/master/master.cpp (lines 3597 - 3598)
<https://reviews.apache.org/r/45961/#comment206853>

    



src/master/master.cpp (lines 3907 - 3910)
<https://reviews.apache.org/r/45961/#comment206887>

    Need to check whether this executor actually consumes resources.



src/master/allocator/mesos/hierarchical.cpp (line 1271)
<https://reviews.apache.org/r/45961/#comment206973>

    A comment on what it is used for?
    
    ```
    Due to the two stages in the allocation algorithm and the nature of shared 
resources being re-offerable even if already allocated, the same shared 
resources can appear in two distinct offers in one allocation cycle. This would 
be bad because the allocator API contract shouldn't depend on its 
implementation details. For now we make sure a shared resource is only 
allocated once in one offer cycle. To achieve this we use 
`offeredSharedResources` to keep track of shared resources already allocated in 
the current cycle.
    ```



src/master/allocator/mesos/hierarchical.cpp (lines 1331 - 1340)
<https://reviews.apache.org/r/45961/#comment206974>

    Given this is the 1st stage of the allocate cycle there's no chance a 
shared resource could be in `offeredSharedResources` already?



src/master/allocator/mesos/hierarchical.cpp (lines 1386 - 1400)
<https://reviews.apache.org/r/45961/#comment206992>

    IIUC the comment still means "this is done to make sure shared resources 
are counted at most once in the framework's allocation".
    
    The last revision had:
    
    ```
    slaves[slaveId].allocated -= resources.shared();
    slaves[slaveId].allocated += resources;
    ```
    
    How is this longer version different?



src/master/allocator/mesos/hierarchical.cpp (lines 1499 - 1513)
<https://reviews.apache.org/r/45961/#comment206995>

    Is this equivalent?
    
    ```
    Resources available = slaves[slaveId].total.nonShared() - 
slaves[slaveId].allocated).nonShared();
    available += slaves[slaveId].total.shared() - 
offeredSharedResources[slaveId];
    ```



src/master/allocator/mesos/hierarchical.cpp (lines 1570 - 1571)
<https://reviews.apache.org/r/45961/#comment206997>

    Explain the `nonShared()` here.
    
    `remainingClusterResources` doesn't exclude shared resources.



src/master/allocator/mesos/hierarchical.cpp (lines 1595 - 1605)
<https://reviews.apache.org/r/45961/#comment206999>

    Ditto to my comment on the same code above.



src/master/allocator/sorter/drf/sorter.cpp (lines 163 - 170)
<https://reviews.apache.org/r/45961/#comment207005>

    So `DRFSorter::allocated` makes sure there's no duplicate shared resources 
in `allocations` so the caller can call it without pruning out the redundant 
copies. However the `unallocated` counterpart doesn't do the same? Ideally 
usage of the pair of API should be symmetric.



src/master/allocator/sorter/drf/sorter.cpp (line 194)
<https://reviews.apache.org/r/45961/#comment207000>

    No longer relevant?



src/master/master.hpp (line 274)
<https://reviews.apache.org/r/45961/#comment206784>

    Add some comment to help explain why we need this method:
    
    ```
      // Return a subset of `resources` offered to `frameworkId` that can 
      // be recovered. Right now we filter out shared resources that are
      // still used by `frameworkId`.
      // NOTE: A framework can reuse a shared resource to launch multiple
      // tasks and the allocator only recovers a shared resource allocated
      // to the framework if it's not used by any task of the framework.
    ```



src/master/master.hpp (line 278)
<https://reviews.apache.org/r/45961/#comment207068>

    s/updated/result/.



src/master/master.hpp (lines 280 - 286)
<https://reviews.apache.org/r/45961/#comment207067>

    If `updated` has multple copies of a shared resource, the `-=` here can't 
remove it.
    
    We have to add it back. The result would have a single copy of the shared 
resource that are unused (by this framework; I think this is what we want).
    
    ```
    Resources recoverable(const Resources& resources)
    {
      Resources recoverable = resources.nonShared();
      foreach (const Resource& resource, resources.shared()) {
        if (!usedResources[frameworkId].contains(resource)) {
          recoverable += resource;
        }
      }
    
      return recoverable;
    }
    ```



src/master/master.hpp (line 343)
<https://reviews.apache.org/r/45961/#comment206780>

    Comment.
    
    ```
    // Resources consumed by pending tasks.
    ```



src/master/master.cpp (line 3409)
<https://reviews.apache.org/r/45961/#comment206787>

    Shouldn't this go through `recoverable()` as well?
    
    It'll be helpful if add some comments on the `recoverResources` to spell 
out what exactly is expected.



src/master/master.cpp (line 3488)
<https://reviews.apache.org/r/45961/#comment207073>

    Add to this comment that if the task isn't going to be launched, we don't 
need to account for its resources in `pendingResources` either.



src/master/master.cpp (line 3495)
<https://reviews.apache.org/r/45961/#comment207075>

    Checking `task.has_executor()` is insufficient as a condition to add the 
executor's resources because they may not consume additional resources.



src/master/master.cpp (line 3497)
<https://reviews.apache.org/r/45961/#comment206794>

    Two space indentation.



src/master/master.cpp (lines 3597 - 3598)
<https://reviews.apache.org/r/45961/#comment207076>

    At least we need a comment on why it's safe to recover all 
`offeredResources` if `slave == nullptr`: if slave is nullptr, no 
`usedResources` is on it so all resources should be recoverable.
    
    But honestly this looks odd and we need to invoke this conditional operator 
in multiple places. Adding the same comment to all places this is called is 
also very redundant.
    
    Perhaps `recoverable` can be moved to Master and hide this detail about 
`slave == nullptr`? Perhaps we should just wrap around 
`allocator->recoverResources` in Master to insert this logic.



src/master/master.cpp (lines 3639 - 3640)
<https://reviews.apache.org/r/45961/#comment207079>

    Ditto.



src/master/master.cpp (line 3650)
<https://reviews.apache.org/r/45961/#comment207081>

    No need for the underscore and need comments on the variable.



src/master/master.cpp (line 3907)
<https://reviews.apache.org/r/45961/#comment207082>

    Ditto to my comment above, we don't know if the task's executor actually 
consumes resources by just checking `task.has_executor()`.



src/master/master.cpp (line 3966)
<https://reviews.apache.org/r/45961/#comment207083>

    Not "original" if it goes through offer operations. How about just "We add 
back offered shared resource even if they are consumed by other tasks in the 
same accept call so multiple tasks using the same shared resource can be 
launched with the same offer".
    
    We still need to comment at where the variable is first introduced.



src/master/master.cpp (lines 4042 - 4044)
<https://reviews.apache.org/r/45961/#comment207084>

    Instead of commenting here we should comment on the method because it's 
called from multiple callsites.



src/master/master.cpp (lines 4143 - 4144)
<https://reviews.apache.org/r/45961/#comment207085>

    Ditto to comments above about the same code.



src/master/master.cpp (lines 4281 - 4285)
<https://reviews.apache.org/r/45961/#comment207092>

    So `taskId` in `pendingTasks` actually doesn't guarantee `slave != nullptr` 
because `removeSlave` doesn't clean it up.



src/master/master.cpp (lines 4287 - 4297)
<https://reviews.apache.org/r/45961/#comment207093>

    Ditto about `task.has_executor()`.



src/master/validation.hpp (lines 157 - 158)
<https://reviews.apache.org/r/45961/#comment207090>

    Don't we always have the two additional arugments, even thought they could 
be empty (but not None)?


- Jiang Yan Xu


On July 7, 2016, 8:44 a.m., Anindya Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45961/
> -----------------------------------------------------------
> 
> (Updated July 7, 2016, 8:44 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Joris Van Remoortere, Michael 
> Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-4431
>     https://issues.apache.org/jira/browse/MESOS-4431
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> o Each shared resource is accouted via its share count. This count is
>   updated based on the resource operations (such as add and subtract)
>   in various scenarios such as task launch and terminate at multiple
>   modules such as master, allocator, sorter, etc.
> o Only allow DESTROY if there are no running or pending tasks using
>   the volume. However, if the volume is in a pending offer to one or
>   more frameworks, resind that offer before processing the DESTROY.
> 
> 
> Diffs
> -----
> 
>   src/common/resources.cpp f6ff92b591c15bc8e93fd85e1896349c3a7bb968 
>   src/common/resources_utils.cpp 8e881a09500b0966b577e514077460f724868a8d 
>   src/master/allocator/mesos/hierarchical.cpp 
> c1e00039606164599e25ff5f76245e4d35ec3112 
>   src/master/allocator/sorter/drf/sorter.hpp 
> e29feebd70277c79f7c3f6fb233e7a36501cf220 
>   src/master/allocator/sorter/drf/sorter.cpp 
> 7df4dd641b21ea0705368861bf4679fed1ef078d 
>   src/master/http.cpp bff1fd53462bfec19e4a025c49a00e2855faf4f3 
>   src/master/master.hpp 60efd22280c62801b080365986fe9773737ca563 
>   src/master/master.cpp 79e3d78ba45060bc2f2532fdc3d105d1cc888d0f 
>   src/master/validation.hpp 43d876b00511d4ef652c2685d4950c994b5eef31 
>   src/master/validation.cpp 50ba372d1301d16d1738a3e4f4882b51f9ce06cd 
>   src/tests/master_validation_tests.cpp 
> 9eb82a569f7c48caa96d4d54a93b199ccac74385 
>   src/tests/sorter_tests.cpp 0a921347586808863e615ca3dcc23fae92b629f5 
>   src/v1/resources.cpp 8c3f2d1c1529915a59d47fe37bb3fc7a3267079a 
> 
> Diff: https://reviews.apache.org/r/45961/diff/
> 
> 
> Testing
> -------
> 
> Tests successful.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>

Reply via email to