> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 3499-3502
> > <https://reviews.apache.org/r/45961/diff/4-7/?file=1416585#file1416585line3499>
> >
> >     `task.has_executor()` doesn't always lead to additional resources being 
> > used/charged against the framework. See `Master::addTask()`. It will suck a 
> > lot if we have to duplicate that logic here though so we might have to 
> > consider doing the pending task refactor first...

Refactored `Master::addTask()` and moved calculation of resources for the task 
into a separate function `Master::totalTaskResources()`, and used this function 
to determine the resources for pending tasks (in `Slave::pendingResources`).


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 3913-3916
> > <https://reviews.apache.org/r/45961/diff/4-7/?file=1416585#file1416585line3913>
> >
> >     Need to check whether this executor actually consumes resources.

Please refer to earlier comment regarding `Master::totalTaskResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 4291-4301
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438752#file1438752line4291>
> >
> >     Ditto about `task.has_executor()`.

Please refer to earlier comment regarding `Master::totalTaskResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, line 3497
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438752#file1438752line3497>
> >
> >     Checking `task.has_executor()` is insufficient as a condition to add 
> > the executor's resources because they may not consume additional resources.

Please refer to earlier comment regarding `Master::totalTaskResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, line 3911
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438752#file1438752line3911>
> >
> >     Ditto to my comment above, we don't know if the task's executor 
> > actually consumes resources by just checking `task.has_executor()`.

Please refer to earlier comment regarding `Master::totalTaskResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 3601-3602
> > <https://reviews.apache.org/r/45961/diff/4-7/?file=1416585#file1416585line3601>
> >
> >

Wrapped `Allocator::recoverResources()` into a new function 
`Master::recoverResources()` with the same arguments. Inside 
`Master::recoverResources()`, we check for `slave == nullptr` to determine what 
resources are recoverable, and call `Allocator::recoverResources()` thereafter.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.hpp, line 274
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438751#file1438751line274>
> >
> >     Add some comment to help explain why we need this method:
> >     
> >     ```
> >       // Return a subset of `resources` offered to `frameworkId` that can 
> >       // be recovered. Right now we filter out shared resources that are
> >       // still used by `frameworkId`.
> >       // NOTE: A framework can reuse a shared resource to launch multiple
> >       // tasks and the allocator only recovers a shared resource allocated
> >       // to the framework if it's not used by any task of the framework.
> >     ```

Refactored this within `Master::recoverResources()` and added comments within 
that function.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, line 3411
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438752#file1438752line3411>
> >
> >     Shouldn't this go through `recoverable()` as well?
> >     
> >     It'll be helpful if add some comments on the `recoverResources` to 
> > spell out what exactly is expected.

Wrapped `Allocator::recoverResources()` into `Master::recoverResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 3599-3600
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438752#file1438752line3599>
> >
> >     At least we need a comment on why it's safe to recover all 
> > `offeredResources` if `slave == nullptr`: if slave is nullptr, no 
> > `usedResources` is on it so all resources should be recoverable.
> >     
> >     But honestly this looks odd and we need to invoke this conditional 
> > operator in multiple places. Adding the same comment to all places this is 
> > called is also very redundant.
> >     
> >     Perhaps `recoverable` can be moved to Master and hide this detail about 
> > `slave == nullptr`? Perhaps we should just wrap around 
> > `allocator->recoverResources` in Master to insert this logic.

Wrapped `Allocator::recoverResources()` into `Master::recoverResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 3643-3644
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438752#file1438752line3643>
> >
> >     Ditto.

Wrapped `Allocator::recoverResources()` into `Master::recoverResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 4147-4148
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438752#file1438752line4147>
> >
> >     Ditto to comments above about the same code.

Wrapped `Allocator::recoverResources()` into `Master::recoverResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/validation.hpp, lines 157-158
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438753#file1438753line157>
> >
> >     Don't we always have the two additional arugments, even thought they 
> > could be empty (but not None)?

The reason is to satisfy a couple of (6 in total) test cases which tests the 
validation without involvement of framework's view of used and pending 
resources. On second thought, I updated the test cases and made these 2 args 
not have option. So, now we have:
```
Option<Error> validate(
    const Offer::Operation::Destroy& destroy,
    const Resources& checkpointedResources,
    const hashmap<FrameworkID, Resources>& usedResources,
    const hashmap<FrameworkID, Resources>& pendingResources);
```


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 1386-1400
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438747#file1438747line1386>
> >
> >     IIUC the comment still means "this is done to make sure shared 
> > resources are counted at most once in the framework's allocation".
> >     
> >     The last revision had:
> >     
> >     ```
> >     slaves[slaveId].allocated -= resources.shared();
> >     slaves[slaveId].allocated += resources;
> >     ```
> >     
> >     How is this longer version different?

The earlier code forces `slaves[slaveId].allocated` to have atmost a single 
copy of a shared resource. This version ensures that 
`slaves[slaveId].allocated` contains shared resources that are counted at most 
once for each framework that resource is allocated to.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 1596-1606
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438747#file1438747line1596>
> >
> >     Ditto to my comment on the same code above.

Same comment as in the prior comment.
The earlier code forces `slaves[slaveId].allocated` to have atmost a single 
copy of a shared resource. This version ensures that 
`slaves[slaveId].allocated` contains shared resources that are counted at most 
once for each framework that resource is allocated to.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 1331-1340
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438747#file1438747line1331>
> >
> >     Given this is the 1st stage of the allocate cycle there's no chance a 
> > shared resource could be in `offeredSharedResources` already?

As discussed offline, we would need this snippet to address scenarios where we 
have multiple frameworks of the same role.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/allocator/sorter/drf/sorter.cpp, lines 163-170
> > <https://reviews.apache.org/r/45961/diff/7/?file=1438749#file1438749line163>
> >
> >     So `DRFSorter::allocated` makes sure there's no duplicate shared 
> > resources in `allocations` so the caller can call it without pruning out 
> > the redundant copies. However the `unallocated` counterpart doesn't do the 
> > same? Ideally usage of the pair of API should be symmetric.

Moved responsibility of pruning of already allocated shared resources out of 
the sorter and into the allocator.


- Anindya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45961/#review141305
-----------------------------------------------------------


On July 17, 2016, 6:28 p.m., Anindya Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45961/
> -----------------------------------------------------------
> 
> (Updated July 17, 2016, 6:28 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Joris Van Remoortere, Michael 
> Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-4431
>     https://issues.apache.org/jira/browse/MESOS-4431
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> o Each shared resource is accouted via its share count. This count is
>   updated based on the resource operations (such as add and subtract)
>   in various scenarios such as task launch and terminate at multiple
>   modules such as master, allocator, sorter, etc.
> o Only allow DESTROY if there are no running or pending tasks using
>   the volume. However, if the volume is in a pending offer to one or
>   more frameworks, resind that offer before processing the DESTROY.
> 
> 
> Diffs
> -----
> 
>   src/common/resources.cpp f6ff92b591c15bc8e93fd85e1896349c3a7bb968 
>   src/common/resources_utils.cpp 8e881a09500b0966b577e514077460f724868a8d 
>   src/master/allocator/mesos/hierarchical.hpp 
> b72ba16277a3210e4d309b616d185a10e2029a66 
>   src/master/allocator/mesos/hierarchical.cpp 
> 7d4064535a20b93950f5a95eef1ad3f0d37d305b 
>   src/master/allocator/sorter/drf/sorter.hpp 
> bc6bfb2d5d3b32d55be055a0514861b4e7d889bb 
>   src/master/allocator/sorter/drf/sorter.cpp 
> ac85b327fc33d34246788e6a8c8bf5a486c61434 
>   src/master/http.cpp d1fe22bd2d9eb6bd82e32cf01720bbe3d8be26d5 
>   src/master/master.hpp ac998b1f5b305a9bff9d9e5cd205a6c3481f9b38 
>   src/master/master.cpp 61eaa4a92741a2d1e3f6624c555a40f6b9240d90 
>   src/master/quota_handler.cpp bf6a613a7bb3c62fd77d1ffde3170749d6c21fa2 
>   src/master/validation.hpp 43d876b00511d4ef652c2685d4950c994b5eef31 
>   src/master/validation.cpp 50ba372d1301d16d1738a3e4f4882b51f9ce06cd 
>   src/master/weights_handler.cpp 9a901e71ba2fbf2ca1c02f658a72d44cfaa5ec62 
>   src/tests/master_validation_tests.cpp 
> 9eb82a569f7c48caa96d4d54a93b199ccac74385 
>   src/tests/sorter_tests.cpp b0e5ef8a55bbcca3553a221bd5691a9c801a04f7 
>   src/v1/resources.cpp 8c3f2d1c1529915a59d47fe37bb3fc7a3267079a 
> 
> Diff: https://reviews.apache.org/r/45961/diff/
> 
> 
> Testing
> -------
> 
> Tests successful.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>

Reply via email to