> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/common/resources.cpp, line 1773
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390645#file1390645line1773>
> >
> >     Make sure we add this to the tests.
> >     
> >     So now the shared persistent volume could look like `disk(alice, 
> > hdfs-p, {foo: bar, foo})[hadoop:/hdfs:/data:rw]<SHARED>:1<6>`
> >     
> >     Would `<SHARED>` be a little redundant given that we have `<6>`?

I think having the `<SHARED>` makes it clearer. I would want to keep the 
`<SHARED>` while logging the `Resource` object. Test case 
`ResourcesTest.PrintingSharedResources` added in 
https://reviews.apache.org/r/45964/.


> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/master/allocator/sorter/drf/sorter.hpp, line 171
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390648#file1390648line171>
> >
> >     Is `nonSharedScalarQuantities` a better name?

I would prefer to not change the name of this existing member variable and 
change at a whole bunch of places in the sorter. I think a comment should 
suffice.


> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/master/allocator/sorter/drf/sorter.hpp, line 181
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390648#file1390648line181>
> >
> >     I feel that we can find a better place/name for this field. 
> > `shareResources` doesn't say if it's the allocation or the total pool.
> >     
> >     I feel like it should be in `total_` as it's reflects a state of the 
> > total pool. If we leave it here then perhaps give it a clearer name? What 
> > do you think?

I would rename it to `allocatedSharedResources`. I think putting it in total 
gives an inaccurate meaning since we are storing the allocated shared resources 
here (with n copies).


> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 898-901
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390647#file1390647line898>
> >
> >     Can't quite understand this comment as to providing an answer for the 
> > use of `.nonShared()` on `resources`?

We always keep one copy of a specific shared resource in 
`slaves[slaveId].allocated`. Assume 2 offers go out to 2 different frameworks 
with the same shared resource. Let us say Framework 1 returns it back, so we 
subtract the shared resource from `slaves[slaveId].allocated`, hence the shared 
resource is removed from `slaves[slaveId].allocated` (since its shared count is 
always 1 if present). When Framework 2 also returns it, the 
`CHECK(slaves[slaveId].allocated.contains(resources))` would fail. That is why 
we only check against non shared resources.

However, I think the following would be a better approach. We can move to a 
model of keeping n copies for shared resources in `slaves[slaveId].allocated` 
(where n is the number of frameworks who have been offered this shared 
resource). As a result, we can update the slave's allocated list without 
worrying about the shared count being 0 for shared resources.


> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/master/allocator/mesos/hierarchical.cpp, line 1321
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390647#file1390647line1321>
> >
> >     If a framework explicitly turns down the shared resources we should not 
> > send them back again right?

If we check for shared resource in `isFiltered()`, we should be able to filter 
resources correctly if the shared resource being offered in the current offer 
cycle was returned by the framework in the previous offer cycle (i.e. 
`OfferFilter` created in `recoverResources()` includes the shared resources.

However, let us say we offered shared resource in previous offer cycle and the 
framework launched a task in the `ACCEPT` call using that shared resource (or 
the other case is although the framework returned the shared resource back, but 
it already has 1 or more tasks running from a previous accept cycle using that 
shared resource). In that case, the `OfferFilter` would not contain the shared 
resource. In the next offer cycle, we will have the shared resource in call to 
`isFiltered()` [since the shared resource is always added] which would result 
`isFiltered()` to be `false` which means the offer shall be sent to this 
framework every seond (instead of `refuse_seconds`). Since shared resource is 
always added in the calculation, using a shared resource in a task by a 
framework would result in too frequent offers when a task is using that shared 
resource, which does not seem correct since the shared resource is always added 
in the offer to applicable frameworks.

On further discussion: Since we send shared resource only once, I think we 
should be fine with filtering on all resources since only 1 framework gets 
offered every 1 second and the remaining ones do not.


> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 1426-1428
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390647#file1390647line1426>
> >
> >     Thinking about this, we should discuss what the best behavior is for 
> > now:
> >     
> >     In the current design two offers from the same agent can already be 
> > created due to the two stage allocation algorithm. However the same 
> > resource are never sent in two offers.
> >     
> >     Here we have two options for shared resources for stage 2:
> >     
> >     1. Replenish 'available' with a (potential) 2nd copy of the same shared 
> > resources, or
> >     2. Don't add the 2nd copy if it's already in an offer in this cycle but 
> > DO add another copy if it's not in an offer.
> >     
> >     Eventually we may be sending the same shared resources to all eligible 
> > frameworks anyways but for now it feels that 2) is more in line with the 
> > current behavior: one resource in one offer in one offer cycle.
> >     
> >     Thoughts?

To be consistent with the fact that the same resource is not offered more than 
once in the same offer cycle, we shall continue the same with shared resources. 
If a shared resource has been offered in this offer cycle, we will not offer 
that same shared resource in the same offer cycle to any framework.


> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/master/allocator/sorter/drf/sorter.cpp, lines 448-449
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390649#file1390649line448>
> >
> >     Here the constrast between `total_.scalarQuantities` (includes shared 
> > resources) and `allocations[name].scalarQuantities` makes a good argument 
> > for renaming `allocations[name].scalarQuantities` as 
> > `allocations[name].nonSharedScalarQuantities` so it's clear that it's 
> > different from the way `total_.scalarQuantities` is calculated.

As in the previous comment, I would prefer to not change the name of this 
existing member variable and change at a whole bunch of places in the sorter. 
Also, I think that even if we want to change the name of an existing member 
variable, it should be a different commit (and not in the same commit as the 
shared resources functionality).


> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, line 3917
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390652#file1390652line3917>
> >
> >     The interesting scenario: if the framework receives one of shared 
> > resource and launches 3 tasks each using it, here `_offeredResources` has 
> > one copy of it and `-= addTask` is going to take it out after the first 
> > task is launched.
> >     
> >     The next task in this sequence then fails the validation due to the 
> > resource in taskinfo not in `_offeredResources`.
> >     
> >     Should the next task fail?

I think we should allow multiple tasks to be launched in the same accept cycle. 
For validation, we will add the view of shared resources from the original 
offer to ensure validations for subsequent tasks are successful in the same 
accept cycle.


> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/master/master.hpp, line 295
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390651#file1390651line295>
> >
> >     Here `const Resource&` is used and we have lost the count which means 
> > updated -= resource may not remove the shared resource.

Shared resources extracted via `resources.shared()` have a count of 1 always 
(since the offer always has atmost 1 copy of the shared resource), otherwise 
shared resource cannot be in resources.
So,
`updated -= resource`
essentially translates to:
`updated -= Resource_(resource)`

Since the `resource.has_shared()` is `true`, the `Resource_(resource)` morphs 
into a `Resource_` object with a `sharedCount` of 1. So we should be ok here.


> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, line 3960
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390652#file1390652line3960>
> >
> >     Not just `used` and not just task launches in previous cycles right?
> >     
> >     If the Accept has three copies of the shared resources and two tasks 
> > use them. The remaining one cannot be recovered due to tasks launched now.

Now we send only 1 copy of a specific shared resource across multiple offers in 
a single offer cycle. So the Accept can have a single copy of any shared 
resource. So this condition would not arise.


> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/master/master.hpp, line 289
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390651#file1390651line289>
> >
> >     Multiple frameworks can use the same shared resource so passing in one 
> > frameworkId is not sufficient?

I think it is sufficient (and not all frameworks are needed) since we want to 
update the allocations at the slave level and framework level based on this 
framework. To address this issue, we now maintain n copies of shared resources 
in slave's allocated list (but only 1 copy in role/framework's allocated list). 
So when a framework returns a shared resource in an offer cycle and that shared 
resource is not used in any task launched by that specific framework, we return 
those shared resources back to the slave's allocated list as well to the 
framework's allocated list.
Since slave's allocated list contains n copies (if the same shared resource is 
being offered to n frameworks), it now becomes n-1 copies; but since framework 
has 1 copy always, that shared resource is removed from framework's list of 
allocated resources.


> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/master/master.hpp, line 288
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390651#file1390651line288>
> >
> >     Besides used resources, what about offered resources and resources in 
> > pending tasks? Besides, the method name sounded like it's filtering things 
> > in but from the context in `_accept()` it's filtering things out...

Modified the name of the method to `recoverable()`.


> On June 14, 2016, 10:50 p.m., Jiang Yan Xu wrote:
> > src/master/master.hpp, lines 292-302
> > <https://reviews.apache.org/r/45961/diff/3/?file=1390651#file1390651line292>
> >
> >     How about:
> >     
> >     ```
> >     // Return the subset of the specified resources on this agent that can
> >     // be recovered back to the allocator.
> >     // The criteria for recoverable resources:
> >     // - Non-shared resources.
> >     // - Shared resources that do not also exist in 'totalUsedResources'
> >     //   or 'offeredResources'.
> >     Resources recoverable(const Resources& resources)
> >     {
> >       Resources recoverable = resources.nonShared();
> >       foreach (const Resource& resource, resources.shared()) {
> >         if (!totalUsedResources.contains(resource) &&
> >             !offeredResources.contains(resource)) {
> >           recoverable += resource;
> >         }
> >       }
> >       
> >       return recoverable;
> >     }
> >     ```
> >     
> >     It looks like we'll have to maintain a `totalUsedResources` in `Slave` 
> > though. 
> >     
> >     Another issue is that: when the task is being authorized, its resources 
> > are not in either `usedResources` or `offeredResources`. We have to address 
> > this too...

This method should act on this specific framework only. For example, if this 
framework returned the shared resource but some other framework has launched a 
task (or has been offered), we should not remove this shared resource from the 
resources passed to `recoverResources()` since in that case, the framework's 
allocated list will continue to hold this shared resource which is incorrect as 
the DRF share would not be accurate.

As in previous comment, we would maintain n copies of slave's allocated list 
and `recoverable()` acts on this specific framework only to check for any 
launched tasks using this shared resource. Based on that, we recover this 
resource from slave's allocated list (to become n-1 copies) and from 
framework's allocated list (will be removed).


- Anindya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45961/#review136970
-----------------------------------------------------------


On June 23, 2016, 1:27 a.m., Anindya Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45961/
> -----------------------------------------------------------
> 
> (Updated June 23, 2016, 1:27 a.m.)
> 
> 
> Review request for mesos and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-4431
>     https://issues.apache.org/jira/browse/MESOS-4431
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Each shared resource is accouted via its share count. This count is
> updated based on the resource operations (such as add and subtract)
> in various scenarios such as task launch and terminate at multiple
> modules such as master, allocator, sorter, etc.
> Only allow DESTROY if no task is using the volume, or has not been
> offered to any framework.
> Since shared resources are available to multiple frameworks of the
> same role simultaneously, we normalize it with a weight equivalent
> to the number of frameworks to which this shared resource has been
> allocated to in the sorter.
> 
> 
> Diffs
> -----
> 
>   src/common/resources.cpp f6ff92b591c15bc8e93fd85e1896349c3a7bb968 
>   src/common/resources_utils.cpp 8e881a09500b0966b577e514077460f724868a8d 
>   src/master/allocator/mesos/hierarchical.cpp 
> 5b2331b3f99252fd16f2514123006dd4ba7f1c0d 
>   src/master/allocator/sorter/drf/sorter.hpp 
> 35273b5dcf39938125a112c5e56b5a8a542157db 
>   src/master/allocator/sorter/drf/sorter.cpp 
> 27d56f274c41bbabe6f2abbbcebd2c4eff52693e 
>   src/master/http.cpp 7daaf12a4086635bbc5aba5e3375c95e8899ac6e 
>   src/master/master.hpp fe57878dc59637459d5c5cdae0be2aa159133fa4 
>   src/master/master.cpp 8def7156f4a05b39590321ce7743f7385a68bed0 
>   src/master/validation.hpp 43d876b00511d4ef652c2685d4950c994b5eef31 
>   src/master/validation.cpp 9120b71fc7725bdf7094aac6619d8aadcc352df5 
>   src/tests/master_validation_tests.cpp 
> 9eb82a569f7c48caa96d4d54a93b199ccac74385 
>   src/tests/sorter_tests.cpp 6fc58829892dc0223140f1b47593a3e5853cace5 
>   src/v1/resources.cpp 8c3f2d1c1529915a59d47fe37bb3fc7a3267079a 
> 
> Diff: https://reviews.apache.org/r/45961/diff/
> 
> 
> Testing
> -------
> 
> Tests successful.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>

Reply via email to