Re: Review Request 58486: Update the allocator on a per offer operation.

2017-05-17 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58486/#review175297
---



Patch looks great!

Reviews applied: [59194, 59195, 58486]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On May 17, 2017, 4:50 p.m., Anindya Sinha wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58486/
> ---
> 
> (Updated May 17, 2017, 4:50 p.m.)
> 
> 
> Review request for mesos, James Peach and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7308
> https://issues.apache.org/jira/browse/MESOS-7308
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The allocator is updated for each offer operation in the order specified
> in the `ACCEPT` call. Once the allocator is updated successfully, we
> handle subsequent operations as follows:
> 
> 1) Launch or Launch Group: We launch the tasks or task group on the
>agent.
> 2) (Un)reservation or Create/Destroy of Persistent Volumes: We send
>the `CheckPointResourcesMessage` to the agent.
> 
> If allocation for any of the operations fail, those resources are not
> recovered from the allocator.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 89d0790fd5fea59e74276f462581fe0073594732 
>   src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 
> 
> 
> Diff: https://reviews.apache.org/r/58486/diff/8/
> 
> 
> Testing
> ---
> 
> Tests passed.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>



Re: Review Request 58486: Update the allocator on a per offer operation.

2017-05-17 Thread Anindya Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58486/
---

(Updated May 17, 2017, 4:50 p.m.)


Review request for mesos, James Peach and Jiang Yan Xu.


Changes
---

Do not recalculate resources to be recovered if there is no error in 
`updateAllocation()`.


Bugs: MESOS-7308
https://issues.apache.org/jira/browse/MESOS-7308


Repository: mesos


Description
---

The allocator is updated for each offer operation in the order specified
in the `ACCEPT` call. Once the allocator is updated successfully, we
handle subsequent operations as follows:

1) Launch or Launch Group: We launch the tasks or task group on the
   agent.
2) (Un)reservation or Create/Destroy of Persistent Volumes: We send
   the `CheckPointResourcesMessage` to the agent.

If allocation for any of the operations fail, those resources are not
recovered from the allocator.


Diffs (updated)
-

  src/master/master.hpp 89d0790fd5fea59e74276f462581fe0073594732 
  src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 


Diff: https://reviews.apache.org/r/58486/diff/8/

Changes: https://reviews.apache.org/r/58486/diff/7-8/


Testing
---

Tests passed.


Thanks,

Anindya Sinha



Re: Review Request 58486: Update the allocator on a per offer operation.

2017-05-17 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58486/#review175227
---



Patch looks great!

Reviews applied: [59194, 59195, 58486]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On May 17, 2017, 2 a.m., Anindya Sinha wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58486/
> ---
> 
> (Updated May 17, 2017, 2 a.m.)
> 
> 
> Review request for mesos, James Peach and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7308
> https://issues.apache.org/jira/browse/MESOS-7308
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The allocator is updated for each offer operation in the order specified
> in the `ACCEPT` call. Once the allocator is updated successfully, we
> handle subsequent operations as follows:
> 
> 1) Launch or Launch Group: We launch the tasks or task group on the
>agent.
> 2) (Un)reservation or Create/Destroy of Persistent Volumes: We send
>the `CheckPointResourcesMessage` to the agent.
> 
> If allocation for any of the operations fail, those resources are not
> recovered from the allocator.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 89d0790fd5fea59e74276f462581fe0073594732 
>   src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 
> 
> 
> Diff: https://reviews.apache.org/r/58486/diff/7/
> 
> 
> Testing
> ---
> 
> Tests passed.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>



Re: Review Request 58486: Update the allocator on a per offer operation.

2017-05-16 Thread Anindya Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58486/
---

(Updated May 17, 2017, 2 a.m.)


Review request for mesos, James Peach and Jiang Yan Xu.


Changes
---

Validate that the valid operations can be processed in `Master::__accept()`.


Bugs: MESOS-7308
https://issues.apache.org/jira/browse/MESOS-7308


Repository: mesos


Description
---

The allocator is updated for each offer operation in the order specified
in the `ACCEPT` call. Once the allocator is updated successfully, we
handle subsequent operations as follows:

1) Launch or Launch Group: We launch the tasks or task group on the
   agent.
2) (Un)reservation or Create/Destroy of Persistent Volumes: We send
   the `CheckPointResourcesMessage` to the agent.

If allocation for any of the operations fail, those resources are not
recovered from the allocator.


Diffs (updated)
-

  src/master/master.hpp 89d0790fd5fea59e74276f462581fe0073594732 
  src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 


Diff: https://reviews.apache.org/r/58486/diff/7/

Changes: https://reviews.apache.org/r/58486/diff/6-7/


Testing
---

Tests passed.


Thanks,

Anindya Sinha



Re: Review Request 58486: Update the allocator on a per offer operation.

2017-05-15 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58486/#review175055
---



Patch looks great!

Reviews applied: [59194, 59195, 58486]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On May 15, 2017, 6:01 p.m., Anindya Sinha wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58486/
> ---
> 
> (Updated May 15, 2017, 6:01 p.m.)
> 
> 
> Review request for mesos, James Peach and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7308
> https://issues.apache.org/jira/browse/MESOS-7308
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The allocator is updated for each offer operation in the order specified
> in the `ACCEPT` call. Once the allocator is updated successfully, we
> handle subsequent operations as follows:
> 
> 1) Launch or Launch Group: We launch the tasks or task group on the
>agent.
> 2) (Un)reservation or Create/Destroy of Persistent Volumes: We send
>the `CheckPointResourcesMessage` to the agent.
> 
> If allocation for any of the operations fail, those resources are not
> recovered from the allocator.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 89d0790fd5fea59e74276f462581fe0073594732 
>   src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 
> 
> 
> Diff: https://reviews.apache.org/r/58486/diff/6/
> 
> 
> Testing
> ---
> 
> Tests passed.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>



Re: Review Request 58486: Update the allocator on a per offer operation.

2017-05-15 Thread Anindya Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58486/
---

(Updated May 15, 2017, 6:01 p.m.)


Review request for mesos, James Peach and Jiang Yan Xu.


Changes
---

Rebased.


Bugs: MESOS-7308
https://issues.apache.org/jira/browse/MESOS-7308


Repository: mesos


Description
---

The allocator is updated for each offer operation in the order specified
in the `ACCEPT` call. Once the allocator is updated successfully, we
handle subsequent operations as follows:

1) Launch or Launch Group: We launch the tasks or task group on the
   agent.
2) (Un)reservation or Create/Destroy of Persistent Volumes: We send
   the `CheckPointResourcesMessage` to the agent.

If allocation for any of the operations fail, those resources are not
recovered from the allocator.


Diffs (updated)
-

  src/master/master.hpp 89d0790fd5fea59e74276f462581fe0073594732 
  src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 


Diff: https://reviews.apache.org/r/58486/diff/6/

Changes: https://reviews.apache.org/r/58486/diff/5-6/


Testing
---

Tests passed.


Thanks,

Anindya Sinha



Re: Review Request 58486: Update the allocator on a per offer operation.

2017-05-13 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58486/#review174892
---



Patch looks great!

Reviews applied: [59194, 59195, 58486]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On May 13, 2017, 4:51 a.m., Anindya Sinha wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58486/
> ---
> 
> (Updated May 13, 2017, 4:51 a.m.)
> 
> 
> Review request for mesos, James Peach and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7308
> https://issues.apache.org/jira/browse/MESOS-7308
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The allocator is updated for each offer operation in the order specified
> in the `ACCEPT` call. Once the allocator is updated successfully, we
> handle subsequent operations as follows:
> 
> 1) Launch or Launch Group: We launch the tasks or task group on the
>agent.
> 2) (Un)reservation or Create/Destroy of Persistent Volumes: We send
>the `CheckPointResourcesMessage` to the agent.
> 
> If allocation for any of the operations fail, those resources are not
> recovered from the allocator.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 89d0790fd5fea59e74276f462581fe0073594732 
>   src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 
> 
> 
> Diff: https://reviews.apache.org/r/58486/diff/5/
> 
> 
> Testing
> ---
> 
> Tests passed.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>



Re: Review Request 58486: Update the allocator on a per offer operation.

2017-05-12 Thread Anindya Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58486/
---

(Updated May 13, 2017, 4:51 a.m.)


Review request for mesos, James Peach and Jiang Yan Xu.


Changes
---

Rebased.


Bugs: MESOS-7308
https://issues.apache.org/jira/browse/MESOS-7308


Repository: mesos


Description
---

The allocator is updated for each offer operation in the order specified
in the `ACCEPT` call. Once the allocator is updated successfully, we
handle subsequent operations as follows:

1) Launch or Launch Group: We launch the tasks or task group on the
   agent.
2) (Un)reservation or Create/Destroy of Persistent Volumes: We send
   the `CheckPointResourcesMessage` to the agent.

If allocation for any of the operations fail, those resources are not
recovered from the allocator.


Diffs (updated)
-

  src/master/master.hpp 89d0790fd5fea59e74276f462581fe0073594732 
  src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 


Diff: https://reviews.apache.org/r/58486/diff/5/

Changes: https://reviews.apache.org/r/58486/diff/4-5/


Testing
---

Tests passed.


Thanks,

Anindya Sinha



Re: Review Request 58486: Update the allocator on a per offer operation.

2017-05-12 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58486/#review174780
---



Patch looks great!

Reviews applied: [59194, 59195, 58486]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On May 11, 2017, 8:21 p.m., Anindya Sinha wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58486/
> ---
> 
> (Updated May 11, 2017, 8:21 p.m.)
> 
> 
> Review request for mesos, James Peach and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7308
> https://issues.apache.org/jira/browse/MESOS-7308
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The allocator is updated for each offer operation in the order specified
> in the `ACCEPT` call. Once the allocator is updated successfully, we
> handle subsequent operations as follows:
> 
> 1) Launch or Launch Group: We launch the tasks or task group on the
>agent.
> 2) (Un)reservation or Create/Destroy of Persistent Volumes: We send
>the `CheckPointResourcesMessage` to the agent.
> 
> If allocation for any of the operations fail, those resources are not
> recovered from the allocator.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 89d0790fd5fea59e74276f462581fe0073594732 
>   src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 
> 
> 
> Diff: https://reviews.apache.org/r/58486/diff/4/
> 
> 
> Testing
> ---
> 
> Tests passed.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>



Re: Review Request 58486: Update the allocator on a per offer operation.

2017-05-11 Thread Anindya Sinha


> On April 20, 2017, 6:17 p.m., Jiang Yan Xu wrote:
> > I feel the following may be more clear and general at high level.
> > 
> > - In `_accept()` all operations are validated, for resource operations we 
> > call `updateAllocation` for each individual operation [like we used to 
> > do](https://github.com/apache/mesos/blob/e7fcc4fbc09122480c8f2549a8af099769e888c0/src/master/master.cpp#L3770).
> >  
> > - We fail `updateAllocate` if the operation cannot be applied but leave the 
> > allocator in a good state.
> > - Individual resource operations are sent to the agent only when 
> > `updateAllocate` succeeds, otherwise the operation is dropped with no bad 
> > side effects.
> > 
> > It'll be also great if we can come up with a test?

The overall approach is as follows (based on your comments and F2F discussions):
1) `Resources::apply()` fails if a `DESTROY` is applied on a `Resources` object 
with shared count != 1 (https://reviews.apache.org/r/59194/, 
https://issues.apache.org/jira/browse/MESOS-7403).
2) `updateAllocation()` returns a failed future if any operation cannot be 
applied to `slave.allocated` (https://reviews.apache.org/r/59195/).
3) All valid operations received as a part of `ACCEPT` call 
`updateAllocation()` individually. Since `updateAllocation()` now returns a 
`Future`, we proceed to the next step in processing of `ACCEPT` once 
we get back the futures from `updateAllocation()` for each of the valid 
operations.
3a) For each of the non-failed futures (from step 3):
i) For `LAUNCH` and `LAUNCH_GROUP`: We send the `RunTaskMessage` or 
`RunTaskGroupMessage` to the agent.
ii) For all other operations, we send the `CheckpointedResourcesMessage` to 
the agent.
3b) For each of failed futures, we drop that operation.
4) Recover resources as applicable based on the successful allocation updates 
for the valid operations.

I will add tests in subsequent review in this chain.


> On April 20, 2017, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/allocator/mesos/hierarchical.cpp
> > Lines 746-749 (original), 746-749 (patched)
> > 
> >
> > Below here we can use the same mechanism as done in `updateAvailable`:
> > 
> > ```
> > // In addition to asserting the operation can be applied to the 
> > offered
> > // resources above, here we also check if it can be applied to the 
> > agent's
> > // current allocated resources. We do this because certain 
> > operations
> > // require that they are applicable to the agent's allocated 
> > resources
> > // as a whole. One such example is that a shared persistent volume 
> > can
> > // only be destroyed by a framework when it is not allocated to 
> > other
> > // frameworks (see `Resources::apply` for details). The master 
> > already does
> > // best-effort validation but this check may still fail due to 
> > additional   
> > // `allocate` runs. In a race condition an `allocate` could have 
> > been
> > // enqueued by the allocator itself just before master's request to
> > // enqueue `updateAllocation` arrives to the allocator. This is 
> > similar
> > // to the siutation in `updateAvailable`, see the diagram there.
> > Try updatedAllocated = slave.allocated.apply(operation);
> > if (updatedAllocated.isError()) {
> >   return Failure(updatedAllocated.error());
> > }
> > ```
> > 
> > We need to comment clearly on why we need this. Above I tried to state 
> > the general idea but also gave an concrete example but this is just a first 
> > draft.
> > 
> > 
> > For this to work we need to make sure `Resources::apply` fails in such 
> > a case: https://issues.apache.org/jira/browse/MESOS-7403
> > 
> > But now the failure would occur in the right place in this method (not 
> > applicable to the current allocated resources) and the error message 
> > returned by `Resources::apply` would be able to explain what exactly 
> > happened.

Fixed in https://reviews.apache.org/r/59195/.


> On April 20, 2017, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/allocator/mesos/hierarchical.cpp
> > Lines 848-851 (original)
> > 
> >
> > This being at the bottom of the method was intended to validate the 
> > post-condition. If this doesn't hold, ideally we shouldn't capture it here 
> > because we don't know how it has failed. We should capture the failure at 
> > the individual step and construct an error message/log that's explicit 
> > about it.

Updated in https://reviews.apache.org/r/59195/.


> On April 20, 2017, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp
> > Lines 4774-4775 (patched)
> > 
> >
> > So if the failure has occurred, the allocation in the a

Re: Review Request 58486: Update the allocator on a per offer operation.

2017-05-11 Thread Anindya Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58486/
---

(Updated May 11, 2017, 8:21 p.m.)


Review request for mesos, James Peach and Jiang Yan Xu.


Summary (updated)
-

Update the allocator on a per offer operation.


Bugs: MESOS-7308
https://issues.apache.org/jira/browse/MESOS-7308


Repository: mesos


Description (updated)
---

The allocator is updated for each offer operation in the order specified
in the `ACCEPT` call. Once the allocator is updated successfully, we
handle subsequent operations as follows:

1) Launch or Launch Group: We launch the tasks or task group on the
   agent.
2) (Un)reservation or Create/Destroy of Persistent Volumes: We send
   the `CheckPointResourcesMessage` to the agent.

If allocation for any of the operations fail, those resources are not
recovered from the allocator.


Diffs (updated)
-

  src/master/master.hpp 89d0790fd5fea59e74276f462581fe0073594732 
  src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 


Diff: https://reviews.apache.org/r/58486/diff/4/

Changes: https://reviews.apache.org/r/58486/diff/3-4/


Testing
---

Tests passed.


Thanks,

Anindya Sinha