Review Request 50117: Added more elapse time for allocator benchmark test.

2016-07-17 Thread Guangya Liu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50117/
---

Review request for mesos, Benjamin Mahler, Klaus Ma, and Jiang Yan Xu.


Repository: mesos


Description
---

This patch is adding elapse time for addFramework, addSlave
and setQuota in allocator benchmark test, adding those elapse
time can help evaluate the performnace of those APIs.


Diffs
-

  src/tests/hierarchical_allocator_tests.cpp 
ce5da6be490b6fce311286eb4018c91eef55067e 

Diff: https://reviews.apache.org/r/50117/diff/


Testing
---

make
make check

DeclineOffers benchmark
```
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1
Using 1000 agents and 50 frameworks
Added 50 frameworks in 14779us
Added 1000 agents in 2.7833secs
round 0 allocate() took 2.325974secs to make 1000 offers
round 1 allocate() took 2.349536secs to make 1000 offers
round 2 allocate() took 2.368871secs to make 1000 offers
round 3 allocate() took 2.584638secs to make 1000 offers
```

ResourceLabels benchmark
```
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/1
Using 1000 agents and 50 frameworks
Added 50 frameworks in 12574us
Added 1000 agents in 3.37232secs
round 0 allocate() took 2.459538secs to make 1000 offers
round 1 allocate() took 2.474587secs to make 1000 offers
```

Mtrics benchmark
```
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.Metrics/1
Set quota for 50 roles in 64060us
Added 50 frameworks in 97191us
Added 1000 agents in 9.80secs
/metrics/snapshot took 78252us for 1000 agents and 50 frameworks
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.Metrics/1 (10055 ms)
[--] 1 test from 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (10055 ms total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (10079 ms total)
[  PASSED  ] 1 test.
```


Thanks,

Guangya Liu



Re: Review Request 50115: Used `Clock::settle()` to check if the operations are processed.

2016-07-17 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50115/#review142533
---



Patch looks great!

Reviews applied: [49943, 49781, 50060, 50061, 50062, 50115]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On July 18, 2016, 3:32 a.m., Guangya Liu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50115/
> ---
> 
> (Updated July 18, 2016, 3:32 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Jie Yu, Klaus Ma, and Jiang Yan Xu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently, in HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave,
> we are using `sleep` to check if the operations are processed, this
> is not accurate, we should use `Clock::settle()` instead.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_tests.cpp 
> ce5da6be490b6fce311286eb4018c91eef55067e 
> 
> Diff: https://reviews.apache.org/r/50115/diff/
> 
> 
> Testing
> ---
> 
> make
> make check
> 
> ```
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/1
> Using 1000 agents and 50 frameworks
> Added 50 frameworks in 14624us
> Added 1000 agents in 2.253886secs
> Updated 1000 agents in 1.940139secs
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/1
>  (4493 ms)
> [--] 1 test from 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (4495 ms total)
> 
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (4520 ms total)
> [  PASSED  ] 1 test.
> ```
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>



Review Request 50115: Used `Clock::settle()` to check if the operations are processed.

2016-07-17 Thread Guangya Liu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50115/
---

Review request for mesos, Benjamin Mahler, Jie Yu, Klaus Ma, and Jiang Yan Xu.


Repository: mesos


Description
---

Currently, in HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave,
we are using `sleep` to check if the operations are processed, this
is not accurate, we should use `Clock::settle()` instead.


Diffs
-

  src/tests/hierarchical_allocator_tests.cpp 
ce5da6be490b6fce311286eb4018c91eef55067e 

Diff: https://reviews.apache.org/r/50115/diff/


Testing
---

make
make check

```
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/1
Using 1000 agents and 50 frameworks
Added 50 frameworks in 14624us
Added 1000 agents in 2.253886secs
Updated 1000 agents in 1.940139secs
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/1 
(4493 ms)
[--] 1 test from 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (4495 ms total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (4520 ms total)
[  PASSED  ] 1 test.
```


Thanks,

Guangya Liu



Re: Review Request 49819: Implemented `CgroupsIsolatorProcess::prepare`.

2016-07-17 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49819/#review142528
---




src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 307)


I think here you need to call `onAny()` rather than `then()`. You can take 
a look at the following code as a reference:

https://github.com/apache/mesos/blob/1.0.0-rc2/src/linux/cgroups.cpp#L1648:L1649


- Qian Zhang


On July 15, 2016, 11:32 a.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49819/
> ---
> 
> (Updated July 15, 2016, 11:32 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
> 
> 
> Bugs: MESOS-5041
> https://issues.apache.org/jira/browse/MESOS-5041
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented `CgroupsIsolatorProcess::prepare`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 
> c57baec88437f68886702a40ec8a6a6458546119 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> 4a9f55bf3b217405bf90943f27a976422877a99e 
> 
> Diff: https://reviews.apache.org/r/49819/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 49827: Implemented `CgroupsIsolatorProcess::cleanup`.

2016-07-17 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49827/#review142519
---




src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp (lines 134 - 138)


Why are they virtual methods?



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (lines 581 - 585)


I think you need to call `onAny()` rather than `then()`, so these codes 
should be changed to:
```
  return collect(cleanups)
.onAny(defer(
PID(this),
::_cleanup,
containerId,
lambda::_1));
```
You can take a look at the following code as a reference:

https://github.com/apache/mesos/blob/1.0.0-rc2/src/linux/cgroups.cpp#L1648:L1649



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 593)


I think you need to check `!isReady()` rather than `isFailed()`.



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (lines 615 - 617)


Can we merge these 3 lines into the code below?
```
.then([]() { return Nothing(); });
```



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (lines 632 - 635)


I think these code should be formated to:
```
return Failure(
"Failed to clean up container '" + stringify(containerId) +
"': " + (future.isFailed() ? future.failure() : "discarded"));
```


- Qian Zhang


On July 15, 2016, 11:33 a.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49827/
> ---
> 
> (Updated July 15, 2016, 11:33 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
> 
> 
> Bugs: MESOS-5041
> https://issues.apache.org/jira/browse/MESOS-5041
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented `CgroupsIsolatorProcess::cleanup`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 
> c57baec88437f68886702a40ec8a6a6458546119 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> 4a9f55bf3b217405bf90943f27a976422877a99e 
> 
> Diff: https://reviews.apache.org/r/49827/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 50114: Added missing tables attributes in docs.

2016-07-17 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50114/#review142526
---



Patch looks great!

Reviews applied: [50114]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On July 17, 2016, 9:26 p.m., Tomasz Janiszewski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50114/
> ---
> 
> (Updated July 17, 2016, 9:26 p.m.)
> 
> 
> Review request for mesos, Alexander Rojas, Kapil Arya, and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Most of the tables presented in docs uses
> `class="table table-striped"`. To make tables
> looks consitent I added this attribute where
> it was missing on modules and authorization.
> 
> 
> Diffs
> -
> 
>   docs/authorization.md fc027ac643906451513e50d166b51c147b4a9d99 
>   docs/modules.md 60191566cf3580792046ba21a6b9a66a21b18e21 
> 
> Diff: https://reviews.apache.org/r/50114/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Tomasz Janiszewski
> 
>



Re: Review Request 49844: Fixed a subscriber FD leak when running tests.

2016-07-17 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49844/#review142525
---



Patch looks great!

Reviews applied: [49837, 49844]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On July 17, 2016, 9:15 p.m., Anand Mazumdar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49844/
> ---
> 
> (Updated July 17, 2016, 9:15 p.m.)
> 
> 
> Review request for mesos, Vinod Kone and Zhitao Li.
> 
> 
> Bugs: MESOS-5812
> https://issues.apache.org/jira/browse/MESOS-5812
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This FD leak would only surface when running tests. We hold on to
> a reference of the `Connection` object in the client so that it is
> not destroyed while the connection is active. When running tests,
> the IP:Port of libprocess remain the same which means the objects
> keep on accumulating. In a real world cluster, we remove the
> subscriber upon noticing a _disconnection_ i.e. this means the
> socket has already been already closed upstream by Libprocess on
> the server side.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp ac998b1f5b305a9bff9d9e5cd205a6c3481f9b38 
>   src/master/master.cpp 79e3d78ba45060bc2f2532fdc3d105d1cc888d0f 
> 
> Diff: https://reviews.apache.org/r/49844/diff/
> 
> 
> Testing
> ---
> 
> make check (gtest_repeat=1000) no FD leaks
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>



Review Request 50114: Added missing tables attributes in docs.

2016-07-17 Thread Tomasz Janiszewski

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50114/
---

Review request for mesos, Alexander Rojas, Kapil Arya, and Vinod Kone.


Repository: mesos


Description
---

Most of the tables presented in docs uses
`class="table table-striped"`. To make tables
looks consitent I added this attribute where
it was missing on modules and authorization.


Diffs
-

  docs/authorization.md fc027ac643906451513e50d166b51c147b4a9d99 
  docs/modules.md 60191566cf3580792046ba21a6b9a66a21b18e21 

Diff: https://reviews.apache.org/r/50114/diff/


Testing
---


Thanks,

Tomasz Janiszewski



Re: Review Request 49844: Fixed a subscriber FD leak when running tests.

2016-07-17 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49844/
---

(Updated July 17, 2016, 9:15 p.m.)


Review request for mesos, Vinod Kone and Zhitao Li.


Changes
---

Added a comment around making `HttpConnection` a RAII class. NNFR


Bugs: MESOS-5812
https://issues.apache.org/jira/browse/MESOS-5812


Repository: mesos


Description (updated)
---

This FD leak would only surface when running tests. We hold on to
a reference of the `Connection` object in the client so that it is
not destroyed while the connection is active. When running tests,
the IP:Port of libprocess remain the same which means the objects
keep on accumulating. In a real world cluster, we remove the
subscriber upon noticing a _disconnection_ i.e. this means the
socket has already been already closed upstream by Libprocess on
the server side.


Diffs (updated)
-

  src/master/master.hpp ac998b1f5b305a9bff9d9e5cd205a6c3481f9b38 
  src/master/master.cpp 79e3d78ba45060bc2f2532fdc3d105d1cc888d0f 

Diff: https://reviews.apache.org/r/49844/diff/


Testing
---

make check (gtest_repeat=1000) no FD leaks


Thanks,

Anand Mazumdar



Re: Review Request 45959: Support arithmetic operations for shared resources with consumer counts.

2016-07-17 Thread Anindya Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45959/
---

(Updated July 17, 2016, 8:44 p.m.)


Review request for mesos, Benjamin Mahler, Joris Van Remoortere, and Jiang Yan 
Xu.


Bugs: MESOS-4892
https://issues.apache.org/jira/browse/MESOS-4892


Repository: mesos


Description
---

A new class Resource_ is added that allows 'Resources' to group
identical shared resource objects together into a single 'Resource_'
object and tracked by its shared count. Non-shared resource objects
are not grouped.

For resource addition and subtraction, the shared count is adjusted for
shared resources as follows:
a) Addition: If shared resource is absent from original, then the
   resource is added initialized with a consumer count of 1. Otherwise,
   the share count for the shared resource is incremented.
b) Subtraction: If shared resource's share count is already 1, then
   the shared resource is removed from the original. Otherwise, its
   consumer count is decremented.

Note that v1 changes for shared resources are in the next commit.


Diffs
-

  include/mesos/resources.hpp a557e97c65194d4aad879fb88d8edefd1c95b8d8 
  src/common/resources.cpp f6ff92b591c15bc8e93fd85e1896349c3a7bb968 
  src/master/validation.cpp 50ba372d1301d16d1738a3e4f4882b51f9ce06cd 
  src/tests/mesos.hpp e4eccfc3810bed3649a3ab80e252849470de4c72 
  src/tests/resources_tests.cpp dc12bd8f1e2da6972bc8aed598811c55d664036e 

Diff: https://reviews.apache.org/r/45959/diff/


Testing
---

New tests added to demonstrate arithmetic operations for shared resources with 
consumer counts.
Tests successful.


Thanks,

Anindya Sinha



Re: Review Request 49571: Added a benchmark test for allocations and shared resources.

2016-07-17 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49571/#review142523
---



Bad patch!

Reviews applied: [49571, 45967, 45966, 45964, 45963, 45962, 45961, 45960, 
48616, 45959, 45958]

Failed command: ./support/apply-review.sh -n -r 45959

Error:
2016-07-17 19:20:09 URL:https://reviews.apache.org/r/45959/diff/raw/ 
[34218/34218] -> "45959.patch" [1]
error: patch failed: src/tests/resources_tests.cpp:2405
error: src/tests/resources_tests.cpp: patch does not apply

Full log: https://builds.apache.org/job/mesos-reviewbot/14361/console

- Mesos ReviewBot


On July 17, 2016, 6:44 p.m., Anindya Sinha wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49571/
> ---
> 
> (Updated July 17, 2016, 6:44 p.m.)
> 
> 
> Review request for mesos and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-5771
> https://issues.apache.org/jira/browse/MESOS-5771
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Allocations test has the following configurations:
> (1) REGULAR: Offers from every slave have regular resources.
> (2) SHARED: Offers from every slave include a shared resource.
> (3) REGULAR: Offers from every alternate slave contain only regular
> resources; and offers from every other alternate slave contains
> a shared resource.
> 
> Added a case for testing arithmetic operations on shared resources.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_tests.cpp 
> ce5da6be490b6fce311286eb4018c91eef55067e 
>   src/tests/resources_tests.cpp 40d290ac540d26373c5fb7c2a93d27d1aa61d722 
> 
> Diff: https://reviews.apache.org/r/49571/diff/
> 
> 
> Testing
> ---
> 
> All tests passed.
> 
> Resource arithmetic benchmark test results
> ==
> Minimal impact seen in Resources arithmetic with the Resources refactor 
> changes to incorporate shared resources.
> 
> With shared resources patch (note that 4th test below is for shared resources 
> for scalars)
> --
> 
> [--] 4 tests from ResourcesOperators/Resources_BENCHMARK_Test
> [ RUN  ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/0
> [   OK ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/0 (980 ms)
> [ RUN  ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/1
> [   OK ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/1 (17128 
> ms)
> [ RUN  ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/2
> [   OK ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/2 (9217 
> ms)
> [ RUN  ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/3
> [   OK ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/3 (1100 
> ms)
> [--] 4 tests from ResourcesOperators/Resources_BENCHMARK_Test (28425 
> ms total)
> 
> HEAD
> 
> 
> [--] 3 tests from ResourcesOperators/Resources_BENCHMARK_Test
> [ RUN  ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/0
> [   OK ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/0 (866 ms)
> [ RUN  ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/1
> [   OK ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/1 (17563 
> ms)
> [ RUN  ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/2
> [   OK ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/2 (9218 
> ms)
> [--] 3 tests from ResourcesOperators/Resources_BENCHMARK_Test (27647 
> ms total)
> 
> Allocations benchmark test results
> ==
> Support of shared resources has a small impact on runtime performance in 
> allocations. Also, there is no visible impact in performance when shared 
> resources are added in the tests.
> 
> With the patch (and no shared resources)
> 
> round 0 allocate took 3.19704secs to make 200 offers
> round 50 allocate took 3.240605secs to make 200 offers
> round 100 allocate took 3.227024secs to make 200 offers
> round 150 allocate took 3.225281secs to make 200 offers
> round 199 allocate took 3.26036secs to make 200 offers
> 
> With the patch (and shared resources on all agents)
> ---
> round 0 allocate took 3.279115secs to make 200 offers
> round 50 allocate took 3.273396secs to make 200 offers
> round 100 allocate took 3.278509secs to make 200 offers
> round 150 allocate took 3.275959secs to make 200 offers
> round 199 allocate took 3.278151secs to make 200 offers
> 
> With the patch (and shared resources on alternate agents)
> -
> round 0 allocate took 3.251739secs to make 200 offers

Re: Review Request 45967: Added documentation for shareable resources.

2016-07-17 Thread Anindya Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45967/
---

(Updated July 17, 2016, 6:31 p.m.)


Review request for mesos and Jiang Yan Xu.


Changes
---

Minor changes needed to accommodate SHAREABLE_RESOURCES->SHARED_RESOURCES in 
framework capabilities.


Bugs: MESOS-4325
https://issues.apache.org/jira/browse/MESOS-4325


Repository: mesos


Description
---

Added documentation for shareable resources.


Diffs (updated)
-

  docs/home.md 5e6225ce9fd5a3fab817ec333fc5d45fb4488956 
  docs/shared-resources.md PRE-CREATION 

Diff: https://reviews.apache.org/r/45967/diff/


Testing
---

Tests successful.


Thanks,

Anindya Sinha



Re: Review Request 45966: Offer shared resources to frameworks only if opted in.

2016-07-17 Thread Anindya Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45966/
---

(Updated July 17, 2016, 6:31 p.m.)


Review request for mesos and Jiang Yan Xu.


Changes
---

Minor changes needed to accommodate SHAREABLE_RESOURCES->SHARED_RESOURCES in 
framework capabilities.


Summary (updated)
-

Offer shared resources to frameworks only if opted in.


Bugs: MESOS-4325
https://issues.apache.org/jira/browse/MESOS-4325


Repository: mesos


Description (updated)
---

Added a new capability SHARED_RESOURCES that frameworks need to opt
in if they are interested in receiving shared resources in their
offers.


Diffs (updated)
-

  include/mesos/mesos.proto 7796c72694808618a1b0af2e7d00111f9674785c 
  include/mesos/v1/mesos.proto fd03c540a819ea43ccb2965600d063f9b736a8ba 
  src/examples/persistent_shared_volume_framework.cpp PRE-CREATION 
  src/master/allocator/mesos/hierarchical.hpp 
b72ba16277a3210e4d309b616d185a10e2029a66 
  src/master/allocator/mesos/hierarchical.cpp 
7d4064535a20b93950f5a95eef1ad3f0d37d305b 
  src/tests/hierarchical_allocator_tests.cpp 
ce5da6be490b6fce311286eb4018c91eef55067e 

Diff: https://reviews.apache.org/r/45966/diff/


Testing
---

Tests updated with new capability.
Tests successful.


Thanks,

Anindya Sinha



Re: Review Request 45964: Add unit tests for sharing of resources.

2016-07-17 Thread Anindya Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45964/
---

(Updated July 17, 2016, 6:30 p.m.)


Review request for mesos and Jiang Yan Xu.


Changes
---

Moved the sorter test for shared resources completely to 
https://reviews.apache.org/r/45961.


Bugs: MESOS-4324
https://issues.apache.org/jira/browse/MESOS-4324


Repository: mesos


Description
---

Add unit tests for sharing of resources.


Diffs (updated)
-

  src/tests/hierarchical_allocator_tests.cpp 
ce5da6be490b6fce311286eb4018c91eef55067e 
  src/tests/master_validation_tests.cpp 
9eb82a569f7c48caa96d4d54a93b199ccac74385 
  src/tests/mesos.hpp e4eccfc3810bed3649a3ab80e252849470de4c72 
  src/tests/persistent_volume_tests.cpp 
a6f97c4bb5fb29d610c01255036095e2b30c44c5 
  src/tests/resources_tests.cpp 40d290ac540d26373c5fb7c2a93d27d1aa61d722 

Diff: https://reviews.apache.org/r/45964/diff/


Testing
---

Tests for shared resources added for allocator, resources and sorter.
Tests successful.


Thanks,

Anindya Sinha



Re: Review Request 45963: Allow tasks to set persistent volume as readonly or readwrite resource.

2016-07-17 Thread Anindya Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45963/
---

(Updated July 17, 2016, 6:29 p.m.)


Review request for mesos and Jiang Yan Xu.


Changes
---

Minor updates needed to shorten duration of the test.


Bugs: MESOS-4324
https://issues.apache.org/jira/browse/MESOS-4324


Repository: mesos


Description
---

Allow the task to specify the persistent volume access to be read-only
or read-write. Note that the persistent volume is always created as
read-write.
If the task is the first consumer of the shared persistent volume, then
set the ownership of the persistent volume to match that of the task.
Otherwise, allow the task to be executed only if the ownership of the
persistent volume matches that of the task.
Added an option to run the test in mixed (default) mode or shared-only
mode. In mixed mode, multiple shards alternate between shared and
unshared persistent volumes for the tasks. In shared-only mode, all
shards use shared persistent volumes for the tasks.


Diffs (updated)
-

  src/examples/persistent_shared_volume_framework.cpp PRE-CREATION 
  src/master/validation.hpp 43d876b00511d4ef652c2685d4950c994b5eef31 
  src/master/validation.cpp 50ba372d1301d16d1738a3e4f4882b51f9ce06cd 
  src/slave/containerizer/mesos/isolators/filesystem/linux.hpp 
0afe9272900cfa4b39887eb259070a9a2df2ab93 
  src/slave/containerizer/mesos/isolators/filesystem/linux.cpp 
db3ed8f69de8b52633194b252b0e5aba38ec69c0 
  src/slave/containerizer/mesos/isolators/filesystem/posix.hpp 
794b6e5990db5f8eb21a6535872f284ca02e0553 
  src/slave/containerizer/mesos/isolators/filesystem/posix.cpp 
0809e8ec35232fbdafee171a7a960cbdec272134 

Diff: https://reviews.apache.org/r/45963/diff/


Testing
---

Tests successful.


Thanks,

Anindya Sinha



Re: Review Request 45961: Support sharing of resources through reference counting of resources.

2016-07-17 Thread Anindya Sinha


- Anindya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45961/#review141305
---


On July 17, 2016, 6:28 p.m., Anindya Sinha wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45961/
> ---
> 
> (Updated July 17, 2016, 6:28 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Joris Van Remoortere, Michael 
> Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-4431
> https://issues.apache.org/jira/browse/MESOS-4431
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> o Each shared resource is accouted via its share count. This count is
>   updated based on the resource operations (such as add and subtract)
>   in various scenarios such as task launch and terminate at multiple
>   modules such as master, allocator, sorter, etc.
> o Only allow DESTROY if there are no running or pending tasks using
>   the volume. However, if the volume is in a pending offer to one or
>   more frameworks, resind that offer before processing the DESTROY.
> 
> 
> Diffs
> -
> 
>   src/common/resources.cpp f6ff92b591c15bc8e93fd85e1896349c3a7bb968 
>   src/common/resources_utils.cpp 8e881a09500b0966b577e514077460f724868a8d 
>   src/master/allocator/mesos/hierarchical.hpp 
> b72ba16277a3210e4d309b616d185a10e2029a66 
>   src/master/allocator/mesos/hierarchical.cpp 
> 7d4064535a20b93950f5a95eef1ad3f0d37d305b 
>   src/master/allocator/sorter/drf/sorter.hpp 
> bc6bfb2d5d3b32d55be055a0514861b4e7d889bb 
>   src/master/allocator/sorter/drf/sorter.cpp 
> ac85b327fc33d34246788e6a8c8bf5a486c61434 
>   src/master/http.cpp d1fe22bd2d9eb6bd82e32cf01720bbe3d8be26d5 
>   src/master/master.hpp ac998b1f5b305a9bff9d9e5cd205a6c3481f9b38 
>   src/master/master.cpp 61eaa4a92741a2d1e3f6624c555a40f6b9240d90 
>   src/master/quota_handler.cpp bf6a613a7bb3c62fd77d1ffde3170749d6c21fa2 
>   src/master/validation.hpp 43d876b00511d4ef652c2685d4950c994b5eef31 
>   src/master/validation.cpp 50ba372d1301d16d1738a3e4f4882b51f9ce06cd 
>   src/master/weights_handler.cpp 9a901e71ba2fbf2ca1c02f658a72d44cfaa5ec62 
>   src/tests/master_validation_tests.cpp 
> 9eb82a569f7c48caa96d4d54a93b199ccac74385 
>   src/tests/sorter_tests.cpp b0e5ef8a55bbcca3553a221bd5691a9c801a04f7 
>   src/v1/resources.cpp 8c3f2d1c1529915a59d47fe37bb3fc7a3267079a 
> 
> Diff: https://reviews.apache.org/r/45961/diff/
> 
> 
> Testing
> ---
> 
> Tests successful.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>



Re: Review Request 45962: Added a persistent volume test framework for shared volumes.

2016-07-17 Thread Anindya Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45962/
---

(Updated July 17, 2016, 6:29 p.m.)


Review request for mesos and Jiang Yan Xu.


Changes
---

Minor updates needed on rebase. Also shortened the duration of the test.


Bugs: MESOS-4431
https://issues.apache.org/jira/browse/MESOS-4431


Repository: mesos


Description
---

Added a persistent volume test framework for shared volumes.


Diffs (updated)
-

  src/Makefile.am cc83525a4455bbb0e654d346921d66ed2436411d 
  src/examples/persistent_shared_volume_framework.cpp PRE-CREATION 
  src/tests/examples_tests.cpp ac513ce9aa3c8f366fe81ba937e3dc0d51a26940 
  src/tests/persistent_shared_volume_framework_test.sh PRE-CREATION 

Diff: https://reviews.apache.org/r/45962/diff/


Testing
---

New test framework for shared resources added.
Tests successful.


Thanks,

Anindya Sinha



Re: Review Request 45961: Support sharing of resources through reference counting of resources.

2016-07-17 Thread Anindya Sinha


> On July 12, 2016, 7:26 a.m., Jiang Yan Xu wrote:
> > Adding to the previous partial review after discussion.
> > 
> > ## Invariant checking and documentation
> > 
> > An overall comment is that I think since we keep track of shared resource 
> > across Master, allocator and sorter, each for a different aspect (allocated 
> > vs. used vs. available, etc.) and with different semantics and expactations 
> > on uniquesness/number of copies etc., we should very explicitly document 
> > these variables and method arguments and check their invariants.
> > 
> > Specifically, the following diagram helped me understand the relationship 
> > between these variables.
> > 
> > Sorter
> > |
> > |- allocations: hashmap
> >  |
> >  |- resources: hashmap > Resources>
> >  |- scalarQuantities 
> > 
> > `Allocation.resources` expects 1 copy of each shared resource because it's 
> > tracking whether a client has been allocated this resource and doesn't care 
> > about how many times it's been allocated or used by the client. Therefore 
> > distinct shared resources is an invariant that we should check in 
> > `allocated()`.
> > 
> > Allocator
> > |
> > |- slaves: hashmap
> > |
> > |- allocated: Resources
> > 
> > `Slave.allocated` expects 1 copy of each shared resource **per client** and 
> > it's grouped into one Resources object.
> > 
> > Master
> > |
> > |- Slave
> > |  |
> > |  |- usedResources: hashmap
> > |
> > |- Framework
> >|
> >|- usedResources: hashmap
> >
> > `Slave.usedResources` and `Framework.usedResources` expect 1 copy of each 
> > shared resource **per task** as they are tracking "usage".

Yes, I will update the comments to reflect scope of shared count (ie. copies) 
in the context of the `Resources` objects used in master, allocator and sorter. 
So, now with the current version:

1) Sorter=> allocations (resources, scalarQuantities): 1 copy of each shared 
resource per allocation (that has not been recovered) by a client.
2) Allocator=> slaves[].allocated: 1 copy of each shared resource per 
allocation (that has not been recovered) by a framework.
3) Master=> usedResources (in Slave and Framework): 1 copy of each shared 
resource per task as they are tracking "usage".


> On July 12, 2016, 7:26 a.m., Jiang Yan Xu wrote:
> > src/master/master.hpp, line 274
> > 
> >
> > Chatted offline. We should refactor this into a 
> > `Master::recoverResources()` call to encapsulate this logic and call 
> > `allocator->recoverResources()` inside with adjusted arguments.

Wrapped `Allocator::recoverResources()` within `Master::recoverResources()`.


> On July 12, 2016, 7:26 a.m., Jiang Yan Xu wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 450-459
> > 
> >
> > Seems like in `slaves[slaveId].allocated` we should be maintaining one 
> > copy of each shared resource **per framework** but this if condition here 
> > assumes one copy of each shared resource across framework.

Yes, this was done when we maintained atmost a single copy of a shared 
resource. Now with 1 copy of shared resource per framework, we need to just sum 
up all resources.


> On July 12, 2016, 7:26 a.m., Jiang Yan Xu wrote:
> > src/tests/master_validation_tests.cpp, lines 630-632
> > 
> >
> > Here we are testing that can't destroy a  nonshared persistent volume 
> > `disk2` because it's in use?
> > 
> > Perhaps it's still reasonable to test it (even thought it can't happen 
> > right now) but it's not relevant to this test?

Ok. Since it is not a use case that happens in the `DESTROY` callflow, I think 
we can remove it.


> On July 12, 2016, 7:26 a.m., Jiang Yan Xu wrote:
> > src/tests/sorter_tests.cpp, lines 520-521
> > 
> >
> > Even if a and b both use 1/10 of the persistent volume, shouldn't they 
> > each get 0.1 share?

Yes, the shares is true based on when we were evenly distributing the shares 
across frameworks to which the shared resources were allocated to. I did not 
update the test in this commit so you do not see it. I will fix this to match 
the current behavior.


> On July 12, 2016, 7:26 a.m., Jiang Yan Xu wrote:
> > src/tests/sorter_tests.cpp, lines 499-500
> > 
> >
> > This is not a realistic example. "id1" is already a volume, you 
> > shouldn't claim 1/10 of a volume. (I don't see us validating this though)

Yes, a typo 

Re: Review Request 45961: Support sharing of resources through reference counting of resources.

2016-07-17 Thread Anindya Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45961/
---

(Updated July 17, 2016, 6:28 p.m.)


Review request for mesos, Benjamin Mahler, Joris Van Remoortere, Michael Park, 
and Jiang Yan Xu.


Changes
---

Updates based on review comments. Also, fixed filtering based on shared 
resources by maintining n copies of shared resources (allocated but not 
recovered) in the allocator (for slaves) as well as the various sorters (for 
respective clients).


Bugs: MESOS-4431
https://issues.apache.org/jira/browse/MESOS-4431


Repository: mesos


Description
---

o Each shared resource is accouted via its share count. This count is
  updated based on the resource operations (such as add and subtract)
  in various scenarios such as task launch and terminate at multiple
  modules such as master, allocator, sorter, etc.
o Only allow DESTROY if there are no running or pending tasks using
  the volume. However, if the volume is in a pending offer to one or
  more frameworks, resind that offer before processing the DESTROY.


Diffs (updated)
-

  src/common/resources.cpp f6ff92b591c15bc8e93fd85e1896349c3a7bb968 
  src/common/resources_utils.cpp 8e881a09500b0966b577e514077460f724868a8d 
  src/master/allocator/mesos/hierarchical.hpp 
b72ba16277a3210e4d309b616d185a10e2029a66 
  src/master/allocator/mesos/hierarchical.cpp 
7d4064535a20b93950f5a95eef1ad3f0d37d305b 
  src/master/allocator/sorter/drf/sorter.hpp 
bc6bfb2d5d3b32d55be055a0514861b4e7d889bb 
  src/master/allocator/sorter/drf/sorter.cpp 
ac85b327fc33d34246788e6a8c8bf5a486c61434 
  src/master/http.cpp d1fe22bd2d9eb6bd82e32cf01720bbe3d8be26d5 
  src/master/master.hpp ac998b1f5b305a9bff9d9e5cd205a6c3481f9b38 
  src/master/master.cpp 61eaa4a92741a2d1e3f6624c555a40f6b9240d90 
  src/master/quota_handler.cpp bf6a613a7bb3c62fd77d1ffde3170749d6c21fa2 
  src/master/validation.hpp 43d876b00511d4ef652c2685d4950c994b5eef31 
  src/master/validation.cpp 50ba372d1301d16d1738a3e4f4882b51f9ce06cd 
  src/master/weights_handler.cpp 9a901e71ba2fbf2ca1c02f658a72d44cfaa5ec62 
  src/tests/master_validation_tests.cpp 
9eb82a569f7c48caa96d4d54a93b199ccac74385 
  src/tests/sorter_tests.cpp b0e5ef8a55bbcca3553a221bd5691a9c801a04f7 
  src/v1/resources.cpp 8c3f2d1c1529915a59d47fe37bb3fc7a3267079a 

Diff: https://reviews.apache.org/r/45961/diff/


Testing
---

Tests successful.


Thanks,

Anindya Sinha



Re: Review Request 45961: Support sharing of resources through reference counting of resources.

2016-07-17 Thread Anindya Sinha


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 3499-3502
> > 
> >
> > `task.has_executor()` doesn't always lead to additional resources being 
> > used/charged against the framework. See `Master::addTask()`. It will suck a 
> > lot if we have to duplicate that logic here though so we might have to 
> > consider doing the pending task refactor first...

Refactored `Master::addTask()` and moved calculation of resources for the task 
into a separate function `Master::totalTaskResources()`, and used this function 
to determine the resources for pending tasks (in `Slave::pendingResources`).


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 3913-3916
> > 
> >
> > Need to check whether this executor actually consumes resources.

Please refer to earlier comment regarding `Master::totalTaskResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 4291-4301
> > 
> >
> > Ditto about `task.has_executor()`.

Please refer to earlier comment regarding `Master::totalTaskResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, line 3497
> > 
> >
> > Checking `task.has_executor()` is insufficient as a condition to add 
> > the executor's resources because they may not consume additional resources.

Please refer to earlier comment regarding `Master::totalTaskResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, line 3911
> > 
> >
> > Ditto to my comment above, we don't know if the task's executor 
> > actually consumes resources by just checking `task.has_executor()`.

Please refer to earlier comment regarding `Master::totalTaskResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 3601-3602
> > 
> >
> >

Wrapped `Allocator::recoverResources()` into a new function 
`Master::recoverResources()` with the same arguments. Inside 
`Master::recoverResources()`, we check for `slave == nullptr` to determine what 
resources are recoverable, and call `Allocator::recoverResources()` thereafter.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.hpp, line 274
> > 
> >
> > Add some comment to help explain why we need this method:
> > 
> > ```
> >   // Return a subset of `resources` offered to `frameworkId` that can 
> >   // be recovered. Right now we filter out shared resources that are
> >   // still used by `frameworkId`.
> >   // NOTE: A framework can reuse a shared resource to launch multiple
> >   // tasks and the allocator only recovers a shared resource allocated
> >   // to the framework if it's not used by any task of the framework.
> > ```

Refactored this within `Master::recoverResources()` and added comments within 
that function.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, line 3411
> > 
> >
> > Shouldn't this go through `recoverable()` as well?
> > 
> > It'll be helpful if add some comments on the `recoverResources` to 
> > spell out what exactly is expected.

Wrapped `Allocator::recoverResources()` into `Master::recoverResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 3599-3600
> > 
> >
> > At least we need a comment on why it's safe to recover all 
> > `offeredResources` if `slave == nullptr`: if slave is nullptr, no 
> > `usedResources` is on it so all resources should be recoverable.
> > 
> > But honestly this looks odd and we need to invoke this conditional 
> > operator in multiple places. Adding the same comment to all places this is 
> > called is also very redundant.
> > 
> > Perhaps `recoverable` can be moved to Master and hide this detail about 
> > `slave == nullptr`? Perhaps we should just wrap around 
> > `allocator->recoverResources` in Master to insert this logic.

Wrapped `Allocator::recoverResources()` into `Master::recoverResources()`.


> On July 11, 2016, 6:17 p.m., Jiang Yan Xu wrote:
> > src/master/master.cpp, lines 3643-3644
> > 
> >
> > Ditto.

Wrapped `Allocator::recoverResources()` into 

Re: Review Request 50017: WIP: Validated the resources when parsing it.

2016-07-17 Thread Guangya Liu


> On 七月 14, 2016, 5:56 p.m., Benjamin Mahler wrote:
> > include/mesos/resources.hpp, lines 135-147
> > 
> >
> > This suggests a change of semantics to `Resources`.
> > 
> > Currently: `Resources` always refers to a valid collection resources. 
> > This function would never return Some(Error) unless we change the semantics.
> > 
> > If we want to remove the unnecessary validation, we can do this by 
> > making '`operator += (const Resources& that)`' avoid validating '`that`' 
> > since it's already valid. Currently it just calls '`operator += (const 
> > Resource& r)`' which is unaware of '`r`' already being valid since it comes 
> > from a '`Resources`'.
> > 
> > However, we have to keep the validation in '`operator += (const 
> > Resource& r)`', since '`r`' is just a protobuf coming from an arbitrary 
> > caller, it may be invalid and we need to validate.
> > 
> > Make sense?
> 
> Guangya Liu wrote:
> Hi Ben, does there are any example in mesos code for `Resource` object 
> come from protobuf? I found that we are always using `Resource::parse` to get 
> a `Resource` or `Resources` object and I have added some validation logic in 
> `Resource::parse` in when getting `Resource` or `Resources` object.
> 
> Guangya Liu wrote:
> Ben, got your point. We can set `Resource` object as following:
> 
> ```javascript 
> Resource scalar;
> scalar.set_name("cpus");
> scalar.set_type(Value::SCALAR);
> scalar.mutable_scalar()->set_value(1.234);
> ```
> 
> So if we igore the validation for `'operator += (const Resource& r)'`, 
> there might be some erros if the resource was not constructed correctly. This 
> is not a good way to create `Resource` object but we cannot guard the usage 
> for this now.
> 
> I found that the API `'operator += (const Resource& r)'` was not called 
> directly too frequently, but we are using this API `'operator += (const 
> Resources& that)'` for the most of the time, so seems it is good to keep the 
> validation for `'operator += (const Resource& r)'` as it was not called 
> directly too much. 
> 
> The problem is that `'operator += (const Resources& that)'` is calling 
> `'operator += (const Resource& r)'`, I did not found a good way to keep 
> validation for `'operator += (const Resource& r)'` but ignore the validation 
> for `'operator += (const Resources& that)'`, still checking how we can make 
> this work, any comments/suggestions for this?
> 
> Guangya Liu wrote:
> Did more test by removing validation, the performance of resources 
> operations with `port` resources increased more than 5x.
> 
> With validation
> ```
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from ResourcesOperators/Resources_BENCHMARK_Test
> [ RUN  ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/2
> Took 11.212199secs to perform 1000 'total += r' operations on 
> ports(*):[1-2, 4-5, 7-8, 10-11, 13-14, 16-17, 1...
> Took 8.145903secs to perform 1000 'total -= r' operations on 
> ports(*):[1-2, 4-5, 7-8, 10-11, 13-14, 16-17, 1...
> Took 11.154993secs to perform 1000 'total = total + r' operations on 
> ports(*):[1-2, 4-5, 7-8, 10-11, 13-14, 16-17, 1...
> Took 7.974004secs to perform 1000 'total = total - r' operations on 
> ports(*):[1-2, 4-5, 7-8, 10-11, 13-14, 16-17, 1...
> [   OK ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/2 
> (38490 ms)
> [--] 1 test from ResourcesOperators/Resources_BENCHMARK_Test 
> (38490 ms total)
> 
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (38512 ms total)
> [  PASSED  ] 1 test.
> ```
> 
> Without validation
> ```
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from ResourcesOperators/Resources_BENCHMARK_Test
> [ RUN  ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/2
> Took 2.90743secs to perform 1000 'total += r' operations on 
> ports(*):[1-2, 4-5, 7-8, 10-11, 13-14, 16-17, 1...
> Took 9166us to perform 1000 'total -= r' operations on ports(*):[1-2, 
> 4-5, 7-8, 10-11, 13-14, 16-17, 1...
> Took 3.038083secs to perform 1000 'total = total + r' operations on 
> ports(*):[1-2, 4-5, 7-8, 10-11, 13-14, 16-17, 1...
> Took 9253us to perform 1000 'total = total - r' operations on 
> ports(*):[1-2, 4-5, 7-8, 10-11, 13-14, 16-17, 1...
> [   OK ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/2 
> (5966 ms)
> [--] 1 test from ResourcesOperators/Resources_BENCHMARK_Test 
> (5966 ms total)
> 
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (5988 ms total)
> [  PASSED  ] 1 test.
> ```

Re: Review Request 49825: Implemented `CgroupsIsolatorProcess::status`.

2016-07-17 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49825/#review142518
---



For this patch, I have exactly same comments with 
https://reviews.apache.org/r/49824/ :-)

- Qian Zhang


On July 15, 2016, 11:33 a.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49825/
> ---
> 
> (Updated July 15, 2016, 11:33 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
> 
> 
> Bugs: MESOS-5041
> https://issues.apache.org/jira/browse/MESOS-5041
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented `CgroupsIsolatorProcess::status`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 
> c57baec88437f68886702a40ec8a6a6458546119 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> 4a9f55bf3b217405bf90943f27a976422877a99e 
> 
> Diff: https://reviews.apache.org/r/49825/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 49824: Implemented `CgroupsIsolatorProcess::usage`.

2016-07-17 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49824/#review142517
---




src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp (line 126)


Why is this a virtual method?



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (lines 484 - 485)


I would suggest to make this consistent with the warning message below in 
`_usage`, i.e.:
```
"Failed to get the usage information of container '" + 
stringify(containerId) + "': Unknown container"
```



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (lines 494 - 498)


I think we should use `await()` rather than `collect()` so that we can 
return partial usage statistics. And I think we do not need a `defer()` here, 
instead a `lambda::bind()` should be enough. Please take a look at the 
following code as a reference:

https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/containerizer.cpp#L1532:L1535



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 506)


I think you should check `!isReady()` rather than `isFailed()`.


- Qian Zhang


On July 15, 2016, 11:33 a.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49824/
> ---
> 
> (Updated July 15, 2016, 11:33 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
> 
> 
> Bugs: MESOS-5041
> https://issues.apache.org/jira/browse/MESOS-5041
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented `CgroupsIsolatorProcess::usage`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 
> c57baec88437f68886702a40ec8a6a6458546119 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> 4a9f55bf3b217405bf90943f27a976422877a99e 
> 
> Diff: https://reviews.apache.org/r/49824/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 49823: Implemented `CgroupsIsolatorProcess::update`.

2016-07-17 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49823/#review142516
---




src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp (line 122)


Why is this a virtual method?



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 448)


I do not think you need the period in the end.



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 459)


I do not think we need to call `_update()`, instead the following should be 
enough:
```
return collect(updates).then([]() { return Nothing(); });
```
You can take a look at the following code as a reference:

https://github.com/apache/mesos/blob/1.0.0-rc2/src/slave/containerizer/mesos/isolators/gpu/isolator.cpp#L241



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 465)


I do not think we need this method.


- Qian Zhang


On July 15, 2016, 11:33 a.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49823/
> ---
> 
> (Updated July 15, 2016, 11:33 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
> 
> 
> Bugs: MESOS-5041
> https://issues.apache.org/jira/browse/MESOS-5041
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented `CgroupsIsolatorProcess::update`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 
> c57baec88437f68886702a40ec8a6a6458546119 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> 4a9f55bf3b217405bf90943f27a976422877a99e 
> 
> Diff: https://reviews.apache.org/r/49823/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 49821: Implemented `CgroupsIsolatorProcess::watch`.

2016-07-17 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49821/#review142515
---




src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 434)


I do not think you need the period in the end.


- Qian Zhang


On July 15, 2016, 11:33 a.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49821/
> ---
> 
> (Updated July 15, 2016, 11:33 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
> 
> 
> Bugs: MESOS-5041
> https://issues.apache.org/jira/browse/MESOS-5041
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented `CgroupsIsolatorProcess::watch`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> 4a9f55bf3b217405bf90943f27a976422877a99e 
> 
> Diff: https://reviews.apache.org/r/49821/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 49820: Implemented `CgroupsIsolatorProcess::isolate`.

2016-07-17 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49820/#review142514
---




src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 377)


I do not think you need the period in the end.


- Qian Zhang


On July 15, 2016, 11:32 a.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49820/
> ---
> 
> (Updated July 15, 2016, 11:32 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
> 
> 
> Bugs: MESOS-5041
> https://issues.apache.org/jira/browse/MESOS-5041
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented `CgroupsIsolatorProcess::isolate`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 
> c57baec88437f68886702a40ec8a6a6458546119 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> 4a9f55bf3b217405bf90943f27a976422877a99e 
> 
> Diff: https://reviews.apache.org/r/49820/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 49820: Implemented `CgroupsIsolatorProcess::isolate`.

2016-07-17 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49820/#review142513
---




src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp (line 118)


Why is this a virtual method?


- Qian Zhang


On July 15, 2016, 11:32 a.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49820/
> ---
> 
> (Updated July 15, 2016, 11:32 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
> 
> 
> Bugs: MESOS-5041
> https://issues.apache.org/jira/browse/MESOS-5041
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented `CgroupsIsolatorProcess::isolate`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 
> c57baec88437f68886702a40ec8a6a6458546119 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> 4a9f55bf3b217405bf90943f27a976422877a99e 
> 
> Diff: https://reviews.apache.org/r/49820/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 49819: Implemented `CgroupsIsolatorProcess::prepare`.

2016-07-17 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49819/#review142510
---




src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp (line 113)


Why is this a virtual method?



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 286)


s/Create/Creating/



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 325)


s/cgroup/Cgroup/



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 357)


I think you should check `!prepares.isReady()` rather than 
`prepares.isFailed()`, please take a look at the following code as a reference:

https://github.com/apache/mesos/blob/1.0.0-rc2/src/slave/containerizer/mesos/isolators/cgroups/cpushare.cpp#L555:L559


- Qian Zhang


On July 15, 2016, 11:32 a.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49819/
> ---
> 
> (Updated July 15, 2016, 11:32 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
> 
> 
> Bugs: MESOS-5041
> https://issues.apache.org/jira/browse/MESOS-5041
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented `CgroupsIsolatorProcess::prepare`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 
> c57baec88437f68886702a40ec8a6a6458546119 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> 4a9f55bf3b217405bf90943f27a976422877a99e 
> 
> Diff: https://reviews.apache.org/r/49819/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 50017: WIP: Validated the resources when parsing it.

2016-07-17 Thread Guangya Liu


> On 七月 14, 2016, 5:56 p.m., Benjamin Mahler wrote:
> > include/mesos/resources.hpp, lines 135-147
> > 
> >
> > This suggests a change of semantics to `Resources`.
> > 
> > Currently: `Resources` always refers to a valid collection resources. 
> > This function would never return Some(Error) unless we change the semantics.
> > 
> > If we want to remove the unnecessary validation, we can do this by 
> > making '`operator += (const Resources& that)`' avoid validating '`that`' 
> > since it's already valid. Currently it just calls '`operator += (const 
> > Resource& r)`' which is unaware of '`r`' already being valid since it comes 
> > from a '`Resources`'.
> > 
> > However, we have to keep the validation in '`operator += (const 
> > Resource& r)`', since '`r`' is just a protobuf coming from an arbitrary 
> > caller, it may be invalid and we need to validate.
> > 
> > Make sense?
> 
> Guangya Liu wrote:
> Hi Ben, does there are any example in mesos code for `Resource` object 
> come from protobuf? I found that we are always using `Resource::parse` to get 
> a `Resource` or `Resources` object and I have added some validation logic in 
> `Resource::parse` in when getting `Resource` or `Resources` object.
> 
> Guangya Liu wrote:
> Ben, got your point. We can set `Resource` object as following:
> 
> ```javascript 
> Resource scalar;
> scalar.set_name("cpus");
> scalar.set_type(Value::SCALAR);
> scalar.mutable_scalar()->set_value(1.234);
> ```
> 
> So if we igore the validation for `'operator += (const Resource& r)'`, 
> there might be some erros if the resource was not constructed correctly. This 
> is not a good way to create `Resource` object but we cannot guard the usage 
> for this now.
> 
> I found that the API `'operator += (const Resource& r)'` was not called 
> directly too frequently, but we are using this API `'operator += (const 
> Resources& that)'` for the most of the time, so seems it is good to keep the 
> validation for `'operator += (const Resource& r)'` as it was not called 
> directly too much. 
> 
> The problem is that `'operator += (const Resources& that)'` is calling 
> `'operator += (const Resource& r)'`, I did not found a good way to keep 
> validation for `'operator += (const Resource& r)'` but ignore the validation 
> for `'operator += (const Resources& that)'`, still checking how we can make 
> this work, any comments/suggestions for this?

Did more test by removing validation, the performance of resources operations 
with `port` resources increased more than 5x.

With validation
```
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from ResourcesOperators/Resources_BENCHMARK_Test
[ RUN  ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/2
Took 11.212199secs to perform 1000 'total += r' operations on ports(*):[1-2, 
4-5, 7-8, 10-11, 13-14, 16-17, 1...
Took 8.145903secs to perform 1000 'total -= r' operations on ports(*):[1-2, 
4-5, 7-8, 10-11, 13-14, 16-17, 1...
Took 11.154993secs to perform 1000 'total = total + r' operations on 
ports(*):[1-2, 4-5, 7-8, 10-11, 13-14, 16-17, 1...
Took 7.974004secs to perform 1000 'total = total - r' operations on 
ports(*):[1-2, 4-5, 7-8, 10-11, 13-14, 16-17, 1...
[   OK ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/2 (38490 ms)
[--] 1 test from ResourcesOperators/Resources_BENCHMARK_Test (38490 ms 
total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (38512 ms total)
[  PASSED  ] 1 test.
```

Without validation
```
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from ResourcesOperators/Resources_BENCHMARK_Test
[ RUN  ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/2
Took 2.90743secs to perform 1000 'total += r' operations on ports(*):[1-2, 4-5, 
7-8, 10-11, 13-14, 16-17, 1...
Took 9166us to perform 1000 'total -= r' operations on ports(*):[1-2, 4-5, 7-8, 
10-11, 13-14, 16-17, 1...
Took 3.038083secs to perform 1000 'total = total + r' operations on 
ports(*):[1-2, 4-5, 7-8, 10-11, 13-14, 16-17, 1...
Took 9253us to perform 1000 'total = total - r' operations on ports(*):[1-2, 
4-5, 7-8, 10-11, 13-14, 16-17, 1...
[   OK ] ResourcesOperators/Resources_BENCHMARK_Test.Arithmetic/2 (5966 ms)
[--] 1 test from ResourcesOperators/Resources_BENCHMARK_Test (5966 ms 
total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (5988 ms total)
[  PASSED  ] 1 test.
```

Also after more checking, I think that we can remove the validation from 
resource operations, as we already have resource validation when launching 
tasks, creating resurces etc. Even though application developer may create 
resources using protobuf directly, but if the 

Re: Review Request 49814: Implemented `CgroupsIsolatorProcess::create`.

2016-07-17 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49814/#review142508
---


Fix it, then Ship it!





src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp (lines 83 - 85)


I would suggest to make the naming of these 3 parameters consistent, i.e., 
all starting with underscore. You can take the method below it as an example :-)



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (lines 108 - 110)


Ditto



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 113)


s/used/use/



src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp (line 114)


s/by/be/


- Qian Zhang


On July 15, 2016, 11:31 a.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49814/
> ---
> 
> (Updated July 15, 2016, 11:31 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
> 
> 
> Bugs: MESOS-5041
> https://issues.apache.org/jira/browse/MESOS-5041
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Implemented `CgroupsIsolatorProcess::create`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 
> c57baec88437f68886702a40ec8a6a6458546119 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> 4a9f55bf3b217405bf90943f27a976422877a99e 
> 
> Diff: https://reviews.apache.org/r/49814/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> haosdent huang
> 
>