Re: Review Request 51402: Added nested container check in provisioner destroy.

2016-09-03 Thread Guangya Liu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51402/#review147737
---




src/slave/containerizer/mesos/provisioner/provisioner.cpp (line 432)


s/both/all



src/slave/containerizer/mesos/provisioner/provisioner.cpp (lines 433 - 434)


+1 to Joseph's comments, since here you already mentioned that all of the 
sub-containers were already destroyed, so in the followig check, it shoud use 
check-fail instead?


- Guangya Liu


On 八月 29, 2016, 9:20 p.m., Gilbert Song wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51402/
> ---
> 
> (Updated 八月 29, 2016, 9:20 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Artem Harutyunyan, Jie Yu, Joseph 
> Wu, and Kevin Klues.
> 
> 
> Bugs: MESOS-6067
> https://issues.apache.org/jira/browse/MESOS-6067
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added nested container check in provisioner destroy.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/provisioner/provisioner.cpp 
> 8e35ff49ec99a242e764095dcfbb541c5e41ec71 
> 
> Diff: https://reviews.apache.org/r/51402/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Gilbert Song
> 
>



Re: Review Request 51393: Added unit test for provisioner recursive listContainers().

2016-09-03 Thread Guangya Liu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51393/#review147736
---


Fix it, then Ship it!




Ship It!


src/tests/containerizer/provisioner_paths_tests.cpp (line 86)


move this right before #105?



src/tests/containerizer/provisioner_paths_tests.cpp (lines 88 - 89)


How about make it less jagged?

```
// TODO(gilbert): Refactor duplicate code 
// to the unit test class.
```


- Guangya Liu


On 八月 29, 2016, 9:20 p.m., Gilbert Song wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51393/
> ---
> 
> (Updated 八月 29, 2016, 9:20 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Artem Harutyunyan, Jie Yu, Joseph 
> Wu, and Kevin Klues.
> 
> 
> Bugs: MESOS-6067
> https://issues.apache.org/jira/browse/MESOS-6067
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added unit test for provisioner recursive listContainers().
> 
> 
> Diffs
> -
> 
>   src/tests/containerizer/provisioner_paths_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/51393/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Gilbert Song
> 
>



Re: Review Request 51620: Removed two std::move in MountInfoTable::read.

2016-09-03 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51620/#review147735
---


Ship it!




Ship It!

- Qian Zhang


On Sept. 3, 2016, 8:25 a.m., Jie Yu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51620/
> ---
> 
> (Updated Sept. 3, 2016, 8:25 a.m.)
> 
> 
> Review request for mesos and Kevin Klues.
> 
> 
> Bugs: MESOS-6118
> https://issues.apache.org/jira/browse/MESOS-6118
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Since 'entry' is a const reference, the std::move won't reduce the
> extra copying. This patch removed this optimization.
> 
> 
> Diffs
> -
> 
>   src/linux/fs.cpp 14ae5a9089916549f691363bc2269e13c5260a14 
> 
> Diff: https://reviews.apache.org/r/51620/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Jie Yu
> 
>



Re: Review Request 51097: Added a `PortMapper` class.

2016-09-03 Thread Avinash sridharan


> On Sept. 3, 2016, 9:36 p.m., Jie Yu wrote:
> > I made a bunch of edit while committing since I probably don't have time to 
> > do review back and forth. Please do take a look at the final patch.
> > 
> > Please follow up with patches if you don't agree on some of the changes.
> > 
> > Also, please rebase the subsequent patches accordingly.

Fine with most of the changes. I wanted to introduce a `PluginError` class 
derived from `Error`, which can be returned by the `PortMapper` public 
functions. Will simplify the code. Will submit a separate patch for this.


- Avinash


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51097/#review147619
---


On Sept. 2, 2016, 10:33 p.m., Avinash sridharan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51097/
> ---
> 
> (Updated Sept. 2, 2016, 10:33 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Qian Zhang.
> 
> 
> Bugs: MESOS-6023
> https://issues.apache.org/jira/browse/MESOS-6023
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This class will embody the logic for implementing the CNI port-mapper
> plugin.
> 
> Also added helper functions in `cni::spec` to be able to log CNI
> plugin error as a JSON formatted string of `spec::error`.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 120a715eeeb7fb222d1169500950b5c7643df554 
>   
> src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/main.cpp
>  PRE-CREATION 
>   
> src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/port_mapper.hpp
>  PRE-CREATION 
>   
> src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/port_mapper.cpp
>  PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
> e6967415a689184f13d65a986f13b0af54364fb2 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
> 938a9f3bff2a95c8591450f0edafaddb01833e59 
> 
> Diff: https://reviews.apache.org/r/51097/diff/
> 
> 
> Testing
> ---
> 
> Tested the port-mapper with the following CNI config:
> {
> "name": "mynet",
> "type": "port-mapper",
> "chain": "MESOS",
> "delegate": {
>   "type" : "bridge",
>   "bridge": "cni0",
>   "isGateway": true,
>   "ipMasq": true,
>   "ipam": {
>   "type": "host-local",
>   "subnet": "10.22.0.0/16",
>   "routes": [
> { "dst": "0.0.0.0/0" }
>   ]
>   }
> },
> "args" : {
>   "org.apache.mesos" : {
> "network_info" : {
>   "port_mappings": {
> "host_port" : 80,
> "container_port" : 80
>   }
> }
>   }
> }
> }
> 
> and the following environment variables:
> export CNI_COMMAND="ADD"
> export CNI_CONTAINERID="00001110"
> export CNI_PATH="$MESOS_INSTALL:/home/vagrant/dev/go/cni/bin"
> export CNI_IFNAME="eth0"
> export CNI_NETNS="/etc/netns"
> 
> If we remove fields from the above CNI config, or remove certain environment 
> variables the creation of the `PortMapper` correctly fails. However, if 
> config and environment variables are passed as is it will create the 
> `PortMapper` correctly.
> 
> 
> Thanks,
> 
> Avinash sridharan
> 
>



Re: Review Request 51097: Added a `PortMapper` class.

2016-09-03 Thread Avinash sridharan


> On Sept. 3, 2016, 9:36 p.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/spec.hpp, line 35
> > 
> >
> > This is not in the CNI spec. Why we have it here?

Most CNI plugins return an error code of 100. The CNI spec specifies error 
codes of 100+ to be unreserved, so using this as the return code for generic 
errors for the plugins.


> On Sept. 3, 2016, 9:36 p.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/isolators/network/cni/spec.hpp, lines 37-40
> > 
> >
> > This is a non-POD global variable. We try to avoid that. Also, why we 
> > need this?

Currently, the CNI spec has specified only two errors VERSION_ERROR and 
FIELD_ERROR. Though we don't use it in the plugins right now, specifying it for 
future purposes.


- Avinash


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51097/#review147619
---


On Sept. 2, 2016, 10:33 p.m., Avinash sridharan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51097/
> ---
> 
> (Updated Sept. 2, 2016, 10:33 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Qian Zhang.
> 
> 
> Bugs: MESOS-6023
> https://issues.apache.org/jira/browse/MESOS-6023
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This class will embody the logic for implementing the CNI port-mapper
> plugin.
> 
> Also added helper functions in `cni::spec` to be able to log CNI
> plugin error as a JSON formatted string of `spec::error`.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 120a715eeeb7fb222d1169500950b5c7643df554 
>   
> src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/main.cpp
>  PRE-CREATION 
>   
> src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/port_mapper.hpp
>  PRE-CREATION 
>   
> src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/port_mapper.cpp
>  PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
> e6967415a689184f13d65a986f13b0af54364fb2 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
> 938a9f3bff2a95c8591450f0edafaddb01833e59 
> 
> Diff: https://reviews.apache.org/r/51097/diff/
> 
> 
> Testing
> ---
> 
> Tested the port-mapper with the following CNI config:
> {
> "name": "mynet",
> "type": "port-mapper",
> "chain": "MESOS",
> "delegate": {
>   "type" : "bridge",
>   "bridge": "cni0",
>   "isGateway": true,
>   "ipMasq": true,
>   "ipam": {
>   "type": "host-local",
>   "subnet": "10.22.0.0/16",
>   "routes": [
> { "dst": "0.0.0.0/0" }
>   ]
>   }
> },
> "args" : {
>   "org.apache.mesos" : {
> "network_info" : {
>   "port_mappings": {
> "host_port" : 80,
> "container_port" : 80
>   }
> }
>   }
> }
> }
> 
> and the following environment variables:
> export CNI_COMMAND="ADD"
> export CNI_CONTAINERID="00001110"
> export CNI_PATH="$MESOS_INSTALL:/home/vagrant/dev/go/cni/bin"
> export CNI_IFNAME="eth0"
> export CNI_NETNS="/etc/netns"
> 
> If we remove fields from the above CNI config, or remove certain environment 
> variables the creation of the `PortMapper` correctly fails. However, if 
> config and environment variables are passed as is it will create the 
> `PortMapper` correctly.
> 
> 
> Thanks,
> 
> Avinash sridharan
> 
>



Re: Review Request 51623: Updated mesos-execute to support task groups.

2016-09-03 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51623/#review147731
---



Patch looks great!

Reviews applied: [51622, 51623]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On Sept. 3, 2016, 8:36 p.m., Abhishek Dasgupta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51623/
> ---
> 
> (Updated Sept. 3, 2016, 8:36 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Vinod Kone.
> 
> 
> Bugs: MESOS-6096
> https://issues.apache.org/jira/browse/MESOS-6096
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> sThis patch updates mesos-execute to use LAUNCH_GROUP
> for launching task groups. I made some assumption
> in this patch, like newly introduced `--commands`
> option in mesos-execute takes semicolon dimilited values
> which will certainly not be the case in final solution. I
> am open to suggestion what can we use as delimitter for
> `--commands`. In this patch, I kept old `--command` option
> as well and it is working as expected. Though new
> `--commands` should be able to launch single as well
> as multiple tasks and I hope in future we will stick
> to one `commands` option.
> 
> Suggestions are very welcome for further improvement.
> 
> 
> Diffs
> -
> 
>   src/cli/execute.cpp b752d057a3d86482ef1a4baaf31052469e38dc76 
> 
> Diff: https://reviews.apache.org/r/51623/diff/
> 
> 
> Testing
> ---
> 
> On Ubuntu 16.04:
> sudo make -j4
> 
> and manually ran these commands:
> **Successful**
> mesos-execute --master=127.0.0.1:5050 --name="LetsCitar" 
> --containerizer=docker --docker_image="ubuntu" --command="echo hello"
> 
> **Stuck after submitting task to agent as agent code for LAUNCH_GROUP is not 
> yet completed**
> mesos-execute --master=127.0.0.1:5050 --name="LetsCitar" 
> --containerizer=docker --docker_image="ubuntu" --commands="echo hello"
> 
> 
> Thanks,
> 
> Abhishek Dasgupta
> 
>



Re: Review Request 51097: Added a `PortMapper` class.

2016-09-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51097/#review147619
---


Fix it, then Ship it!




I made a bunch of edit while committing since I probably don't have time to do 
review back and forth. Please do take a look at the final patch.

Please follow up with patches if you don't agree on some of the changes.

Also, please rebase the subsequent patches accordingly.


src/slave/containerizer/mesos/isolators/network/cni/spec.hpp (lines 20 - 21)


swap these two.



src/Makefile.am (lines 1046 - 1047)


Can you align the tailing back slashes for `MESOS_LINUX_FILES` (ditto 
below).



src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/main.cpp
 (line 58)


I'd declare an error code constant in PortMapper and use it here (instead 
of relying on a general default value).



src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/port_mapper.hpp
 (line 60)


no need to use leading underscore for the parameter.



src/slave/containerizer/mesos/isolators/network/cni/spec.hpp (line 35)


This is not in the CNI spec. Why we have it here?



src/slave/containerizer/mesos/isolators/network/cni/spec.hpp (lines 37 - 40)


This is a non-POD global variable. We try to avoid that. Also, why we need 
this?


- Jie Yu


On Sept. 2, 2016, 10:33 p.m., Avinash sridharan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51097/
> ---
> 
> (Updated Sept. 2, 2016, 10:33 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Qian Zhang.
> 
> 
> Bugs: MESOS-6023
> https://issues.apache.org/jira/browse/MESOS-6023
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This class will embody the logic for implementing the CNI port-mapper
> plugin.
> 
> Also added helper functions in `cni::spec` to be able to log CNI
> plugin error as a JSON formatted string of `spec::error`.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 120a715eeeb7fb222d1169500950b5c7643df554 
>   
> src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/main.cpp
>  PRE-CREATION 
>   
> src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/port_mapper.hpp
>  PRE-CREATION 
>   
> src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/port_mapper.cpp
>  PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.hpp 
> e6967415a689184f13d65a986f13b0af54364fb2 
>   src/slave/containerizer/mesos/isolators/network/cni/spec.cpp 
> 938a9f3bff2a95c8591450f0edafaddb01833e59 
> 
> Diff: https://reviews.apache.org/r/51097/diff/
> 
> 
> Testing
> ---
> 
> Tested the port-mapper with the following CNI config:
> {
> "name": "mynet",
> "type": "port-mapper",
> "chain": "MESOS",
> "delegate": {
>   "type" : "bridge",
>   "bridge": "cni0",
>   "isGateway": true,
>   "ipMasq": true,
>   "ipam": {
>   "type": "host-local",
>   "subnet": "10.22.0.0/16",
>   "routes": [
> { "dst": "0.0.0.0/0" }
>   ]
>   }
> },
> "args" : {
>   "org.apache.mesos" : {
> "network_info" : {
>   "port_mappings": {
> "host_port" : 80,
> "container_port" : 80
>   }
> }
>   }
> }
> }
> 
> and the following environment variables:
> export CNI_COMMAND="ADD"
> export CNI_CONTAINERID="00001110"
> export CNI_PATH="$MESOS_INSTALL:/home/vagrant/dev/go/cni/bin"
> export CNI_IFNAME="eth0"
> export CNI_NETNS="/etc/netns"
> 
> If we remove fields from the above CNI config, or remove certain environment 
> variables the creation of the `PortMapper` correctly fails. However, if 
> config and environment variables are passed as is it will create the 
> `PortMapper` correctly.
> 
> 
> Thanks,
> 
> Avinash sridharan
> 
>



Review Request 51622: Added parse function for semicolon separated commands.

2016-09-03 Thread Abhishek Dasgupta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51622/
---

Review request for mesos, Anand Mazumdar and Vinod Kone.


Bugs: MESOS-6096
https://issues.apache.org/jira/browse/MESOS-6096


Repository: mesos


Description
---

Required to parse the newly introduced `--commands` option in
mesos-execute.


Diffs
-

  src/common/parse.hpp 51582a4f8943296799d71d0a8d7812787fd1b2c2 

Diff: https://reviews.apache.org/r/51622/diff/


Testing
---


Thanks,

Abhishek Dasgupta



Review Request 51623: Updated mesos-execute to support task groups.

2016-09-03 Thread Abhishek Dasgupta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51623/
---

Review request for mesos, Anand Mazumdar and Vinod Kone.


Bugs: MESOS-6096
https://issues.apache.org/jira/browse/MESOS-6096


Repository: mesos


Description
---

sThis patch updates mesos-execute to use LAUNCH_GROUP
for launching task groups. I made some assumption
in this patch, like newly introduced `--commands`
option in mesos-execute takes semicolon dimilited values
which will certainly not be the case in final solution. I
am open to suggestion what can we use as delimitter for
`--commands`. In this patch, I kept old `--command` option
as well and it is working as expected. Though new
`--commands` should be able to launch single as well
as multiple tasks and I hope in future we will stick
to one `commands` option.

Suggestions are very welcome for further improvement.


Diffs
-

  src/cli/execute.cpp b752d057a3d86482ef1a4baaf31052469e38dc76 

Diff: https://reviews.apache.org/r/51623/diff/


Testing
---

On Ubuntu 16.04:
sudo make -j4

and manually ran these commands:
**Successful**
mesos-execute --master=127.0.0.1:5050 --name="LetsCitar" --containerizer=docker 
--docker_image="ubuntu" --command="echo hello"

**Stuck after submitting task to agent as agent code for LAUNCH_GROUP is not 
yet completed**
mesos-execute --master=127.0.0.1:5050 --name="LetsCitar" --containerizer=docker 
--docker_image="ubuntu" --commands="echo hello"


Thanks,

Abhishek Dasgupta



Review Request 51621: Made recovered resource allocated as soon as possible.

2016-09-03 Thread Guangya Liu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51621/
---

Review request for mesos, Benjamin Mahler, Jacob Janco, and Jiang Yan Xu.


Bugs: MESOS-3078
https://issues.apache.org/jira/browse/MESOS-3078


Repository: mesos


Description
---

Enable the `recoverResources` logic to call `ensureAllocation`,
this can make sure the `allocationCandidates` can be updated
as soon as possible and allocate the resources as soon as possible
if there are no request in the message queue.


Diffs
-

  src/master/allocator/mesos/hierarchical.cpp 
9e5db2196c6a541dc1208ba8b9f13ef9a518bcc4 

Diff: https://reviews.apache.org/r/51621/diff/


Testing
---

make
make check


Thanks,

Guangya Liu



Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-03 Thread Guangya Liu


> On 九月 3, 2016, 9:21 a.m., Guangya Liu wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 484-485
> > 
> >
> > I did some test with benchmark test of 
> > `SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0`
> >  and it failed as following:
> > 
> > ```
> > ../../src/tests/hierarchical_allocator_tests.cpp:3459: Failure
> > Value of: offerCallbacks.load()
> >   Actual: 5
> > Expected: slaveCount
> > Which is: 1000
> > [  FAILED  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0,
> >  where GetParam() = (1000, 1) (497 ms)
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (498 ms total)
> > 
> > [--] Global test environment tear-down
> > [==] 1 test from 1 test case ran. (512 ms total)
> > [  PASSED  ] 0 tests.
> > [  FAILED  ] 1 test, listed below:
> > [  FAILED  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0,
> >  where GetParam() = (1000, 1)
> > 
> >  1 FAILED TEST
> > ```
> > 
> > The reason is that we cannot make sure all of the `_allocate()` 
> > finished after `addSlave` finished.
> > 
> > Shall we do a `while` loop in the benchmark to wait till all 
> > allocations are got?

Perhaps we can simply remove the `ASSERT_EQ` for offer and agent count in 
`HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave` as this patch will not 
trigger `allocate()` synchronously, so we cannot make sure the agent count is 
always same as offer count.


- Guangya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/#review147722
---


On 九月 3, 2016, 6:05 a.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51027/
> ---
> 
> (Updated 九月 3, 2016, 6:05 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus 
> Ma, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Triggered allocations dispatch allocate() only
>   if there is no pending allocation in the queue.
> - Allocation candidates are accumulated and only
>   cleared when enqueued allocations are processed.
> - Batched allocations are handled synchronously.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> dd07ed221d2c1755d2478369641ffdc46ecc4471 
>   src/master/allocator/mesos/hierarchical.cpp 
> 9e5db2196c6a541dc1208ba8b9f13ef9a518bcc4 
> 
> Diff: https://reviews.apache.org/r/51027/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> note: check without filters depends on https://reviews.apache.org/r/51028
> 
> With new benchmark https://reviews.apache.org/r/49617: 
> Sample output without 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 57251us
> Added 1 agents in 3.2134535333mins
> allocator settled after  1.6123603833mins
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (290578 ms)
> 
> Sample output with 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 39817us
> Added 1 agents in 3.2286054167mins
> allocator settled after  25.525654secs
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (220137 ms)
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-03 Thread Guangya Liu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/#review147722
---




src/master/allocator/mesos/hierarchical.cpp (lines 483 - 484)


I did some test with benchmark test of 
`SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0`
 and it failed as following:

```
../../src/tests/hierarchical_allocator_tests.cpp:3459: Failure
Value of: offerCallbacks.load()
  Actual: 5
Expected: slaveCount
Which is: 1000
[  FAILED  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0,
 where GetParam() = (1000, 1) (497 ms)
[--] 1 test from 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test (498 ms total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (512 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0,
 where GetParam() = (1000, 1)

 1 FAILED TEST
```

The reason is that we cannot make sure all of the `_allocate()` finished 
after `addSlave` finished.

Shall we do a `while` loop in the benchmark to wait till all allocations 
are got?


- Guangya Liu


On 九月 3, 2016, 6:05 a.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51027/
> ---
> 
> (Updated 九月 3, 2016, 6:05 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus 
> Ma, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Triggered allocations dispatch allocate() only
>   if there is no pending allocation in the queue.
> - Allocation candidates are accumulated and only
>   cleared when enqueued allocations are processed.
> - Batched allocations are handled synchronously.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> dd07ed221d2c1755d2478369641ffdc46ecc4471 
>   src/master/allocator/mesos/hierarchical.cpp 
> 9e5db2196c6a541dc1208ba8b9f13ef9a518bcc4 
> 
> Diff: https://reviews.apache.org/r/51027/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> note: check without filters depends on https://reviews.apache.org/r/51028
> 
> With new benchmark https://reviews.apache.org/r/49617: 
> Sample output without 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 57251us
> Added 1 agents in 3.2134535333mins
> allocator settled after  1.6123603833mins
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (290578 ms)
> 
> Sample output with 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 39817us
> Added 1 agents in 3.2286054167mins
> allocator settled after  25.525654secs
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (220137 ms)
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 50868: Using `OfferedResources` for benchmark test of `SuppressOffers`.

2016-09-03 Thread Guangya Liu


> On 八月 25, 2016, 10:56 p.m., Jiang Yan Xu wrote:
> > I don't know. Making such distinction feels like splitting hairs to me. 
> > Yeah there are subtle differences between the two concepts but to me:
> > 
> > - `offerCallback` may not be the best name what we could have come up with 
> > even though in `Master::offer()` we do construct offers.
> > - When offer objects are constructed, we call the resources in them 
> > "offered resources". 
> > - When we parse resources from offer objects, we call them "offered 
> > resources".
> > - In this test, namely, the "Allocator" tests, we are indeed only 
> > allocating resources. (I don't see "Offer" objects being created).
> > 
> > Later if the allocator actually hands out offer objects, then we probably 
> > need to change the tests to use offers directly but right now I think 1) 
> > "allocation" is most correct, 2) it's not a big deal if people treat them 
> > as synonymous terms. 
> > 
> > That said, I think reusing the "Allocation" defined at the top of the test 
> > makes sense.
> 
> Guangya Liu wrote:
> Thanks Jiang Yan, there are acutally some discussion here 
> https://reviews.apache.org/r/50677/ , the reason that I want to update this 
> is because:
>  
> 1) Make the code consistent, in the allocator test cases, some benchmark 
> test using `OfferedResources` while some using `Allocation`, when people want 
> to add new benchmark test, they will be confused: Which one shall I use, 
> `OfferedResources` or `Allocation`, so here we need to make the code 
> consistent with each other for better reading, easy to maintain and 
> consistent.
> 
> 2) Seems here using `OfferedResources` maybe better, as here we are 
> actually constructing the `offers` on each agent; But the `Allocation` is an 
> allocator concept and it means the resources for a specified framework. Also 
> all of the log messages are highlighting `offer` in benchmark test such as 
> here 
> https://github.com/apache/mesos/blob/master/src/tests/hierarchical_allocator_tests.cpp#L3885
> 
> 3) The benchmark test is focusing on `offers` even though the allocator 
> know nothing about `offer`, but here we are just simulating `offer 
> operations` in benchmark test. For other unit test cases other than benchmark 
> test, they are focusing on `allocations` for each framework, so it is using 
> `Allocation`. 
> 
> What do you think?
> 
> Jiang Yan Xu wrote:
> Thanks for the reply. I've listed a few critia that I think could explain 
> their subtle differences but I try to see if making such a distinction leads 
> to simplity and clarity of the code. i.e., would everyone agree with the way 
> we make such distinction and would it help readers understand the code 
> better. If it's controversial, then perhaps we should aim for consistently. :)
> 
> In terms of code in this file, yes I agree there's inconsistency and we 
> should address it. To that end I think:
> 
> To 1): The question to me is to choose between *OfferedResources -> 
> Allocation* or *Allocation -> OfferedResources*. I am leaning towards 
> *OfferedResources -> Allocation* and FWIW `Allocation` is a more established 
> usage than `OfferedResources` in this file.
> 
> To 2) and 3): sorry I failed to see the fundamental difference between 
> the benchmarks and the unit tests here in terms of `offers` vs `allocations`. 
> The unit tests, espcially later-added ones, follow basically the same pattern 
> as the benchmarks. The benchmarks are just different in the use of 
> expectations vs. measurements and the number of iterations. It's not 
> convincing to me that in some cases we need to call them OfferedResources and 
> in others we should call them Allocations. 
> 
> So overall I think it's better to change to use the Allocation struct 
> defined at the top to achieve consistency. It looks to me that all 
> occurrences of `offers` in the log messages can be changed to `allocations` 
> as well.
> 
> Let me know what you think.
> 
> Guangya Liu wrote:
> Hi Jiang Yan, are you mentioning the `Allocation` definition here 
> https://github.com/apache/mesos/blob/9c6097f063405279efc07eec22457c2059653f07/src/tests/hierarchical_allocator_tests.cpp#L79-L83
>  ? They are a bit different:
> 
> ```
> struct Allocation
> {
>   FrameworkID frameworkId;
>   hashmap resources;
> };
> ```
> 
> ```
> struct Allocation
> {
>   FrameworkID   frameworkId;
>   SlaveID   slaveId;
>   Resources resources;
> };
> ```
> 
> The above two is a bit difference: The first `Allocation` is an 
> allocation for a framework and the second `Allocation` is actullay an `offer` 
> as one framework can have multiple such `Allocation`.
> 
> So it may not accurate to use `allocations` in the benchmark test as they 
> are actually `offers` for each framework, comments? Thanks.
> 
> 

Re: Review Request 49617: Add benchmark for failover of many frameworks.

2016-09-03 Thread Guangya Liu


> On 八月 24, 2016, 9:12 a.m., Guangya Liu wrote:
> > src/tests/hierarchical_allocator_tests.cpp, line 4030
> > 
> >
> > How about adding `ports` resources to the allocation here.
> > 
> > ```
> > // Each agent has a portion of it's resources allocated to a single
> > // framework. We round-robin through the frameworks when allocating.
> > Resources allocation = 
> > Resources::parse("cpus:16;mem:1024;disk:1024").get();
> > 
> > Try<::mesos::Value::Ranges> ranges = fragment(createRange(31000, 
> > 32000), 16);
> > ASSERT_SOME(ranges);
> > ASSERT_EQ(16, ranges->range_size());
> > 
> > allocation += createPorts(ranges.get());
> > ```
> 
> Jacob Janco wrote:
> I took this out to simplify the test.

I think that we should simulate a full allocate scernario in the benchmark test 
to include `ports` as well just like other benchmark test, as we actually have 
some special logic to handle non-scalar resources in sorter when allocate 
resources, comments?


- Guangya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49617/#review146625
---


On 八月 17, 2016, 2:26 a.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49617/
> ---
> 
> (Updated 八月 17, 2016, 2:26 a.m.)
> 
> 
> Review request for mesos, Joris Van Remoortere and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-5780
> https://issues.apache.org/jira/browse/MESOS-5780
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - This benchmark measures latency to stability of
>   the allocator following disconnection and
>   reconnection of all frameworks.
> - In this scenario, frameworks are offered resources
>   and suppressed in batches.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_tests.cpp 
> cbed333f497016fe2811f755028796012b41db77 
> 
> Diff: https://reviews.apache.org/r/49617/diff/
> 
> 
> Testing
> ---
> 
> MESOS_BENCHMARK=1 GTEST_FILTER="*BENCHMARK_Test.FrameworkFailover*" make check
> 
> Sample Output:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/23
> Using 1 agents and 6000 frameworks
> Added 6000 frameworks in 113410us
> Added 1 agents in 6.8398066333mins
> allocator settled after  3.286837mins
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/23
>  (609255 ms)[ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/24
> Using 2 agents and 1 frameworks
> Added 1 frameworks in 190us
> Added 2 agents in 4.752954secs
> allocator settled after  7us
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/24
>  (6332 ms)
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-03 Thread Guangya Liu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/#review147718
---




src/master/allocator/mesos/hierarchical.cpp (lines 1212 - 1215)


Just nit, I think that we should separate the `_allocate()` and 
`_deallocate()` to use different time elapse. But as the original logic is also 
putting the `_deallocate()` logic in `_allocate()`, so I think that it is OK to 
keep such logic but may need an update if we want to make the time more 
accurate for diffrent actions: `_allocate()` and `_deallocate()`.

```
stopwatch.start();

metrics.allocation_run.start();

_allocate();

metrics.allocation_run.stop();

stopwatch.stop();

VLOG(1) << "Performed allocation for " << allocationCandidates.size()
<< " agents in " << stopwatch.elapsed();

stopwatch.start();

_deallocate();

stopwatch.stop();

VLOG(1) << "Performed deallocation for " << allocationCandidates.size()
<< " agents in " << stopwatch.elapsed();
```


- Guangya Liu


On 九月 3, 2016, 6:05 a.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51027/
> ---
> 
> (Updated 九月 3, 2016, 6:05 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus 
> Ma, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Triggered allocations dispatch allocate() only
>   if there is no pending allocation in the queue.
> - Allocation candidates are accumulated and only
>   cleared when enqueued allocations are processed.
> - Batched allocations are handled synchronously.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> dd07ed221d2c1755d2478369641ffdc46ecc4471 
>   src/master/allocator/mesos/hierarchical.cpp 
> 9e5db2196c6a541dc1208ba8b9f13ef9a518bcc4 
> 
> Diff: https://reviews.apache.org/r/51027/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> note: check without filters depends on https://reviews.apache.org/r/51028
> 
> With new benchmark https://reviews.apache.org/r/49617: 
> Sample output without 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 57251us
> Added 1 agents in 3.2134535333mins
> allocator settled after  1.6123603833mins
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (290578 ms)
> 
> Sample output with 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 39817us
> Added 1 agents in 3.2286054167mins
> allocator settled after  25.525654secs
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (220137 ms)
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 49617: Add benchmark for failover of many frameworks.

2016-09-03 Thread Jacob Janco


> On Aug. 24, 2016, 9:12 a.m., Guangya Liu wrote:
> > src/tests/hierarchical_allocator_tests.cpp, line 4030
> > 
> >
> > How about adding `ports` resources to the allocation here.
> > 
> > ```
> > // Each agent has a portion of it's resources allocated to a single
> > // framework. We round-robin through the frameworks when allocating.
> > Resources allocation = 
> > Resources::parse("cpus:16;mem:1024;disk:1024").get();
> > 
> > Try<::mesos::Value::Ranges> ranges = fragment(createRange(31000, 
> > 32000), 16);
> > ASSERT_SOME(ranges);
> > ASSERT_EQ(16, ranges->range_size());
> > 
> > allocation += createPorts(ranges.get());
> > ```

I took this out to simplify the test.


- Jacob


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49617/#review146625
---


On Aug. 17, 2016, 2:26 a.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49617/
> ---
> 
> (Updated Aug. 17, 2016, 2:26 a.m.)
> 
> 
> Review request for mesos, Joris Van Remoortere and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-5780
> https://issues.apache.org/jira/browse/MESOS-5780
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - This benchmark measures latency to stability of
>   the allocator following disconnection and
>   reconnection of all frameworks.
> - In this scenario, frameworks are offered resources
>   and suppressed in batches.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_tests.cpp 
> cbed333f497016fe2811f755028796012b41db77 
> 
> Diff: https://reviews.apache.org/r/49617/diff/
> 
> 
> Testing
> ---
> 
> MESOS_BENCHMARK=1 GTEST_FILTER="*BENCHMARK_Test.FrameworkFailover*" make check
> 
> Sample Output:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/23
> Using 1 agents and 6000 frameworks
> Added 6000 frameworks in 113410us
> Added 1 agents in 6.8398066333mins
> allocator settled after  3.286837mins
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/23
>  (609255 ms)[ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/24
> Using 2 agents and 1 frameworks
> Added 1 frameworks in 190us
> Added 2 agents in 4.752954secs
> allocator settled after  7us
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/24
>  (6332 ms)
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-03 Thread Jacob Janco

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/
---

(Updated Sept. 3, 2016, 6:05 a.m.)


Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus Ma, 
and Jiang Yan Xu.


Bugs: MESOS-3157
https://issues.apache.org/jira/browse/MESOS-3157


Repository: mesos


Description (updated)
---

- Triggered allocations dispatch allocate() only
  if there is no pending allocation in the queue.
- Allocation candidates are accumulated and only
  cleared when enqueued allocations are processed.
- Batched allocations are handled synchronously.


Diffs (updated)
-

  src/master/allocator/mesos/hierarchical.hpp 
dd07ed221d2c1755d2478369641ffdc46ecc4471 
  src/master/allocator/mesos/hierarchical.cpp 
9e5db2196c6a541dc1208ba8b9f13ef9a518bcc4 

Diff: https://reviews.apache.org/r/51027/diff/


Testing
---

make check

note: check without filters depends on https://reviews.apache.org/r/51028

With new benchmark https://reviews.apache.org/r/49617: 
Sample output without 51027:
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
Using 1 agents and 3000 frameworks
Added 3000 frameworks in 57251us
Added 1 agents in 3.2134535333mins
allocator settled after  1.6123603833mins
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
 (290578 ms)

Sample output with 51027:
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
Using 1 agents and 3000 frameworks
Added 3000 frameworks in 39817us
Added 1 agents in 3.2286054167mins
allocator settled after  25.525654secs
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
 (220137 ms)


Thanks,

Jacob Janco