Re: Review Request 68785: Added jsonschema.py for managing json-based configs.

2018-12-05 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68785/#review211087
---



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['68785']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2679/mesos-review-68785

Relevant logs:

- 
[mesos-tests.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2679/mesos-review-68785/logs/mesos-tests.log):

```
I1206 07:11:53.926291 12628 master.cpp:11060] Removing task 
b906611e-a0d8-42d3-8b17-f6a1eebc28f1 with resources cpus(allocated: *):4; 
mem(allocated: *):2048; disk(allocated: *):1024; ports(allocated: 
*):[31000-32000] of framework d492d358-74c4-41bf-b7de-2aa984a21adc- on 
agent d492d358-74c4-41bf-b7de-2aa984a21adc-S0 at slave(461)@192.10.1.6:57295 
(windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I1206 07:11:53.929306 12628 master.cpp:1275] Agent 
d492d358-74c4-41bf-b7de-2aa984a21adc-S0 at slave(461)@192.10.1.6:57295 
(windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net) disconnected
I1206 07:11:53.929306 12628 master.cpp:3278] Disconnecting agent 
d492d358-74c4-41bf-b7de-2aa984a21adc-S0 at slave(461)@192.10.1.6:57295 
(windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I1206 07:11:53.929306 12628 master.cpp:3297] Deactivating agent 
d492d358-74c4-41bf-b7de-2aa984a21adc-S0 at slave(461)@192.10.1.6:57295 
(windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I1206 07:11:53.930291 14116 hierarchical.cpp:357] Removed framework 
d492d358-74c4-41bf-b7de-2aa984a21adc-
I1206 07:11:53.930291 14116 hierarchical.cpp:801] Agent 
d492d358-74c4-41bf-b7de-2aa984a21adc-S0 deactivated
I1206 07:11:53.930291 16004 containerizer.cpp:2463] Destroying container 
a038b849-a5d3-4c60-a9c9-6c021fc8e1d1 in RUNNING state
I1206 07:11:53.930291 16004 containerizer.cpp:3130] Transitioning the state of 
container a038b849-a5d3-4c60-a9c9-6c021fc8e1d1 from RUNNING to DESTROYING
I1206 07:11:53.931286 16004 launcher.cpp:161] Asked to destroy container 
a038b849-a5d3-4c60-a9c9-6c021fc8e1d1
W1206 07:11:53.932286 13540 process.cpp:1423] Failed to recv on socket 
WindowsFD::Type::SOCKET=2436 to peer '192.10.1.6:59096': IO failed with error 
code: The specified network name is no longer available.

W1206 07:11:53.933288 13540 process.cpp:838] Failed to recv on socket Window[   
OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (685 ms)
[--] 1 test from IsolationFlag/MemoryIsolatorTest (704 ms total)

[--] Global test environment tear-down
[==] 1056 tests from 103 test cases ran. (502177 ms total)
[  PASSED  ] 1055 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] DockerFetcherPluginTest.INTERNET_CURL_FetchImage

 1 FAILED TEST
  YOU HAVE 231 DISABLED TESTS

sFD::Type::SOCKET=2764 to peer '192.10.1.6:59097': IO failed with error code: 
The specified network name is no longer available.

I1206 07:11:53.971282 14116 containerizer.cpp:2969] Container 
a038b849-a5d3-4c60-a9c9-6c021fc8e1d1 has exited
I1206 07:11:54.000337 14528 master.cpp:1117] Master terminating
I1206 07:11:54.001298 12184 hierarchical.cpp:643] Removed agent 
d492d358-74c4-41bf-b7de-2aa984a21adc-S0
I1206 07:11:54.312300 13540 process.cpp:927] Stopped the socket accept loop
```

- Mesos Reviewbot Windows


On Dec. 6, 2018, 5:56 a.m., Eric Chung wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68785/
> ---
> 
> (Updated Dec. 6, 2018, 5:56 a.m.)
> 
> 
> Review request for mesos, Armand Grillet, Jason Lai, and Kevin Klues.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added jsonschema.py for managing json-based configs.
> 
> 
> Diffs
> -
> 
>   src/python/cli_new/lib/cli/jsonschema.py PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68785/diff/2/
> 
> 
> Testing
> ---
> 
> Ref MESOS-7278
> 
> 
> Thanks,
> 
> Eric Chung
> 
>



Re: Review Request 66649: Added pb2gen.sh for generating python protobuf bindings.

2018-12-05 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66649/#review211085
---



PASS: Mesos patch 66649 was successfully built and tested.

Reviews applied: `['66649']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2678/mesos-review-66649

- Mesos Reviewbot Windows


On Dec. 6, 2018, 4:26 a.m., Eric Chung wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66649/
> ---
> 
> (Updated Dec. 6, 2018, 4:26 a.m.)
> 
> 
> Review request for mesos, Armand Grillet, Jason Lai, Kevin Klues, and Zhitao 
> Li.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The current state of support for python in protoc has two serious
> issues:
> 1. The `__init__.py` files necessary to mark a directory as a python
> package aren't generated.
> 2. The import paths in each of the generated .py files do not reflect
> the `--python_path` option passed to `protoc`.
> 
> This results in incomplete code generated, preventing it from being used
> out of the box. To address this issue, we're adding a `pb2gen.sh` script
> to do end-to-end code generation for python protobuf bindings: the
> script generates the bindings based on what's in the `include` dir, then
> postprocesses the generated code to add proper import paths and the
> `__init__.py` files.
> 
> 
> Diffs
> -
> 
>   src/python/.gitignore fee529dccf57fd24036733f4b470b7b9865a918c 
>   src/python/lib/pb2gen.sh PRE-CREATION 
>   src/python/lib/requirements.in 86ee2ad40c0c114b0082f2d4f52f75758d41cbd9 
>   support/pylint.config af25dd90cb2d467c688ea4b060dc4640040a068b 
> 
> 
> Diff: https://reviews.apache.org/r/66649/diff/7/
> 
> 
> Testing
> ---
> 
> 1. under `src/python/lib`, run `pb2gen.sh`
> 2. initialize and activate virtualenv: `virtualenv env && . env/bin/activate`
> 3. install reqs: `pip install -r requirements.in`
> 4. try to import modules from generated python code: `python -c 'from 
> mesos.pb2.mesos.v1.master import master_pb2'`
> 
> 
> Thanks,
> 
> Eric Chung
> 
>



Re: Review Request 68785: Added jsonschema.py for managing json-based configs.

2018-12-05 Thread Eric Chung

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68785/
---

(Updated Dec. 6, 2018, 5:56 a.m.)


Review request for mesos, Armand Grillet, Jason Lai, and Kevin Klues.


Repository: mesos


Description (updated)
---

Added jsonschema.py for managing json-based configs.


Diffs (updated)
-

  src/python/cli_new/lib/cli/jsonschema.py PRE-CREATION 


Diff: https://reviews.apache.org/r/68785/diff/2/

Changes: https://reviews.apache.org/r/68785/diff/1-2/


Testing
---

Ref MESOS-7278


Thanks,

Eric Chung



Re: Review Request 69514: Added gauge metric for operator event stream subscribers.

2018-12-05 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69514/#review211084
---



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['69307', '69514']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2677/mesos-review-69514

Relevant logs:

- 
[mesos-tests.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2677/mesos-review-69514/logs/mesos-tests.log):

```
W1206 05:27:39.058878 14156 slave.cpp:3922] Ignoring shutdown framework 
d9bd20aa-ffd0-4bca-aa58-5667f9568e74- because it is terminating
I1206 05:27:39.061875 16248 master.cpp:1275] Agent 
d9bd20aa-ffd0-4bca-aa58-5667f9568e74-S0 at slave(461)@192.10.1.6:55204 
(windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net) disconnected
I1206 05:27:39.061875 16248 master.cpp:3278] Disconnecting agent 
d9bd20aa-ffd0-4bca-aa58-5667f9568e74-S0 at slave(461)@192.10.1.6:55204 
(windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I1206 05:27:39.061875 16248 master.cpp:3297] Deactivating agent 
d9bd20aa-ffd0-4bca-aa58-5667f9568e74-S0 at slave(461)@192.10.1.6:55204 
(windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I1206 05:27:39.061875 12100 hierarchical.cpp:357] Removed framework 
d9bd20aa-ffd0-4bca-aa58-5667f9568e74-
I1206 05:27:39.062891 12100 hierarchical.cpp:801] Agent 
d9bd20aa-ffd0-4bca-aa58-5667f9568e74-S0 deactivated
I1206 05:27:39.063894 12100 containerizer.cpp:2463] Destroying container 
d89d4df8-6c34-4767-91e0-07a2c072e3f1 in RUNNING state
I1206 05:27:39.063894 12100 containerizer.cpp:3130] Transitioning the state of 
container d89d4df8-6c34-4767-91e0-07a2c072e3f1 from RUNNING to DESTROYING
I1206 05:27:39.064885 12100 launcher.cpp:161] Asked to destroy container 
d89d4df8-6c34-4767-91e0-07a2c072e3f1
W1206 05:27:39.065882 10340 process.cpp:838] Failed to recv on socket 
WindowsFD::Type::SOCKET=3240 to peer '192.10.1.6:57026': IO failed with error 
code: The specified network name is no longer available[   OK ] 
IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (683 ms)
[--] 1 test from IsolationFlag/MemoryIsolatorTest (704 ms total)

[--] Global test environment tear-down
[==] 1058 tests from 103 test cases ran. (500613 ms total)
[  PASSED  ] 1057 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] DockerFetcherPluginTest.INTERNET_CURL_FetchImage

 1 FAILED TEST
  YOU HAVE 231 DISABLED TESTS

.

W1206 05:27:39.065882 10340 process.cpp:1423] Failed to recv on socket 
WindowsFD::Type::SOCKET=3720 to peer '192.10.1.6:57025': IO failed with error 
code: The specified network name is no longer available.

I1206 05:27:39.142838 16248 containerizer.cpp:2969] Container 
d89d4df8-6c34-4767-91e0-07a2c072e3f1 has exited
I1206 05:27:39.171838  6032 master.cpp:1117] Master terminating
I1206 05:27:39.172838 11132 hierarchical.cpp:643] Removed agent 
d9bd20aa-ffd0-4bca-aa58-5667f9568e74-S0
I1206 05:27:39.523833 10340 process.cpp:927] Stopped the socket accept loop
```

- Mesos Reviewbot Windows


On Dec. 6, 2018, 2:21 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69514/
> ---
> 
> (Updated Dec. 6, 2018, 2:21 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Gastón Kleiman, and 
> Greg Mann.
> 
> 
> Bugs: MESOS-9258
> https://issues.apache.org/jira/browse/MESOS-9258
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This metric "master/active_operator_event_stream_subscribers" returns
> the total number of subscribers to the master's operator event stream.
> 
> 
> Diffs
> -
> 
>   docs/monitoring.md 00c6ea94bcb73746aef740236632ede123f5b534 
>   docs/operator-http-api.md 6edf361047ee5915514da892c6d885b74a510a72 
>   src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
>   src/master/metrics.hpp f6bb89da9394bfe469bc690ff2de66824e994098 
>   src/master/metrics.cpp f69ed52c1c89912e7a5d26cb9409f5959b09111a 
>   src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 
>   src/tests/master_tests.cpp ef2c00101fc3d30c564a9ca34884dece2cdd2651 
> 
> 
> Diff: https://reviews.apache.org/r/69514/diff/2/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> src/mesos-tests --gtest_filter="*MaxEventStreamSubscribers*" 
> --gtest_repeat=-1 --gtest_break_on_failure
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 69474: Added tests for agent/executor heartbeating.

2018-12-05 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69474/#review211083
---



PASS: Mesos patch 69474 was successfully built and tested.

Reviews applied: `['69463', '69472', '69473', '69474']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2676/mesos-review-69474

- Mesos Reviewbot Windows


On Dec. 6, 2018, 1:26 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69474/
> ---
> 
> (Updated Dec. 6, 2018, 1:26 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-7564
> https://issues.apache.org/jira/browse/MESOS-7564
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This adds two separate tests which check if the Agent sends heartbeats
> to HTTP executors, and if the HTTP executor driver sends heartbeats
> to the agent.
> 
> 
> Diffs
> -
> 
>   src/tests/executor_http_api_tests.cpp 
> a635e1a3c228b52f47ba81a67753ea49ce092143 
>   src/tests/mesos.hpp c08e7e6c1dbc3dd9eb980868e43368c0a423c3ee 
> 
> 
> Diff: https://reviews.apache.org/r/69474/diff/4/
> 
> 
> Testing
> ---
> 
> ```
> cmake --build . --target check
> src/mesos-tests --gtest_filter="*Heartbeat*" --verbose --gtest_repeat=-1 
> --gtest_break_on_failure
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 66649: Added pb2gen.sh for generating python protobuf bindings.

2018-12-05 Thread Eric Chung

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66649/
---

(Updated Dec. 6, 2018, 4:26 a.m.)


Review request for mesos, Armand Grillet, Jason Lai, Kevin Klues, and Zhitao Li.


Repository: mesos


Description
---

The current state of support for python in protoc has two serious
issues:
1. The `__init__.py` files necessary to mark a directory as a python
package aren't generated.
2. The import paths in each of the generated .py files do not reflect
the `--python_path` option passed to `protoc`.

This results in incomplete code generated, preventing it from being used
out of the box. To address this issue, we're adding a `pb2gen.sh` script
to do end-to-end code generation for python protobuf bindings: the
script generates the bindings based on what's in the `include` dir, then
postprocesses the generated code to add proper import paths and the
`__init__.py` files.


Diffs (updated)
-

  src/python/.gitignore fee529dccf57fd24036733f4b470b7b9865a918c 
  src/python/lib/pb2gen.sh PRE-CREATION 
  src/python/lib/requirements.in 86ee2ad40c0c114b0082f2d4f52f75758d41cbd9 
  support/pylint.config af25dd90cb2d467c688ea4b060dc4640040a068b 


Diff: https://reviews.apache.org/r/66649/diff/7/

Changes: https://reviews.apache.org/r/66649/diff/6-7/


Testing
---

1. under `src/python/lib`, run `pb2gen.sh`
2. initialize and activate virtualenv: `virtualenv env && . env/bin/activate`
3. install reqs: `pip install -r requirements.in`
4. try to import modules from generated python code: `python -c 'from 
mesos.pb2.mesos.v1.master import master_pb2'`


Thanks,

Eric Chung



Re: Review Request 68654: Enabled isort for src/python/lib.

2018-12-05 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68654/#review211082
---



PASS: Mesos patch 68654 was successfully built and tested.

Reviews applied: `['68654']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2675/mesos-review-68654

- Mesos Reviewbot Windows


On Dec. 6, 2018, 1:26 a.m., Eric Chung wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68654/
> ---
> 
> (Updated Dec. 6, 2018, 1:26 a.m.)
> 
> 
> Review request for mesos, Armand Grillet, Jason Lai, and Kevin Klues.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Enabled isort for src/python/lib.
> 
> 
> Diffs
> -
> 
>   src/python/cli_new/.isort.cfg PRE-CREATION 
>   src/python/cli_new/bin/main.py 430add47d726b3de9cbd921d76b8fb17bebd4fdd 
>   src/python/cli_new/bin/settings.py 3de3387a4d4781aa784b1030f621fa2a41755a50 
>   src/python/cli_new/lib/cli/config.py 
> 7f41736357182550711dc05d83ecd6045b5559d6 
>   src/python/cli_new/lib/cli/docopt.py 
> c624175845a719c06e5c1b9840fc9eafad26907d 
>   src/python/cli_new/lib/cli/http.py 1d8fc5fcbc2b3a027db7fa49bb03160a64d395d6 
>   src/python/cli_new/lib/cli/plugins/base.py 
> e0fcbbf6250eac97f9e989a7c0852734d107dc34 
>   src/python/cli_new/lib/cli/plugins/config/main.py 
> 41bdb60e6abcb05104ffb78e7e57c4959433ab1a 
>   src/python/cli_new/lib/cli/tests/agent.py 
> 31e3e3f17fa81bdbea2e3fd41588d2a65a06e07e 
>   src/python/cli_new/lib/cli/tests/base.py 
> e3104fe1b1e76cbcf0ce6994a9bfea703b54d1d1 
>   src/python/cli_new/lib/cli/tests/tests.py 
> 60aa4e8bdcf15bf9b9174e21a2470d15b23d81dc 
>   src/python/cli_new/tests/main.py ef02fdd6ae30ef4eaaf7fb534baed594b28349e6 
>   src/python/cli_new/tox.ini 236adc7425a57d3e1fae35432527c83ccb46a4c7 
>   src/python/lib/.isort.cfg PRE-CREATION 
>   src/python/lib/mesos/http.py 5a8016da50b2e1e5a81710f10a7ef65d6f4198bd 
>   src/python/lib/tests/test_exceptions.py 
> 096eab82311e02b0fa829101b6ad1b23e42ce088 
>   src/python/lib/tests/test_http.py 41a52f511318cb69c8e9976f09e2f142327544ab 
>   src/python/lib/tox.ini 3ee77681a9b802cd5b4a7910779b8d50aac4cf69 
> 
> 
> Diff: https://reviews.apache.org/r/68654/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Eric Chung
> 
>



Re: Review Request 67185: Added request_protobuf to mesos.http.

2018-12-05 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67185/#review211081
---



PASS: Mesos patch 67185 was successfully built and tested.

Reviews applied: `['67185']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2674/mesos-review-67185

- Mesos Reviewbot Windows


On Dec. 5, 2018, 5:28 p.m., Eric Chung wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67185/
> ---
> 
> (Updated Dec. 5, 2018, 5:28 p.m.)
> 
> 
> Review request for mesos, Armand Grillet, Jason Lai, Kevin Klues, and Zhitao 
> Li.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This change adds the method `Resource.request_protobuf()` which assumes
> that both the request and response data are encoded in protobuf. This
> will be used later in the RPC client, which will also include a decoding
> step.
> 
> Testing Done:
> Ran unit tests with `tox` under `src/python/lib`
> 
> 
> Diffs
> -
> 
>   src/python/lib/mesos/constants.py PRE-CREATION 
>   src/python/lib/mesos/http.py 5a8016da50b2e1e5a81710f10a7ef65d6f4198bd 
>   src/python/lib/requirements.in 86ee2ad40c0c114b0082f2d4f52f75758d41cbd9 
>   src/python/lib/tests/test_http.py 41a52f511318cb69c8e9976f09e2f142327544ab 
> 
> 
> Diff: https://reviews.apache.org/r/67185/diff/5/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Eric Chung
> 
>



Re: Review Request 69514: Added gauge metric for operator event stream subscribers.

2018-12-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69514/
---

(Updated Dec. 5, 2018, 6:21 p.m.)


Review request for mesos, Benno Evers, Benjamin Mahler, Gastón Kleiman, and 
Greg Mann.


Changes
---

Changed metric to a `PushGauge`.  I chose to use the `operator=` instead of 
`operator++ / operator--` because the size is bounded and there would be 
additional logic if I used the increment operators.


Bugs: MESOS-9258
https://issues.apache.org/jira/browse/MESOS-9258


Repository: mesos


Description
---

This metric "master/active_operator_event_stream_subscribers" returns
the total number of subscribers to the master's operator event stream.


Diffs (updated)
-

  docs/monitoring.md 00c6ea94bcb73746aef740236632ede123f5b534 
  docs/operator-http-api.md 6edf361047ee5915514da892c6d885b74a510a72 
  src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
  src/master/metrics.hpp f6bb89da9394bfe469bc690ff2de66824e994098 
  src/master/metrics.cpp f69ed52c1c89912e7a5d26cb9409f5959b09111a 
  src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 
  src/tests/master_tests.cpp ef2c00101fc3d30c564a9ca34884dece2cdd2651 


Diff: https://reviews.apache.org/r/69514/diff/2/

Changes: https://reviews.apache.org/r/69514/diff/1-2/


Testing
---

```
make check

src/mesos-tests --gtest_filter="*MaxEventStreamSubscribers*" --gtest_repeat=-1 
--gtest_break_on_failure
```


Thanks,

Joseph Wu



Re: Review Request 69514: Added gauge metric for operator event stream subscribers.

2018-12-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69514/#review211080
---




src/master/metrics.hpp
Lines 63 (patched)


+1 to Ben's suggestion, let's make this a push gauge.


- Greg Mann


On Dec. 5, 2018, 10:03 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69514/
> ---
> 
> (Updated Dec. 5, 2018, 10:03 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Gastón Kleiman, and 
> Greg Mann.
> 
> 
> Bugs: MESOS-9258
> https://issues.apache.org/jira/browse/MESOS-9258
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This metric "master/active_operator_event_stream_subscribers" returns
> the total number of subscribers to the master's operator event stream.
> 
> 
> Diffs
> -
> 
>   docs/monitoring.md 00c6ea94bcb73746aef740236632ede123f5b534 
>   docs/operator-http-api.md 6edf361047ee5915514da892c6d885b74a510a72 
>   src/master/master.hpp c7becfa615964674dcf1ebd9424aa5818a0fdb85 
>   src/master/metrics.hpp f6bb89da9394bfe469bc690ff2de66824e994098 
>   src/master/metrics.cpp f69ed52c1c89912e7a5d26cb9409f5959b09111a 
>   src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 
>   src/tests/master_tests.cpp ef2c00101fc3d30c564a9ca34884dece2cdd2651 
> 
> 
> Diff: https://reviews.apache.org/r/69514/diff/1/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> src/mesos-tests --gtest_filter="*MaxEventStreamSubscribers*" 
> --gtest_repeat=-1 --gtest_break_on_failure
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 69307: Changed master to hold subscribers in a circular buffer.

2018-12-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69307/#review211079
---


Ship it!




Ship It!

- Greg Mann


On Dec. 5, 2018, 10:01 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69307/
> ---
> 
> (Updated Dec. 5, 2018, 10:01 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Gastón Kleiman, and 
> Greg Mann.
> 
> 
> Bugs: MESOS-9258
> https://issues.apache.org/jira/browse/MESOS-9258
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This adds a flag (--max_operator_event_stream_subscribers) to the
> master which controls how many active subscribers on the Master's
> event stream will be allowed at any time.
> 
> The default is 1000 subscribers, which is purposefully higher
> than we expect is needed.  Operators aware that their  network
> has clients/proxies whom do not close connections have the
> option of lowering this flag.
> 
> 
> Diffs
> -
> 
>   docs/configuration/master.md 575476728079d884fe86b1a3a56614647d1b572e 
>   src/master/constants.hpp 76ad0c3b1494cb97888af5545acc998b8fb15bf6 
>   src/master/flags.hpp ed2d76af2831d1b9718724936fd3e4850e99b332 
>   src/master/flags.cpp 2677738d19caa7d600e591aeb4fb37f0c2318d45 
>   src/master/master.hpp c7becfa615964674dcf1ebd9424aa5818a0fdb85 
>   src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
>   src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 
> 
> 
> Diff: https://reviews.apache.org/r/69307/diff/3/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> src/mesos-tests --gtest_filter="*MaxEventStreamSubscribers*" 
> --gtest_repeat=-1 --gtest_break_on_failure
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 69473: Added heartbeaters for agent and HTTP executors.

2018-12-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69473/#review211078
---


Ship it!




Ship It!

- Greg Mann


On Dec. 6, 2018, 1:24 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69473/
> ---
> 
> (Updated Dec. 6, 2018, 1:24 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-7564
> https://issues.apache.org/jira/browse/MESOS-7564
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This implements two separate heartbeaters for Executor Events (agent
> to executor) and Executor Calls (executor to agent).  Both are set to
> non-configurable intervals of 30 minutes, which should be sufficient
> to keep the connections alive while not flooding logs with warnings
> if the executor/agent does not have this patch.
> 
> 
> Diffs
> -
> 
>   src/common/validation.cpp 5f8f2889f79f9486f8bb731b39f706908dcd84ea 
>   src/executor/executor.cpp d00ea32f02c1d92d64525b8b5ce5ca18ac840c1b 
>   src/launcher/default_executor.cpp b89b0363d64b8dd8936a62b57803f0eff50bfc71 
>   src/launcher/executor.cpp 6b1204dc5a67d4fc3e8ceae378262d53f3d28421 
>   src/slave/constants.hpp fdc21a35bcc0028dc0a2d7ebd06171fad2a20ba1 
>   src/slave/http.cpp ba389263ef7ab311682d542424eb56360841d24c 
>   src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
> 
> 
> Diff: https://reviews.apache.org/r/69473/diff/3/
> 
> 
> Testing
> ---
> 
> make
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 69472: Refactored master and agent streaming connections.

2018-12-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69472/#review211076
---


Fix it, then Ship it!




Looks great, thanks Joseph!


src/slave/http.cpp
Line 834 (original), 834-835 (patched)


Nit: not indented far enough after open paren.


- Greg Mann


On Dec. 6, 2018, 1:23 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69472/
> ---
> 
> (Updated Dec. 6, 2018, 1:23 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-7564
> https://issues.apache.org/jira/browse/MESOS-7564
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This moves the very similar `HttpConnection` classes inside the
> master and agent into a common header.  The refactored
> `StreamingHttpConnection` is more explicitly named to avoid
> potentially clashing with the libprocess HTTP helpers.
> 
> This also moves the master's heartbeater helper into a new header
> and transforms it into an RAII libprocess actor wrapper.  The 
> heartbeater depends on this `StreamingHttpConnection` and is currently
> used by the master for heartbeating the operator event stream
> and HTTP framework connection.  A later patch will use this heartbeater
> for agent->executor heartbeats.
> 
> 
> Diffs
> -
> 
>   src/common/heartbeater.hpp PRE-CREATION 
>   src/common/http.hpp ac9ed5e8b1c6e8e100eba401136eb95c9dd621d4 
>   src/master/framework.cpp 36eda9f287bef28608024e0036db1f955a1b393c 
>   src/master/http.cpp 68ee2a6dcffbc772afec6e797b1af8da48f61937 
>   src/master/master.hpp c7becfa615964674dcf1ebd9424aa5818a0fdb85 
>   src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
>   src/slave/http.cpp ba389263ef7ab311682d542424eb56360841d24c 
>   src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
> 
> 
> Diff: https://reviews.apache.org/r/69472/diff/3/
> 
> 
> Testing
> ---
> 
> make
> 
> The original plan was to use the same helper for the executor too, but of the 
> Master, Framework, and Agent heartbeats, only the executor lacks a streaming 
> connection.
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 67185: Added request_protobuf to mesos.http.

2018-12-05 Thread Eric Chung

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67185/
---

(Updated Dec. 6, 2018, 1:28 a.m.)


Review request for mesos, Armand Grillet, Jason Lai, Kevin Klues, and Zhitao Li.


Repository: mesos


Description
---

This change adds the method `Resource.request_protobuf()` which assumes
that both the request and response data are encoded in protobuf. This
will be used later in the RPC client, which will also include a decoding
step.

Testing Done:
Ran unit tests with `tox` under `src/python/lib`


Diffs (updated)
-

  src/python/lib/mesos/constants.py PRE-CREATION 
  src/python/lib/mesos/http.py 5a8016da50b2e1e5a81710f10a7ef65d6f4198bd 
  src/python/lib/requirements.in 86ee2ad40c0c114b0082f2d4f52f75758d41cbd9 
  src/python/lib/tests/test_http.py 41a52f511318cb69c8e9976f09e2f142327544ab 


Diff: https://reviews.apache.org/r/67185/diff/5/

Changes: https://reviews.apache.org/r/67185/diff/4-5/


Testing
---


Thanks,

Eric Chung



Re: Review Request 69474: Added tests for agent/executor heartbeating.

2018-12-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69474/
---

(Updated Dec. 5, 2018, 5:26 p.m.)


Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.


Changes
---

Address comments, deal with MSVC compilation error, and fix warning in the 
MockHttpExecutor switch statement.


Bugs: MESOS-7564
https://issues.apache.org/jira/browse/MESOS-7564


Repository: mesos


Description
---

This adds two separate tests which check if the Agent sends heartbeats
to HTTP executors, and if the HTTP executor driver sends heartbeats
to the agent.


Diffs (updated)
-

  src/tests/executor_http_api_tests.cpp 
a635e1a3c228b52f47ba81a67753ea49ce092143 
  src/tests/mesos.hpp c08e7e6c1dbc3dd9eb980868e43368c0a423c3ee 


Diff: https://reviews.apache.org/r/69474/diff/4/

Changes: https://reviews.apache.org/r/69474/diff/3-4/


Testing
---

```
cmake --build . --target check
src/mesos-tests --gtest_filter="*Heartbeat*" --verbose --gtest_repeat=-1 
--gtest_break_on_failure
```


Thanks,

Joseph Wu



Re: Review Request 68654: Enabled isort for src/python/lib.

2018-12-05 Thread Eric Chung

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68654/
---

(Updated Dec. 6, 2018, 1:26 a.m.)


Review request for mesos, Armand Grillet, Jason Lai, and Kevin Klues.


Repository: mesos


Description
---

Enabled isort for src/python/lib.


Diffs (updated)
-

  src/python/cli_new/.isort.cfg PRE-CREATION 
  src/python/cli_new/bin/main.py 430add47d726b3de9cbd921d76b8fb17bebd4fdd 
  src/python/cli_new/bin/settings.py 3de3387a4d4781aa784b1030f621fa2a41755a50 
  src/python/cli_new/lib/cli/config.py 7f41736357182550711dc05d83ecd6045b5559d6 
  src/python/cli_new/lib/cli/docopt.py c624175845a719c06e5c1b9840fc9eafad26907d 
  src/python/cli_new/lib/cli/http.py 1d8fc5fcbc2b3a027db7fa49bb03160a64d395d6 
  src/python/cli_new/lib/cli/plugins/base.py 
e0fcbbf6250eac97f9e989a7c0852734d107dc34 
  src/python/cli_new/lib/cli/plugins/config/main.py 
41bdb60e6abcb05104ffb78e7e57c4959433ab1a 
  src/python/cli_new/lib/cli/tests/agent.py 
31e3e3f17fa81bdbea2e3fd41588d2a65a06e07e 
  src/python/cli_new/lib/cli/tests/base.py 
e3104fe1b1e76cbcf0ce6994a9bfea703b54d1d1 
  src/python/cli_new/lib/cli/tests/tests.py 
60aa4e8bdcf15bf9b9174e21a2470d15b23d81dc 
  src/python/cli_new/tests/main.py ef02fdd6ae30ef4eaaf7fb534baed594b28349e6 
  src/python/cli_new/tox.ini 236adc7425a57d3e1fae35432527c83ccb46a4c7 
  src/python/lib/.isort.cfg PRE-CREATION 
  src/python/lib/mesos/http.py 5a8016da50b2e1e5a81710f10a7ef65d6f4198bd 
  src/python/lib/tests/test_exceptions.py 
096eab82311e02b0fa829101b6ad1b23e42ce088 
  src/python/lib/tests/test_http.py 41a52f511318cb69c8e9976f09e2f142327544ab 
  src/python/lib/tox.ini 3ee77681a9b802cd5b4a7910779b8d50aac4cf69 


Diff: https://reviews.apache.org/r/68654/diff/4/

Changes: https://reviews.apache.org/r/68654/diff/3-4/


Testing
---


Thanks,

Eric Chung



Re: Review Request 69473: Added heartbeaters for agent and HTTP executors.

2018-12-05 Thread Joseph Wu


> On Dec. 5, 2018, 4:49 p.m., Greg Mann wrote:
> > src/executor/executor.cpp
> > Lines 797 (patched)
> > 
> >
> > Can we break `Minutes(30)` out into a common header somewhere so we can 
> > use it in the test as well? If we don't do this now, let's leave a TODO for 
> > that as well.

I've opted for the TODO, since we don't have a good header for executor 
constants right now.  There are a few other constants and flags for executors 
which we could potentially put into a header somewhere.  But we can discuss/do 
that in future.


- Joseph


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69473/#review211070
---


On Dec. 5, 2018, 5:24 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69473/
> ---
> 
> (Updated Dec. 5, 2018, 5:24 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-7564
> https://issues.apache.org/jira/browse/MESOS-7564
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This implements two separate heartbeaters for Executor Events (agent
> to executor) and Executor Calls (executor to agent).  Both are set to
> non-configurable intervals of 30 minutes, which should be sufficient
> to keep the connections alive while not flooding logs with warnings
> if the executor/agent does not have this patch.
> 
> 
> Diffs
> -
> 
>   src/common/validation.cpp 5f8f2889f79f9486f8bb731b39f706908dcd84ea 
>   src/executor/executor.cpp d00ea32f02c1d92d64525b8b5ce5ca18ac840c1b 
>   src/launcher/default_executor.cpp b89b0363d64b8dd8936a62b57803f0eff50bfc71 
>   src/launcher/executor.cpp 6b1204dc5a67d4fc3e8ceae378262d53f3d28421 
>   src/slave/constants.hpp fdc21a35bcc0028dc0a2d7ebd06171fad2a20ba1 
>   src/slave/http.cpp ba389263ef7ab311682d542424eb56360841d24c 
>   src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
> 
> 
> Diff: https://reviews.apache.org/r/69473/diff/3/
> 
> 
> Testing
> ---
> 
> make
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 69473: Added heartbeaters for agent and HTTP executors.

2018-12-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69473/
---

(Updated Dec. 5, 2018, 5:24 p.m.)


Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.


Changes
---

Extended a TODO and fixed an incorrect copy of the heartbeater.


Bugs: MESOS-7564
https://issues.apache.org/jira/browse/MESOS-7564


Repository: mesos


Description
---

This implements two separate heartbeaters for Executor Events (agent
to executor) and Executor Calls (executor to agent).  Both are set to
non-configurable intervals of 30 minutes, which should be sufficient
to keep the connections alive while not flooding logs with warnings
if the executor/agent does not have this patch.


Diffs (updated)
-

  src/common/validation.cpp 5f8f2889f79f9486f8bb731b39f706908dcd84ea 
  src/executor/executor.cpp d00ea32f02c1d92d64525b8b5ce5ca18ac840c1b 
  src/launcher/default_executor.cpp b89b0363d64b8dd8936a62b57803f0eff50bfc71 
  src/launcher/executor.cpp 6b1204dc5a67d4fc3e8ceae378262d53f3d28421 
  src/slave/constants.hpp fdc21a35bcc0028dc0a2d7ebd06171fad2a20ba1 
  src/slave/http.cpp ba389263ef7ab311682d542424eb56360841d24c 
  src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
  src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 


Diff: https://reviews.apache.org/r/69473/diff/3/

Changes: https://reviews.apache.org/r/69473/diff/2-3/


Testing
---

make


Thanks,

Joseph Wu



Re: Review Request 69472: Refactored master and agent streaming connections.

2018-12-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69472/
---

(Updated Dec. 5, 2018, 5:23 p.m.)


Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.


Changes
---

Was accidentally copying the `ResponseHeartbeater` class, causing the actor to 
be terminated immediately.  Disallowed copying of that object and fixed the 
copy.


Bugs: MESOS-7564
https://issues.apache.org/jira/browse/MESOS-7564


Repository: mesos


Description
---

This moves the very similar `HttpConnection` classes inside the
master and agent into a common header.  The refactored
`StreamingHttpConnection` is more explicitly named to avoid
potentially clashing with the libprocess HTTP helpers.

This also moves the master's heartbeater helper into a new header
and transforms it into an RAII libprocess actor wrapper.  The 
heartbeater depends on this `StreamingHttpConnection` and is currently
used by the master for heartbeating the operator event stream
and HTTP framework connection.  A later patch will use this heartbeater
for agent->executor heartbeats.


Diffs (updated)
-

  src/common/heartbeater.hpp PRE-CREATION 
  src/common/http.hpp ac9ed5e8b1c6e8e100eba401136eb95c9dd621d4 
  src/master/framework.cpp 36eda9f287bef28608024e0036db1f955a1b393c 
  src/master/http.cpp 68ee2a6dcffbc772afec6e797b1af8da48f61937 
  src/master/master.hpp c7becfa615964674dcf1ebd9424aa5818a0fdb85 
  src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
  src/slave/http.cpp ba389263ef7ab311682d542424eb56360841d24c 
  src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
  src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 


Diff: https://reviews.apache.org/r/69472/diff/3/

Changes: https://reviews.apache.org/r/69472/diff/2-3/


Testing
---

make

The original plan was to use the same helper for the executor too, but of the 
Master, Framework, and Agent heartbeats, only the executor lacks a streaming 
connection.


Thanks,

Joseph Wu



Re: Review Request 69098: Added a benchmark to compare quota and nonquota allocation performance.

2018-12-05 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69098/#review211073
---



PASS: Mesos patch 69098 was successfully built and tested.

Reviews applied: `['69092', '69093', '69094', '69096', '69097', '69098']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2673/mesos-review-69098

- Mesos Reviewbot Windows


On Oct. 22, 2018, 3:39 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69098/
> ---
> 
> (Updated Oct. 22, 2018, 3:39 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Gastón Kleiman.
> 
> 
> Bugs: MESOS-6630
> https://issues.apache.org/jira/browse/MESOS-6630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This benchmark evaluates the performance difference between nonquota
> and quota settings. In both settings, the same allocations are made
> for fair comparison. In particular, since the agent will always be
> allocated as a whole in nonquota settings, we should also avoid
> agent chopping in quota setting as well. Thus in this benchmark,
> quotas are only set to be multiples of whole agent resources.
> This is also why we have this dedicated benchmark for comparison
> rather than extending the existing quota benchmarks (which involves
> agent chopping).
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_benchmarks.cpp 
> 6c6330e8cdbc705be322a7e2445b295c35ee6952 
> 
> 
> Diff: https://reviews.apache.org/r/69098/diff/2/
> 
> 
> Testing
> ---
> 
> Result from optimized build on a multicore 2.2GHz machine:
> 
> Benchmark setup: 20 agents, 10 roles, 20 frameworks
> Made 20 allocations in 3.508676ms for nonquota roles
> Made 20 allocations in 7.901269ms for quota roles
> 
> Benchmark setup: 200 agents, 100 roles, 200 frameworks
> Made 200 allocations in 63.407391ms for nonquota roles
> Made 200 allocations in 279.002319ms for quota roles
> 
> Benchmark setup: 2000 agents, 1000 roles, 2000 frameworks
> Made 2000 allocations in 4.003802373secs for nonquota roles
> Made 2000 allocations in 20.503188535secs for quota roles
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 69097: Added an allocator benchmark for quota performance.

2018-12-05 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69097/
---

(Updated Dec. 5, 2018, 5:16 p.m.)


Review request for mesos, Benjamin Mahler and Gastón Kleiman.


Changes
---

Introduced more chopping in the benchmark and updated the test result.


Bugs: MESOS-6630
https://issues.apache.org/jira/browse/MESOS-6630


Repository: mesos


Description
---

This benchmark evaluates the allocator performance in
the presence of roles with both small quota (which can
be satisfied by half an agent) as well as large quota
(which need resources from two agents). We setup the cluster,
trigger one allocation cycle and measure the elapsed time.


Diffs (updated)
-

  src/tests/hierarchical_allocator_benchmarks.cpp 
6c6330e8cdbc705be322a7e2445b295c35ee6952 


Diff: https://reviews.apache.org/r/69097/diff/2/

Changes: https://reviews.apache.org/r/69097/diff/1-2/


Testing (updated)
---

Result from optimized build on a multicore 2.2GHz machine:

Added 30 agents in 1.612314ms
Added 30 frameworks in 8.353056ms
Benchmark setup: 30 agents, 30 roles, 30 frameworks
Start allocation
Made 36 allocations in 13.844557ms
Made 0 allocation in 8.541409ms

Added 300 agents in 12.506766ms
Added 300 frameworks in 540.262526ms
Benchmark setup: 300 agents, 300 roles, 300 frameworks
Start allocation
Made 350 allocations in 952.103434ms
Made 0 allocation in 770.261096ms

Added 3000 agents in 119.947513ms
Added 3000 frameworks in 55.630070532secs
Benchmark setup: 3000 agents, 3000 roles, 3000 frameworks
Start allocation
Made 3500 allocations in 1.974348067mins
Made 0 allocation in 1.83595374028333mins


Thanks,

Meng Zhu



Re: Review Request 69097: Added an allocator benchmark for quota performance.

2018-12-05 Thread Meng Zhu


> On Nov. 2, 2018, 3:22 p.m., Benjamin Mahler wrote:
> > src/tests/hierarchical_allocator_benchmarks.cpp
> > Lines 498-499 (patched)
> > 
> >
> > Why not make it even smaller (e.g. 1/4 or 1/10th) to be more 
> > pathological?

Updated it to 1/5 for small roles and 5 for large roles.


> On Nov. 2, 2018, 3:22 p.m., Benjamin Mahler wrote:
> > src/tests/hierarchical_allocator_benchmarks.cpp
> > Lines 601-609 (patched)
> > 
> >
> > No need for the comments, seems self explanatory?

Done.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69097/#review210291
---


On Oct. 22, 2018, 3:32 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69097/
> ---
> 
> (Updated Oct. 22, 2018, 3:32 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Gastón Kleiman.
> 
> 
> Bugs: MESOS-6630
> https://issues.apache.org/jira/browse/MESOS-6630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This benchmark evaluates the allocator performance in
> the presence of roles with both small quota (which can
> be satisfied by half an agent) as well as large quota
> (which need resources from two agents). We setup the cluster,
> trigger one allocation cycle and measure the elapsed time.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_benchmarks.cpp 
> bf9167b63747f7b8a402d950947028436307082a 
> 
> 
> Diff: https://reviews.apache.org/r/69097/diff/1/
> 
> 
> Testing
> ---
> 
> Result from optimized build on a multicore 2.2GHz machine:
> 
> Benchmark setup: 20 agents, 25 roles, 25 frameworks
> Made 30 allocations in 11.01225ms
> Made 0 allocation in 5.758192ms
> 
> Benchmark setup: 200 agents, 250 roles, 250 frameworks
> Made 300 allocations in 337.974387ms
> Made 0 allocation in 313.298636ms
> 
> Benchmark setup: 2000 agents, 2500 roles, 2500 frameworks
> Made 3000 allocations in 27.881684393secs
> Made 0 allocation in 25.013398926secs
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 69514: Added gauge metric for operator event stream subscribers.

2018-12-05 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69514/#review211071
---



PASS: Mesos patch 69514 was successfully built and tested.

Reviews applied: `['69307', '69514']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2672/mesos-review-69514

- Mesos Reviewbot Windows


On Dec. 5, 2018, 10:03 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69514/
> ---
> 
> (Updated Dec. 5, 2018, 10:03 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Gastón Kleiman, and 
> Greg Mann.
> 
> 
> Bugs: MESOS-9258
> https://issues.apache.org/jira/browse/MESOS-9258
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This metric "master/active_operator_event_stream_subscribers" returns
> the total number of subscribers to the master's operator event stream.
> 
> 
> Diffs
> -
> 
>   docs/monitoring.md 00c6ea94bcb73746aef740236632ede123f5b534 
>   docs/operator-http-api.md 6edf361047ee5915514da892c6d885b74a510a72 
>   src/master/master.hpp c7becfa615964674dcf1ebd9424aa5818a0fdb85 
>   src/master/metrics.hpp f6bb89da9394bfe469bc690ff2de66824e994098 
>   src/master/metrics.cpp f69ed52c1c89912e7a5d26cb9409f5959b09111a 
>   src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 
>   src/tests/master_tests.cpp ef2c00101fc3d30c564a9ca34884dece2cdd2651 
> 
> 
> Diff: https://reviews.apache.org/r/69514/diff/1/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> src/mesos-tests --gtest_filter="*MaxEventStreamSubscribers*" 
> --gtest_repeat=-1 --gtest_break_on_failure
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 69474: Added tests for agent/executor heartbeating.

2018-12-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69474/#review211056
---


Fix it, then Ship it!





src/tests/executor_http_api_tests.cpp
Lines 31-32 (patched)


These can be removed.



src/tests/executor_http_api_tests.cpp
Lines 973-974 (patched)


Newline between these lines?



src/tests/executor_http_api_tests.cpp
Lines 1121 (patched)


s/heartbeats are/heartbeats are sent/



src/tests/executor_http_api_tests.cpp
Lines 1171 (patched)


s/Calls/calls/ for consistency with "events".



src/tests/executor_http_api_tests.cpp
Lines 1226 (patched)


If we don't break this value out into a header now, let's leave a TODO here 
to update later.


- Greg Mann


On Nov. 30, 2018, 12:03 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69474/
> ---
> 
> (Updated Nov. 30, 2018, 12:03 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-7564
> https://issues.apache.org/jira/browse/MESOS-7564
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This adds two separate tests which check if the Agent sends heartbeats
> to HTTP executors, and if the HTTP executor driver sends heartbeats
> to the agent.
> 
> 
> Diffs
> -
> 
>   src/tests/executor_http_api_tests.cpp 
> a635e1a3c228b52f47ba81a67753ea49ce092143 
> 
> 
> Diff: https://reviews.apache.org/r/69474/diff/3/
> 
> 
> Testing
> ---
> 
> ```
> cmake --build . --target check
> src/mesos-tests --gtest_filter="*Heartbeat*" --verbose --gtest_repeat=-1 
> --gtest_break_on_failure
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 69473: Added heartbeaters for agent and HTTP executors.

2018-12-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69473/#review211070
---




src/executor/executor.cpp
Lines 797 (patched)


Can we break `Minutes(30)` out into a common header somewhere so we can use 
it in the test as well? If we don't do this now, let's leave a TODO for that as 
well.


- Greg Mann


On Dec. 6, 2018, 12:24 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69473/
> ---
> 
> (Updated Dec. 6, 2018, 12:24 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-7564
> https://issues.apache.org/jira/browse/MESOS-7564
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This implements two separate heartbeaters for Executor Events (agent
> to executor) and Executor Calls (executor to agent).  Both are set to
> non-configurable intervals of 30 minutes, which should be sufficient
> to keep the connections alive while not flooding logs with warnings
> if the executor/agent does not have this patch.
> 
> 
> Diffs
> -
> 
>   src/common/validation.cpp 5f8f2889f79f9486f8bb731b39f706908dcd84ea 
>   src/executor/executor.cpp d00ea32f02c1d92d64525b8b5ce5ca18ac840c1b 
>   src/launcher/default_executor.cpp b89b0363d64b8dd8936a62b57803f0eff50bfc71 
>   src/launcher/executor.cpp 6b1204dc5a67d4fc3e8ceae378262d53f3d28421 
>   src/slave/constants.hpp fdc21a35bcc0028dc0a2d7ebd06171fad2a20ba1 
>   src/slave/http.cpp ba389263ef7ab311682d542424eb56360841d24c 
>   src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
> 
> 
> Diff: https://reviews.apache.org/r/69473/diff/2/
> 
> 
> Testing
> ---
> 
> make
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 69505: Added an operation status update manager to the agent.

2018-12-05 Thread Gastón Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69505/#review211067
---


Fix it, then Ship it!





src/master/master.cpp
Lines 8539-8541 (original)


Should we replace this with something like the following?

```
CHECK(!resourceProviderId.isError())
  << "Could not determine resource provider of operation " << *operation
  << ": " << resourceProviderId.error();
```



src/tests/slave_tests.cpp
Lines 6505-6507 (patched)


These checks seem redundant and unrelated to this test?


- Gastón Kleiman


On Dec. 4, 2018, 2:22 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69505/
> ---
> 
> (Updated Dec. 4, 2018, 2:22 p.m.)
> 
> 
> Review request for mesos, Benno Evers and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9278
> https://issues.apache.org/jira/browse/MESOS-9278
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds an operation status update manager to the agent
> in order to handle updates for operations on agent default
> resources. A new test is also added which verifies that such
> updates are retried.
> 
> 
> Diffs
> -
> 
>   src/master/master.cpp b4b02d8b4d7d6d1aabda1f97b9bf824419f76a9e 
>   src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
>   src/tests/slave_tests.cpp 1e918712d73d0a03c4c15b41babff1f649ba9c07 
> 
> 
> Diff: https://reviews.apache.org/r/69505/diff/1/
> 
> 
> Testing
> ---
> 
> `bin/mesos-tests.sh --gtest_filter="*UpdateOperationStatusRetry*" 
> --gtest_repeat=-1 --gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 69473: Added heartbeaters for agent and HTTP executors.

2018-12-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69473/
---

(Updated Dec. 5, 2018, 4:24 p.m.)


Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.


Changes
---

Updated heartbeater construction to match prior review changes.


Bugs: MESOS-7564
https://issues.apache.org/jira/browse/MESOS-7564


Repository: mesos


Description
---

This implements two separate heartbeaters for Executor Events (agent
to executor) and Executor Calls (executor to agent).  Both are set to
non-configurable intervals of 30 minutes, which should be sufficient
to keep the connections alive while not flooding logs with warnings
if the executor/agent does not have this patch.


Diffs (updated)
-

  src/common/validation.cpp 5f8f2889f79f9486f8bb731b39f706908dcd84ea 
  src/executor/executor.cpp d00ea32f02c1d92d64525b8b5ce5ca18ac840c1b 
  src/launcher/default_executor.cpp b89b0363d64b8dd8936a62b57803f0eff50bfc71 
  src/launcher/executor.cpp 6b1204dc5a67d4fc3e8ceae378262d53f3d28421 
  src/slave/constants.hpp fdc21a35bcc0028dc0a2d7ebd06171fad2a20ba1 
  src/slave/http.cpp ba389263ef7ab311682d542424eb56360841d24c 
  src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
  src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 


Diff: https://reviews.apache.org/r/69473/diff/2/

Changes: https://reviews.apache.org/r/69473/diff/1-2/


Testing
---

make


Thanks,

Joseph Wu



Re: Review Request 69472: Refactored master and agent streaming connections.

2018-12-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69472/
---

(Updated Dec. 5, 2018, 4:23 p.m.)


Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.


Changes
---

Significant change from previous revision(s):

* The master **and** agent's `HttpConnection` structs have been pulled into a 
common header due to their similarities.  This common header is renamed and 
tweaked, so it appears much more explicitly named in the code.
* The Heartbeater helper has arguments more directly copied from the master 
now.  These arguments were previously changed to remove the templating.  Having 
a templatized class may allow us to modify the heartbeat message before sending 
it, i.e. adding a timestamp.
* The Heartbeater class now has a wrapper class to handle the actor lifecycle.


Summary (updated)
-

Refactored master and agent streaming connections.


Bugs: MESOS-7564
https://issues.apache.org/jira/browse/MESOS-7564


Repository: mesos


Description (updated)
---

This moves the very similar `HttpConnection` classes inside the
master and agent into a common header.  The refactored
`StreamingHttpConnection` is more explicitly named to avoid
potentially clashing with the libprocess HTTP helpers.

This also moves the master's heartbeater helper into a new header
and transforms it into an RAII libprocess actor wrapper.  The 
heartbeater depends on this `StreamingHttpConnection` and is currently
used by the master for heartbeating the operator event stream
and HTTP framework connection.  A later patch will use this heartbeater
for agent->executor heartbeats.


Diffs (updated)
-

  src/common/heartbeater.hpp PRE-CREATION 
  src/common/http.hpp ac9ed5e8b1c6e8e100eba401136eb95c9dd621d4 
  src/master/framework.cpp 36eda9f287bef28608024e0036db1f955a1b393c 
  src/master/http.cpp 68ee2a6dcffbc772afec6e797b1af8da48f61937 
  src/master/master.hpp c7becfa615964674dcf1ebd9424aa5818a0fdb85 
  src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
  src/slave/http.cpp ba389263ef7ab311682d542424eb56360841d24c 
  src/slave/slave.hpp edf7269d4057ec8c95bb54c855210ad00d002a50 
  src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 


Diff: https://reviews.apache.org/r/69472/diff/2/

Changes: https://reviews.apache.org/r/69472/diff/1-2/


Testing
---

make

The original plan was to use the same helper for the executor too, but of the 
Master, Framework, and Agent heartbeats, only the executor lacks a streaming 
connection.


Thanks,

Joseph Wu



Re: Review Request 69463: Added HEARTBEAT events and calls for the executor HTTP API.

2018-12-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69463/
---

(Updated Dec. 5, 2018, 4:15 p.m.)


Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.


Changes
---

Fix spelling!!! :D


Bugs: MESOS-7564
https://issues.apache.org/jira/browse/MESOS-7564


Repository: mesos


Description
---

These new messages are meant to be backwards compatible, in that
they won't cause crashes when new executors send heartbeats to old
agents, or new agents send heartbeats to old executors.  All recipients
of these heartbeats are currently expected to ignore them, as their
only purpose is to keep certain connections from being marked "stale"
by network intermediaries.


Diffs (updated)
-

  include/mesos/executor/executor.proto 
1b5fa5dab6944a8649fb98447eeec7105495b879 
  include/mesos/v1/executor/executor.proto 
b2ef325cf6a72137854355d541889c7c6ae7c230 


Diff: https://reviews.apache.org/r/69463/diff/2/

Changes: https://reviews.apache.org/r/69463/diff/1-2/


Testing
---

cmake --build . --target mesos-protobufs


Thanks,

Joseph Wu



Re: Review Request 69098: Added a benchmark to compare quota and nonquota allocation performance.

2018-12-05 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69098/#review211065
---


Fix it, then Ship it!




Thanks for updating the test comment and commit description! Much clearer now 
and I think I would understand the need for a separate benchmark it if I were 
coming in fresh.

Did you happen to post perf stacks anywhere?


src/tests/hierarchical_allocator_benchmarks.cpp
Lines 664-666 (patched)


No need to state the name of the variable in its comment:

// Determines the the number of agents in the cluster...

But if this all that it is, no need for a comment at all, it's clear from 
the name what it represents.

I can't quite understand the second sentence, perhaps you can clarify it? 
Intuitively it's not clear why the number of agents would influence how many 
are needed to satisfy a quota.



src/tests/hierarchical_allocator_benchmarks.cpp
Lines 667 (patched)


agentsPerRole?



src/tests/hierarchical_allocator_benchmarks.cpp
Lines 669 (patched)


frameworksPerRole?



src/tests/hierarchical_allocator_benchmarks.cpp
Lines 673-683 (patched)


Why do we need to store it as a member? Are we doing this anywhere else? I 
seem to recall we just access GetParam from the test itself?



src/tests/hierarchical_allocator_benchmarks.cpp
Lines 690 (patched)


This would be clearer?

```
// 10 roles, 10*2 = 20 agents, 10*2 = 20 frameworks.
```

etc



src/tests/hierarchical_allocator_benchmarks.cpp
Lines 690-694 (patched)


periods?



src/tests/hierarchical_allocator_benchmarks.cpp
Lines 766 (patched)


Can you pre-increment as a habit? I'm not sure if the compiler can optimize 
away the post-increment copying in the case of iterators, Metric objects, etc. 
So as a habit pre-incrementing seems like the better approach to adhere to



src/tests/hierarchical_allocator_benchmarks.cpp
Lines 779-780 (patched)


It's certainly not pristine from the perspective of the drf sorter's 
allocation count and metrics for example. So we probably don't want to say 
pristine here.

While it's a bit brittle of an assumption, it seems ok to me for now with a 
note about this potentially starting from a dirty state if we change things.

Did you consider using the param for this?



src/tests/hierarchical_allocator_benchmarks.cpp
Lines 793 (patched)


Do we need these two start outputs? It seems like they clutter the output 
to me.



src/tests/hierarchical_allocator_benchmarks.cpp
Lines 804-807 (patched)


Maybe state why we don't bother recovering the resources here?


- Benjamin Mahler


On Oct. 22, 2018, 10:39 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69098/
> ---
> 
> (Updated Oct. 22, 2018, 10:39 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Gastón Kleiman.
> 
> 
> Bugs: MESOS-6630
> https://issues.apache.org/jira/browse/MESOS-6630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This benchmark evaluates the performance difference between nonquota
> and quota settings. In both settings, the same allocations are made
> for fair comparison. In particular, since the agent will always be
> allocated as a whole in nonquota settings, we should also avoid
> agent chopping in quota setting as well. Thus in this benchmark,
> quotas are only set to be multiples of whole agent resources.
> This is also why we have this dedicated benchmark for comparison
> rather than extending the existing quota benchmarks (which involves
> agent chopping).
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_benchmarks.cpp 
> 6c6330e8cdbc705be322a7e2445b295c35ee6952 
> 
> 
> Diff: https://reviews.apache.org/r/69098/diff/2/
> 
> 
> Testing
> ---
> 
> Result from optimized build on a multicore 2.2GHz machine:
> 
> Benchmark setup: 20 agents, 10 roles, 20 frameworks
> Made 20 allocations in 3.508676ms for nonquota roles
> Made 20 allocations in 7.901269ms for quota roles
> 
> Benchmark setup: 200 agents, 100 roles, 200 frameworks
> Made 200 allocations in 63.407391ms for nonquota roles
> Made 200 allocations in 279.002319ms for quota roles
> 
> Benchmark setup: 2000 

Re: Review Request 69514: Added gauge metric for operator event stream subscribers.

2018-12-05 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69514/#review211063
---



There is an effort to migrate all pull gauges in the master and allocator (and 
more) to push gauges. Can you update this to use a push gauge?

- Benjamin Mahler


On Dec. 5, 2018, 10:03 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69514/
> ---
> 
> (Updated Dec. 5, 2018, 10:03 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Gastón Kleiman, and 
> Greg Mann.
> 
> 
> Bugs: MESOS-9258
> https://issues.apache.org/jira/browse/MESOS-9258
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This metric "master/active_operator_event_stream_subscribers" returns
> the total number of subscribers to the master's operator event stream.
> 
> 
> Diffs
> -
> 
>   docs/monitoring.md 00c6ea94bcb73746aef740236632ede123f5b534 
>   docs/operator-http-api.md 6edf361047ee5915514da892c6d885b74a510a72 
>   src/master/master.hpp c7becfa615964674dcf1ebd9424aa5818a0fdb85 
>   src/master/metrics.hpp f6bb89da9394bfe469bc690ff2de66824e994098 
>   src/master/metrics.cpp f69ed52c1c89912e7a5d26cb9409f5959b09111a 
>   src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 
>   src/tests/master_tests.cpp ef2c00101fc3d30c564a9ca34884dece2cdd2651 
> 
> 
> Diff: https://reviews.apache.org/r/69514/diff/1/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> src/mesos-tests --gtest_filter="*MaxEventStreamSubscribers*" 
> --gtest_repeat=-1 --gtest_break_on_failure
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 68984: Changed a benign warning log message in slave.cpp to info.

2018-12-05 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68984/#review211062
---


Ship it!




Ship It!

- Chun-Hung Hsiao


On Oct. 11, 2018, 6:12 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68984/
> ---
> 
> (Updated Oct. 11, 2018, 6:12 p.m.)
> 
> 
> Review request for mesos and Gastón Kleiman.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently, `UpdateFrameworkMessage` is broadcasted by the master
> to all agents regardless of whether the framework actually exists
> on the agent (see: https://bit.ly/2OiPB4F). So ignoring info
> update for framework due to missing framework on the agent is not
> unexpected. A warning message would false alarm the user. This
> patch changes the log to info to reduce noises.
> 
> 
> Diffs
> -
> 
>   src/slave/slave.cpp 9d84dcb52e5b7e507fc375f184c5f77be08d70a2 
> 
> 
> Diff: https://reviews.apache.org/r/68984/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Review Request 69514: Added gauge metric for operator event stream subscribers.

2018-12-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69514/
---

Review request for mesos, Benno Evers, Benjamin Mahler, Gastón Kleiman, and 
Greg Mann.


Bugs: MESOS-9258
https://issues.apache.org/jira/browse/MESOS-9258


Repository: mesos


Description
---

This metric "master/active_operator_event_stream_subscribers" returns
the total number of subscribers to the master's operator event stream.


Diffs
-

  docs/monitoring.md 00c6ea94bcb73746aef740236632ede123f5b534 
  docs/operator-http-api.md 6edf361047ee5915514da892c6d885b74a510a72 
  src/master/master.hpp c7becfa615964674dcf1ebd9424aa5818a0fdb85 
  src/master/metrics.hpp f6bb89da9394bfe469bc690ff2de66824e994098 
  src/master/metrics.cpp f69ed52c1c89912e7a5d26cb9409f5959b09111a 
  src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 
  src/tests/master_tests.cpp ef2c00101fc3d30c564a9ca34884dece2cdd2651 


Diff: https://reviews.apache.org/r/69514/diff/1/


Testing
---

```
make check

src/mesos-tests --gtest_filter="*MaxEventStreamSubscribers*" --gtest_repeat=-1 
--gtest_break_on_failure
```


Thanks,

Joseph Wu



Re: Review Request 69098: Added a benchmark to compare quota and nonquota allocation performance.

2018-12-05 Thread Meng Zhu


> On Nov. 2, 2018, 3:23 p.m., Benjamin Mahler wrote:
> > Hm.. why is this one not just extending the one you added in the previous 
> > review?
> 
> Meng Zhu wrote:
> This benchmark compares quota vs. non-quota. For non-quota setting, there 
> will be no chopping. So for fair comparison, quota setting should have no 
> chopping either. The previous test focuses quota only and chopping is an 
> important part of it.
> 
> Benjamin Mahler wrote:
> Ok, can you spell it out a bit more in the comments and description, it 
> just seems to say the same allocations are made, but would be good to explain 
> how that was done and why other benchmarks might fail to do it?

Done.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69098/#review210294
---


On Oct. 22, 2018, 3:39 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69098/
> ---
> 
> (Updated Oct. 22, 2018, 3:39 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Gastón Kleiman.
> 
> 
> Bugs: MESOS-6630
> https://issues.apache.org/jira/browse/MESOS-6630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This benchmark evaluates the performance difference between nonquota
> and quota settings. In both settings, the same allocations are made
> for fair comparison.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_benchmarks.cpp 
> bf9167b63747f7b8a402d950947028436307082a 
> 
> 
> Diff: https://reviews.apache.org/r/69098/diff/1/
> 
> 
> Testing
> ---
> 
> Result from optimized build on a multicore 2.2GHz machine:
> 
> Benchmark setup: 20 agents, 10 roles, 20 frameworks
> Made 20 allocations in 3.508676ms for nonquota roles
> Made 20 allocations in 7.901269ms for quota roles
> 
> Benchmark setup: 200 agents, 100 roles, 200 frameworks
> Made 200 allocations in 63.407391ms for nonquota roles
> Made 200 allocations in 279.002319ms for quota roles
> 
> Benchmark setup: 2000 agents, 1000 roles, 2000 frameworks
> Made 2000 allocations in 4.003802373secs for nonquota roles
> Made 2000 allocations in 20.503188535secs for quota roles
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 69307: Changed master to hold subscribers in a circular buffer.

2018-12-05 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69307/
---

(Updated Dec. 5, 2018, 2:01 p.m.)


Review request for mesos, Benno Evers, Benjamin Mahler, Gastón Kleiman, and 
Greg Mann.


Changes
---

Replaced one log message with a comment instead.
Upgraded one log message from VLOG(1) to LOG(INFO).
Modified test to remove the UPID hack.


Bugs: MESOS-9258
https://issues.apache.org/jira/browse/MESOS-9258


Repository: mesos


Description
---

This adds a flag (--max_operator_event_stream_subscribers) to the
master which controls how many active subscribers on the Master's
event stream will be allowed at any time.

The default is 1000 subscribers, which is purposefully higher
than we expect is needed.  Operators aware that their  network
has clients/proxies whom do not close connections have the
option of lowering this flag.


Diffs (updated)
-

  docs/configuration/master.md 575476728079d884fe86b1a3a56614647d1b572e 
  src/master/constants.hpp 76ad0c3b1494cb97888af5545acc998b8fb15bf6 
  src/master/flags.hpp ed2d76af2831d1b9718724936fd3e4850e99b332 
  src/master/flags.cpp 2677738d19caa7d600e591aeb4fb37f0c2318d45 
  src/master/master.hpp c7becfa615964674dcf1ebd9424aa5818a0fdb85 
  src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
  src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 


Diff: https://reviews.apache.org/r/69307/diff/3/

Changes: https://reviews.apache.org/r/69307/diff/2-3/


Testing
---

```
make check

src/mesos-tests --gtest_filter="*MaxEventStreamSubscribers*" --gtest_repeat=-1 
--gtest_break_on_failure
```


Thanks,

Joseph Wu



Re: Review Request 69094: Renamed one allocator benchmark to be more descriptive.

2018-12-05 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69094/
---

(Updated Dec. 5, 2018, 1:55 p.m.)


Review request for mesos, Benjamin Mahler and Gastón Kleiman.


Changes
---

Rebased.


Repository: mesos


Description
---

Renamed `Allocations` to `MultiFrameworkAllocations`.
Also removed `_TestBase` from the fixture name.


Diffs (updated)
-

  src/tests/hierarchical_allocator_benchmarks.cpp 
6c6330e8cdbc705be322a7e2445b295c35ee6952 


Diff: https://reviews.apache.org/r/69094/diff/2/

Changes: https://reviews.apache.org/r/69094/diff/1-2/


Testing
---

make check


Thanks,

Meng Zhu



Re: Review Request 69163: Set agent and/or resource provider ID in operation status updates.

2018-12-05 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69163/#review211060
---




src/resource_provider/storage/provider.cpp
Lines 3093 (patched)


s/info/operation/


- Chun-Hung Hsiao


On Dec. 3, 2018, 1:20 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69163/
> ---
> 
> (Updated Dec. 3, 2018, 1:20 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch sets agent and/or resource provider ID operation status
> update messages. This is not always possible, e.g., some operations
> might fail validation so that no corresponding IDs can be extracted.
> 
> Since operations failing validation are currently directly rejected by
> the master without going through a status update manager, they are not
> retried either. If a master status update manager for operations is
> introduced at a later point it should be possible to forward
> acknowledgements for updates to the master's update manager.
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp 1662125ed3e47b179ee32d08c1d3af75553066ba 
>   src/common/protobuf_utils.cpp a45607eed4c4bae5010bcc3f3ffeabd6d911062a 
>   src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
>   src/resource_provider/manager.cpp 6c81c430e9e1205d71982a7fa2bcd9aa15fc01b2 
>   src/resource_provider/storage/provider.cpp 
> a22c82c442304979fbdec0fcb74543077751a135 
>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
>   src/tests/master_tests.cpp ef2c00101fc3d30c564a9ca34884dece2cdd2651 
>   src/tests/mesos.hpp c08e7e6c1dbc3dd9eb980868e43368c0a423c3ee 
> 
> 
> Diff: https://reviews.apache.org/r/69163/diff/6/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69513: Manually copy test reports to host fs.

2018-12-05 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69513/#review211059
---



PASS: Mesos patch 69513 was successfully built and tested.

Reviews applied: `['69513']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2671/mesos-review-69513

- Mesos Reviewbot Windows


On Dec. 5, 2018, 7:52 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69513/
> ---
> 
> (Updated Dec. 5, 2018, 7:52 p.m.)
> 
> 
> Review request for mesos and James Peach.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Manually copy test reports to host fs.
> 
> 
> Diffs
> -
> 
>   support/mesos-build/entrypoint.sh ec98cc8b1fdcd0a32ed32cfeb69dfb976f82b81d 
> 
> 
> Diff: https://reviews.apache.org/r/69513/diff/1/
> 
> 
> Testing
> ---
> 
> Have been testing this over week in a custom Jenkins job.
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 69307: Changed master to hold subscribers in a circular buffer.

2018-12-05 Thread Joseph Wu


> On Dec. 4, 2018, 3:52 p.m., Greg Mann wrote:
> > src/master/master.cpp
> > Lines 12109 (patched)
> > 
> >
> > This seems worthy of LOG(INFO) to me, WDYT?

Sure, I can promote it.  I don't have much of a preference, except that this 
log line should _not_ be a warning, since it is fairly easy to hit with a 
smaller limit set.


> On Dec. 4, 2018, 3:52 p.m., Greg Mann wrote:
> > src/master/master.cpp
> > Lines 12112 (patched)
> > 
> >
> > Is this ID printed anywhere else? I'm wondering if it's at all useful.

Besides the line written upon subscribing,
```
  LOG(INFO) << "Added subscriber " << http.streamId
<< " to the list of active subscribers";
```
and exiting, 
```
  LOG(INFO) << "Removed subscriber " << id
<< " from the list of active subscribers";
```
the ID is not printed anywhere else.  These log lines were how we spotted the 
subscriber leak in the first place.


- Joseph


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69307/#review211032
---


On Nov. 30, 2018, 6:49 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69307/
> ---
> 
> (Updated Nov. 30, 2018, 6:49 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Gastón Kleiman, and 
> Greg Mann.
> 
> 
> Bugs: MESOS-9258
> https://issues.apache.org/jira/browse/MESOS-9258
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This adds a flag (--max_operator_event_stream_subscribers) to the
> master which controls how many active subscribers on the Master's
> event stream will be allowed at any time.
> 
> The default is 1000 subscribers, which is purposefully higher
> than we expect is needed.  Operators aware that their  network
> has clients/proxies whom do not close connections have the
> option of lowering this flag.
> 
> 
> Diffs
> -
> 
>   docs/configuration/master.md 575476728079d884fe86b1a3a56614647d1b572e 
>   src/master/constants.hpp 76ad0c3b1494cb97888af5545acc998b8fb15bf6 
>   src/master/flags.hpp ed2d76af2831d1b9718724936fd3e4850e99b332 
>   src/master/flags.cpp 2677738d19caa7d600e591aeb4fb37f0c2318d45 
>   src/master/master.hpp c7becfa615964674dcf1ebd9424aa5818a0fdb85 
>   src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
>   src/tests/api_tests.cpp edde48a4b4b46c9f47bc44de389c97b77322b8e8 
> 
> 
> Diff: https://reviews.apache.org/r/69307/diff/2/
> 
> 
> Testing
> ---
> 
> ```
> make check
> 
> src/mesos-tests --gtest_filter="*MaxEventStreamSubscribers*" 
> --gtest_repeat=-1 --gtest_break_on_failure
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 69513: Manually copy test reports to host fs.

2018-12-05 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69513/#review211054
---


Ship it!




Ship It!

- James Peach


On Dec. 5, 2018, 7:52 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69513/
> ---
> 
> (Updated Dec. 5, 2018, 7:52 p.m.)
> 
> 
> Review request for mesos and James Peach.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Manually copy test reports to host fs.
> 
> 
> Diffs
> -
> 
>   support/mesos-build/entrypoint.sh ec98cc8b1fdcd0a32ed32cfeb69dfb976f82b81d 
> 
> 
> Diff: https://reviews.apache.org/r/69513/diff/1/
> 
> 
> Testing
> ---
> 
> Have been testing this over week in a custom Jenkins job.
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Review Request 69513: Manually copy test reports to host fs.

2018-12-05 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69513/
---

Review request for mesos and James Peach.


Repository: mesos


Description
---

Manually copy test reports to host fs.


Diffs
-

  support/mesos-build/entrypoint.sh ec98cc8b1fdcd0a32ed32cfeb69dfb976f82b81d 


Diff: https://reviews.apache.org/r/69513/diff/1/


Testing
---

Have been testing this over week in a custom Jenkins job.


Thanks,

Vinod Kone



Re: Review Request 69162: Added agent and resource provider IDs to operation status messages.

2018-12-05 Thread James DeFelice


> On Dec. 4, 2018, 11:20 p.m., James DeFelice wrote:
> > include/mesos/v1/mesos.proto
> > Lines 2434 (patched)
> > 
> >
> > I'd like this part to be a bit more fleshed out:
> > 
> > When `uuid` is set then it MUST be possible to acknowledge the status 
> > update by using the specified `agent_id` and `resource_provider_id` (for 
> > local providers); and `resource_provider_id` (for external providers).
> 
> Benjamin Bannier wrote:
> I'd argue that this is already implied by the documentation of `uuid`,
> 
> // Statuses that are delivered reliably to the scheduler will
> // include a `uuid`. The status is considered delivered once
> // it is acknowledged by the scheduler.
> optional UUID uuid = 5;
> 
> 
> Dropping for now.
> 
> James DeFelice wrote:
> I see what you mean by it being implied. However, I'd really like for it 
> to be more explicitly documented so that it's not accidentally overlooked 
> later.
> 
> Greg Mann wrote:
> I'm not sure that I understand the suggestion. Does the statement "When 
> uuid is set then it MUST be possible to acknowledge the status update by 
> using the specified agent_id and resource_provider_id" imply that when `uuid` 
> is set, `agent_id` and `resource_provider_id` must best set (or just RP ID in 
> the case of ext. providers)? That is not the case, as we will support both 
> operations on agent default resources (no RP ID) and operations by operators 
> (no agent ID) in the future.
> 
> James, is the following an accurate representation of your intended 
> meaning? -
> 
> "When `uuid` is set then it MUST be possible to acknowledge the status 
> update by using the specified `agent_id` and/or `resource_provider_id`, when 
> present."
> 
> 
> 
> I'm also not sure why this comment would mention acknowledgement - isn't 
> it true that the desired constraint here is that it should be possible for 
> any update which contains a `uuid` to be _reconciled_ by the framework using 
> the supplied agent/RP ID, when present?
> 
> James DeFelice wrote:
> The goal is to document that Mesos will never provide a framework a UUID 
> that cannot be acknowledged due to informtion that's missing in the status 
> update proto.
> 
> Greg Mann wrote:
> I'm not sure that a comment here is necessary to accomplish that. I might 
> update the comments in the `AcknowledgeOperationStatus` message in the 
> scheduler API to say:
> 
> ```
>   message AcknowledgeOperationStatus {
> // If either `agent_id` or `resource_provider_id` are set in the
> // operation status received by the scheduler, they should be set
> // here as well.
> optional AgentID agent_id = 1;
> optional ResourceProviderID resource_provider_id = 2;
> 
> required bytes uuid = 3;
> required OperationID operation_id = 4;
>   }
> ```
> 
> This would provide explicit instructions to framework devs to include 
> agent and/or RP IDs in the acknowledgement any time they're set in the status.
> 
> WDYT?

I like adding a comment there too. I guess I was looking to explicitly document 
(via API) a guarantee that Mesos will always provide the ID fields necessary in 
order for a framework to properly ACK when uuid is set. As per a PM w/ 
Benjamin, unit tests could help codify guarantee as well (and prevent 
accidental regression). He's filed a follow-up JIRA for such tests.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69162/#review211030
---


On Dec. 5, 2018, noon, Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69162/
> ---
> 
> (Updated Dec. 5, 2018, noon)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds agent and resource provider IDs to
> `UpdateOperationStatus` and `UpdateOperationStatusMessage`. With that
> frameworks are able to reconcile enough information after failover to
> construct operation acknowledgements.
> 
> We will add code to populate these fields in a follow-up patch.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto c822cc747cf153435b6c3ae1004168d5a289c97b 
>   include/mesos/v1/mesos.proto 51c1bfdbb360b3554eba2229ed386d6271e0315b 
>   src/common/type_utils.cpp ef13eae47b88efc15f1b2d00852b6387c2fffcbc 
>   src/internal/devolve.cpp 491ed2aa131a92e958bfa71cccfc5f257cd4b3f9 
>   

Re: Review Request 69162: Added agent and resource provider IDs to operation status messages.

2018-12-05 Thread Greg Mann


> On Dec. 4, 2018, 11:20 p.m., James DeFelice wrote:
> > include/mesos/v1/mesos.proto
> > Lines 2434 (patched)
> > 
> >
> > I'd like this part to be a bit more fleshed out:
> > 
> > When `uuid` is set then it MUST be possible to acknowledge the status 
> > update by using the specified `agent_id` and `resource_provider_id` (for 
> > local providers); and `resource_provider_id` (for external providers).
> 
> Benjamin Bannier wrote:
> I'd argue that this is already implied by the documentation of `uuid`,
> 
> // Statuses that are delivered reliably to the scheduler will
> // include a `uuid`. The status is considered delivered once
> // it is acknowledged by the scheduler.
> optional UUID uuid = 5;
> 
> 
> Dropping for now.
> 
> James DeFelice wrote:
> I see what you mean by it being implied. However, I'd really like for it 
> to be more explicitly documented so that it's not accidentally overlooked 
> later.
> 
> Greg Mann wrote:
> I'm not sure that I understand the suggestion. Does the statement "When 
> uuid is set then it MUST be possible to acknowledge the status update by 
> using the specified agent_id and resource_provider_id" imply that when `uuid` 
> is set, `agent_id` and `resource_provider_id` must best set (or just RP ID in 
> the case of ext. providers)? That is not the case, as we will support both 
> operations on agent default resources (no RP ID) and operations by operators 
> (no agent ID) in the future.
> 
> James, is the following an accurate representation of your intended 
> meaning? -
> 
> "When `uuid` is set then it MUST be possible to acknowledge the status 
> update by using the specified `agent_id` and/or `resource_provider_id`, when 
> present."
> 
> 
> 
> I'm also not sure why this comment would mention acknowledgement - isn't 
> it true that the desired constraint here is that it should be possible for 
> any update which contains a `uuid` to be _reconciled_ by the framework using 
> the supplied agent/RP ID, when present?
> 
> James DeFelice wrote:
> The goal is to document that Mesos will never provide a framework a UUID 
> that cannot be acknowledged due to informtion that's missing in the status 
> update proto.

I'm not sure that a comment here is necessary to accomplish that. I might 
update the comments in the `AcknowledgeOperationStatus` message in the 
scheduler API to say:

```
  message AcknowledgeOperationStatus {
// If either `agent_id` or `resource_provider_id` are set in the
// operation status received by the scheduler, they should be set
// here as well.
optional AgentID agent_id = 1;
optional ResourceProviderID resource_provider_id = 2;

required bytes uuid = 3;
required OperationID operation_id = 4;
  }
```

This would provide explicit instructions to framework devs to include agent 
and/or RP IDs in the acknowledgement any time they're set in the status.

WDYT?


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69162/#review211030
---


On Dec. 5, 2018, noon, Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69162/
> ---
> 
> (Updated Dec. 5, 2018, noon)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds agent and resource provider IDs to
> `UpdateOperationStatus` and `UpdateOperationStatusMessage`. With that
> frameworks are able to reconcile enough information after failover to
> construct operation acknowledgements.
> 
> We will add code to populate these fields in a follow-up patch.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto c822cc747cf153435b6c3ae1004168d5a289c97b 
>   include/mesos/v1/mesos.proto 51c1bfdbb360b3554eba2229ed386d6271e0315b 
>   src/common/type_utils.cpp ef13eae47b88efc15f1b2d00852b6387c2fffcbc 
>   src/internal/devolve.cpp 491ed2aa131a92e958bfa71cccfc5f257cd4b3f9 
>   src/internal/evolve.cpp aa60efed5a28ed4d847a4a27aa8e994cda82175d 
> 
> 
> Diff: https://reviews.apache.org/r/69162/diff/8/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69344: Added volume gid manager to Mesos agent.

2018-12-05 Thread Andrei Budnik

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69344/#review211051
---




src/slave/volume_gid_manager/volume_gid_manager.cpp
Lines 58 (patched)


`return ErrnoError("Failed to open '" + path + "'");`



src/slave/volume_gid_manager/volume_gid_manager.cpp
Lines 62 (patched)


`fts_read` is not thread safe (`MT-Unsafe`). However, we have couple of 
places in `stout` where we call `fts_read` as well.



src/slave/volume_gid_manager/volume_gid_manager.cpp
Lines 77 (patched)


It might be unsafe to access `node`'s entries, since most likely memory for 
`node` was freed by `fts_close(tree)`.



src/slave/volume_gid_manager/volume_gid_manager.cpp
Lines 122 (patched)


`return ErrnoError("Failed to stop traversing file system");`



src/slave/volume_gid_manager/volume_gid_manager.cpp
Lines 154 (patched)


Is it possible to add range values to the error message?



src/slave/volume_gid_manager/volume_gid_manager.cpp
Lines 275 (patched)


Should we add `parse.error()` to the error message?


- Andrei Budnik


On Dec. 4, 2018, 2:42 a.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69344/
> ---
> 
> (Updated Dec. 4, 2018, 2:42 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik, Gilbert Song, Greg Mann, Ilya 
> Pronin, and Jie Yu.
> 
> 
> Bugs: MESOS-8810
> https://issues.apache.org/jira/browse/MESOS-8810
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added volume gid manager to Mesos agent.
> 
> 
> Diffs
> -
> 
>   src/CMakeLists.txt bde070445b644e15d46c390d1c983caabaa1fec8 
>   src/Makefile.am 8da1a05b618f17542fec9b5057484a9bae57658a 
>   src/slave/volume_gid_manager/volume_gid_manager.hpp PRE-CREATION 
>   src/slave/volume_gid_manager/volume_gid_manager.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69344/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Re: Review Request 69098: Added a benchmark to compare quota and nonquota allocation performance.

2018-12-05 Thread Benjamin Mahler


> On Nov. 2, 2018, 10:23 p.m., Benjamin Mahler wrote:
> > Hm.. why is this one not just extending the one you added in the previous 
> > review?
> 
> Meng Zhu wrote:
> This benchmark compares quota vs. non-quota. For non-quota setting, there 
> will be no chopping. So for fair comparison, quota setting should have no 
> chopping either. The previous test focuses quota only and chopping is an 
> important part of it.

Ok, can you spell it out a bit more in the comments and description, it just 
seems to say the same allocations are made, but would be good to explain how 
that was done and why other benchmarks might fail to do it?


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69098/#review210294
---


On Oct. 22, 2018, 10:39 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69098/
> ---
> 
> (Updated Oct. 22, 2018, 10:39 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Gastón Kleiman.
> 
> 
> Bugs: MESOS-6630
> https://issues.apache.org/jira/browse/MESOS-6630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This benchmark evaluates the performance difference between nonquota
> and quota settings. In both settings, the same allocations are made
> for fair comparison.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_benchmarks.cpp 
> bf9167b63747f7b8a402d950947028436307082a 
> 
> 
> Diff: https://reviews.apache.org/r/69098/diff/1/
> 
> 
> Testing
> ---
> 
> Result from optimized build on a multicore 2.2GHz machine:
> 
> Benchmark setup: 20 agents, 10 roles, 20 frameworks
> Made 20 allocations in 3.508676ms for nonquota roles
> Made 20 allocations in 7.901269ms for quota roles
> 
> Benchmark setup: 200 agents, 100 roles, 200 frameworks
> Made 200 allocations in 63.407391ms for nonquota roles
> Made 200 allocations in 279.002319ms for quota roles
> 
> Benchmark setup: 2000 agents, 1000 roles, 2000 frameworks
> Made 2000 allocations in 4.003802373secs for nonquota roles
> Made 2000 allocations in 20.503188535secs for quota roles
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 69493: Documented the `linux/seccomp` isolator.

2018-12-05 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69493/#review211048
---



Bad patch!

Reviews applied: [69493, 69420, 69409, 68022, 68021, 68020, 68019, 68018, 
68017, 68016, 67844]

Failed command: /usr/bin/python3 support/apply-reviews.py -n -r 67844

Error:
2018-12-05 18:22:11 URL:https://reviews.apache.org/r/67844/diff/raw/ [260/260] 
-> "67844.patch" [1]
error: missing binary patch data for '3rdparty/libseccomp-2.3.3.tar.gz'
error: binary patch does not apply to '3rdparty/libseccomp-2.3.3.tar.gz'
error: 3rdparty/libseccomp-2.3.3.tar.gz: patch does not apply

Full log: https://builds.apache.org/job/Mesos-Reviewbot/23592/console

- Mesos Reviewbot


On Nov. 30, 2018, 4:33 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69493/
> ---
> 
> (Updated Nov. 30, 2018, 4:33 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, James Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9036
> https://issues.apache.org/jira/browse/MESOS-9036
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   docs/isolators/linux-seccomp.md PRE-CREATION 
>   docs/mesos-containerizer.md d15e82583fa207ba78e9fc1e83da0cf1f469ec4e 
> 
> 
> Diff: https://reviews.apache.org/r/69493/diff/2/
> 
> 
> Testing
> ---
> 
> None: not a functional change.
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 69098: Added a benchmark to compare quota and nonquota allocation performance.

2018-12-05 Thread Meng Zhu


> On Nov. 2, 2018, 3:23 p.m., Benjamin Mahler wrote:
> > Hm.. why is this one not just extending the one you added in the previous 
> > review?

This benchmark compares quota vs. non-quota. For non-quota setting, there will 
be no chopping. So for fair comparison, quota setting should have no chopping 
either. The previous test focuses quota only and chopping is an important part 
of it.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69098/#review210294
---


On Oct. 22, 2018, 3:39 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69098/
> ---
> 
> (Updated Oct. 22, 2018, 3:39 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Gastón Kleiman.
> 
> 
> Bugs: MESOS-6630
> https://issues.apache.org/jira/browse/MESOS-6630
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This benchmark evaluates the performance difference between nonquota
> and quota settings. In both settings, the same allocations are made
> for fair comparison.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_benchmarks.cpp 
> bf9167b63747f7b8a402d950947028436307082a 
> 
> 
> Diff: https://reviews.apache.org/r/69098/diff/1/
> 
> 
> Testing
> ---
> 
> Result from optimized build on a multicore 2.2GHz machine:
> 
> Benchmark setup: 20 agents, 10 roles, 20 frameworks
> Made 20 allocations in 3.508676ms for nonquota roles
> Made 20 allocations in 7.901269ms for quota roles
> 
> Benchmark setup: 200 agents, 100 roles, 200 frameworks
> Made 200 allocations in 63.407391ms for nonquota roles
> Made 200 allocations in 279.002319ms for quota roles
> 
> Benchmark setup: 2000 agents, 1000 roles, 2000 frameworks
> Made 2000 allocations in 4.003802373secs for nonquota roles
> Made 2000 allocations in 20.503188535secs for quota roles
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 69096: Moved a few allocator test helpers to `tests/allocator.hpp`.

2018-12-05 Thread Meng Zhu


> On Oct. 24, 2018, 5:22 p.m., Benjamin Mahler wrote:
> > src/tests/allocator.hpp
> > Lines 40-45 (patched)
> > 
> >
> > I think we some other create helpers lying around, e.g. createTask. Is 
> > this where these belong?
> > 
> > Do you think we need them?
> 
> Meng Zhu wrote:
> I do not have use cases of other helpers in mind atm. Thus I prefer to 
> pull in only these two.
> 
> Benjamin Mahler wrote:
> Do we even need these helpers? They seem pretty trivial, and I wonder how 
> much code it's actually simplifying. I'm fine with the patch, but it feels 
> heavy to have a new header and compliation unit just for these, especially 
> since they're only used in two files and are trivial.

Yeah, we may do these two inline or just copy them around. But I expect that 
the allocator tests and benchmarks would share more utilities in the future 
(after all, they are all allocator tests). So might as well set up the room for 
it. If it makes the code structure cleaner, I think it is worth it.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69096/#review210010
---


On Oct. 19, 2018, 6:28 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69096/
> ---
> 
> (Updated Oct. 19, 2018, 6:28 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Gastón Kleiman.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This helps to share the helpers between
> `hierarchical_allocator_tests.cpp` and
> `hierarchical_allocator_benchmarks.cpp`.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 5795c7097a9ed1f659e169ad81a9f2c09481aa81 
>   src/tests/CMakeLists.txt 00dbdee1c06e7571fa799850cdaf7f2ccadc8bea 
>   src/tests/allocator.hpp 15a5396986ea6939f25b8d23b87a91c27338fc7b 
>   src/tests/allocator.cpp PRE-CREATION 
>   src/tests/hierarchical_allocator_tests.cpp 
> 27fbd9cf0c4442e7675362a806d35bad141ffb6d 
> 
> 
> Diff: https://reviews.apache.org/r/69096/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 69162: Added agent and resource provider IDs to operation status messages.

2018-12-05 Thread James DeFelice


> On Dec. 4, 2018, 11:20 p.m., James DeFelice wrote:
> > include/mesos/v1/mesos.proto
> > Lines 2434 (patched)
> > 
> >
> > I'd like this part to be a bit more fleshed out:
> > 
> > When `uuid` is set then it MUST be possible to acknowledge the status 
> > update by using the specified `agent_id` and `resource_provider_id` (for 
> > local providers); and `resource_provider_id` (for external providers).
> 
> Benjamin Bannier wrote:
> I'd argue that this is already implied by the documentation of `uuid`,
> 
> // Statuses that are delivered reliably to the scheduler will
> // include a `uuid`. The status is considered delivered once
> // it is acknowledged by the scheduler.
> optional UUID uuid = 5;
> 
> 
> Dropping for now.
> 
> James DeFelice wrote:
> I see what you mean by it being implied. However, I'd really like for it 
> to be more explicitly documented so that it's not accidentally overlooked 
> later.
> 
> Greg Mann wrote:
> I'm not sure that I understand the suggestion. Does the statement "When 
> uuid is set then it MUST be possible to acknowledge the status update by 
> using the specified agent_id and resource_provider_id" imply that when `uuid` 
> is set, `agent_id` and `resource_provider_id` must best set (or just RP ID in 
> the case of ext. providers)? That is not the case, as we will support both 
> operations on agent default resources (no RP ID) and operations by operators 
> (no agent ID) in the future.
> 
> James, is the following an accurate representation of your intended 
> meaning? -
> 
> "When `uuid` is set then it MUST be possible to acknowledge the status 
> update by using the specified `agent_id` and/or `resource_provider_id`, when 
> present."
> 
> 
> 
> I'm also not sure why this comment would mention acknowledgement - isn't 
> it true that the desired constraint here is that it should be possible for 
> any update which contains a `uuid` to be _reconciled_ by the framework using 
> the supplied agent/RP ID, when present?

The goal is to document that Mesos will never provide a framework a UUID that 
cannot be acknowledged due to informtion that's missing in the status update 
proto.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69162/#review211030
---


On Dec. 5, 2018, noon, Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69162/
> ---
> 
> (Updated Dec. 5, 2018, noon)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds agent and resource provider IDs to
> `UpdateOperationStatus` and `UpdateOperationStatusMessage`. With that
> frameworks are able to reconcile enough information after failover to
> construct operation acknowledgements.
> 
> We will add code to populate these fields in a follow-up patch.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto c822cc747cf153435b6c3ae1004168d5a289c97b 
>   include/mesos/v1/mesos.proto 51c1bfdbb360b3554eba2229ed386d6271e0315b 
>   src/common/type_utils.cpp ef13eae47b88efc15f1b2d00852b6387c2fffcbc 
>   src/internal/devolve.cpp 491ed2aa131a92e958bfa71cccfc5f257cd4b3f9 
>   src/internal/evolve.cpp aa60efed5a28ed4d847a4a27aa8e994cda82175d 
> 
> 
> Diff: https://reviews.apache.org/r/69162/diff/8/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 68138: Added tests to ensure correct quota accounting.

2018-12-05 Thread Meng Zhu


> On Dec. 4, 2018, 12:51 p.m., Benjamin Mahler wrote:
> > src/tests/hierarchical_allocator_tests.cpp
> > Lines 1702 (patched)
> > 
> >
> > Maybe: QuotaAccountingReserveAllocatedResources?

Sounds good.


> On Dec. 4, 2018, 12:51 p.m., Benjamin Mahler wrote:
> > src/tests/hierarchical_allocator_tests.cpp
> > Lines 1751-1761 (patched)
> > 
> >
> > The way this test is written only catches the case where making the 
> > reservation could double-charge the quota, but it doesn't catch a bug where 
> > the reservation would lead to no-charge of the quota. One way to resolve 
> > this easily is to make the second agent larger, in order to make sure that 
> > the allocation is getting chopped to match the unsatisfied quota exactly.

We already have tests to ensure reservations are charged towards the quota. But 
yeah, does not hurt to double verify it here.


> On Dec. 4, 2018, 12:51 p.m., Benjamin Mahler wrote:
> > src/tests/hierarchical_allocator_tests.cpp
> > Lines 1767 (patched)
> > 
> >
> > Maybe: QuotaAccountingUnreserveAllocatedResources?

Sounds good.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68138/#review211024
---


On Dec. 5, 2018, 9:03 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68138/
> ---
> 
> (Updated Dec. 5, 2018, 9:03 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9099
> https://issues.apache.org/jira/browse/MESOS-9099
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added two allocator tests to ensure reserving and
> unreserving allocated resources do not affect
> quota accounting.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_tests.cpp 
> 3034d460dd48b5134cfd4a24c54775222a348e32 
> 
> 
> Diff: https://reviews.apache.org/r/68138/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 68138: Added tests to ensure correct quota accounting.

2018-12-05 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68138/
---

(Updated Dec. 5, 2018, 9:03 a.m.)


Review request for mesos, Benjamin Mahler and Gastón Kleiman.


Changes
---

Addressed Ben's comment.


Bugs: MESOS-9099
https://issues.apache.org/jira/browse/MESOS-9099


Repository: mesos


Description
---

Added two allocator tests to ensure reserving and
unreserving allocated resources do not affect
quota accounting.


Diffs (updated)
-

  src/tests/hierarchical_allocator_tests.cpp 
3034d460dd48b5134cfd4a24c54775222a348e32 


Diff: https://reviews.apache.org/r/68138/diff/3/

Changes: https://reviews.apache.org/r/68138/diff/2-3/


Testing
---

make check


Thanks,

Meng Zhu



Re: Review Request 69162: Added agent and resource provider IDs to operation status messages.

2018-12-05 Thread Greg Mann


> On Dec. 4, 2018, 11:20 p.m., James DeFelice wrote:
> > include/mesos/v1/mesos.proto
> > Lines 2434 (patched)
> > 
> >
> > I'd like this part to be a bit more fleshed out:
> > 
> > When `uuid` is set then it MUST be possible to acknowledge the status 
> > update by using the specified `agent_id` and `resource_provider_id` (for 
> > local providers); and `resource_provider_id` (for external providers).
> 
> Benjamin Bannier wrote:
> I'd argue that this is already implied by the documentation of `uuid`,
> 
> // Statuses that are delivered reliably to the scheduler will
> // include a `uuid`. The status is considered delivered once
> // it is acknowledged by the scheduler.
> optional UUID uuid = 5;
> 
> 
> Dropping for now.
> 
> James DeFelice wrote:
> I see what you mean by it being implied. However, I'd really like for it 
> to be more explicitly documented so that it's not accidentally overlooked 
> later.

I'm not sure that I understand the suggestion. Does the statement "When uuid is 
set then it MUST be possible to acknowledge the status update by using the 
specified agent_id and resource_provider_id" imply that when `uuid` is set, 
`agent_id` and `resource_provider_id` must best set (or just RP ID in the case 
of ext. providers)? That is not the case, as we will support both operations on 
agent default resources (no RP ID) and operations by operators (no agent ID) in 
the future.

James, is the following an accurate representation of your intended meaning? -

"When `uuid` is set then it MUST be possible to acknowledge the status update 
by using the specified `agent_id` and/or `resource_provider_id`, when present."


I'm also not sure why this comment would mention acknowledgement - isn't it 
true that the desired constraint here is that it should be possible for any 
update which contains a `uuid` to be _reconciled_ by the framework using the 
supplied agent/RP ID, when present?


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69162/#review211030
---


On Dec. 5, 2018, noon, Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69162/
> ---
> 
> (Updated Dec. 5, 2018, noon)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds agent and resource provider IDs to
> `UpdateOperationStatus` and `UpdateOperationStatusMessage`. With that
> frameworks are able to reconcile enough information after failover to
> construct operation acknowledgements.
> 
> We will add code to populate these fields in a follow-up patch.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto c822cc747cf153435b6c3ae1004168d5a289c97b 
>   include/mesos/v1/mesos.proto 51c1bfdbb360b3554eba2229ed386d6271e0315b 
>   src/common/type_utils.cpp ef13eae47b88efc15f1b2d00852b6387c2fffcbc 
>   src/internal/devolve.cpp 491ed2aa131a92e958bfa71cccfc5f257cd4b3f9 
>   src/internal/evolve.cpp aa60efed5a28ed4d847a4a27aa8e994cda82175d 
> 
> 
> Diff: https://reviews.apache.org/r/69162/diff/8/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69162: Added agent and resource provider IDs to operation status messages.

2018-12-05 Thread James DeFelice


> On Dec. 4, 2018, 11:20 p.m., James DeFelice wrote:
> > include/mesos/v1/mesos.proto
> > Lines 2434 (patched)
> > 
> >
> > I'd like this part to be a bit more fleshed out:
> > 
> > When `uuid` is set then it MUST be possible to acknowledge the status 
> > update by using the specified `agent_id` and `resource_provider_id` (for 
> > local providers); and `resource_provider_id` (for external providers).
> 
> Benjamin Bannier wrote:
> I'd argue that this is already implied by the documentation of `uuid`,
> 
> // Statuses that are delivered reliably to the scheduler will
> // include a `uuid`. The status is considered delivered once
> // it is acknowledged by the scheduler.
> optional UUID uuid = 5;
> 
> 
> Dropping for now.

I see what you mean by it being implied. However, I'd really like for it to be 
more explicitly documented so that it's not accidentally overlooked later.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69162/#review211030
---


On Dec. 5, 2018, noon, Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69162/
> ---
> 
> (Updated Dec. 5, 2018, noon)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds agent and resource provider IDs to
> `UpdateOperationStatus` and `UpdateOperationStatusMessage`. With that
> frameworks are able to reconcile enough information after failover to
> construct operation acknowledgements.
> 
> We will add code to populate these fields in a follow-up patch.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto c822cc747cf153435b6c3ae1004168d5a289c97b 
>   include/mesos/v1/mesos.proto 51c1bfdbb360b3554eba2229ed386d6271e0315b 
>   src/common/type_utils.cpp ef13eae47b88efc15f1b2d00852b6387c2fffcbc 
>   src/internal/devolve.cpp 491ed2aa131a92e958bfa71cccfc5f257cd4b3f9 
>   src/internal/evolve.cpp aa60efed5a28ed4d847a4a27aa8e994cda82175d 
> 
> 
> Diff: https://reviews.apache.org/r/69162/diff/8/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69493: Documented the `linux/seccomp` isolator.

2018-12-05 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69493/#review211040
---



FAIL: Failed to apply the dependent review: 67844.

Failed command: `python.exe .\support\apply-reviews.py -n -r 67844`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2670/mesos-review-69493

Relevant logs:

- 
[apply-review-67844.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2670/mesos-review-69493/logs/apply-review-67844.log):

```
error: missing binary patch data for '3rdparty/libseccomp-2.3.3.tar.gz'
error: binary patch does not apply to '3rdparty/libseccomp-2.3.3.tar.gz'
error: 3rdparty/libseccomp-2.3.3.tar.gz: patch does not apply
```

- Mesos Reviewbot Windows


On Nov. 30, 2018, 4:33 p.m., Andrei Budnik wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69493/
> ---
> 
> (Updated Nov. 30, 2018, 4:33 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, James Peach, and Qian Zhang.
> 
> 
> Bugs: MESOS-9036
> https://issues.apache.org/jira/browse/MESOS-9036
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   docs/isolators/linux-seccomp.md PRE-CREATION 
>   docs/mesos-containerizer.md d15e82583fa207ba78e9fc1e83da0cf1f469ec4e 
> 
> 
> Diff: https://reviews.apache.org/r/69493/diff/2/
> 
> 
> Testing
> ---
> 
> None: not a functional change.
> 
> 
> Thanks,
> 
> Andrei Budnik
> 
>



Re: Review Request 69162: Added agent and resource provider IDs to operation status messages.

2018-12-05 Thread Benjamin Bannier


> On Dec. 5, 2018, 12:20 a.m., James DeFelice wrote:
> > include/mesos/v1/mesos.proto
> > Lines 2434 (patched)
> > 
> >
> > I'd like this part to be a bit more fleshed out:
> > 
> > When `uuid` is set then it MUST be possible to acknowledge the status 
> > update by using the specified `agent_id` and `resource_provider_id` (for 
> > local providers); and `resource_provider_id` (for external providers).

I'd argue that this is already implied by the documentation of `uuid`,

// Statuses that are delivered reliably to the scheduler will
// include a `uuid`. The status is considered delivered once
// it is acknowledged by the scheduler.
optional UUID uuid = 5;


Dropping for now.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69162/#review211030
---


On Dec. 5, 2018, 1 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69162/
> ---
> 
> (Updated Dec. 5, 2018, 1 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds agent and resource provider IDs to
> `UpdateOperationStatus` and `UpdateOperationStatusMessage`. With that
> frameworks are able to reconcile enough information after failover to
> construct operation acknowledgements.
> 
> We will add code to populate these fields in a follow-up patch.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto c822cc747cf153435b6c3ae1004168d5a289c97b 
>   include/mesos/v1/mesos.proto 51c1bfdbb360b3554eba2229ed386d6271e0315b 
>   src/common/type_utils.cpp ef13eae47b88efc15f1b2d00852b6387c2fffcbc 
>   src/internal/devolve.cpp 491ed2aa131a92e958bfa71cccfc5f257cd4b3f9 
>   src/internal/evolve.cpp aa60efed5a28ed4d847a4a27aa8e994cda82175d 
> 
> 
> Diff: https://reviews.apache.org/r/69162/diff/8/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69162: Added agent and resource provider IDs to operation status messages.

2018-12-05 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69162/
---

(Updated Dec. 5, 2018, 1 p.m.)


Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.


Bugs: MESOS-9293
https://issues.apache.org/jira/browse/MESOS-9293


Repository: mesos


Description
---

This patch adds agent and resource provider IDs to
`UpdateOperationStatus` and `UpdateOperationStatusMessage`. With that
frameworks are able to reconcile enough information after failover to
construct operation acknowledgements.

We will add code to populate these fields in a follow-up patch.


Diffs (updated)
-

  include/mesos/mesos.proto c822cc747cf153435b6c3ae1004168d5a289c97b 
  include/mesos/v1/mesos.proto 51c1bfdbb360b3554eba2229ed386d6271e0315b 
  src/common/type_utils.cpp ef13eae47b88efc15f1b2d00852b6387c2fffcbc 
  src/internal/devolve.cpp 491ed2aa131a92e958bfa71cccfc5f257cd4b3f9 
  src/internal/evolve.cpp aa60efed5a28ed4d847a4a27aa8e994cda82175d 


Diff: https://reviews.apache.org/r/69162/diff/8/

Changes: https://reviews.apache.org/r/69162/diff/7-8/


Testing
---

`make check`


Thanks,

Benjamin Bannier