Re: Review Request 70498: Simplified Sorter::add(client) implementations.

2019-04-22 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70498/#review214802
---


Fix it, then Ship it!




Much cleaner, thanks!


src/master/allocator/sorter/drf/sorter.cpp
Line 74 (original), 74 (patched)


Not sure if this is a behavior change. The current interface does not say 
anything regarding adding existing clients.

We should either allow this (maybe a no-op) or make it clear in the 
interface that clients should not be added more than once.

Also, since we have the exact same check below (last modified line), this 
one seems unnecessary.

ditto below in the random sorter



src/master/allocator/sorter/drf/sorter.cpp
Lines 79-82 (patched)


I was little puzzled here first (e.g. e/f/g should be phase 1(b) + phase 2, 
while e/f is phase 1(b) only). 

Maybe better just list the example along with the cases below:

  //   Phase 1: Walk down the tree until:
  // (a) we run out of tokens -> add "." node
  // e.g. add a
  // (b) or, we reach a leaf -> transform the leaf into internal + "."
  // e.g. add e/f, e/f/g, e/f/g/
  // (c) or, we're at an internal node but can't find the next child
  // e.g. add w/x, w/x/y, w/x/y/

ditto below in the random sorter



src/master/allocator/sorter/drf/sorter.cpp
Lines 89 (patched)


For the remaining tokens (if any)



src/master/allocator/sorter/drf/sorter.cpp
Lines 94 (patched)


`tokenIndex` or `depth`?

ditto below in the random sorter


- Meng Zhu


On April 19, 2019, 12:05 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70498/
> ---
> 
> (Updated April 19, 2019, 12:05 p.m.)
> 
> 
> Review request for mesos and Meng Zhu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The existing logic is rather hard to understand and there is no
> guiding explanation of the algorithm.
> 
> This re-writes the logic into an easier to understand approach,
> along with a comment at the top that explains the two phase
> algorithm.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/sorter/drf/sorter.cpp 
> 554ac84ee585d1d07048a58cf7d7d1e6586252ee 
>   src/master/allocator/sorter/random/sorter.cpp 
> bbe130dbf3b158ea14f9572bc5d14200fcd85127 
> 
> 
> Diff: https://reviews.apache.org/r/70498/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 70519: Transitioned tasks when an unreachable agent is marked as gone.

2019-04-22 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70519/#review214805
---



Patch looks great!

Reviews applied: [70518, 70519]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On April 22, 2019, 11:57 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70519/
> ---
> 
> (Updated April 22, 2019, 11:57 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, Joseph Wu, and Vinod 
> Kone.
> 
> 
> Bugs: MESOS-9545
> https://issues.apache.org/jira/browse/MESOS-9545
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch updates the master code responsible for marking
> agents as gone to properly transition tasks on agents which
> were previously marked as unreachable.
> 
> 
> Diffs
> -
> 
>   src/master/framework.cpp 05f5514c589b2dba08afe77281e5fbc4e29f232b 
>   src/master/http.cpp e7a92d0f554ba4cafaee5a75f09b46eb1bf4a310 
>   src/master/master.hpp 94891af9deeaddbfc9d6eabb243aed97f7b7 
>   src/master/master.cpp ad54ae217863a08f4e6d743b39c176b171353084 
>   src/tests/api_tests.cpp e76417a9098281265b3411c18767bfcc2f624b6f 
> 
> 
> Diff: https://reviews.apache.org/r/70519/diff/1/
> 
> 
> Testing
> ---
> 
> `make check`
> `bin/mesos-tests.sh --gtest_filter="*UnreachableAgentMarkedGone*" 
> --gtest_repeat=-1 --gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Review Request 70521: Renamed variables in `Master::_accept` to improve readability.

2019-04-22 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70521/
---

Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Meng Zhu.


Repository: mesos


Description
---

Renamed variables in `Master::_accept` to improve readability.


Diffs
-

  src/master/master.cpp ad54ae217863a08f4e6d743b39c176b171353084 


Diff: https://reviews.apache.org/r/70521/diff/1/


Testing
---

make check


Thanks,

Chun-Hung Hsiao



Re: Review Request 70132: Do not implicitly decline speculatively converted resources.

2019-04-22 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70132/
---

(Updated April 23, 2019, 1:15 a.m.)


Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Meng Zhu.


Changes
---

Updated variable names and comments to make it easier to understand.


Summary (updated)
-

Do not implicitly decline speculatively converted resources.


Bugs: MESOS-9616
https://issues.apache.org/jira/browse/MESOS-9616


Repository: mesos


Description (updated)
---

Currently if a framework accepts an offer to perform pipelined
operations, e.g., reserving resource, without a final consumer, the
converted resources will be implicitly declined. This is an undesired
behavior as the framework might want to reserve one resource first but
launch a task later in the next allocation cycle. This patch fixes this
behavior.

But, if the framework accepts an offers with multiple operations that
cancel out each other, the resources consumed by these operations are
still considered unused and will be declined.


Diffs (updated)
-

  docs/scheduler-http-api.md a5327c229142267836f327f9c382ef50b7e334db 
  src/master/master.cpp ad54ae217863a08f4e6d743b39c176b171353084 
  src/tests/slave_tests.cpp b1c3a01031b917fb9773c8c890a8f88838870559 


Diff: https://reviews.apache.org/r/70132/diff/5/

Changes: https://reviews.apache.org/r/70132/diff/4-5/


Testing
---

make check


Thanks,

Chun-Hung Hsiao



Re: Review Request 70509: Added tests for overlapping ranges and sets in task/executor resources.

2019-04-22 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70509/#review214803
---



FAIL: Failed to apply the dependent review: 70507.

Failed command: `python.exe .\support\apply-reviews.py -n -r 70507`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/3234/mesos-review-70509

Relevant logs:

- 
[apply-review-70507.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/3234/mesos-review-70509/logs/apply-review-70507.log):

```
error: patch failed: src/common/resource_quantities.hpp:79
error: src/common/resource_quantities.hpp: patch does not apply
```

- Mesos Reviewbot Windows


On April 22, 2019, 6:14 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70509/
> ---
> 
> (Updated April 22, 2019, 6:14 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9619
> https://issues.apache.org/jira/browse/MESOS-9619
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added tests for overlapping ranges and sets in task/executor resources.
> 
> 
> Diffs
> -
> 
>   src/tests/master_tests.cpp 964d935771a99efaee63187affe46b551146f310 
> 
> 
> Diff: https://reviews.apache.org/r/70509/diff/3/
> 
> 
> Testing
> ---
> 
> `make check`
> `bin/mesos-tests.sh --gtest_filter="*OverlappingSetsAndRanges*" 
> --gtest_repeat=-1 --gtest_break_on_failure`
> `bin/mesos-tests.sh --gtest_filter="*LaunchOverlappingSetAndRangeResources*" 
> --gtest_repeat=-1 --gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 70519: Transitioned tasks when an unreachable agent is marked as gone.

2019-04-22 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70519/
---

(Updated April 22, 2019, 11:57 p.m.)


Review request for mesos, Benno Evers, Gastón Kleiman, Joseph Wu, and Vinod 
Kone.


Summary (updated)
-

Transitioned tasks when an unreachable agent is marked as gone.


Bugs: MESOS-9545
https://issues.apache.org/jira/browse/MESOS-9545


Repository: mesos


Description
---

This patch updates the master code responsible for marking
agents as gone to properly transition tasks on agents which
were previously marked as unreachable.


Diffs
-

  src/master/framework.cpp 05f5514c589b2dba08afe77281e5fbc4e29f232b 
  src/master/http.cpp e7a92d0f554ba4cafaee5a75f09b46eb1bf4a310 
  src/master/master.hpp 94891af9deeaddbfc9d6eabb243aed97f7b7 
  src/master/master.cpp ad54ae217863a08f4e6d743b39c176b171353084 
  src/tests/api_tests.cpp e76417a9098281265b3411c18767bfcc2f624b6f 


Diff: https://reviews.apache.org/r/70519/diff/1/


Testing
---

`make check`
`bin/mesos-tests.sh --gtest_filter="*UnreachableAgentMarkedGone*" 
--gtest_repeat=-1 --gtest_break_on_failure`


Thanks,

Greg Mann



Review Request 70519: Sent task status updates when unreachable agent is marked as gone.

2019-04-22 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70519/
---

Review request for mesos, Benno Evers, Gastón Kleiman, Joseph Wu, and Vinod 
Kone.


Bugs: MESOS-9545
https://issues.apache.org/jira/browse/MESOS-9545


Repository: mesos


Description
---

This patch updates the master code responsible for marking
agents as gone to properly transition tasks on agents which
were previously marked as unreachable.


Diffs
-

  src/master/framework.cpp 05f5514c589b2dba08afe77281e5fbc4e29f232b 
  src/master/http.cpp e7a92d0f554ba4cafaee5a75f09b46eb1bf4a310 
  src/master/master.hpp 94891af9deeaddbfc9d6eabb243aed97f7b7 
  src/master/master.cpp ad54ae217863a08f4e6d743b39c176b171353084 
  src/tests/api_tests.cpp e76417a9098281265b3411c18767bfcc2f624b6f 


Diff: https://reviews.apache.org/r/70519/diff/1/


Testing
---

`make check`
`bin/mesos-tests.sh --gtest_filter="*UnreachableAgentMarkedGone*" 
--gtest_repeat=-1 --gtest_break_on_failure`


Thanks,

Greg Mann



Review Request 70518: Fixed a memory leak in the master's 'removeTask()' helper.

2019-04-22 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70518/
---

Review request for mesos, Benno Evers, Gastón Kleiman, Joseph Wu, and Vinod 
Kone.


Bugs: MESOS-9545
https://issues.apache.org/jira/browse/MESOS-9545


Repository: mesos


Description
---

Previously, all removed tasks were added to the
`slaves.unreachableTasks` map. This patch adds a conditional
so that removed tasks are only added to that structure when
they are being marked unreachable.


Diffs
-

  src/master/master.cpp ad54ae217863a08f4e6d743b39c176b171353084 


Diff: https://reviews.apache.org/r/70518/diff/1/


Testing
---

`make check`


Thanks,

Greg Mann



Re: Review Request 70508: Fixed the flaky ExamplesTest.DynamicReservationFramework.

2019-04-22 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70508/#review214800
---


Ship it!




Ship It!

- Meng Zhu


On April 22, 2019, 9:47 a.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70508/
> ---
> 
> (Updated April 22, 2019, 9:47 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Meng Zhu.
> 
> 
> Bugs: MESOS-5804
> https://issues.apache.org/jira/browse/MESOS-5804
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The test failed in MESOS-5804 due to the following race:
> 
>   1. Framework launches task T, moves from RESERVED to
>  TASK_RUNNING state.
>   2. Allocation cycle triggers and will send the unreserved
>  resources to the framework.
>   3. Before the offer gets to the framework, task T finishes and
>  framework moves from TASK_RUNNING to RESERVED.
>   4. In the RESERVED state, the framework expects the reservation
>  in the offer. But, it's coming in a later offer, and the one
>  that arrives is for the unreserved resources since it was
>  generated while the task was still running.
> 
> Tne fix applied here for this specific race is to use a 2 week
> filter rather than a 0 second filter. That would ensure that the
> unreserved resources do not get re-offered to the framework on
> their own. However, this fix does not work until MESOS-9616 is
> resolved.
> 
> 
> Diffs
> -
> 
>   src/examples/dynamic_reservation_framework.cpp 
> f9c7dfe46a1e8dd1bc8eae45ed1b65b7a6d60dfc 
> 
> 
> Diff: https://reviews.apache.org/r/70508/diff/1/
> 
> 
> Testing
> ---
> 
> Test passes with https://reviews.apache.org/r/70132/ applied.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 70517: Added unit test for a master validation helper function.

2019-04-22 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70517/#review214801
---



Landing this patch and preceding ones in the chain; will merge the subsequent 
integration test patch soon when we have a test scheduler utility to make it 
more concise.

- Greg Mann


On April 22, 2019, 6:13 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70517/
> ---
> 
> (Updated April 22, 2019, 6:13 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9619
> https://issues.apache.org/jira/browse/MESOS-9619
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added unit test for a master validation helper function.
> 
> 
> Diffs
> -
> 
>   src/tests/master_validation_tests.cpp 
> 400ad686291e08f578f27cfb9341263972e36684 
> 
> 
> Diff: https://reviews.apache.org/r/70517/diff/1/
> 
> 
> Testing
> ---
> 
> Testing details at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 70508: Fixed the flaky ExamplesTest.DynamicReservationFramework.

2019-04-22 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70508/#review214799
---



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['70132', '70508']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/3232/mesos-review-70508

Relevant logs:

- 
[mesos-tests.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/3232/mesos-review-70508/logs/mesos-tests.log):

```
I0422 21:11:47.285095 70432 master.cpp:] Disconnecting agent 
c544fe49-2ca1-4396-8fa1-a71bb44930b2-S0 at slave(501)@192.10.1.4:57491 
(windows-01.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I0422 21:11:47.285095 70432 master.cpp:3352] Deactivating agent 
c544fe49-2ca1-4396-8fa1-a71bb44930b2-S0 at slave(501)@192.10.1.4:57491 
(windows-01.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I0422 21:11:47.285095 67920 hierarchical.cpp:392] Removed framework 
c544fe49-2ca1-4396-8fa1-a71bb44930b2-
I0422 21:11:47.286084 67920 hierarchical.cpp:829] Agent 
c544fe49-2ca1-4396-8fa1-a71bb44930b2-S0 deactivated
I0422 21:11:47.287097 72232 containerizer.cpp:2576] Destroying container 
f9ff457a-e2a7-454a-a4a0-80a08c5edb61 in RUNNING state
I0422 21:11:47.287097 72232 containerizer.cpp:3278] Transitioning the state of 
container f9ff457a-e2a7-454a-a4a0-80a08c5edb61 from RUNNIN[   OK ] 
IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (688 ms)
[--] 1 test from IsolationFlag/MemoryIsolatorTest (705 ms total)

[--] Global test environment tear-down
[==] 1161 tests from 109 test cases ran. (581230 ms total)
[  PASSED  ] 1158 tests.
[  FAILED  ] 3 tests, listed below:
[  FAILED  ] DockerFetcherPluginTest.INTERNET_CURL_FetchManifest
[  FAILED  ] DockerFetcherPluginTest.INTERNET_CURL_FetchImage
[  FAILED  ] DockerFetcherPluginTest.INTERNET_CURL_InvokeFetchByName

 3 FAILED TESTS
  YOU HAVE 233 DISABLED TESTS

G to DESTROYING
I0422 21:11:47.287097 72232 launcher.cpp:161] Asked to destroy container 
f9ff457a-e2a7-454a-a4a0-80a08c5edb61
W0422 21:11:47.289079 71860 process.cpp:1423] Failed to recv on socket 
WindowsFD::Type::SOCKET=2492 to peer '192.10.1.4:59882': IO failed with error 
code: The specified network name is no longer available.

W0422 21:11:47.289079 71860 process.cpp:838] Failed to recv on socket 
WindowsFD::Type::SOCKET=2208 to peer '192.10.1.4:59883': IO failed with error 
code: The specified network name is no longer available.

I0422 21:11:47.302096 70252 containerizer.cpp:3117] Container 
f9ff457a-e2a7-454a-a4a0-80a08c5edb61 has exited
I0422 21:11:47.332104 68540 master.cpp:1135] Master terminating
I0422 21:11:47.333106 67732 hierarchical.cpp:680] Removed agent 
c544fe49-2ca1-4396-8fa1-a71bb44930b2-S0
I0422 21:11:48.433110 71860 process.cpp:927] Stopped the socket accept loop
```

- Mesos Reviewbot Windows


On April 22, 2019, 9:47 a.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70508/
> ---
> 
> (Updated April 22, 2019, 9:47 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Meng Zhu.
> 
> 
> Bugs: MESOS-5804
> https://issues.apache.org/jira/browse/MESOS-5804
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The test failed in MESOS-5804 due to the following race:
> 
>   1. Framework launches task T, moves from RESERVED to
>  TASK_RUNNING state.
>   2. Allocation cycle triggers and will send the unreserved
>  resources to the framework.
>   3. Before the offer gets to the framework, task T finishes and
>  framework moves from TASK_RUNNING to RESERVED.
>   4. In the RESERVED state, the framework expects the reservation
>  in the offer. But, it's coming in a later offer, and the one
>  that arrives is for the unreserved resources since it was
>  generated while the task was still running.
> 
> Tne fix applied here for this specific race is to use a 2 week
> filter rather than a 0 second filter. That would ensure that the
> unreserved resources do not get re-offered to the framework on
> their own. However, this fix does not work until MESOS-9616 is
> resolved.
> 
> 
> Diffs
> -
> 
>   src/examples/dynamic_reservation_framework.cpp 
> f9c7dfe46a1e8dd1bc8eae45ed1b65b7a6d60dfc 
> 
> 
> Diff: https://reviews.apache.org/r/70508/diff/1/
> 
> 
> Testing
> ---
> 
> Test passes with https://reviews.apache.org/r/70132/ applied.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 70509: Added tests for overlapping ranges and sets in task/executor resources.

2019-04-22 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70509/#review214798
---



Patch looks great!

Reviews applied: [70507, 70472, 70517, 70509]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On April 22, 2019, 6:14 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70509/
> ---
> 
> (Updated April 22, 2019, 6:14 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9619
> https://issues.apache.org/jira/browse/MESOS-9619
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added tests for overlapping ranges and sets in task/executor resources.
> 
> 
> Diffs
> -
> 
>   src/tests/master_tests.cpp 964d935771a99efaee63187affe46b551146f310 
> 
> 
> Diff: https://reviews.apache.org/r/70509/diff/3/
> 
> 
> Testing
> ---
> 
> `make check`
> `bin/mesos-tests.sh --gtest_filter="*OverlappingSetsAndRanges*" 
> --gtest_repeat=-1 --gtest_break_on_failure`
> `bin/mesos-tests.sh --gtest_filter="*LaunchOverlappingSetAndRangeResources*" 
> --gtest_repeat=-1 --gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 70517: Added unit test for a master validation helper function.

2019-04-22 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70517/#review214797
---


Ship it!




Ship It!

- Benjamin Mahler


On April 22, 2019, 6:13 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70517/
> ---
> 
> (Updated April 22, 2019, 6:13 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9619
> https://issues.apache.org/jira/browse/MESOS-9619
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added unit test for a master validation helper function.
> 
> 
> Diffs
> -
> 
>   src/tests/master_validation_tests.cpp 
> 400ad686291e08f578f27cfb9341263972e36684 
> 
> 
> Diff: https://reviews.apache.org/r/70517/diff/1/
> 
> 
> Testing
> ---
> 
> Testing details at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 70508: Fixed the flaky ExamplesTest.DynamicReservationFramework.

2019-04-22 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70508/#review214795
---



Patch looks great!

Reviews applied: [70132, 70508]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On April 22, 2019, 4:47 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70508/
> ---
> 
> (Updated April 22, 2019, 4:47 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Meng Zhu.
> 
> 
> Bugs: MESOS-5804
> https://issues.apache.org/jira/browse/MESOS-5804
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The test failed in MESOS-5804 due to the following race:
> 
>   1. Framework launches task T, moves from RESERVED to
>  TASK_RUNNING state.
>   2. Allocation cycle triggers and will send the unreserved
>  resources to the framework.
>   3. Before the offer gets to the framework, task T finishes and
>  framework moves from TASK_RUNNING to RESERVED.
>   4. In the RESERVED state, the framework expects the reservation
>  in the offer. But, it's coming in a later offer, and the one
>  that arrives is for the unreserved resources since it was
>  generated while the task was still running.
> 
> Tne fix applied here for this specific race is to use a 2 week
> filter rather than a 0 second filter. That would ensure that the
> unreserved resources do not get re-offered to the framework on
> their own. However, this fix does not work until MESOS-9616 is
> resolved.
> 
> 
> Diffs
> -
> 
>   src/examples/dynamic_reservation_framework.cpp 
> f9c7dfe46a1e8dd1bc8eae45ed1b65b7a6d60dfc 
> 
> 
> Diff: https://reviews.apache.org/r/70508/diff/1/
> 
> 
> Testing
> ---
> 
> Test passes with https://reviews.apache.org/r/70132/ applied.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 70509: Added tests for overlapping ranges and sets in task/executor resources.

2019-04-22 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70509/
---

(Updated April 22, 2019, 6:14 p.m.)


Review request for mesos, Benno Evers, Benjamin Mahler, and Gastón Kleiman.


Bugs: MESOS-9619
https://issues.apache.org/jira/browse/MESOS-9619


Repository: mesos


Description
---

Added tests for overlapping ranges and sets in task/executor resources.


Diffs (updated)
-

  src/tests/master_tests.cpp 964d935771a99efaee63187affe46b551146f310 


Diff: https://reviews.apache.org/r/70509/diff/3/

Changes: https://reviews.apache.org/r/70509/diff/2-3/


Testing
---

`make check`
`bin/mesos-tests.sh --gtest_filter="*OverlappingSetsAndRanges*" 
--gtest_repeat=-1 --gtest_break_on_failure`
`bin/mesos-tests.sh --gtest_filter="*LaunchOverlappingSetAndRangeResources*" 
--gtest_repeat=-1 --gtest_break_on_failure`


Thanks,

Greg Mann



Review Request 70517: Added unit test for a master validation helper function.

2019-04-22 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70517/
---

Review request for mesos, Benno Evers, Benjamin Mahler, and Gastón Kleiman.


Bugs: MESOS-9619
https://issues.apache.org/jira/browse/MESOS-9619


Repository: mesos


Description
---

Added unit test for a master validation helper function.


Diffs
-

  src/tests/master_validation_tests.cpp 
400ad686291e08f578f27cfb9341263972e36684 


Diff: https://reviews.apache.org/r/70517/diff/1/


Testing
---

Testing details at the end of this chain.


Thanks,

Greg Mann



Re: Review Request 70508: Fixed the flaky ExamplesTest.DynamicReservationFramework.

2019-04-22 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70508/
---

(Updated April 22, 2019, 4:47 p.m.)


Review request for mesos, Chun-Hung Hsiao and Meng Zhu.


Changes
---

Added dependency on r/70132.


Bugs: MESOS-5804
https://issues.apache.org/jira/browse/MESOS-5804


Repository: mesos


Description
---

The test failed in MESOS-5804 due to the following race:

  1. Framework launches task T, moves from RESERVED to
 TASK_RUNNING state.
  2. Allocation cycle triggers and will send the unreserved
 resources to the framework.
  3. Before the offer gets to the framework, task T finishes and
 framework moves from TASK_RUNNING to RESERVED.
  4. In the RESERVED state, the framework expects the reservation
 in the offer. But, it's coming in a later offer, and the one
 that arrives is for the unreserved resources since it was
 generated while the task was still running.

Tne fix applied here for this specific race is to use a 2 week
filter rather than a 0 second filter. That would ensure that the
unreserved resources do not get re-offered to the framework on
their own. However, this fix does not work until MESOS-9616 is
resolved.


Diffs
-

  src/examples/dynamic_reservation_framework.cpp 
f9c7dfe46a1e8dd1bc8eae45ed1b65b7a6d60dfc 


Diff: https://reviews.apache.org/r/70508/diff/1/


Testing (updated)
---

Test passes with https://reviews.apache.org/r/70132/ applied.


Thanks,

Benjamin Mahler



Re: Review Request 70472: Ensured that task groups do not specify overlapping ranges or sets.

2019-04-22 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70472/#review214793
---


Ship it!




Ship It!

- Benjamin Mahler


On April 22, 2019, 3:32 a.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70472/
> ---
> 
> (Updated April 22, 2019, 3:32 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Gastón Kleiman, and 
> Meng Zhu.
> 
> 
> Bugs: MESOS-9619
> https://issues.apache.org/jira/browse/MESOS-9619
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds validation to the master to ensure that task
> groups do not include resources with overlapping set- or
> range-valued resources, as this can crash the allocator.
> 
> 
> Diffs
> -
> 
>   src/master/validation.hpp 71748c121aa3518d68811ea1e60707d195b58657 
>   src/master/validation.cpp f032a781608857d0c9cfa220dd8d70f74d60f1ec 
> 
> 
> Diff: https://reviews.apache.org/r/70472/diff/5/
> 
> 
> Testing
> ---
> 
> Testing details at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 70509: Added tests for overlapping ranges and sets in task/executor resources.

2019-04-22 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70509/#review214790
---



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['70507', '70472', '70509']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/3230/mesos-review-70509

Relevant logs:

- 
[mesos-tests.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/3230/mesos-review-70509/logs/mesos-tests.log):

```
I0422 16:09:20.563588 69556 master.cpp:] Disconnecting agent 
8941e201-5ddb-4380-9fec-91080ed96810-S0 at slave(501)@192.10.1.4:54650 
(windows-01.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I0422 16:09:20.563588 69556 master.cpp:3352] Deactivating agent 
8941e201-5ddb-4380-9fec-91080ed96810-S0 at slave(501)@192.10.1.4:54650 
(windows-01.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I0422 16:09:20.564586 71208 hierarchical.cpp:392] Removed framework 
8941e201-5ddb-4380-9fec-91080ed96810-
I0422 16:09:20.564586 71208 hierarchical.cpp:829] Agent 
8941e201-5ddb-4380-9fec-91080ed96810-S0 deactivated
I0422 16:09:20.564586 73620 containerizer.cpp:2576] Destroying container 
21833486-b5bd-444f-8ac0-2befe93c0726 in RUNNING state
I0422 16:09:20.564586 73620 containerizer.cpp:3278] Transitioning the state of 
container 21833486-b5bd-444f-8ac0-2befe93c0726 from RUNNING to DESTROYING
I0422 16:09:20.565583 73620 launcher.cpp:161] Asked to destroy container 
21833486-b5bd-444f-8ac0-2befe93c0726
W0422 16:09:20.566566 73216 process.cpp:838] Failed to recv on socket 
WindowsFD::Type::SOCKET=12072 to peer '192.10.1.4:57026': IO failed with error 
code: The specified network name is no longer available.
[   OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (786 ms)
[--] 1 test from IsolationFlag/MemoryIsolatorTest (803 ms total)

[--] Global test environment tear-down
[==] 1162 tests from 109 test cases ran. (583041 ms total)
[  PASSED  ] 1159 tests.
[  FAILED  ] 3 tests, listed below:
[  FAILED  ] DockerFetcherPluginTest.INTERNET_CURL_FetchManifest
[  FAILED  ] DockerFetcherPluginTest.INTERNET_CURL_FetchImage
[  FAILED  ] DockerFetcherPluginTest.INTERNET_CURL_InvokeFetchByName

 3 FAILED TESTS
  YOU HAVE 233 DISABLED TESTS


W0422 16:09:20.567572 73216 process.cpp:1423] Failed to recv on socket 
WindowsFD::Type::SOCKET=12120 to peer '192.10.1.4:57025': IO failed with error 
code: The specified network name is no longer available.

I0422 16:09:20.637586 70944 containerizer.cpp:3117] Container 
21833486-b5bd-444f-8ac0-2befe93c0726 has exited
I0422 16:09:20.667577 71688 master.cpp:1135] Master terminating
I0422 16:09:20.669574 71208 hierarchical.cpp:680] Removed agent 
8941e201-5ddb-4380-9fec-91080ed96810-S0
I0422 16:09:21.650332 73216 process.cpp:927] Stopped the socket accept loop
```

- Mesos Reviewbot Windows


On April 22, 2019, 3:34 a.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70509/
> ---
> 
> (Updated April 22, 2019, 3:34 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, and Gastón Kleiman.
> 
> 
> Bugs: MESOS-9619
> https://issues.apache.org/jira/browse/MESOS-9619
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added tests for overlapping ranges and sets in task/executor resources.
> 
> 
> Diffs
> -
> 
>   src/tests/master_tests.cpp 964d935771a99efaee63187affe46b551146f310 
>   src/tests/master_validation_tests.cpp 
> 400ad686291e08f578f27cfb9341263972e36684 
> 
> 
> Diff: https://reviews.apache.org/r/70509/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> `bin/mesos-tests.sh --gtest_filter="*OverlappingSetsAndRanges*" 
> --gtest_repeat=-1 --gtest_break_on_failure`
> `bin/mesos-tests.sh --gtest_filter="*LaunchOverlappingSetAndRangeResources*" 
> --gtest_repeat=-1 --gtest_break_on_failure`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 70515: Added a test to verify non-root nested container can access its sandbox.

2019-04-22 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70515/#review214788
---



Patch looks great!

Reviews applied: [70514, 70515]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On April 22, 2019, 1:27 p.m., Qian Zhang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70515/
> ---
> 
> (Updated April 22, 2019, 1:27 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik, Gilbert Song, and James Peach.
> 
> 
> Bugs: MESOS-9536
> https://issues.apache.org/jira/browse/MESOS-9536
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a test to verify non-root nested container can access its sandbox.
> 
> 
> Diffs
> -
> 
>   src/tests/containerizer/nested_mesos_containerizer_tests.cpp 
> bbf83fa24966a7c9f585b9912fa77bf3460db26f 
> 
> 
> Diff: https://reviews.apache.org/r/70515/diff/1/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> This test will fail without the previous patch 
> (https://reviews.apache.org/r/70514/ ).
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>



Review Request 70515: Added a test to verify non-root nested container can access its sandbox.

2019-04-22 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70515/
---

Review request for mesos, Andrei Budnik, Gilbert Song, and James Peach.


Bugs: MESOS-9536
https://issues.apache.org/jira/browse/MESOS-9536


Repository: mesos


Description
---

Added a test to verify non-root nested container can access its sandbox.


Diffs
-

  src/tests/containerizer/nested_mesos_containerizer_tests.cpp 
bbf83fa24966a7c9f585b9912fa77bf3460db26f 


Diff: https://reviews.apache.org/r/70515/diff/1/


Testing
---

sudo make check

This test will fail without the previous patch 
(https://reviews.apache.org/r/70514/ ).


Thanks,

Qian Zhang



Review Request 70514: Made nested contaienr can access its sandbox via `MESOS_SANDBOX`.

2019-04-22 Thread Qian Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70514/
---

Review request for mesos, Andrei Budnik, Gilbert Song, and James Peach.


Bugs: MESOS-9536
https://issues.apache.org/jira/browse/MESOS-9536


Repository: mesos


Description
---

Previously in MESOS-8332 we narrowed task sandbox permissions from 0755
to 0750 which will cause nested container may not has permission to
access its sandbox via the environment variable `MESOS_SANDBOX`. Now in
this patch, for nested container which has no its own rootfs, we bind
mount its sandbox to the directory specified via the agent flag
`--sandbox_directory` and set `MESOS_SANDBOX` to `--sandbox_directory`
as well, in this way such nested container will have the permission
to access its sandbox via `MESOS_SANDBOX`.


Diffs
-

  src/slave/containerizer/mesos/containerizer.cpp 
043244841a73fa3f5f7119bc38f6d3a04be8990b 
  src/slave/containerizer/mesos/isolators/filesystem/linux.cpp 
725754f26855ea54ccf8cbcb288ee3b29e8ed4e7 


Diff: https://reviews.apache.org/r/70514/diff/1/


Testing
---


Thanks,

Qian Zhang