Re: Review Request 51028: Fix tests with rapidly triggered allocations.

2017-01-18 Thread Jacob Janco

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51028/
---

(Updated Jan. 18, 2017, 11:29 p.m.)


Review request for mesos and Jiang Yan Xu.


Changes
---

Change JIRA.


Bugs: MESOS-6904
https://issues.apache.org/jira/browse/MESOS-6904


Repository: mesos


Description
---

- Per MESOS-3157, if cluster events with the possibility
  of triggering an allocation occur rapidly and test
  assertions depend on gleaning information from assumed
  order, it is likely they will fail due to lack of parity
  between event and actual allocation. Ensure that expected
  allocations occur.


Diffs
-

  src/tests/hierarchical_allocator_tests.cpp 
1edd0ecc8a93cd41532e1cf3641f67c780ab23a5 

Diff: https://reviews.apache.org/r/51028/diff/


Testing
---

make check


Thanks,

Jacob Janco



Re: Review Request 55691: Fix XSS vulnerability in pailer invocation.

2017-01-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55691/#review162250
---



Patch looks great!

Reviews applied: [55691]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Jan. 18, 2017, 11:40 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55691/
> ---
> 
> (Updated Jan. 18, 2017, 11:40 p.m.)
> 
> 
> Review request for mesos, haosdent huang and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6947
> https://issues.apache.org/jira/browse/MESOS-6947
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fix XSS vulnerability in pailer invocation.
> 
> 
> Diffs
> -
> 
>   src/webui/master/static/js/controllers.js 
> 388ca2447716cbc7141da6a20daf2340621a16e8 
>   src/webui/master/static/pailer.html 
> 19e0981143bd7e8372b49f4f036867e9dd05727a 
> 
> Diff: https://reviews.apache.org/r/55691/diff/
> 
> 
> Testing
> ---
> 
> make -j8 + test framework + checking pailer representation of files in sandbox
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 55689: Updated version in comment about TASK_LOST behavior.

2017-01-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55689/#review162243
---



Patch looks great!

Reviews applied: [55689]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Jan. 18, 2017, 11:21 p.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55689/
> ---
> 
> (Updated Jan. 18, 2017, 11:21 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Updated version in comment about TASK_LOST behavior.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 8f1d6957a97eff1e0a94817d38e7a7de6d69 
>   include/mesos/v1/mesos.proto 74e7851b147ab821dceeab6e838d34b092f101c3 
> 
> Diff: https://reviews.apache.org/r/55689/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Neil Conway
> 
>



Re: Review Request 55686: Added custom target for 'make distcheck'.

2017-01-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55686/#review162238
---



Patch looks great!

Reviews applied: [55657, 55686]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Jan. 18, 2017, 10:25 p.m., Srinivas Brahmaroutu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55686/
> ---
> 
> (Updated Jan. 18, 2017, 10:25 p.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-5433
> https://issues.apache.org/jira/browse/MESOS-5433
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added custom target for 'make distcheck'.
> 
> 
> Diffs
> -
> 
>   CMakeLists.txt c6b1eb6a7c7e6ed3650847acbec0ddd3101bb96f 
> 
> Diff: https://reviews.apache.org/r/55686/diff/
> 
> 
> Testing
> ---
> 
> make dist
> 
> 
> Thanks,
> 
> Srinivas Brahmaroutu
> 
>



Re: Review Request 51028: Fix tests with rapidly triggered allocations.

2017-01-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51028/#review162237
---




src/tests/hierarchical_allocator_tests.cpp (lines 2933 - 2936)


Seems clearer if we increment allocations after the settle?



src/tests/hierarchical_allocator_tests.cpp (lines 2994 - 2998)


This test is already paused, all of these should be run with a paused 
clock, are any of them not?


- Benjamin Mahler


On Dec. 21, 2016, 5:58 a.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51028/
> ---
> 
> (Updated Dec. 21, 2016, 5:58 a.m.)
> 
> 
> Review request for mesos and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Per MESOS-3157, if cluster events with the possibility
>   of triggering an allocation occur rapidly and test
>   assertions depend on gleaning information from assumed
>   order, it is likely they will fail due to lack of parity
>   between event and actual allocation. Ensure that expected
>   allocations occur.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_tests.cpp 
> 1edd0ecc8a93cd41532e1cf3641f67c780ab23a5 
> 
> Diff: https://reviews.apache.org/r/51028/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 52534: Dispatch filter expiration twice.

2017-01-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52534/#review162235
---


Fix it, then Ship it!




This adjusts the code to fix the problem, so looks good to me.

Longer term it would be great to have something that's less brittle for dealing 
with the timer expirations after an allocation on the agent has occurred. The 
current approach relies on non-local delay and dispatch knowledge, vs. an 
approach that was explicitly marking filters or waiting for a future (e.g. 
`process::collect(delay, allocationHasOccurred)`).


src/master/allocator/mesos/hierarchical.cpp (lines 1078 - 1087)


The naming here seems backwards? `_expire()` then `expire()`?



src/master/allocator/mesos/hierarchical.cpp (lines 1078 - 1081)


Hm.. seems like this should be incorporated to the comment above that is 
explaining when to expire the filter.

Also, maybe since the explanation is here we can just use a lambda instead 
of the additional top level function?


- Benjamin Mahler


On Jan. 12, 2017, 6:56 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52534/
> ---
> 
> (Updated Jan. 12, 2017, 6:56 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Benjamin Mahler, and Jiang Yan 
> Xu.
> 
> 
> Bugs: MESOS-6904
> https://issues.apache.org/jira/browse/MESOS-6904
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - With an asynchronous `batch()` allocation,
>   this ensures that filters do not expire
>   before the next allocation.
> - This patch should be reverted when allocation
>   occurs on resource recovery.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> a6424d624864155e1c87a28a63b784512c5c8722 
>   src/master/allocator/mesos/hierarchical.cpp 
> 91b1ec43940a788459f045ca4a4b82d4e8373bca 
> 
> Diff: https://reviews.apache.org/r/52534/diff/
> 
> 
> Testing
> ---
> 
> With https://reviews.apache.org/r/51027/: 
> 
> GTEST_FILTER="-*SmallOfferFilter*" make check -j8
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 55701: Fixed unsafe usage of process pointer in async.hpp.

2017-01-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55701/#review162234
---


Ship it!




Ship It!

- Benjamin Mahler


On Jan. 19, 2017, 2:41 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55701/
> ---
> 
> (Updated Jan. 19, 2017, 2:41 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Hindman, Benjamin 
> Mahler, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The `async(...)` helper spawns a libprocess actor to execute some
> lambda (asynchronously, of course).  This actor is owned by the
> libprocess GC actor, but the `AsyncExecutor` stores a copy of that
> pointer and dereferences it in several possible locations.
> 
> Instead, the `AsyncExecutor` should store the `PID` of the actor,
> which can be safely used, even if the actor has already terminated;
> such as turning libprocess finalization.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/include/process/async.hpp 
> 8565a52c6ba4008edb852e898b8f0b6d598a194f 
> 
> Diff: https://reviews.apache.org/r/55701/diff/
> 
> 
> Testing
> ---
> 
> Applied the following change:
> ```
> diff --git a/src/launcher/executor.cpp b/src/launcher/executor.cpp
> index e035a4e..4c60ef9 100644
> --- a/src/launcher/executor.cpp
> +++ b/src/launcher/executor.cpp
> @@ -936,6 +930,8 @@ int main(int argc, char** argv)
>  
>process::spawn(executor.get());
>process::wait(executor.get());
> +  executor.reset();
>  
> +  process::finalize(true);
>return EXIT_SUCCESS;
>  }
> ```
> 
> And then ran:
> ```
> make check GTEST_FILTER="HTTPCommandExecutorTest.TerminateWithACK"
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 51027: Track allocation candidates to bound allocator.

2017-01-18 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/#review162229
---


Fix it, then Ship it!




The logic looks good to me, just a few minor comments.


src/master/allocator/mesos/hierarchical.hpp (lines 221 - 229)


I think it's sufficient here to just say:

```
// Allocate resources from the specified agents.
```

The rest of the comment seems to be a re-iteration of the logic inside the 
implementations. Do we need to say that the other two call into this? Or how 
exactly we do the batching? IMHO it's likely to go stale over time and the 
reader can read the implementation to see exactly how it works.

We could add something like:

```
  // Allocate any allocatable resources from all known agents,
  // the allocation is deferred and batched with other
  // allocation requests.
  process::Future allocate();
```

But it seems unnecessary given the caller just needs to understand that it 
is asynchronous (i.e. returns a Future).



src/master/allocator/mesos/hierarchical.hpp (lines 232 - 236)


Run seems a bit generic. How about `_allocate()` and `__allocate()` to make 
it clear it's the first `_allocate()` continuation?



src/master/allocator/mesos/hierarchical.cpp (lines 1267 - 1274)


This change introduces an additional trip through the allocator's queue 
after the allocation completes and before the delay is called.

This would avoid it:

```
auto pid = self();

// TODO: Use process::loop for the allocation loop.
allocate()
  .onAny([pid]() {
delay(allocationInterval, self(), ::batch);
  }
```

Not sure if this was intentional or not.



src/master/allocator/mesos/hierarchical.cpp (lines 1301 - 1305)


This seems pretty self explanatory, no need for this comment IMHO. The 
example is likely to go stale.



src/master/allocator/mesos/hierarchical.cpp (lines 1308 - 1309)


Why the newline?


- Benjamin Mahler


On Jan. 12, 2017, 6:55 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51027/
> ---
> 
> (Updated Jan. 12, 2017, 6:55 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus 
> Ma, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6904
> https://issues.apache.org/jira/browse/MESOS-6904
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Triggered allocations dispatch allocate() only
>   if there is no pending allocation in the queue.
> - Allocation candidates are accumulated and only
>   cleared when enqueued allocations are processed.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> a6424d624864155e1c87a28a63b784512c5c8722 
>   src/master/allocator/mesos/hierarchical.cpp 
> 91b1ec43940a788459f045ca4a4b82d4e8373bca 
> 
> Diff: https://reviews.apache.org/r/51027/diff/
> 
> 
> Testing
> ---
> 
> make check with the filters below
> 
> Broken tests: 
> - TEST_F(HierarchicalAllocatorTest, SuppressAndReviveOffers), fix in 51028
> - TEST_F(HierarchicalAllocatorTest, AllocationRunsMetric), fix in 51028
> - TEST_F(HierarchicalAllocatorTest, AllocationRunTimerMetrics), fix in 51028
> - TEST_F(HierarchicalAllocatorTest, UpdateWeight), fix in 51028
> - TEST_P(HierarchicalAllocator_BENCHMARK_Test, AddAndUpdateSlave), fix in 
> 51028
> - TEST_F(HierarchicalAllocatorTest, SmallOfferFilterTimeout), fix in 52534
> - TEST_F(OversubscriptionTest, RescindRevocableOfferWithIncreasedRevocable), 
> fix in 51621
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Review Request 55701: Fixed unsafe usage of process pointer in async.hpp.

2017-01-18 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55701/
---

Review request for mesos, Benjamin Bannier, Benjamin Hindman, Benjamin Mahler, 
and Michael Park.


Repository: mesos


Description
---

The `async(...)` helper spawns a libprocess actor to execute some
lambda (asynchronously, of course).  This actor is owned by the
libprocess GC actor, but the `AsyncExecutor` stores a copy of that
pointer and dereferences it in several possible locations.

Instead, the `AsyncExecutor` should store the `PID` of the actor,
which can be safely used, even if the actor has already terminated;
such as turning libprocess finalization.


Diffs
-

  3rdparty/libprocess/include/process/async.hpp 
8565a52c6ba4008edb852e898b8f0b6d598a194f 

Diff: https://reviews.apache.org/r/55701/diff/


Testing
---

Applied the following change:
```
diff --git a/src/launcher/executor.cpp b/src/launcher/executor.cpp
index e035a4e..4c60ef9 100644
--- a/src/launcher/executor.cpp
+++ b/src/launcher/executor.cpp
@@ -936,6 +930,8 @@ int main(int argc, char** argv)
 
   process::spawn(executor.get());
   process::wait(executor.get());
+  executor.reset();
 
+  process::finalize(true);
   return EXIT_SUCCESS;
 }
```

And then ran:
```
make check GTEST_FILTER="HTTPCommandExecutorTest.TerminateWithACK"
```


Thanks,

Joseph Wu



Re: Review Request 55037: Added default PATH value in `launch.cpp` for Windows case.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55037/
---

(Updated Jan. 19, 2017, 2:13 a.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
and Joseph Wu.


Bugs: MESOS-6839
https://issues.apache.org/jira/browse/MESOS-6839


Repository: mesos


Description
---

Currently, when the containerizer launches a process with the POSIX
launcher, it will check if the `$PATH` environment variable (or `%PATH%`
in the case of Windows) is set in the `LaunchInfo`. If it is not, we
supply a default path. Unfortunately, this default path is specific to
POSIX. In many of our tests, this causes many of our tests to be unable
to find Windows-standard executables like `ping`, and subsequently fail.

This commit will introduce a function, `defaultPath` that returns a
sensible default path for both POSIX and Windows. Since the Windows
implementation of this depends on the configuration of the host running
the containerizer (rather than, say, the one creating the `TaskInfo`),
we choose to implement this in `launch.cpp` instead of Stout, where a
user could mistakenly call it and expect the same output on all hosts.


Diffs (updated)
-

  src/slave/containerizer/docker.cpp 7a8a7271b54af0b4dcdae7a2aa8a90a8a7d05fd6 
  src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
ea91c71fdfac48a2fc1d31a0ee088a73244be367 
  
src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/port_mapper.cpp
 ab4b88acddc7503e16e3730320df39a2f104539a 
  src/slave/containerizer/mesos/launch.cpp 
e482ab8bdfc358f695b87cda72ca59fb64cd8c4d 

Diff: https://reviews.apache.org/r/55037/diff/


Testing
---


Thanks,

Alex Clemmer



Re: Review Request 55037: Added default PATH value in `launch.cpp` for Windows case.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55037/
---

(Updated Jan. 19, 2017, 2:09 a.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
and Joseph Wu.


Bugs: MESOS-6839
https://issues.apache.org/jira/browse/MESOS-6839


Repository: mesos


Description
---

Currently, when the containerizer launches a process with the POSIX
launcher, it will check if the `$PATH` environment variable (or `%PATH%`
in the case of Windows) is set in the `LaunchInfo`. If it is not, we
supply a default path. Unfortunately, this default path is specific to
POSIX. In many of our tests, this causes many of our tests to be unable
to find Windows-standard executables like `ping`, and subsequently fail.

This commit will introduce a function, `defaultPath` that returns a
sensible default path for both POSIX and Windows. Since the Windows
implementation of this depends on the configuration of the host running
the containerizer (rather than, say, the one creating the `TaskInfo`),
we choose to implement this in `launch.cpp` instead of Stout, where a
user could mistakenly call it and expect the same output on all hosts.


Diffs (updated)
-

  src/slave/containerizer/docker.cpp 7a8a7271b54af0b4dcdae7a2aa8a90a8a7d05fd6 
  src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
ea91c71fdfac48a2fc1d31a0ee088a73244be367 
  
src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/port_mapper.cpp
 ab4b88acddc7503e16e3730320df39a2f104539a 
  src/slave/containerizer/mesos/launch.cpp 
e482ab8bdfc358f695b87cda72ca59fb64cd8c4d 
  src/slave/containerizer/mesos/utils.hpp 
a54106dc4893bb222f42ede936ac9029e817faf9 

Diff: https://reviews.apache.org/r/55037/diff/


Testing
---


Thanks,

Alex Clemmer



Re: Review Request 55037: Added default PATH value in `launch.cpp` for Windows case.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55037/
---

(Updated Jan. 19, 2017, 2:06 a.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
and Joseph Wu.


Changes
---

Addressed Joseph's comments.


Bugs: MESOS-6839
https://issues.apache.org/jira/browse/MESOS-6839


Repository: mesos


Description
---

Currently, when the containerizer launches a process with the POSIX
launcher, it will check if the `$PATH` environment variable (or `%PATH%`
in the case of Windows) is set in the `LaunchInfo`. If it is not, we
supply a default path. Unfortunately, this default path is specific to
POSIX. In many of our tests, this causes many of our tests to be unable
to find Windows-standard executables like `ping`, and subsequently fail.

This commit will introduce a function, `defaultPath` that returns a
sensible default path for both POSIX and Windows. Since the Windows
implementation of this depends on the configuration of the host running
the containerizer (rather than, say, the one creating the `TaskInfo`),
we choose to implement this in `launch.cpp` instead of Stout, where a
user could mistakenly call it and expect the same output on all hosts.


Diffs (updated)
-

  src/slave/containerizer/docker.cpp 7a8a7271b54af0b4dcdae7a2aa8a90a8a7d05fd6 
  src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
ea91c71fdfac48a2fc1d31a0ee088a73244be367 
  
src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/port_mapper.cpp
 ab4b88acddc7503e16e3730320df39a2f104539a 
  src/slave/containerizer/mesos/launch.cpp 
e482ab8bdfc358f695b87cda72ca59fb64cd8c4d 
  src/slave/containerizer/mesos/utils.hpp 
a54106dc4893bb222f42ede936ac9029e817faf9 

Diff: https://reviews.apache.org/r/55037/diff/


Testing
---


Thanks,

Alex Clemmer



Review Request 55699: Stout: Added `host_default_path`.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55699/
---

Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.


Repository: mesos


Description
---

Stout: Added `host_default_path`.


Diffs
-

  3rdparty/stout/include/stout/posix/os.hpp 
397c2d6b06ddb607d254eae8258da5151c5f57e0 
  3rdparty/stout/include/stout/windows/os.hpp 
b4a66ba8ddbc60f7bcc9aad7fef4e158ae3a96d6 

Diff: https://reviews.apache.org/r/55699/diff/


Testing
---


Thanks,

Alex Clemmer



Re: Review Request 55271: Disallow multi-role frameworks to change their roles.

2017-01-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55271/#review162226
---



Patch looks great!

Reviews applied: [55381, 55571, 55271]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Jan. 18, 2017, 7:07 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55271/
> ---
> 
> (Updated Jan. 18, 2017, 7:07 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Jay Guo, and Guangya Liu.
> 
> 
> Bugs: MESOS-6631
> https://issues.apache.org/jira/browse/MESOS-6631
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> We currently do not allow `MULTI_ROLE` frameworks to change their
> roles. This restriction will be lifted later.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 8e8a9037af11cf95961b6498540a0fd486ed091b 
>   src/tests/master_validation_tests.cpp 
> c092362152e1fe8a6b615c2eda171d852c1bbd86 
> 
> Diff: https://reviews.apache.org/r/55271/diff/
> 
> 
> Testing
> ---
> 
> Tested on various Linux configurations in internal CI.
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Review Request 55694: CMake: Separated Stout system headers from Stout API headers.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55694/
---

Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.


Bugs: MESOS-3932
https://issues.apache.org/jira/browse/MESOS-3932


Repository: mesos


Description
---

MESOS-3932 specifies a bug that causes spurious warnings to be printed
when building Mesos.

Many of these warnings can be eliminated simply by correctly marking off
the third-party dependencies as `SYSTEM` headers when we specify the
include path.

This commit will thus separate out the headers of Stout's third-party
dependencies from the core Stout API headers, and include them as
`SYSTEM` headers.


Diffs
-

  3rdparty/stout/cmake/StoutConfigure.cmake 
bc27ac687bae4e1798eece562027ba33c6b32348 
  3rdparty/stout/cmake/StoutTestsConfigure.cmake 
d3bd72e8eba77213095da6cabb3a6d6f4d30941c 
  3rdparty/stout/tests/CMakeLists.txt a8a3ac772aa243c848a6fd8d7a0d45acfe1b98ae 

Diff: https://reviews.apache.org/r/55694/diff/


Testing
---


Thanks,

Alex Clemmer



Review Request 55695: CMake: Separated Libprocess system headers from Libprocess API headers.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55695/
---

Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.


Bugs: MESOS-3932
https://issues.apache.org/jira/browse/MESOS-3932


Repository: mesos


Description
---

MESOS-3932 specifies a bug that causes spurious warnings to be printed
when building Mesos.

Many of these warnings can be eliminated simply by correctly marking off
the third-party dependencies as `SYSTEM` headers when we specify the
include path.

This commit will thus separate out the headers of Libprocesses's
third-party dependencies from the core Libprocess API headers, and
include them as `SYSTEM` headers.

Notably, we choose to include Stout's headers as third-party system
headers, since the goal is for them to be completely independent
projects.


Diffs
-

  3rdparty/libprocess/cmake/ProcessConfigure.cmake 
873f41d844faa0d442f77411e94314a89be5f046 
  3rdparty/libprocess/cmake/ProcessTestsConfigure.cmake 
49ad836d5fa3f84cdf5ae0e08f449cd7ef2537a1 
  3rdparty/libprocess/src/CMakeLists.txt 
60f0e76dfd237d9a12a366b413802d1a96892b55 

Diff: https://reviews.apache.org/r/55695/diff/


Testing
---


Thanks,

Alex Clemmer



Review Request 55696: CMake: Separated Mesos system headers from Mesos API headers.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55696/
---

Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.


Bugs: MESOS-3932
https://issues.apache.org/jira/browse/MESOS-3932


Repository: mesos


Description
---

MESOS-3932 specifies a bug that causes spurious warnings to be printed
when building Mesos.

Many of these warnings can be eliminated simply by correctly marking off
the third-party dependencies as `SYSTEM` headers when we specify the
include path.

This commit will thus separate out the headers of Mesos's
third-party dependencies from the core Mesos API headers, and
include them as `SYSTEM` headers.

Notably, we choose to include Stout's and Libprocess's headers as
third-party system headers, since the goal is for them to be completely
independent projects.


Diffs
-

  src/CMakeLists.txt ce71afc73f85a70cd8f97a6e913662ff7ef0d94c 
  src/docker/CMakeLists.txt 892370d603d42b649afbf05deece5e0b1d97ee98 
  src/master/CMakeLists.txt a9fb34fc7099f05e279ee0506dd18bc0745d546c 
  src/master/cmake/MasterConfigure.cmake 
3d316d6ff2910fc360b0faecb5e6ac9687a77883 
  src/slave/CMakeLists.txt 03466bda934290ce41fa6408d35ccae10c9a6f32 
  src/slave/cmake/AgentConfigure.cmake 1582127ccce773af6031a5b09252192b05a13cdc 
  src/tests/CMakeLists.txt b7adf4f3126611a57baf4f62b62b208bc0236da5 
  src/tests/cmake/MesosTestsConfigure.cmake 
8d416388a8e45a2832ae3841b58541ba5b0613bc 

Diff: https://reviews.apache.org/r/55696/diff/


Testing
---


Thanks,

Alex Clemmer



Re: Review Request 55683: Rationalize process wait error checking.

2017-01-18 Thread James Peach


> On Jan. 19, 2017, 12:43 a.m., Jiang Yan Xu wrote:
> > src/launcher/fetcher.cpp, line 114
> > 
> >
> > It's worth noting that 
> > 
> > ```
> > !WSUCCESS()
> > ```
> > 
> > translates to 
> > 
> > ```
> > !WIFEXITED(status) || WEXITSTATUS(status) != 0
> > ```
> > 
> > so it's not exactly the same as 
> > 
> > 
> > ```
> > WIFEXITED(status) && WEXITSTATUS(status) != 0
> > ```
> > 
> > But the result seems to be more correct than the previous: previously 
> > if the subprocess is signaled, the fetcher would return `true`.
> > 
> > This is then a bug fix which we could note in the description.

Right :)


> On Jan. 19, 2017, 12:43 a.m., Jiang Yan Xu wrote:
> > src/common/status_utils.hpp, line 26
> > 
> >
> > Use an adjective? `WSUCCESSFUL` for a boolean concept?

Agreed to rename it to `WSUCCEEDED`.


> On Jan. 19, 2017, 12:43 a.m., Jiang Yan Xu wrote:
> > src/common/status_utils.hpp, line 28
> > 
> >
> > I guess the same `#ifdef __WINDOWS__` guards apply?

Windows folks think this is probably OK.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55683/#review162196
---


On Jan. 18, 2017, 6:25 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55683/
> ---
> 
> (Updated Jan. 18, 2017, 6:25 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Joris Van Remoortere, Michael 
> Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6946
> https://issues.apache.org/jira/browse/MESOS-6946
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Introduce the WSUCCESS() API to test that a forked process exited
> successfully. Use it in all the places that were performing this test
> bespokely, and update error logs to user WSTRINGIFY() if appropriate.
> 
> 
> Diffs
> -
> 
>   src/common/status_utils.hpp 26b94d49973dd61a6138e6a64d5ec0bdec6dea7e 
>   src/docker/executor.cpp 9b5c469e2d0f33e228ec746711e6bc6ed352cbc7 
>   src/hdfs/hdfs.cpp bebba8bb3c16f40293d21b5e00c353135a0f8f26 
>   src/launcher/default_executor.cpp 57e4799e750f8f5352a9fec58af40efe432ea865 
>   src/launcher/executor.cpp e035a4ee4438a3342f68a5548a5fd8d57a315cfa 
>   src/launcher/fetcher.cpp 5f9a38d3ae8fb9b4b21203f96d53c5bef0909eba 
>   src/linux/ns.hpp da43266ba67f76c9ca856584a540794062097691 
>   src/slave/containerizer/docker.cpp 7a8a7271b54af0b4dcdae7a2aa8a90a8a7d05fd6 
>   src/slave/containerizer/mesos/io/switchboard.cpp 
> 2c4eed8f2dc7f91b9fd08f91f1b08484fca9a9af 
>   src/slave/containerizer/mesos/launch.cpp 
> e482ab8bdfc358f695b87cda72ca59fb64cd8c4d 
>   src/tests/script.cpp ef6b22aaeb056ed489f2c519e1a02644f2653329 
> 
> Diff: https://reviews.apache.org/r/55683/diff/
> 
> 
> Testing
> ---
> 
> make check && sudo make check (Fedora 25)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 55683: Rationalize process wait error checking.

2017-01-18 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55683/
---

(Updated Jan. 19, 2017, 1:11 a.m.)


Review request for mesos, Benjamin Bannier, Joris Van Remoortere, Michael Park, 
and Jiang Yan Xu.


Changes
---

Rebase and address review feedback.


Bugs: MESOS-6946
https://issues.apache.org/jira/browse/MESOS-6946


Repository: mesos


Description (updated)
---

Introduce the WSUCCEEDED() API to test that a forked process exited
successfully. Use it in all the places that were performing this test
bespokely, and update error logs to user WSTRINGIFY() if appropriate.

This also fixes called that were checking the exit status but not the
signaled status.


Diffs (updated)
-

  src/common/status_utils.hpp 26b94d49973dd61a6138e6a64d5ec0bdec6dea7e 
  src/docker/executor.cpp 9b5c469e2d0f33e228ec746711e6bc6ed352cbc7 
  src/hdfs/hdfs.cpp bebba8bb3c16f40293d21b5e00c353135a0f8f26 
  src/launcher/default_executor.cpp 57e4799e750f8f5352a9fec58af40efe432ea865 
  src/launcher/executor.cpp e035a4ee4438a3342f68a5548a5fd8d57a315cfa 
  src/launcher/fetcher.cpp 5f9a38d3ae8fb9b4b21203f96d53c5bef0909eba 
  src/linux/ns.hpp da43266ba67f76c9ca856584a540794062097691 
  src/slave/containerizer/docker.cpp 7a8a7271b54af0b4dcdae7a2aa8a90a8a7d05fd6 
  src/slave/containerizer/mesos/io/switchboard.cpp 
2c4eed8f2dc7f91b9fd08f91f1b08484fca9a9af 
  src/slave/containerizer/mesos/launch.cpp 
e482ab8bdfc358f695b87cda72ca59fb64cd8c4d 
  src/tests/script.cpp ef6b22aaeb056ed489f2c519e1a02644f2653329 

Diff: https://reviews.apache.org/r/55683/diff/


Testing
---

make check && sudo make check (Fedora 25)


Thanks,

James Peach



Re: Review Request 55683: Rationalize process wait error checking.

2017-01-18 Thread Jiang Yan Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55683/#review162196
---


Fix it, then Ship it!





src/common/status_utils.hpp (line 26)


Use an adjective? `WSUCCESSFUL` for a boolean concept?



src/common/status_utils.hpp (line 28)


I guess the same `#ifdef __WINDOWS__` guards apply?



src/launcher/fetcher.cpp (line 114)


It's worth noting that 

```
!WSUCCESS()
```

translates to 

```
!WIFEXITED(status) || WEXITSTATUS(status) != 0
```

so it's not exactly the same as 

```
WIFEXITED(status) && WEXITSTATUS(status) != 0
```

But the result seems to be more correct than the previous: previously if 
the subprocess is signaled, the fetcher would return `true`.

This is then a bug fix which we could note in the description.


- Jiang Yan Xu


On Jan. 18, 2017, 10:25 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55683/
> ---
> 
> (Updated Jan. 18, 2017, 10:25 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Joris Van Remoortere, Michael 
> Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6946
> https://issues.apache.org/jira/browse/MESOS-6946
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Introduce the WSUCCESS() API to test that a forked process exited
> successfully. Use it in all the places that were performing this test
> bespokely, and update error logs to user WSTRINGIFY() if appropriate.
> 
> 
> Diffs
> -
> 
>   src/common/status_utils.hpp 26b94d49973dd61a6138e6a64d5ec0bdec6dea7e 
>   src/docker/executor.cpp 9b5c469e2d0f33e228ec746711e6bc6ed352cbc7 
>   src/hdfs/hdfs.cpp bebba8bb3c16f40293d21b5e00c353135a0f8f26 
>   src/launcher/default_executor.cpp 57e4799e750f8f5352a9fec58af40efe432ea865 
>   src/launcher/executor.cpp e035a4ee4438a3342f68a5548a5fd8d57a315cfa 
>   src/launcher/fetcher.cpp 5f9a38d3ae8fb9b4b21203f96d53c5bef0909eba 
>   src/linux/ns.hpp da43266ba67f76c9ca856584a540794062097691 
>   src/slave/containerizer/docker.cpp 7a8a7271b54af0b4dcdae7a2aa8a90a8a7d05fd6 
>   src/slave/containerizer/mesos/io/switchboard.cpp 
> 2c4eed8f2dc7f91b9fd08f91f1b08484fca9a9af 
>   src/slave/containerizer/mesos/launch.cpp 
> e482ab8bdfc358f695b87cda72ca59fb64cd8c4d 
>   src/tests/script.cpp ef6b22aaeb056ed489f2c519e1a02644f2653329 
> 
> Diff: https://reviews.apache.org/r/55683/diff/
> 
> 
> Testing
> ---
> 
> make check && sudo make check (Fedora 25)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 55464: Made the Agent API able to handle containers nested at arbitrary levels.

2017-01-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55464/#review162210
---



Patch looks great!

Reviews applied: [55676, 55677, 55678, 55679, 55464]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Jan. 18, 2017, 2:48 p.m., Gastón Kleiman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55464/
> ---
> 
> (Updated Jan. 18, 2017, 2:48 p.m.)
> 
> 
> Review request for mesos, Adam B, Alexander Rukletsov, Anand Mazumdar, 
> Alexander Rojas, haosdent huang, and Vinod Kone.
> 
> 
> Bugs: MESOS-6864
> https://issues.apache.org/jira/browse/MESOS-6864
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Made the Agent API able to handle containers nested at arbitrary levels.
> 
> 
> Diffs
> -
> 
>   src/slave/http.cpp ee7119179a6ddd935c3f43a618ef645619c305ee 
>   src/slave/slave.cpp ae183e6b8b40094b4764525a6d63164f210ef9d8 
>   src/tests/api_tests.cpp ea62226fbc5dd0bcc5ea60d1bbcd3748f9367467 
> 
> Diff: https://reviews.apache.org/r/55464/diff/
> 
> 
> Testing
> ---
> 
> `GTEST_FILTER="" make -j 8 check` in macOS.
> `make check` in Linux.
> 
> I also did some manual testing using a proof of concept  that makes the 
> `DefaultExecutor` leverage this change to perform CMD health checks.
> 
> 
> Thanks,
> 
> Gastón Kleiman
> 
>



Re: Review Request 55691: Fix XSS vulnerability in pailer invocation.

2017-01-18 Thread Jacob Janco

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55691/
---

(Updated Jan. 18, 2017, 11:37 p.m.)


Review request for mesos and Jiang Yan Xu.


Bugs: MESOS-6947
https://issues.apache.org/jira/browse/MESOS-6947


Repository: mesos


Description
---

Fix XSS vulnerability in pailer invocation.


Diffs
-

  src/webui/master/static/js/controllers.js 
388ca2447716cbc7141da6a20daf2340621a16e8 
  src/webui/master/static/pailer.html 19e0981143bd7e8372b49f4f036867e9dd05727a 

Diff: https://reviews.apache.org/r/55691/diff/


Testing
---

make -j8 + test framework + checking pailer representation of files in sandbox


Thanks,

Jacob Janco



Re: Review Request 55637: CMake: Added `test` target.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55637/
---

(Updated Jan. 18, 2017, 11:34 p.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.


Changes
---

Addressed Ilya's very good point.


Bugs: MESOS-3697
https://issues.apache.org/jira/browse/MESOS-3697


Repository: mesos


Description
---

This commit will introduce a `test` target to the CMake build system.
The semantics of this target are very similar to the autotools build
system, in that they build the tests, but do not run them.

Accomplishing this is somewhat complex in CMake, because CMake expects
to control the `test` target itself. We work around this by (1)
silencing the warning that `test` is a reserved target, (2) removing the
call to `enable_testing` that sets up CTest, and (3) adding our own
`test` target. As an additional insurance policy, we error out if any of
the `BUILD_TESTING` flags are defined, which would indicate that
`include(CTest)` was called.


Diffs (updated)
-

  CMakeLists.txt c6b1eb6a7c7e6ed3650847acbec0ddd3101bb96f 

Diff: https://reviews.apache.org/r/55637/diff/


Testing
---


Thanks,

Alex Clemmer



Re: Review Request 55637: CMake: Added `test` target.

2017-01-18 Thread Alex Clemmer


> On Jan. 17, 2017, 8:46 p.m., Ilya Pronin wrote:
> > I think the name "test" can confuse people because it's usually expected to 
> > be used for CTest. Maybe it would better be leave old {{tests}} and 
> > {{check}} targets? https://cmake.org/Wiki/CMakeEmulateMakeCheck
> 
> Alex Clemmer wrote:
> That's a great question. I'm not a CMake native, so it's not clear to me 
> how weird this would be to do. The idea was to expose a `test` target to 
> achieve something resembling script parity. Is this idea not compelling to 
> you?
> 
> Ilya Pronin wrote:
> Very compelling. I just meant that `make test` is usually expected to not 
> only build tests but also run them, i.e. act the same way as autotools 
> generated `make check`. E.g. debhelper's `dh_auto_test` looks for `test` or 
> `check` target in a Makefile. So I suggested that target that just builds 
> tests should be called `tests`.
> 
> Alex Clemmer wrote:
> Ah, ok, let me just re-state what I think is true to make sure we're on 
> the page. Right now the autotools build has the semantics that `make check` 
> will build + run tests, and `make test` will simply build them. So the 
> original idea here was that `make test` achieve script parity for our own 
> tooling. And it sounds like what you're saying is that this should actually 
> be the goal -- the goal should be that we do something more congruent with 
> the normal use of the CMake semantics of `make test` and then have another 
> target (which could be called, _e.g._, `make tests`, with an `s`) that 
> implements these semantics.
> 
> Is that all approximately correct?
> 
> If so, let's wait and see what `kaysoky` says. He has a better 
> understanding for the impact of changing the semantics of the `make test` 
> target. IMHO this is a sensible suggestion, but it's important to get input 
> for how much work this creates for the downstream consumers of the `make 
> test` target.
> 
> Ilya Pronin wrote:
> Yes, except for 1 thing: current autotools target that builds tests is 
> called `tests` :) 
> https://github.com/apache/mesos/blob/master/src/Makefile.am#L2490

Oh, oh, oh. Got it. In my defense, I don't know what the hell I'm doing. :)

Let's discard the reviews that disable CTest and re-write this to define the 
target as `tests`.


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55637/#review161940
---


On Jan. 17, 2017, 7:33 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55637/
> ---
> 
> (Updated Jan. 17, 2017, 7:33 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Bugs: MESOS-3697
> https://issues.apache.org/jira/browse/MESOS-3697
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This commit will introduce a `test` target to the CMake build system.
> The semantics of this target are very similar to the autotools build
> system, in that they build the tests, but do not run them.
> 
> Accomplishing this is somewhat complex in CMake, because CMake expects
> to control the `test` target itself. We work around this by (1)
> silencing the warning that `test` is a reserved target, (2) removing the
> call to `enable_testing` that sets up CTest, and (3) adding our own
> `test` target. As an additional insurance policy, we error out if any of
> the `BUILD_TESTING` flags are defined, which would indicate that
> `include(CTest)` was called.
> 
> 
> Diffs
> -
> 
>   CMakeLists.txt c6b1eb6a7c7e6ed3650847acbec0ddd3101bb96f 
>   cmake/MesosConfigure.cmake 6a9ed9dc02f5c9a6d1fce0866f19ffeafec35cdc 
>   src/tests/CMakeLists.txt b7adf4f3126611a57baf4f62b62b208bc0236da5 
> 
> Diff: https://reviews.apache.org/r/55637/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Review Request 55691: Fix XSS vulnerability in pailer invocation.

2017-01-18 Thread Jacob Janco

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55691/
---

Review request for mesos.


Repository: mesos


Description
---

Fix XSS vulnerability in pailer invocation.


Diffs
-

  src/webui/master/static/js/controllers.js 
388ca2447716cbc7141da6a20daf2340621a16e8 
  src/webui/master/static/pailer.html 19e0981143bd7e8372b49f4f036867e9dd05727a 

Diff: https://reviews.apache.org/r/55691/diff/


Testing
---

make -j8 + test framework + checking pailer representation of files in sandbox


Thanks,

Jacob Janco



Review Request 55689: Updated version in comment about TASK_LOST behavior.

2017-01-18 Thread Neil Conway

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55689/
---

Review request for mesos and Vinod Kone.


Repository: mesos


Description
---

Updated version in comment about TASK_LOST behavior.


Diffs
-

  include/mesos/mesos.proto 8f1d6957a97eff1e0a94817d38e7a7de6d69 
  include/mesos/v1/mesos.proto 74e7851b147ab821dceeab6e838d34b092f101c3 

Diff: https://reviews.apache.org/r/55689/diff/


Testing
---


Thanks,

Neil Conway



Re: Review Request 55328: Windows: Added passing GC tests to the build.

2017-01-18 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55328/#review162204
---


Ship it!





src/tests/gc_tests.cpp (lines 376 - 378)


Spacing here.  I can fix this up.


- Joseph Wu


On Jan. 8, 2017, 3:53 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55328/
> ---
> 
> (Updated Jan. 8, 2017, 3:53 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Bugs: MESOS-6707
> https://issues.apache.org/jira/browse/MESOS-6707
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This issue resolves MESOS-6707. See #55327 for the resolution that
> allows us to light up the tests; essentially, we resolved a bug in the
> Windows implementation of `os::rmdir` that causes the Agent sandbox
> deletion to error out instead of completing successfully.
> 
> 
> Diffs
> -
> 
>   src/tests/gc_tests.cpp c4841e85f785189681e218df714cac2038d77803 
> 
> Diff: https://reviews.apache.org/r/55328/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55313: Windows: Fixed the unkillable task bug, lit up executor tests.

2017-01-18 Thread Alex Clemmer


> On Jan. 18, 2017, 11:04 p.m., Joseph Wu wrote:
> > src/tests/default_executor_tests.cpp, lines 442-444
> > 
> >
> > Here too.

Oh, shoot, sorry. I should have caught that.


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55313/#review162200
---


On Jan. 18, 2017, 8:49 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55313/
> ---
> 
> (Updated Jan. 18, 2017, 8:49 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Bugs: MESOS-6698, MESOS-6839 and MESOS-6870
> https://issues.apache.org/jira/browse/MESOS-6698
> https://issues.apache.org/jira/browse/MESOS-6839
> https://issues.apache.org/jira/browse/MESOS-6870
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> MESOS-6839 tracks a bug that causes the current implementation of the
> default executor to be unable to delete any processes associated with a
> task. To understand why requires some knowledge of the differences
> between the process model of Windows and Unix.
> 
> In Unix, there is a robust notion of a process tree, with a well-defined
> notion of process groups, sessions, signal delivery on the tree, and so
> on. Windows lacks a robust notion of a process hierarchy, and therefore
> largely has no equivalents to these constructs (including, notably,
> signal semantics).
> 
> One of the problems this mismatch causes Mesos is that it complicates
> the problem of killing a task, which is at its core a group of
> processes. On Windows, the easiest way to make a process and all its
> descendents easily killable is to track these processes in a Job Object,
> which is a Windows kernel construct similar in principle to Linux's
> control groups (though with different ideas of process namespacing).
> 
> There is some subtlety in making sure _all_ processes associated with a
> task are captured inside a Job Object. The most important consideration
> is that we need to make sure to add any process to the Job Object before
> it has a chance to create any child processes; if we fail to do this,
> the children will not be captured in the Job Object.
> 
> The solution to this is fairly simple on Windows. The process creation
> API allows users to trivially create a process in a suspended state, so
> that the Windows kernel scheduler does not schedule the process to run
> until the user explicitly resumes the main thread. This allows us to
> create the process and add it to a Job Object before it has a chance to
> create children, and then start the process.
> 
> This commit will accomplish this by changing `PosixLauncher::fork` to
> use the Subprocess parent hooks API, which implements exactly this
> semantics. Essentially, the launcher will launch the containerizer
> process, which will inspect the TaskInfo or the environment for a task
> to launch, and then launch it. Using the parent hooks API, Subprocess
> will create the containerizer process on Windows in a suspended state,
> and then the parent hook supplied by the launcher will add that process
> to a Job Object before it has a chance to run. Finally, Subprocess will
> mark the process as runnable, and return.
> 
> This commit resolves MESOS-6839. We also light up the executor tests, so
> it also resolves MESOS-6870 and MESOS-6839.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/launcher.cpp 
> a6a8c01cb39f35f8174fcb5af0ef18de2da5ee78 
>   src/tests/command_executor_tests.cpp 
> 4d5c21ec427ebaac053e56ae554cb466dfeb0b8b 
>   src/tests/default_executor_tests.cpp 
> ec3e854ed58a0fbb3bfad0bd21eb0e2974548865 
> 
> Diff: https://reviews.apache.org/r/55313/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55313: Windows: Fixed the unkillable task bug, lit up executor tests.

2017-01-18 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55313/#review162202
---


Ship it!




Ship It!

- Joseph Wu


On Jan. 18, 2017, 12:49 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55313/
> ---
> 
> (Updated Jan. 18, 2017, 12:49 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Bugs: MESOS-6698, MESOS-6839 and MESOS-6870
> https://issues.apache.org/jira/browse/MESOS-6698
> https://issues.apache.org/jira/browse/MESOS-6839
> https://issues.apache.org/jira/browse/MESOS-6870
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> MESOS-6839 tracks a bug that causes the current implementation of the
> default executor to be unable to delete any processes associated with a
> task. To understand why requires some knowledge of the differences
> between the process model of Windows and Unix.
> 
> In Unix, there is a robust notion of a process tree, with a well-defined
> notion of process groups, sessions, signal delivery on the tree, and so
> on. Windows lacks a robust notion of a process hierarchy, and therefore
> largely has no equivalents to these constructs (including, notably,
> signal semantics).
> 
> One of the problems this mismatch causes Mesos is that it complicates
> the problem of killing a task, which is at its core a group of
> processes. On Windows, the easiest way to make a process and all its
> descendents easily killable is to track these processes in a Job Object,
> which is a Windows kernel construct similar in principle to Linux's
> control groups (though with different ideas of process namespacing).
> 
> There is some subtlety in making sure _all_ processes associated with a
> task are captured inside a Job Object. The most important consideration
> is that we need to make sure to add any process to the Job Object before
> it has a chance to create any child processes; if we fail to do this,
> the children will not be captured in the Job Object.
> 
> The solution to this is fairly simple on Windows. The process creation
> API allows users to trivially create a process in a suspended state, so
> that the Windows kernel scheduler does not schedule the process to run
> until the user explicitly resumes the main thread. This allows us to
> create the process and add it to a Job Object before it has a chance to
> create children, and then start the process.
> 
> This commit will accomplish this by changing `PosixLauncher::fork` to
> use the Subprocess parent hooks API, which implements exactly this
> semantics. Essentially, the launcher will launch the containerizer
> process, which will inspect the TaskInfo or the environment for a task
> to launch, and then launch it. Using the parent hooks API, Subprocess
> will create the containerizer process on Windows in a suspended state,
> and then the parent hook supplied by the launcher will add that process
> to a Job Object before it has a chance to run. Finally, Subprocess will
> mark the process as runnable, and return.
> 
> This commit resolves MESOS-6839. We also light up the executor tests, so
> it also resolves MESOS-6870 and MESOS-6839.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/launcher.cpp 
> a6a8c01cb39f35f8174fcb5af0ef18de2da5ee78 
>   src/tests/command_executor_tests.cpp 
> 4d5c21ec427ebaac053e56ae554cb466dfeb0b8b 
>   src/tests/default_executor_tests.cpp 
> ec3e854ed58a0fbb3bfad0bd21eb0e2974548865 
> 
> Diff: https://reviews.apache.org/r/55313/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55313: Windows: Fixed the unkillable task bug, lit up executor tests.

2017-01-18 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55313/#review162200
---




src/tests/command_executor_tests.cpp (lines 425 - 427)


I'll fix up the spacing here, now that we've gotten rid of that absurdly 
long macro :)



src/tests/default_executor_tests.cpp (lines 442 - 444)


Here too.


- Joseph Wu


On Jan. 18, 2017, 12:49 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55313/
> ---
> 
> (Updated Jan. 18, 2017, 12:49 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Bugs: MESOS-6698, MESOS-6839 and MESOS-6870
> https://issues.apache.org/jira/browse/MESOS-6698
> https://issues.apache.org/jira/browse/MESOS-6839
> https://issues.apache.org/jira/browse/MESOS-6870
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> MESOS-6839 tracks a bug that causes the current implementation of the
> default executor to be unable to delete any processes associated with a
> task. To understand why requires some knowledge of the differences
> between the process model of Windows and Unix.
> 
> In Unix, there is a robust notion of a process tree, with a well-defined
> notion of process groups, sessions, signal delivery on the tree, and so
> on. Windows lacks a robust notion of a process hierarchy, and therefore
> largely has no equivalents to these constructs (including, notably,
> signal semantics).
> 
> One of the problems this mismatch causes Mesos is that it complicates
> the problem of killing a task, which is at its core a group of
> processes. On Windows, the easiest way to make a process and all its
> descendents easily killable is to track these processes in a Job Object,
> which is a Windows kernel construct similar in principle to Linux's
> control groups (though with different ideas of process namespacing).
> 
> There is some subtlety in making sure _all_ processes associated with a
> task are captured inside a Job Object. The most important consideration
> is that we need to make sure to add any process to the Job Object before
> it has a chance to create any child processes; if we fail to do this,
> the children will not be captured in the Job Object.
> 
> The solution to this is fairly simple on Windows. The process creation
> API allows users to trivially create a process in a suspended state, so
> that the Windows kernel scheduler does not schedule the process to run
> until the user explicitly resumes the main thread. This allows us to
> create the process and add it to a Job Object before it has a chance to
> create children, and then start the process.
> 
> This commit will accomplish this by changing `PosixLauncher::fork` to
> use the Subprocess parent hooks API, which implements exactly this
> semantics. Essentially, the launcher will launch the containerizer
> process, which will inspect the TaskInfo or the environment for a task
> to launch, and then launch it. Using the parent hooks API, Subprocess
> will create the containerizer process on Windows in a suspended state,
> and then the parent hook supplied by the launcher will add that process
> to a Job Object before it has a chance to run. Finally, Subprocess will
> mark the process as runnable, and return.
> 
> This commit resolves MESOS-6839. We also light up the executor tests, so
> it also resolves MESOS-6870 and MESOS-6839.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/launcher.cpp 
> a6a8c01cb39f35f8174fcb5af0ef18de2da5ee78 
>   src/tests/command_executor_tests.cpp 
> 4d5c21ec427ebaac053e56ae554cb466dfeb0b8b 
>   src/tests/default_executor_tests.cpp 
> ec3e854ed58a0fbb3bfad0bd21eb0e2974548865 
> 
> Diff: https://reviews.apache.org/r/55313/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55314: Windows: Added working Agent tests, moving pass rate ~75% -> ~87%.

2017-01-18 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55314/#review162198
---


Ship it!




Ship It!

- Joseph Wu


On Jan. 7, 2017, 11:20 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55314/
> ---
> 
> (Updated Jan. 7, 2017, 11:20 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Bugs: MESOS-6714
> https://issues.apache.org/jira/browse/MESOS-6714
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Windows: Added working Agent tests, moving pass rate ~75% -> ~87%.
> 
> This commit partially addresses MESOS-6714.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_tests.cpp 67a6aed8c66a21c94c106b52dff75cbdc41fcf69 
> 
> Diff: https://reviews.apache.org/r/55314/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55637: CMake: Added `test` target.

2017-01-18 Thread Ilya Pronin


> On Jan. 17, 2017, 8:46 p.m., Ilya Pronin wrote:
> > I think the name "test" can confuse people because it's usually expected to 
> > be used for CTest. Maybe it would better be leave old {{tests}} and 
> > {{check}} targets? https://cmake.org/Wiki/CMakeEmulateMakeCheck
> 
> Alex Clemmer wrote:
> That's a great question. I'm not a CMake native, so it's not clear to me 
> how weird this would be to do. The idea was to expose a `test` target to 
> achieve something resembling script parity. Is this idea not compelling to 
> you?
> 
> Ilya Pronin wrote:
> Very compelling. I just meant that `make test` is usually expected to not 
> only build tests but also run them, i.e. act the same way as autotools 
> generated `make check`. E.g. debhelper's `dh_auto_test` looks for `test` or 
> `check` target in a Makefile. So I suggested that target that just builds 
> tests should be called `tests`.
> 
> Alex Clemmer wrote:
> Ah, ok, let me just re-state what I think is true to make sure we're on 
> the page. Right now the autotools build has the semantics that `make check` 
> will build + run tests, and `make test` will simply build them. So the 
> original idea here was that `make test` achieve script parity for our own 
> tooling. And it sounds like what you're saying is that this should actually 
> be the goal -- the goal should be that we do something more congruent with 
> the normal use of the CMake semantics of `make test` and then have another 
> target (which could be called, _e.g._, `make tests`, with an `s`) that 
> implements these semantics.
> 
> Is that all approximately correct?
> 
> If so, let's wait and see what `kaysoky` says. He has a better 
> understanding for the impact of changing the semantics of the `make test` 
> target. IMHO this is a sensible suggestion, but it's important to get input 
> for how much work this creates for the downstream consumers of the `make 
> test` target.

Yes, except for 1 thing: current autotools target that builds tests is called 
`tests` :) https://github.com/apache/mesos/blob/master/src/Makefile.am#L2490


- Ilya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55637/#review161940
---


On Jan. 17, 2017, 7:33 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55637/
> ---
> 
> (Updated Jan. 17, 2017, 7:33 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Bugs: MESOS-3697
> https://issues.apache.org/jira/browse/MESOS-3697
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This commit will introduce a `test` target to the CMake build system.
> The semantics of this target are very similar to the autotools build
> system, in that they build the tests, but do not run them.
> 
> Accomplishing this is somewhat complex in CMake, because CMake expects
> to control the `test` target itself. We work around this by (1)
> silencing the warning that `test` is a reserved target, (2) removing the
> call to `enable_testing` that sets up CTest, and (3) adding our own
> `test` target. As an additional insurance policy, we error out if any of
> the `BUILD_TESTING` flags are defined, which would indicate that
> `include(CTest)` was called.
> 
> 
> Diffs
> -
> 
>   CMakeLists.txt c6b1eb6a7c7e6ed3650847acbec0ddd3101bb96f 
>   cmake/MesosConfigure.cmake 6a9ed9dc02f5c9a6d1fce0866f19ffeafec35cdc 
>   src/tests/CMakeLists.txt b7adf4f3126611a57baf4f62b62b208bc0236da5 
> 
> Diff: https://reviews.apache.org/r/55637/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55312: Windows: Added parent hooks to subprocess.

2017-01-18 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55312/#review162189
---


Ship it!




`CREATE_JOB` code can use a few minor tweaks, which I'll make before committing.


3rdparty/libprocess/src/subprocess.cpp (lines 62 - 70)


I'm going to move most of this comment into the definition of the function, 
as a header comment.



3rdparty/libprocess/src/subprocess.cpp (lines 75 - 77)


Remove the `else` here, but keep the `return`.


- Joseph Wu


On Jan. 18, 2017, 12:49 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55312/
> ---
> 
> (Updated Jan. 18, 2017, 12:49 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The subprocess API exposes various hooks to customize behavior as we
> create the child process. Currently, these hooks are unimplemented in
> the Windows codepaths. This commit will implement the parent hooks --
> which are executed after the child process is created, but before it
> begins execution -- on the Windows codepath.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/include/process/subprocess_base.hpp 
> 74a4bef0706334b4d3c1040a08a8921fbc300344 
>   3rdparty/libprocess/include/process/windows/subprocess.hpp 
> 3bc7f1992d9c38dac2ec23d5bc57415f37d0318a 
>   3rdparty/libprocess/src/subprocess.cpp 
> ad19b0896b4a2e9c60f573cc854c10c69e909e86 
> 
> Diff: https://reviews.apache.org/r/55312/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55657: Added cpack to create source package.

2017-01-18 Thread Srinivas Brahmaroutu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55657/
---

(Updated Jan. 18, 2017, 10:25 p.m.)


Review request for mesos, Alex Clemmer and Joseph Wu.


Bugs: MESOS-4245
https://issues.apache.org/jira/browse/MESOS-4245


Repository: mesos


Description
---

Added cpack to create source package.


Diffs (updated)
-

  CMakeLists.txt c6b1eb6a7c7e6ed3650847acbec0ddd3101bb96f 

Diff: https://reviews.apache.org/r/55657/diff/


Testing
---


Thanks,

Srinivas Brahmaroutu



Re: Review Request 55686: Added custom target for 'make distcheck'.

2017-01-18 Thread Srinivas Brahmaroutu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55686/
---

(Updated Jan. 18, 2017, 10:25 p.m.)


Review request for mesos, Alex Clemmer and Joseph Wu.


Changes
---

make distcheck will untar the TGZ file under build directory
The custom target will then create a build directory into that
directory and runs a 'make check' to validate the source archive.


Bugs: MESOS-5433
https://issues.apache.org/jira/browse/MESOS-5433


Repository: mesos


Description
---

Added custom target for 'make distcheck'.


Diffs (updated)
-

  CMakeLists.txt c6b1eb6a7c7e6ed3650847acbec0ddd3101bb96f 

Diff: https://reviews.apache.org/r/55686/diff/


Testing
---

make dist


Thanks,

Srinivas Brahmaroutu



Re: Review Request 55657: Added cpack to create source package.

2017-01-18 Thread Srinivas Brahmaroutu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55657/
---

(Updated Jan. 18, 2017, 10:23 p.m.)


Review request for mesos, Alex Clemmer and Joseph Wu.


Changes
---

Currently the cpack source package is create in TGZ format.
The sources packages are comparable to the make dist output.
There may be a need to add support to add additional source
files to the archive at some point.


Summary (updated)
-

Added cpack to create source package.


Bugs: MESOS-4245
https://issues.apache.org/jira/browse/MESOS-4245


Repository: mesos


Description (updated)
---

Added cpack to create source package.


Diffs (updated)
-

  CMakeLists.txt c6b1eb6a7c7e6ed3650847acbec0ddd3101bb96f 

Diff: https://reviews.apache.org/r/55657/diff/


Testing
---


Thanks,

Srinivas Brahmaroutu



Re: Review Request 55686: Added custom target for 'make distcheck'.

2017-01-18 Thread Srinivas Brahmaroutu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55686/
---

(Updated Jan. 18, 2017, 10:22 p.m.)


Review request for mesos, Alex Clemmer and Joseph Wu.


Changes
---

Currently the cpack source package is create in TGZ format.
The sources packages are comparable to the make dist output.
There may be a need to add support to add additional source
files to the archive at some point.


Summary (updated)
-

Added custom target for 'make distcheck'.


Bugs: MESOS-5433
https://issues.apache.org/jira/browse/MESOS-5433


Repository: mesos


Description (updated)
---

Added custom target for 'make distcheck'.


Diffs (updated)
-

  CMakeLists.txt c6b1eb6a7c7e6ed3650847acbec0ddd3101bb96f 

Diff: https://reviews.apache.org/r/55686/diff/


Testing
---

make dist


Thanks,

Srinivas Brahmaroutu



Re: Review Request 55311: Added `process::initialize` to default executor's `main`.

2017-01-18 Thread Alex Clemmer


> On Jan. 18, 2017, 1:50 a.m., Joseph Wu wrote:
> > src/launcher/default_executor.cpp, line 1055
> > 
> >
> > It actually feels like `process::Winsock winsock;` is more appropriate 
> > here, as we want the socket stack to be cleaned up afterwards as well.
> > Either than, or you should add a `process::finalize(true)` befow.
> 
> Alex Clemmer wrote:
> Tearing down winsock before libprocess is terminated will cause ugly 
> failures as the test framework disposes itself. Let's do a call to 
> `process::finalize(true)`.

This might not have been clear, btw, so let me be more specific: if we use the 
`Winsock` class, we will shut down the winsock stack when we exit `main`. But, 
at that point, libprocess will typically still be running, and it is therefore 
highly likely that it will start throwing errors as (_e.g._) its open sockets 
start to recieve errors. The desired dispose path is for libprocess to dispose 
of all of its assets, _and then_ dispose of the winsock stack last. This is the 
semantics of `process::finalize(true)`, which is why we should use it here and 
not the `Winsock` class.


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55311/#review162013
---


On Jan. 18, 2017, 6:37 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55311/
> ---
> 
> (Updated Jan. 18, 2017, 6:37 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The default executor's `main` currently depends on `UPID`, which
> networking libraries to do things like retrieve the address of the
> current host. On Windows, this means that it is required to initialize
> the winsock stack to be initialized before `UPID` is used.
> 
> To resolve this issue, we add a call to `process::initialize` at the
> beginning of the `main`, which will cause the stack to be initialized
> correctly.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 57e4799e750f8f5352a9fec58af40efe432ea865 
> 
> Diff: https://reviews.apache.org/r/55311/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55683: Rationalize process wait error checking.

2017-01-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55683/#review162188
---



Patch looks great!

Reviews applied: [55683]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Jan. 18, 2017, 6:25 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55683/
> ---
> 
> (Updated Jan. 18, 2017, 6:25 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Joris Van Remoortere, Michael 
> Park, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6946
> https://issues.apache.org/jira/browse/MESOS-6946
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Introduce the WSUCCESS() API to test that a forked process exited
> successfully. Use it in all the places that were performing this test
> bespokely, and update error logs to user WSTRINGIFY() if appropriate.
> 
> 
> Diffs
> -
> 
>   src/common/status_utils.hpp 26b94d49973dd61a6138e6a64d5ec0bdec6dea7e 
>   src/docker/executor.cpp 9b5c469e2d0f33e228ec746711e6bc6ed352cbc7 
>   src/hdfs/hdfs.cpp bebba8bb3c16f40293d21b5e00c353135a0f8f26 
>   src/launcher/default_executor.cpp 57e4799e750f8f5352a9fec58af40efe432ea865 
>   src/launcher/executor.cpp e035a4ee4438a3342f68a5548a5fd8d57a315cfa 
>   src/launcher/fetcher.cpp 5f9a38d3ae8fb9b4b21203f96d53c5bef0909eba 
>   src/linux/ns.hpp da43266ba67f76c9ca856584a540794062097691 
>   src/slave/containerizer/docker.cpp 7a8a7271b54af0b4dcdae7a2aa8a90a8a7d05fd6 
>   src/slave/containerizer/mesos/io/switchboard.cpp 
> 2c4eed8f2dc7f91b9fd08f91f1b08484fca9a9af 
>   src/slave/containerizer/mesos/launch.cpp 
> e482ab8bdfc358f695b87cda72ca59fb64cd8c4d 
>   src/tests/script.cpp ef6b22aaeb056ed489f2c519e1a02644f2653329 
> 
> Diff: https://reviews.apache.org/r/55683/diff/
> 
> 
> Testing
> ---
> 
> make check && sudo make check (Fedora 25)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 55311: Added `process::initialize` to default executor's `main`.

2017-01-18 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55311/#review162187
---


Ship it!




Ship It!

- Joseph Wu


On Jan. 18, 2017, 10:37 a.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55311/
> ---
> 
> (Updated Jan. 18, 2017, 10:37 a.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The default executor's `main` currently depends on `UPID`, which
> networking libraries to do things like retrieve the address of the
> current host. On Windows, this means that it is required to initialize
> the winsock stack to be initialized before `UPID` is used.
> 
> To resolve this issue, we add a call to `process::initialize` at the
> beginning of the `main`, which will cause the stack to be initialized
> correctly.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 57e4799e750f8f5352a9fec58af40efe432ea865 
> 
> Diff: https://reviews.apache.org/r/55311/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55024: Windows: Start the socket stack in `process::initialize`.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55024/
---

(Updated Jan. 18, 2017, 9:55 p.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
and Joseph Wu.


Changes
---

Address Joseph's last comment.


Repository: mesos


Description
---

Currently libprocess will attempt to use sockets without initializing
the socket stack on Windows. This commit will resolving this problem by
causing `process::initialize` to initialize the socket stack.


Diffs (updated)
-

  3rdparty/libprocess/include/process/process.hpp 
b118f1a2bf5aac12b53ae204253b88c9b1c65f46 
  3rdparty/libprocess/src/process.cpp f475fe78f801924f70f51fdc4ab190c2dbecd656 

Diff: https://reviews.apache.org/r/55024/diff/


Testing
---


Thanks,

Alex Clemmer



Re: Review Request 55023: Windows: Fixed malformatted containerizer command in launcher.

2017-01-18 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55023/#review162186
---


Ship it!




Ship It!

- Joseph Wu


On Jan. 18, 2017, 12:49 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55023/
> ---
> 
> (Updated Jan. 18, 2017, 12:49 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
> and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently in `MesosContainerizerProcess::_launch`, we are passing a
> malformatted shell command to the launcher. This causes the
> containerizer process to crash immediately upon invocation in all
> executor tests.
> 
> This commit will fix this command.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 8bf8a7774a38131c53f6d91c7f09f5dedd9d4cb4 
> 
> Diff: https://reviews.apache.org/r/55023/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55024: Windows: Start the socket stack in `process::initialize`.

2017-01-18 Thread Alex Clemmer


> On Jan. 18, 2017, 1:06 a.m., Joseph Wu wrote:
> > 3rdparty/libprocess/src/process.cpp, line 1328
> > 
> >
> > At the moment, I would not expect this to ever fail, 
> > `EXIT(EXIT_FAILURE)` is preferable.  If it does, it seems like a severe 
> > enough error (as the docs suggest most errors are due to programmer error).
> > 
> > See: 
> > https://msdn.microsoft.com/en-us/library/windows/desktop/ms741549(v=vs.85).aspx
> 
> Alex Clemmer wrote:
> The reason I decided to `LOG(ERROR)` is because there didn't seem to be 
> any scenarios where we would encounter errors because we haven't shut down 
> the socket stack. We can change it if you feel strongly about it, though. Do 
> you think there is a strong possibility of an error condition that I'm 
> missing?
> 
> Joseph Wu wrote:
> We generally err on the side of failing-fast.  So if we don't expect 
> something to occur, we first place a `CHECK` or `EXIT`.  If that turns out to 
> be incorrect later, we consider relaxing the code at that point.
> 
> This usually gives more visibility into problems, as people are more 
> likely to notice a crash than an error log.

I agree about fail-fast, which is good distributed systems hygiene. It just 
seems to me that the dispose path is basically never the cause of service 
errors.

But, anyway, let's change it. I just wanted to double-check there wasn't a 
specific error case I was missing.


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55024/#review162008
---


On Jan. 18, 2017, 5:41 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55024/
> ---
> 
> (Updated Jan. 18, 2017, 5:41 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
> and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently libprocess will attempt to use sockets without initializing
> the socket stack on Windows. This commit will resolving this problem by
> causing `process::initialize` to initialize the socket stack.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/include/process/process.hpp 
> b118f1a2bf5aac12b53ae204253b88c9b1c65f46 
>   3rdparty/libprocess/src/process.cpp 
> f475fe78f801924f70f51fdc4ab190c2dbecd656 
> 
> Diff: https://reviews.apache.org/r/55024/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55024: Windows: Start the socket stack in `process::initialize`.

2017-01-18 Thread Joseph Wu


> On Jan. 17, 2017, 5:06 p.m., Joseph Wu wrote:
> > 3rdparty/libprocess/src/process.cpp, line 1328
> > 
> >
> > At the moment, I would not expect this to ever fail, 
> > `EXIT(EXIT_FAILURE)` is preferable.  If it does, it seems like a severe 
> > enough error (as the docs suggest most errors are due to programmer error).
> > 
> > See: 
> > https://msdn.microsoft.com/en-us/library/windows/desktop/ms741549(v=vs.85).aspx
> 
> Alex Clemmer wrote:
> The reason I decided to `LOG(ERROR)` is because there didn't seem to be 
> any scenarios where we would encounter errors because we haven't shut down 
> the socket stack. We can change it if you feel strongly about it, though. Do 
> you think there is a strong possibility of an error condition that I'm 
> missing?

We generally err on the side of failing-fast.  So if we don't expect something 
to occur, we first place a `CHECK` or `EXIT`.  If that turns out to be 
incorrect later, we consider relaxing the code at that point.

This usually gives more visibility into problems, as people are more likely to 
notice a crash than an error log.


- Joseph


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55024/#review162008
---


On Jan. 18, 2017, 9:41 a.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55024/
> ---
> 
> (Updated Jan. 18, 2017, 9:41 a.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
> and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently libprocess will attempt to use sockets without initializing
> the socket stack on Windows. This commit will resolving this problem by
> causing `process::initialize` to initialize the socket stack.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/include/process/process.hpp 
> b118f1a2bf5aac12b53ae204253b88c9b1c65f46 
>   3rdparty/libprocess/src/process.cpp 
> f475fe78f801924f70f51fdc4ab190c2dbecd656 
> 
> Diff: https://reviews.apache.org/r/55024/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55327: Windows: Fixed hanging symlink bug in `os::rmdir`.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55327/
---

(Updated Jan. 18, 2017, 8:50 p.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.


Changes
---

Address Joseph's comments, fix embarrasing mistake.


Repository: mesos


Description
---

The Windows implementation of `os::rmdir` will fail to delete "hanging"
symlinks (i.e., symlinks whose targets do not exist). Note that on
Windows this bug is specific to symlinks whose targets are _deleted_,
since it is impossible to create a symlink whose target does not exist.

The primary issue that causes this problem is that it is very difficult
to tell whether a symlink points at a directory or a file unless you
resolve the symlink and determine whether the target is a directory or a
file. In situations where the target does not exist, we can't use this
information, and so `os::rmdir` occasionally mis-routes a symlink to
(what was) a directory to a `::remove` call, which will fail with a
cryptic error.

To fix this behavior, this commit will introduce code that simply tries
to remove the reparse point with both `RemoveDirectory` and
`DeleteFile`, and if either succeeds, we report success for the
operation. This represents a "best effort"; in the case that the reparse
point represents something more exotic than a symlink, we will still
fail, but by choosing not to verify whether the target is a directory or
a file, we simplify the code and still obtain the outcome of having
deleted the directory.

This commit is the primary blocker for MESOS-6707, as deleting the Agent
sandbox will sometimes cause us to delete the latest run directory for
the executor before the symlinked `latest` directory itself. This causes
the delete to fail, and then the GC tests to fail, since they tend to
assert the directory does not exist.


Diffs (updated)
-

  3rdparty/stout/include/stout/os/windows/rmdir.hpp 
4437484c068e9ef046e0be14683c97db447f2da1 
  3rdparty/stout/tests/os/rmdir_tests.cpp 
988d41b7fdd11cc96ce005671a7c62d1b5a3615d 

Diff: https://reviews.apache.org/r/55327/diff/


Testing
---


Thanks,

Alex Clemmer



Re: Review Request 55023: Windows: Fixed malformatted containerizer command in launcher.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55023/
---

(Updated Jan. 18, 2017, 8:49 p.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
and Joseph Wu.


Changes
---

Address Joseph's comments.


Repository: mesos


Description
---

Currently in `MesosContainerizerProcess::_launch`, we are passing a
malformatted shell command to the launcher. This causes the
containerizer process to crash immediately upon invocation in all
executor tests.

This commit will fix this command.


Diffs (updated)
-

  src/slave/containerizer/mesos/containerizer.cpp 
8bf8a7774a38131c53f6d91c7f09f5dedd9d4cb4 

Diff: https://reviews.apache.org/r/55023/diff/


Testing
---


Thanks,

Alex Clemmer



Re: Review Request 55313: Windows: Fixed the unkillable task bug, lit up executor tests.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55313/
---

(Updated Jan. 18, 2017, 8:49 p.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.


Changes
---

Address Joseph's comments.


Bugs: MESOS-6698, MESOS-6839 and MESOS-6870
https://issues.apache.org/jira/browse/MESOS-6698
https://issues.apache.org/jira/browse/MESOS-6839
https://issues.apache.org/jira/browse/MESOS-6870


Repository: mesos


Description
---

MESOS-6839 tracks a bug that causes the current implementation of the
default executor to be unable to delete any processes associated with a
task. To understand why requires some knowledge of the differences
between the process model of Windows and Unix.

In Unix, there is a robust notion of a process tree, with a well-defined
notion of process groups, sessions, signal delivery on the tree, and so
on. Windows lacks a robust notion of a process hierarchy, and therefore
largely has no equivalents to these constructs (including, notably,
signal semantics).

One of the problems this mismatch causes Mesos is that it complicates
the problem of killing a task, which is at its core a group of
processes. On Windows, the easiest way to make a process and all its
descendents easily killable is to track these processes in a Job Object,
which is a Windows kernel construct similar in principle to Linux's
control groups (though with different ideas of process namespacing).

There is some subtlety in making sure _all_ processes associated with a
task are captured inside a Job Object. The most important consideration
is that we need to make sure to add any process to the Job Object before
it has a chance to create any child processes; if we fail to do this,
the children will not be captured in the Job Object.

The solution to this is fairly simple on Windows. The process creation
API allows users to trivially create a process in a suspended state, so
that the Windows kernel scheduler does not schedule the process to run
until the user explicitly resumes the main thread. This allows us to
create the process and add it to a Job Object before it has a chance to
create children, and then start the process.

This commit will accomplish this by changing `PosixLauncher::fork` to
use the Subprocess parent hooks API, which implements exactly this
semantics. Essentially, the launcher will launch the containerizer
process, which will inspect the TaskInfo or the environment for a task
to launch, and then launch it. Using the parent hooks API, Subprocess
will create the containerizer process on Windows in a suspended state,
and then the parent hook supplied by the launcher will add that process
to a Job Object before it has a chance to run. Finally, Subprocess will
mark the process as runnable, and return.

This commit resolves MESOS-6839. We also light up the executor tests, so
it also resolves MESOS-6870 and MESOS-6839.


Diffs (updated)
-

  src/slave/containerizer/mesos/launcher.cpp 
a6a8c01cb39f35f8174fcb5af0ef18de2da5ee78 
  src/tests/command_executor_tests.cpp 4d5c21ec427ebaac053e56ae554cb466dfeb0b8b 
  src/tests/default_executor_tests.cpp ec3e854ed58a0fbb3bfad0bd21eb0e2974548865 

Diff: https://reviews.apache.org/r/55313/diff/


Testing
---


Thanks,

Alex Clemmer



Re: Review Request 55312: Windows: Added parent hooks to subprocess.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55312/
---

(Updated Jan. 18, 2017, 8:49 p.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.


Changes
---

Address Joseph's comments from r/55313.


Repository: mesos


Description
---

The subprocess API exposes various hooks to customize behavior as we
create the child process. Currently, these hooks are unimplemented in
the Windows codepaths. This commit will implement the parent hooks --
which are executed after the child process is created, but before it
begins execution -- on the Windows codepath.


Diffs (updated)
-

  3rdparty/libprocess/include/process/subprocess_base.hpp 
74a4bef0706334b4d3c1040a08a8921fbc300344 
  3rdparty/libprocess/include/process/windows/subprocess.hpp 
3bc7f1992d9c38dac2ec23d5bc57415f37d0318a 
  3rdparty/libprocess/src/subprocess.cpp 
ad19b0896b4a2e9c60f573cc854c10c69e909e86 

Diff: https://reviews.apache.org/r/55312/diff/


Testing
---


Thanks,

Alex Clemmer



Review Request 55686: Added cpack to create source package. Currently the cpack source package is create in TGZ format. The sources packages are comparable to the make dist output. There may be a need

2017-01-18 Thread Srinivas Brahmaroutu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55686/
---

Review request for mesos, Alex Clemmer and Joseph Wu.


Bugs: MESOS-5433
https://issues.apache.org/jira/browse/MESOS-5433


Repository: mesos


Description
---

Added cpack to create source package. Currently the cpack source package is 
create in TGZ format. The sources packages are comparable to the make dist 
output. There may be a need to add support to add additional source files to 
the archive at some point.


Diffs
-

  CMakeLists.txt c6b1eb6a7c7e6ed3650847acbec0ddd3101bb96f 

Diff: https://reviews.apache.org/r/55686/diff/


Testing
---

make dist


Thanks,

Srinivas Brahmaroutu



Re: Review Request 55491: Added a `support/README.md` and the `support/jenkins` directory.

2017-01-18 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55491/#review162167
---



Patch looks great!

Reviews applied: [55667, 55490, 55654, 55491]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On Jan. 18, 2017, 4:38 p.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55491/
> ---
> 
> (Updated Jan. 18, 2017, 4:38 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   support/README.md PRE-CREATION 
>   support/jenkins/README.md PRE-CREATION 
>   support/jenkins/buildbot.sh PRE-CREATION 
>   support/jenkins/reviewbot.sh PRE-CREATION 
>   support/jenkins/tidybot.sh PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/55491/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 55637: CMake: Added `test` target.

2017-01-18 Thread Alex Clemmer


> On Jan. 17, 2017, 8:46 p.m., Ilya Pronin wrote:
> > I think the name "test" can confuse people because it's usually expected to 
> > be used for CTest. Maybe it would better be leave old {{tests}} and 
> > {{check}} targets? https://cmake.org/Wiki/CMakeEmulateMakeCheck
> 
> Alex Clemmer wrote:
> That's a great question. I'm not a CMake native, so it's not clear to me 
> how weird this would be to do. The idea was to expose a `test` target to 
> achieve something resembling script parity. Is this idea not compelling to 
> you?
> 
> Ilya Pronin wrote:
> Very compelling. I just meant that `make test` is usually expected to not 
> only build tests but also run them, i.e. act the same way as autotools 
> generated `make check`. E.g. debhelper's `dh_auto_test` looks for `test` or 
> `check` target in a Makefile. So I suggested that target that just builds 
> tests should be called `tests`.

Ah, ok, let me just re-state what I think is true to make sure we're on the 
page. Right now the autotools build has the semantics that `make check` will 
build + run tests, and `make test` will simply build them. So the original idea 
here was that `make test` achieve script parity for our own tooling. And it 
sounds like what you're saying is that this should actually be the goal -- the 
goal should be that we do something more congruent with the normal use of the 
CMake semantics of `make test` and then have another target (which could be 
called, _e.g._, `make tests`, with an `s`) that implements these semantics.

Is that all approximately correct?

If so, let's wait and see what `kaysoky` says. He has a better understanding 
for the impact of changing the semantics of the `make test` target. IMHO this 
is a sensible suggestion, but it's important to get input for how much work 
this creates for the downstream consumers of the `make test` target.


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55637/#review161940
---


On Jan. 17, 2017, 7:33 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55637/
> ---
> 
> (Updated Jan. 17, 2017, 7:33 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Bugs: MESOS-3697
> https://issues.apache.org/jira/browse/MESOS-3697
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This commit will introduce a `test` target to the CMake build system.
> The semantics of this target are very similar to the autotools build
> system, in that they build the tests, but do not run them.
> 
> Accomplishing this is somewhat complex in CMake, because CMake expects
> to control the `test` target itself. We work around this by (1)
> silencing the warning that `test` is a reserved target, (2) removing the
> call to `enable_testing` that sets up CTest, and (3) adding our own
> `test` target. As an additional insurance policy, we error out if any of
> the `BUILD_TESTING` flags are defined, which would indicate that
> `include(CTest)` was called.
> 
> 
> Diffs
> -
> 
>   CMakeLists.txt c6b1eb6a7c7e6ed3650847acbec0ddd3101bb96f 
>   cmake/MesosConfigure.cmake 6a9ed9dc02f5c9a6d1fce0866f19ffeafec35cdc 
>   src/tests/CMakeLists.txt b7adf4f3126611a57baf4f62b62b208bc0236da5 
> 
> Diff: https://reviews.apache.org/r/55637/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 54232: Shutdown tasks of completed frameworks on agent re-registration.

2017-01-18 Thread Neil Conway

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54232/
---

(Updated Jan. 18, 2017, 7:33 p.m.)


Review request for mesos and Vinod Kone.


Changes
---

Remove helper function.


Bugs: MESOS-6602
https://issues.apache.org/jira/browse/MESOS-6602


Repository: mesos


Description
---

Previously, if a framework completed (e.g., due to a teardown operation
or framework shutdown), any framework tasks running on partitioned
agents would not be shutdown when the agent re-registered. For tasks
that are not partition-aware, the task would be shutdown on agent
re-registration anyway. But for partition-aware tasks, this could lead
to orphan tasks.

Fix this by changing the master to shutdown such tasks when the agent
reregisters.

Note that if the master fails over between the time the framework
completes and a partitioned agent re-registers, any framework tasks
running on the agent will NOT be shutdown. This is a known bug; fixing
it requires persisting the framework shutdown operation to the registry
(MESOS-1719).


Diffs (updated)
-

  src/master/master.cpp 73159328ce3fd838e02eba0e6a30cf69efc319ba 
  src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 

Diff: https://reviews.apache.org/r/54232/diff/


Testing
---

`make check`


Thanks,

Neil Conway



Re: Review Request 55491: Added a `support/README.md` and the `support/jenkins` directory.

2017-01-18 Thread Michael Park


> On Jan. 18, 2017, 4:50 a.m., Benjamin Bannier wrote:
> > support/jenkins/reviewbot.sh, line 32
> > 
> >
> > Hardcoding a password here seems like a pretty bad idea. Could we 
> > instead pass it in from the outside, e.g., via an environment variable?
> > 
> > We very likely also need to soon change this password now since it is 
> > known to everybody.

I've figured out how to hide the password. I'll change the password shortly as 
well.


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55491/#review162080
---


On Jan. 18, 2017, 8:38 a.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55491/
> ---
> 
> (Updated Jan. 18, 2017, 8:38 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   support/README.md PRE-CREATION 
>   support/jenkins/README.md PRE-CREATION 
>   support/jenkins/buildbot.sh PRE-CREATION 
>   support/jenkins/reviewbot.sh PRE-CREATION 
>   support/jenkins/tidybot.sh PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/55491/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 55476: Changed TASK_UNREACHABLE to be a non-terminal state.

2017-01-18 Thread Neil Conway

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55476/
---

(Updated Jan. 18, 2017, 7:33 p.m.)


Review request for mesos and Vinod Kone.


Changes
---

Tweak commit message.


Bugs: MESOS-6619
https://issues.apache.org/jira/browse/MESOS-6619


Repository: mesos


Description (updated)
---

This task state was always conceptually non-terminal, but previously
`isTerminalState` returned true for it. The master now distinguishes
between "removable" and "terminal" task states, so we can correctly
classify TASK_UNREACHABLE as a removable but non-terminal task state.


Diffs (updated)
-

  src/common/protobuf_utils.cpp af3c2a660f91c366ca5ffe8671d10c4a270bbbcc 

Diff: https://reviews.apache.org/r/55476/diff/


Testing
---

`make check`


Thanks,

Neil Conway



Re: Review Request 55327: Windows: Fixed hanging symlink bug in `os::rmdir`.

2017-01-18 Thread Alex Clemmer


> On Jan. 18, 2017, 2:32 a.m., Joseph Wu wrote:
> > 3rdparty/stout/include/stout/os/windows/rmdir.hpp, lines 120-123
> > 
> >
> > Note: This is actually unreachable.

:| Embarrassing.


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55327/#review162019
---


On Jan. 15, 2017, 8:40 a.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55327/
> ---
> 
> (Updated Jan. 15, 2017, 8:40 a.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The Windows implementation of `os::rmdir` will fail to delete "hanging"
> symlinks (i.e., symlinks whose targets do not exist). Note that on
> Windows this bug is specific to symlinks whose targets are _deleted_,
> since it is impossible to create a symlink whose target does not exist.
> 
> The primary issue that causes this problem is that it is very difficult
> to tell whether a symlink points at a directory or a file unless you
> resolve the symlink and determine whether the target is a directory or a
> file. In situations where the target does not exist, we can't use this
> information, and so `os::rmdir` occasionally mis-routes a symlink to
> (what was) a directory to a `::remove` call, which will fail with a
> cryptic error.
> 
> To fix this behavior, this commit will introduce code that simply tries
> to remove the reparse point with both `RemoveDirectory` and
> `DeleteFile`, and if either succeeds, we report success for the
> operation. This represents a "best effort"; in the case that the reparse
> point represents something more exotic than a symlink, we will still
> fail, but by choosing not to verify whether the target is a directory or
> a file, we simplify the code and still obtain the outcome of having
> deleted the directory.
> 
> This commit is the primary blocker for MESOS-6707, as deleting the Agent
> sandbox will sometimes cause us to delete the latest run directory for
> the executor before the symlinked `latest` directory itself. This causes
> the delete to fail, and then the GC tests to fail, since they tend to
> assert the directory does not exist.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/os/windows/rmdir.hpp 
> 4437484c068e9ef046e0be14683c97db447f2da1 
>   3rdparty/stout/tests/os/rmdir_tests.cpp 
> 988d41b7fdd11cc96ce005671a7c62d1b5a3615d 
> 
> Diff: https://reviews.apache.org/r/55327/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

2017-01-18 Thread Neil Conway


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5507
> > 
> >
> > can we inline this?
> 
> Neil Conway wrote:
> We could, but personally I find it more readable to make it a separate 
> function. The logic is a bit involved here (since we need to check two 
> different data structures due to backward compatibility requirements), and 
> there's already a lot going on in `Master::_reregisterSlave`.
> 
> Vinod Kone wrote:
> In general, we tend to factor out something into a function if it is 
> reusable and generic, not to reduce the number of lines someone has to read 
> in a function. In fact the argument was to inline in such cases because it 
> will avoid the need to context switch. 
> 
> Both `findPartitonAwareFrameworks` and `filterPartitionTasks` seem very 
> specific to the `_reregisterSlave` method, hence the suggestion.
> 
> For example, instead of looping through all the agent's tasks 3 times to 
> figure out which tasks to add to `Slave*`, it's probably easier to do it in 
> one inline loop here?
> 
> Neil Conway wrote:
> My goal isn't to reduce the # of lines to read the function, but to raise 
> the level of abstraction a little bit. For example, 
> `findPartitonAwareFrameworks` has to check _two_ data structures, depending 
> on both agent version and whether the master has failed over. That is quite 
> strange and takes a bit of thought to validate. With a separate function, the 
> reader of `_reregisterSlave` doesn't need to think about those details if 
> they don't want to -- instead they can just think "okay, we find all the 
> partition-aware frameworks on the agent and then ..."
> 
> I'm fine with looping over the agent's tasks three times, because each of 
> the three things we want to do is logically independent -- it takes a bit of 
> thought to validate whether it is actually correct to combine them into a 
> single loop (because each loop iteration would access a collection that is 
> modified by the loop itself. It would actually be okay, but IMO current 
> approach is safer).
> 
> Vinod Kone wrote:
> This is the code I had in mind. Let me know what you think:
> 
> ```
> // Add recovered frameworks into a hashset for easy lookup.
> hashset frameworks_;
> foreach (const FrameworkInfo& framework, frameworks) {
>   frameworks_.put(framework.id(), framework);
> }
> 
> vector tasks_;
> foreach (const Task& task, tasks) {
>   const FrameworkInfo& frameworkInfo = frameworks_[task.framework_id()];
>   Framework* framework = getFramework(framework.id());
>   
>   if (protobuf::frameworkHasCapability(frameworkInfo, PARTITION_AWARE)) {
> // Add the task if it is partition aware.
> tasks_.push_back(task);
> 
> // Remove the partition aware task from `unreachableTasks`.
> if (framework != null) {
>   framework->unreachableTasks.erase(task.task_id());
> }
>   } else if (slaves.removed.get(slaveInfo.id()).isSome()) {
> // Add a non-partion task iff the master failed over;
> tasks_.push_back(task);
>   }
> }
>   
> Slave* slave = new Slave(
>   this,
>   slaveInfo,
>   pid,
>   machineId,
>   version,
>   Clock::now(),
>   checkpointedResources,
>   executorInfos,
>   tasks_);
> }
> 
> ```

Revised the RR to use something similar to the above.


- Neil


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/#review160019
---


On Jan. 18, 2017, 7:18 p.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54183/
> ---
> 
> (Updated Jan. 18, 2017, 7:18 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6619
> https://issues.apache.org/jira/browse/MESOS-6619
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Before partition-awareness, when an agent failed health checks, the
> master removed the agent from the registry, marked all of its tasks
> TASK_LOST, and moved them to the `completedTasks` list in the master's
> memory. Although "lost" tasks might still be running, partitioned agents
> would only be allowed to re-register if the master failed over, in which
> case the `completedTasks` map would be emptied.
> 
> When partition-awareness was introduced, we initially followed the same
> scheme, with the only difference that partition-aware tasks are marked
> TASK_UNREACHABLE, not TASK_LOST.
> 
> This scheme has a few shortcomings. First, partition-aware 

Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

2017-01-18 Thread Neil Conway

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/
---

(Updated Jan. 18, 2017, 7:18 p.m.)


Review request for mesos and Vinod Kone.


Changes
---

Remove helper functions.


Bugs: MESOS-6619
https://issues.apache.org/jira/browse/MESOS-6619


Repository: mesos


Description
---

Before partition-awareness, when an agent failed health checks, the
master removed the agent from the registry, marked all of its tasks
TASK_LOST, and moved them to the `completedTasks` list in the master's
memory. Although "lost" tasks might still be running, partitioned agents
would only be allowed to re-register if the master failed over, in which
case the `completedTasks` map would be emptied.

When partition-awareness was introduced, we initially followed the same
scheme, with the only difference that partition-aware tasks are marked
TASK_UNREACHABLE, not TASK_LOST.

This scheme has a few shortcomings. First, partition-aware tasks might
resume running when the partitioned agent re-registers. Second, we
re-added non-partition aware tasks when the agent re-registered but then
marked them completed when the framework is shutdown, resulting in two
entries in `completedTasks`.

This commit introduces a separate bounded map, `unreachableTasks`. These
tasks are reported separately via the HTTP endpoints, because they have
different semantics (unlike completed tasks, unreachable tasks can
resume running). The size of this map is limited by a new master flag,
`--max_unreachable_tasks_per_framework`. This commit also changes the
master to omit re-adding non-partition-aware tasks on re-registering
agents (unless the master has failed over): those tasks will shortly be
shutdown anyway.

Finally, this commit fixes a minor bug in the previous code: the
previous coding neglected to shutdown non-partition-aware frameworks
running on pre-1.0 Mesos agents that re-register with the master after
a network partition.


Diffs (updated)
-

  docs/configuration.md 2113d06f58dddc0a28ae1241a24096266fe39801 
  include/mesos/master/master.proto 03203c8ddd05876339b1c56789b7bb140499d49d 
  include/mesos/v1/master/master.proto f8edf39a68752c8601cece345f52bce8b9a8a68b 
  src/master/constants.hpp 900b69460de9280f30b16b86925e436054e080ca 
  src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
  src/master/flags.cpp 124bfbca6be9d5d60597b9a1793bd807202da8ec 
  src/master/http.cpp a44621f39cb059e654a56f57f75b38947f3a4230 
  src/master/master.hpp 8e8a9037af11cf95961b6498540a0fd486ed091b 
  src/master/master.cpp 73159328ce3fd838e02eba0e6a30cf69efc319ba 
  src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 

Diff: https://reviews.apache.org/r/54183/diff/


Testing
---

`make check`


Thanks,

Neil Conway



Re: Review Request 55462: WIP: Validate resource reservation against allocated role.

2017-01-18 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55462/
---

(Updated Jan. 18, 2017, 8:10 p.m.)


Review request for mesos, Benjamin Mahler, Jay Guo, and Guangya Liu.


Bugs: MESOS-6730
https://issues.apache.org/jira/browse/MESOS-6730


Repository: mesos


Description
---

This just adds possible modifications as comments. The changes can only be 
implemented once MESOS-6635 landed.


Diffs
-

  src/master/validation.cpp 96aa36585ded4bd7cf98526f710ccbc4f23b1f0f 

Diff: https://reviews.apache.org/r/55462/diff/


Testing
---

N/A yet.


Thanks,

Benjamin Bannier



Re: Review Request 55271: Disallow multi-role frameworks to change their roles.

2017-01-18 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55271/
---

(Updated Jan. 18, 2017, 8:07 p.m.)


Review request for mesos, Benjamin Mahler, Jay Guo, and Guangya Liu.


Changes
---

Added another test for roles change on master failover. Suggested by guoger 
here: https://reviews.apache.org/r/55571/#comment233128.


Bugs: MESOS-6631
https://issues.apache.org/jira/browse/MESOS-6631


Repository: mesos


Description
---

We currently do not allow `MULTI_ROLE` frameworks to change their
roles. This restriction will be lifted later.


Diffs (updated)
-

  src/master/master.hpp 8e8a9037af11cf95961b6498540a0fd486ed091b 
  src/tests/master_validation_tests.cpp 
c092362152e1fe8a6b615c2eda171d852c1bbd86 

Diff: https://reviews.apache.org/r/55271/diff/


Testing
---

Tested on various Linux configurations in internal CI.


Thanks,

Benjamin Bannier



Re: Review Request 55571: Changed Master::Framework::updateFrameworkInfo so it can return errors.

2017-01-18 Thread Benjamin Bannier


> On Jan. 17, 2017, 10:07 a.m., Jay Guo wrote:
> > src/master/master.cpp, lines 7201-7203
> > 
> >
> > There may be some corner cases that causes this `CHECK` to coredump. 
> > For example:
> > 1. Start a Mesos cluster with master, agent and a scheduler with single 
> > role `role1`.
> > 2. Master failover
> > 3. Agent re-register before scheduler does
> > 4. Scheduler re-subscribe with multi-role `{role2}`
> > 
> > In this case, master re-construct FrameworkInfo from re-registered 
> > agent, which contains old role. However, in current implementation, `Error` 
> > of `updateFrameworkInfo` is not caught for recovered frameworks.
> > 
> > In theory, this test case shouldn't cause master coredump:
> > ```cpp
> >   master::Flags masterFlags = CreateMasterFlags();
> >   Try master = StartMaster(masterFlags);
> >   ASSERT_SOME(master);
> > 
> >   MockExecutor exec(DEFAULT_EXECUTOR_ID);
> >   TestContainerizer containerizer();
> > 
> >   StandaloneMasterDetector slaveDetector(master.get()->pid);
> >   Try slave = StartSlave(, 
> > );
> >   ASSERT_SOME(slave);
> > 
> >   FrameworkInfo frameworkInfo = DEFAULT_FRAMEWORK_INFO;
> >   frameworkInfo.set_failover_timeout(1024);
> >   frameworkInfo.set_role("role1");
> > 
> >   Future frameworkId;
> > 
> >   {
> > MockScheduler sched;
> > MesosSchedulerDriver driver(
> > , frameworkInfo, master.get()->pid, DEFAULT_CREDENTIAL);
> > 
> > EXPECT_CALL(sched, registered(, _, _))
> >   .WillOnce(FutureArg<1>());
> > 
> > Future offers;
> > EXPECT_CALL(sched, resourceOffers(, _))
> >   .WillOnce(FutureArg<1>())
> >   .WillRepeatedly(Return()); // Ignore subsequent offers.
> > 
> > driver.start();
> > 
> > Clock::pause();
> > Clock::settle();
> > Clock::resume();
> > 
> > AWAIT_READY(frameworkId);
> > 
> > AWAIT_READY(offers);
> > EXPECT_FALSE(offers->empty());
> > 
> > TaskInfo task;
> > task.set_name("");
> > task.mutable_task_id()->set_value("1");
> > task.mutable_slave_id()->MergeFrom(offers.get()[0].slave_id());
> > task.mutable_resources()->MergeFrom(offers.get()[0].resources());
> > task.mutable_executor()->MergeFrom(DEFAULT_EXECUTOR_INFO);
> > 
> > EXPECT_CALL(exec, registered(_, _, _, _))
> >   .Times(1);
> > 
> > EXPECT_CALL(exec, launchTask(_, _))
> >   .WillOnce(SendStatusUpdateFromTask(TASK_RUNNING));
> > 
> > Future runningStatus;
> > EXPECT_CALL(sched, statusUpdate(, _))
> >   .WillOnce(FutureArg<1>());
> > 
> > driver.launchTasks(offers.get()[0].id(), {task});
> > 
> > AWAIT_READY(runningStatus);
> > EXPECT_EQ(TASK_RUNNING, runningStatus.get().state());
> > EXPECT_EQ(task.task_id(), runningStatus.get().task_id());
> >   }
> > 
> >   slaveDetector.appoint(None());
> > 
> >   master->reset();
> >   master = StartMaster(masterFlags);
> >   ASSERT_SOME(master);
> > 
> >   Future slaveReregisteredMessage =
> > FUTURE_PROTOBUF(SlaveReregisteredMessage(), _, _);
> > 
> >   slaveDetector.appoint(master.get()->pid);
> > 
> >   AWAIT_READY(slaveReregisteredMessage);
> > 
> >   frameworkInfo.mutable_id()->CopyFrom(frameworkId.get());
> >   frameworkInfo.add_roles("role2");
> >   frameworkInfo.clear_role();
> >   frameworkInfo.add_capabilities()->set_type(
> >   FrameworkInfo::Capability::MULTI_ROLE);
> > 
> >   MockScheduler sched;
> >   MesosSchedulerDriver driver(
> >   , frameworkInfo, master.get()->pid, DEFAULT_CREDENTIAL);
> > 
> >   Future registered;
> >   EXPECT_CALL(sched, registered(, _, _))
> > .WillOnce(FutureSatisfy());
> > 
> >   driver.start();
> > 
> >   Clock::pause();
> >   Clock::settle();
> >   Clock::resume();
> > 
> >   AWAIT_READY(registered);
> > 
> >   EXPECT_CALL(exec, shutdown(_))
> > .Times(AtMost(1));
> > 
> >   driver.stop();
> >   driver.join();
> > ```
> 
> Benjamin Bannier wrote:
> Thanks for spotting this, this is of course unacceptable. Also your test 
> case was helpful. I updated the patch to allow 
> `Master::activateRecoveredFramework` to fail.
> 
> Jay Guo wrote:
> Maybe you could add this test case as part of our test suite as well?

I added the test case to the follow-up patch, 
https://reviews.apache.org/r/55271.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55571/#review161827

Re: Review Request 55464: Made the Agent API able to handle containers nested at arbitrary levels.

2017-01-18 Thread Gastón Kleiman


> On Jan. 16, 2017, 7:33 p.m., Anand Mazumdar wrote:
> > src/slave/http.cpp, line 2184
> > 
> >
> > Not yours, but we should rename this to `getExecutor(...)` to be 
> > consistent with other similar functions in the same file.
> > 
> > Worth doing in an earlier review with this review as dependent on it.

https://reviews.apache.org/r/55678/


> On Jan. 16, 2017, 7:33 p.m., Anand Mazumdar wrote:
> > src/slave/http.cpp, line 2199
> > 
> >
> > hmm, it looks error prone to look up frameworks only when an executor 
> > could be found and then do error checking on L2192. It was fine to do so 
> > earlier since the information was being populated while looping at one 
> > place. 
> > 
> > How about:
> > ```cpp
> > Executor* executor = slave->getExecutor(containerId);
> > if (executor == nullptr) {
> >   return NotFound("Unable to locate executor for container"
> >   " " + stringify(containerId));
> > }
> > 
> > Framework* framework = slave->getFramework(executor->frameworkId);
> > CHECK_NOTNULL(framework);
> > ```
> > 
> > Note that the following invariant is always true that is an executor is 
> > *found*, it would be associated with a framework.

https://reviews.apache.org/r/55679/


> On Jan. 16, 2017, 7:33 p.m., Anand Mazumdar wrote:
> > src/slave/http.cpp, lines 2209-2214
> > 
> >
> > hmm, why return a `BadRequest` here? This looks inconsistent with other 
> > instances where we return `NotFound`. It does not seem that there was 
> > anything malformed with the client request. This can also happen when the 
> > nested container exited right before the request was received on the agent.
> > 
> > Also, can we make the other error messages consistent with this?
> > (See the snippet I posted in the previous comment)

Also in https://reviews.apache.org/r/55679/


> On Jan. 16, 2017, 7:33 p.m., Anand Mazumdar wrote:
> > src/slave/slave.cpp, lines 6227-6236
> > 
> >
> > hmm, can we somehow reuse the helper `getRootContainerId()` in 
> > `slave/containerizer/mesos/utils.hpp`?
> > 
> > You might have to pull it out though first and then re-use it in this 
> > review? Strictly speaking, this method is pretty generic and should not be 
> > a part of the Mesos containerizer code.

https://reviews.apache.org/r/55676/


- Gastón


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55464/#review161753
---


On Jan. 18, 2017, 2:48 p.m., Gastón Kleiman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55464/
> ---
> 
> (Updated Jan. 18, 2017, 2:48 p.m.)
> 
> 
> Review request for mesos, Adam B, Alexander Rukletsov, Anand Mazumdar, 
> Alexander Rojas, haosdent huang, and Vinod Kone.
> 
> 
> Bugs: MESOS-6864
> https://issues.apache.org/jira/browse/MESOS-6864
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Made the Agent API able to handle containers nested at arbitrary levels.
> 
> 
> Diffs
> -
> 
>   src/slave/http.cpp ee7119179a6ddd935c3f43a618ef645619c305ee 
>   src/slave/slave.cpp ae183e6b8b40094b4764525a6d63164f210ef9d8 
>   src/tests/api_tests.cpp ea62226fbc5dd0bcc5ea60d1bbcd3748f9367467 
> 
> Diff: https://reviews.apache.org/r/55464/diff/
> 
> 
> Testing
> ---
> 
> `GTEST_FILTER="" make -j 8 check` in macOS.
> `make check` in Linux.
> 
> I also did some manual testing using a proof of concept  that makes the 
> `DefaultExecutor` leverage this change to perform CMD health checks.
> 
> 
> Thanks,
> 
> Gastón Kleiman
> 
>



Review Request 55678: Renamed `locateExecutor` to `getExecutor`.

2017-01-18 Thread Gastón Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55678/
---

Review request for mesos, Adam B, Alexander Rukletsov, Anand Mazumdar, 
Alexander Rojas, haosdent huang, and Vinod Kone.


Bugs: MESOS-6864
https://issues.apache.org/jira/browse/MESOS-6864


Repository: mesos


Description
---

Rnamed `locateExecutor` to `getExecutor` in `slave.cpp` to be consistent
with other similar functions in the same file.


Diffs
-

  src/slave/http.cpp ee7119179a6ddd935c3f43a618ef645619c305ee 
  src/slave/slave.hpp 8c21ce95683b3d8f0f732e87cf3b3dbe7860f1da 
  src/slave/slave.cpp ae183e6b8b40094b4764525a6d63164f210ef9d8 

Diff: https://reviews.apache.org/r/55678/diff/


Testing
---

`make check` in macOS and Linux.


Thanks,

Gastón Kleiman



Review Request 55677: Made `AttachContainerOutput/Input` tests use an existing container ID.

2017-01-18 Thread Gastón Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55677/
---

Review request for mesos, Adam B, Alexander Rukletsov, Anand Mazumdar, 
Alexander Rojas, haosdent huang, and Vinod Kone.


Bugs: MESOS-6864
https://issues.apache.org/jira/browse/MESOS-6864


Repository: mesos


Description
---

These tests verify that the Agent returns a 500 if the containerizer
doesn't support the `attach` call.

This chain refactors those methods and make them return a 404 instead of
a 500 if the container can't be found, so we need to pass the ID of an
existing container.


Diffs
-

  src/tests/api_tests.cpp ea62226fbc5dd0bcc5ea60d1bbcd3748f9367467 

Diff: https://reviews.apache.org/r/55677/diff/


Testing
---

`make check` in macOS and Linux.


Thanks,

Gastón Kleiman



Review Request 55676: Moved `getRootContainerId` to `protobuf_utils`.

2017-01-18 Thread Gastón Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55676/
---

Review request for mesos, Adam B, Alexander Rukletsov, Anand Mazumdar, 
Alexander Rojas, haosdent huang, and Vinod Kone.


Bugs: MESOS-6864
https://issues.apache.org/jira/browse/MESOS-6864


Repository: mesos


Description
---

This method is pretty generic and should not be a part of the Mesos
containerizer code.


Diffs
-

  CHANGELOG a8d080555a3948c796116af90c17dd71e1dea5d2 
  src/common/protobuf_utils.hpp bca9ace6cb37047bb08466c4fb14028bbb818446 
  src/common/protobuf_utils.cpp af3c2a660f91c366ca5ffe8671d10c4a270bbbcc 
  src/slave/containerizer/composing.cpp 
4fb7ba5cadd36be00be61aa00a69c38c44649aee 
  src/slave/containerizer/mesos/containerizer.cpp 
8bf8a7774a38131c53f6d91c7f09f5dedd9d4cb4 
  src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
b3a269535517222a1d85d8be1fdf5b865d6a3fc6 
  src/slave/containerizer/mesos/isolators/network/cni/cni.cpp 
e1cac954848e618a03ddb82fd6d040ae1d948e82 
  src/slave/containerizer/mesos/utils.hpp 
a54106dc4893bb222f42ede936ac9029e817faf9 
  src/slave/containerizer/mesos/utils.cpp 
4e2a01495909c07f320af416ff4dc59f7328c710 
  src/slave/slave.cpp 205138add7a63289ff8ed81138f8b603828c748e 
  src/tests/slave_recovery_tests.cpp 9cf1d461cf5c6ac1f51ac963360a17e37f558505 

Diff: https://reviews.apache.org/r/55676/diff/


Testing
---

`make check` in macOS and Linux.


Thanks,

Gastón Kleiman



Review Request 55679: Unified the way in which the Slave API handlers perform AuthZ.

2017-01-18 Thread Gastón Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55679/
---

Review request for mesos, Adam B, Alexander Rukletsov, Anand Mazumdar, 
Alexander Rojas, haosdent huang, and Vinod Kone.


Bugs: MESOS-6864
https://issues.apache.org/jira/browse/MESOS-6864


Repository: mesos


Description
---

Unified the way in which the Slave API handlers perform AuthZ.


Diffs
-

  src/slave/http.cpp ee7119179a6ddd935c3f43a618ef645619c305ee 

Diff: https://reviews.apache.org/r/55679/diff/


Testing
---

`make check` in macOS and Linux.


Thanks,

Gastón Kleiman



Re: Review Request 55311: Added `process::initialize` to default executor's `main`.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55311/
---

(Updated Jan. 18, 2017, 6:37 p.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.


Changes
---

Address Joseph's comments.


Repository: mesos


Description
---

The default executor's `main` currently depends on `UPID`, which
networking libraries to do things like retrieve the address of the
current host. On Windows, this means that it is required to initialize
the winsock stack to be initialized before `UPID` is used.

To resolve this issue, we add a call to `process::initialize` at the
beginning of the `main`, which will cause the stack to be initialized
correctly.


Diffs (updated)
-

  src/launcher/default_executor.cpp 57e4799e750f8f5352a9fec58af40efe432ea865 

Diff: https://reviews.apache.org/r/55311/diff/


Testing
---


Thanks,

Alex Clemmer



Review Request 55683: Rationalize process wait error checking.

2017-01-18 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55683/
---

Review request for mesos, Benjamin Bannier, Joris Van Remoortere, Michael Park, 
and Jiang Yan Xu.


Bugs: MESOS-6946
https://issues.apache.org/jira/browse/MESOS-6946


Repository: mesos


Description
---

Introduce the WSUCCESS() API to test that a forked process exited
successfully. Use it in all the places that were performing this test
bespokely, and update error logs to user WSTRINGIFY() if appropriate.


Diffs
-

  src/common/status_utils.hpp 26b94d49973dd61a6138e6a64d5ec0bdec6dea7e 
  src/docker/executor.cpp 9b5c469e2d0f33e228ec746711e6bc6ed352cbc7 
  src/hdfs/hdfs.cpp bebba8bb3c16f40293d21b5e00c353135a0f8f26 
  src/launcher/default_executor.cpp 57e4799e750f8f5352a9fec58af40efe432ea865 
  src/launcher/executor.cpp e035a4ee4438a3342f68a5548a5fd8d57a315cfa 
  src/launcher/fetcher.cpp 5f9a38d3ae8fb9b4b21203f96d53c5bef0909eba 
  src/linux/ns.hpp da43266ba67f76c9ca856584a540794062097691 
  src/slave/containerizer/docker.cpp 7a8a7271b54af0b4dcdae7a2aa8a90a8a7d05fd6 
  src/slave/containerizer/mesos/io/switchboard.cpp 
2c4eed8f2dc7f91b9fd08f91f1b08484fca9a9af 
  src/slave/containerizer/mesos/launch.cpp 
e482ab8bdfc358f695b87cda72ca59fb64cd8c4d 
  src/tests/script.cpp ef6b22aaeb056ed489f2c519e1a02644f2653329 

Diff: https://reviews.apache.org/r/55683/diff/


Testing
---

make check && sudo make check (Fedora 25)


Thanks,

James Peach



Re: Review Request 55313: Windows: Fixed the unkillable task bug, lit up executor tests.

2017-01-18 Thread Alex Clemmer


> On Jan. 18, 2017, 2:08 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/launcher.cpp, lines 117-133
> > 
> >
> > I suspect that we'll use this lambda for other subprocesses in future, 
> > so let's move it into `include/process/subprocess_base.hpp` and 
> > `src/subprocess.cpp`:
> > 
> > ```
> > struct ParentHook
> > {
> >   ...
> > #ifdef __WINDOWS__
> >   static ParentHook CREATE_JOB();
> > #endif // __WINDOWS__
> > };
> > ```

Ok, we can do this. I do question whether this really belongs as part of the 
`Subprocess` API, but I do not will not block that change. :)


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55313/#review162016
---


On Jan. 8, 2017, 6:30 a.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55313/
> ---
> 
> (Updated Jan. 8, 2017, 6:30 a.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Bugs: MESOS-6698, MESOS-6839 and MESOS-6870
> https://issues.apache.org/jira/browse/MESOS-6698
> https://issues.apache.org/jira/browse/MESOS-6839
> https://issues.apache.org/jira/browse/MESOS-6870
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> MESOS-6839 tracks a bug that causes the current implementation of the
> default executor to be unable to delete any processes associated with a
> task. To understand why requires some knowledge of the differences
> between the process model of Windows and Unix.
> 
> In Unix, there is a robust notion of a process tree, with a well-defined
> notion of process groups, sessions, signal delivery on the tree, and so
> on. Windows lacks a robust notion of a process hierarchy, and therefore
> largely has no equivalents to these constructs (including, notably,
> signal semantics).
> 
> One of the problems this mismatch causes Mesos is that it complicates
> the problem of killing a task, which is at its core a group of
> processes. On Windows, the easiest way to make a process and all its
> descendents easily killable is to track these processes in a Job Object,
> which is a Windows kernel construct similar in principle to Linux's
> control groups (though with different ideas of process namespacing).
> 
> There is some subtlety in making sure _all_ processes associated with a
> task are captured inside a Job Object. The most important consideration
> is that we need to make sure to add any process to the Job Object before
> it has a chance to create any child processes; if we fail to do this,
> the children will not be captured in the Job Object.
> 
> The solution to this is fairly simple on Windows. The process creation
> API allows users to trivially create a process in a suspended state, so
> that the Windows kernel scheduler does not schedule the process to run
> until the user explicitly resumes the main thread. This allows us to
> create the process and add it to a Job Object before it has a chance to
> create children, and then start the process.
> 
> This commit will accomplish this by changing `PosixLauncher::fork` to
> use the Subprocess parent hooks API, which implements exactly this
> semantics. Essentially, the launcher will launch the containerizer
> process, which will inspect the TaskInfo or the environment for a task
> to launch, and then launch it. Using the parent hooks API, Subprocess
> will create the containerizer process on Windows in a suspended state,
> and then the parent hook supplied by the launcher will add that process
> to a Job Object before it has a chance to run. Finally, Subprocess will
> mark the process as runnable, and return.
> 
> This commit resolves MESOS-6839. We also light up the executor tests, so
> it also resolves MESOS-6870 and MESOS-6839.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/launcher.cpp 
> a6a8c01cb39f35f8174fcb5af0ef18de2da5ee78 
>   src/tests/command_executor_tests.cpp 
> f4e7044b82e8e81d6430551dc0b2a288db10bc3c 
>   src/tests/default_executor_tests.cpp 
> 340e8c8b36a6a3cc6e5bae021e19dfa7a54852c3 
> 
> Diff: https://reviews.apache.org/r/55313/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 54573: Updated metrics counter during scheduler api calls.

2017-01-18 Thread Zhitao Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54573/#review162142
---


Ship it!




Ship It!

- Zhitao Li


On Jan. 17, 2017, 6:33 p.m., Abhishek Dasgupta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54573/
> ---
> 
> (Updated Jan. 17, 2017, 6:33 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar, Benjamin Mahler, and Zhitao Li.
> 
> 
> Bugs: MESOS-6082
> https://issues.apache.org/jira/browse/MESOS-6082
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Updated metrics counter during  scheduler api calls.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp a44621f39cb059e654a56f57f75b38947f3a4230 
> 
> Diff: https://reviews.apache.org/r/54573/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Abhishek Dasgupta
> 
>



Re: Review Request 54572: Added metrics counter for scheduler calls.

2017-01-18 Thread Zhitao Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54572/#review162145
---


Ship it!




can you also update the `docs/monitoring.md` file to reflect the new counters?

- Zhitao Li


On Jan. 17, 2017, 7:18 p.m., Abhishek Dasgupta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54572/
> ---
> 
> (Updated Jan. 17, 2017, 7:18 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar, Benjamin Mahler, and Zhitao Li.
> 
> 
> Bugs: MESOS-6082
> https://issues.apache.org/jira/browse/MESOS-6082
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added metrics counter for scheduler calls.
> 
> 
> Diffs
> -
> 
>   src/master/metrics.hpp 056d290c1e14a6afd5056873ef7bb07b5892ed32 
>   src/master/metrics.cpp 1f049f3794c1bca45d2684cbbec3b08c1a78c494 
> 
> Diff: https://reviews.apache.org/r/54572/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Abhishek Dasgupta
> 
>



Re: Review Request 55491: Added a `support/README.md` and the `support/jenkins` directory.

2017-01-18 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55491/#review162141
---


Ship it!




Ship It!

- Benjamin Bannier


On Jan. 18, 2017, 5:38 p.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55491/
> ---
> 
> (Updated Jan. 18, 2017, 5:38 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   support/README.md PRE-CREATION 
>   support/jenkins/README.md PRE-CREATION 
>   support/jenkins/buildbot.sh PRE-CREATION 
>   support/jenkins/reviewbot.sh PRE-CREATION 
>   support/jenkins/tidybot.sh PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/55491/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 55311: Added `process::initialize` to default executor's `main`.

2017-01-18 Thread Alex Clemmer


> On Jan. 18, 2017, 1:50 a.m., Joseph Wu wrote:
> > src/launcher/default_executor.cpp, line 1055
> > 
> >
> > It actually feels like `process::Winsock winsock;` is more appropriate 
> > here, as we want the socket stack to be cleaned up afterwards as well.
> > Either than, or you should add a `process::finalize(true)` befow.

Tearing down winsock before libprocess is terminated will cause ugly failures 
as the test framework disposes itself. Let's do a call to 
`process::finalize(true)`.


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55311/#review162013
---


On Jan. 8, 2017, 6:28 a.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55311/
> ---
> 
> (Updated Jan. 8, 2017, 6:28 a.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The default executor's `main` currently depends on `UPID`, which
> networking libraries to do things like retrieve the address of the
> current host. On Windows, this means that it is required to initialize
> the winsock stack to be initialized before `UPID` is used.
> 
> To resolve this issue, we add a call to `process::initialize` at the
> beginning of the `main`, which will cause the stack to be initialized
> correctly.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp fd9963996c82461b60888397989e309d51b60aaa 
> 
> Diff: https://reviews.apache.org/r/55311/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55024: Windows: Start the socket stack in `process::initialize`.

2017-01-18 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55024/
---

(Updated Jan. 18, 2017, 5:41 p.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
and Joseph Wu.


Changes
---

Address Joseph's comments.


Repository: mesos


Description
---

Currently libprocess will attempt to use sockets without initializing
the socket stack on Windows. This commit will resolving this problem by
causing `process::initialize` to initialize the socket stack.


Diffs (updated)
-

  3rdparty/libprocess/include/process/process.hpp 
b118f1a2bf5aac12b53ae204253b88c9b1c65f46 
  3rdparty/libprocess/src/process.cpp f475fe78f801924f70f51fdc4ab190c2dbecd656 

Diff: https://reviews.apache.org/r/55024/diff/


Testing
---


Thanks,

Alex Clemmer



Re: Review Request 55024: Windows: Start the socket stack in `process::initialize`.

2017-01-18 Thread Alex Clemmer


> On Jan. 18, 2017, 1:06 a.m., Joseph Wu wrote:
> > 3rdparty/libprocess/src/process.cpp, line 1328
> > 
> >
> > At the moment, I would not expect this to ever fail, 
> > `EXIT(EXIT_FAILURE)` is preferable.  If it does, it seems like a severe 
> > enough error (as the docs suggest most errors are due to programmer error).
> > 
> > See: 
> > https://msdn.microsoft.com/en-us/library/windows/desktop/ms741549(v=vs.85).aspx

The reason I decided to `LOG(ERROR)` is because there didn't seem to be any 
scenarios where we would encounter errors because we haven't shut down the 
socket stack. We can change it if you feel strongly about it, though. Do you 
think there is a strong possibility of an error condition that I'm missing?


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55024/#review162008
---


On Jan. 15, 2017, 8:46 a.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55024/
> ---
> 
> (Updated Jan. 15, 2017, 8:46 a.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
> and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently libprocess will attempt to use sockets without initializing
> the socket stack on Windows. This commit will resolving this problem by
> causing `process::initialize` to initialize the socket stack.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/include/process/process.hpp 
> b118f1a2bf5aac12b53ae204253b88c9b1c65f46 
>   3rdparty/libprocess/src/process.cpp 
> f475fe78f801924f70f51fdc4ab190c2dbecd656 
> 
> Diff: https://reviews.apache.org/r/55024/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55612: Fixed SlaveRecoveryTest.RecoverTerminatedExecutor test.

2017-01-18 Thread Greg Mann


> On Jan. 18, 2017, 7:24 a.m., Greg Mann wrote:
> > src/tests/slave_recovery_tests.cpp, line 1204
> > 
> >
> > As long as you're cleaning this up, might as well fix the spacing - 
> > should be just one space before the comment begins.
> 
> Vinod Kone wrote:
> Do you mean the spacing between "Return());" and "//" ? Quite a few of 
> such comments in this file use more than one space for this case.

OK, I was just referencing our style guide which specifies a single space for 
trailing comments.


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55612/#review162031
---


On Jan. 17, 2017, 1:06 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55612/
> ---
> 
> (Updated Jan. 17, 2017, 1:06 p.m.)
> 
> 
> Review request for mesos and Greg Mann.
> 
> 
> Bugs: MESOS-6922
> https://issues.apache.org/jira/browse/MESOS-6922
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Just like RecoverHTTPTerminatedExecutor test, the scheduler should
> handle duplicate status updates after agent restarts. See the
> attached ticket for details.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_recovery_tests.cpp 9323cbb74dd555fe6c44af1ef6da70e0c2d76dcb 
> 
> Diff: https://reviews.apache.org/r/55612/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> Ran in a loop for 20 times.
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 55023: Windows: Fixed malformatted containerizer command in launcher.

2017-01-18 Thread Alex Clemmer


> On Jan. 18, 2017, 12:47 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, lines 1608-1617
> > 
> >
> > The effect of this is to change `argv` from:
> > 
> > ```
> > mesos-containerizer launch
> > ```
> > 
> > To:
> > 
> > ```
> > .../mesos/libexec/mesos-containerizer launch
> > ```
> > 
> > This change is harmless on POSIX, so there's no need for ifdef-ing.
> > 
> > If so, can't you accomplish the same thing with a diff like:
> > ```
> > diff --git a/src/slave/containerizer/mesos/containerizer.cpp 
> > b/src/slave/containerizer/mesos/containerizer.cpp
> > index 8bf8a77..c760130 100644
> > --- a/src/slave/containerizer/mesos/containerizer.cpp
> > +++ b/src/slave/containerizer/mesos/containerizer.cpp
> > @@ -1603,12 +1603,12 @@ Future MesosContainerizerProcess::_launch(
> >  
> >// Fork the child using launcher.
> >vector argv(2);
> > -  argv[0] = MESOS_CONTAINERIZER;
> > +  argv[0] = path::join(flags.launcher_dir, MESOS_CONTAINERIZER);
> >argv[1] = MesosContainerizerLaunch::NAME;
> >  
> >Try forked = launcher->fork(
> >containerId,
> > -  path::join(flags.launcher_dir, MESOS_CONTAINERIZER),
> > +  argv[0],
> >argv,
> >in.isSome() ? in.get() : Subprocess::FD(STDIN_FILENO),
> >out.isSome() ? out.get() : Subprocess::FD(STDOUT_FILENO),
> > ```

We've historically been super conservative about changing the POSIX path, but 
with your sign-off, I'm happy to do something like this.


> On Jan. 18, 2017, 12:47 a.m., Joseph Wu wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, lines 1608-1609
> > 
> >
> > It would appear that this TODO no longer applies.

This issue is not yet resolved, actually. The point I was trying to get at is 
that the first argument to `subprocess` (if memory serves) is a no-op. So this 
call is likely to change when we refactor `subprocess`.

We can still delete it if you want, though, but I think it's reasonable to keep 
it.


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55023/#review162001
---


On Jan. 15, 2017, 10:46 a.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55023/
> ---
> 
> (Updated Jan. 15, 2017, 10:46 a.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
> and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently in `MesosContainerizerProcess::_launch`, we are passing a
> malformatted shell command to the launcher. This causes the
> containerizer process to crash immediately upon invocation in all
> executor tests.
> 
> This commit will fix this command.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 8bf8a7774a38131c53f6d91c7f09f5dedd9d4cb4 
> 
> Diff: https://reviews.apache.org/r/55023/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55472: Moved `Slave` definitions out-of-line to master.cpp.

2017-01-18 Thread Neil Conway

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55472/
---

(Updated Jan. 18, 2017, 5 p.m.)


Review request for mesos and Vinod Kone.


Changes
---

Rebase.


Bugs: MESOS-6619
https://issues.apache.org/jira/browse/MESOS-6619


Repository: mesos


Description
---

Previously, one of the `Slave` member functions was defined out-of-line,
but the rest were defined inline; make them all defined out-of-line for
consistency, and also to allow the function implementations to access
members of `Master` in the future.


Diffs (updated)
-

  src/master/master.hpp 8e8a9037af11cf95961b6498540a0fd486ed091b 
  src/master/master.cpp 73159328ce3fd838e02eba0e6a30cf69efc319ba 

Diff: https://reviews.apache.org/r/55472/diff/


Testing
---

`make check`


Thanks,

Neil Conway



Re: Review Request 55491: Added a `support/README.md` and the `support/jenkins` directory.

2017-01-18 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55491/
---

(Updated Jan. 18, 2017, 8:38 a.m.)


Review request for mesos, Benjamin Bannier and Vinod Kone.


Repository: mesos


Description (updated)
---

See summary.


Diffs
-

  support/README.md PRE-CREATION 
  support/jenkins/README.md PRE-CREATION 
  support/jenkins/buildbot.sh PRE-CREATION 
  support/jenkins/reviewbot.sh PRE-CREATION 
  support/jenkins/tidybot.sh PRE-CREATION 

Diff: https://reviews.apache.org/r/55491/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 55491: Added a `support/README.md` and the `support/jenkins` directory.

2017-01-18 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55491/
---

(Updated Jan. 18, 2017, 8:38 a.m.)


Review request for mesos, Benjamin Bannier and Vinod Kone.


Changes
---

Hid the password behind `PASSWORD` env var provided by the Jenkins job.


Repository: mesos


Description
---

Added a `support/README.md` and the `support/jenkins` directory.


Diffs (updated)
-

  support/README.md PRE-CREATION 
  support/jenkins/README.md PRE-CREATION 
  support/jenkins/buildbot.sh PRE-CREATION 
  support/jenkins/reviewbot.sh PRE-CREATION 
  support/jenkins/tidybot.sh PRE-CREATION 

Diff: https://reviews.apache.org/r/55491/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 55491: Added a `support/README.md` and the `support/jenkins` directory.

2017-01-18 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55491/#review162120
---



Patch looks great!

Reviews applied: [55667, 55490, 55654, 55491]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos ReviewBot


On Jan. 18, 2017, 3:12 p.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55491/
> ---
> 
> (Updated Jan. 18, 2017, 3:12 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a `support/README.md` and the `support/jenkins` directory.
> 
> 
> Diffs
> -
> 
>   support/README.md PRE-CREATION 
>   support/jenkins/README.md PRE-CREATION 
>   support/jenkins/buildbot.sh PRE-CREATION 
>   support/jenkins/reviewbot.sh PRE-CREATION 
>   support/jenkins/tidybot.sh PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/55491/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 55381: Added test for framework upgrading to multi-role capability.

2017-01-18 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55381/
---

(Updated Jan. 18, 2017, 4:26 p.m.)


Review request for mesos, Benjamin Mahler, Jay Guo, and Guangya Liu.


Changes
---

Rebased.


Bugs: MESOS-6900
https://issues.apache.org/jira/browse/MESOS-6900


Repository: mesos


Description
---

See summary.


Diffs (updated)
-

  src/tests/master_validation_tests.cpp 
c092362152e1fe8a6b615c2eda171d852c1bbd86 

Diff: https://reviews.apache.org/r/55381/diff/


Testing
---

make check (OS X)


Thanks,

Benjamin Bannier



Re: Review Request 55571: Changed Master::Framework::updateFrameworkInfo so it can return errors.

2017-01-18 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55571/
---

(Updated Jan. 18, 2017, 4:26 p.m.)


Review request for mesos, Benjamin Mahler, Jay Guo, and Guangya Liu.


Changes
---

Rebased.


Bugs: MESOS-6631
https://issues.apache.org/jira/browse/MESOS-6631


Repository: mesos


Description
---

This allows us to confirm that an updated FrameworkInfo is fully
compatible with an already known one. By performing this check in
updateFrameworkInfo when can be sure that both the new and old
FrameworkInfo are available (e.g., after framework reregistration, or
master or framework failover).


Diffs (updated)
-

  src/master/master.hpp 8e8a9037af11cf95961b6498540a0fd486ed091b 
  src/master/master.cpp 73159328ce3fd838e02eba0e6a30cf69efc319ba 

Diff: https://reviews.apache.org/r/55571/diff/


Testing
---

Tested as part of https://reviews.apache.org/r/55271/.


Thanks,

Benjamin Bannier



Re: Review Request 55271: Disallow multi-role frameworks to change their roles.

2017-01-18 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55271/
---

(Updated Jan. 18, 2017, 4:25 p.m.)


Review request for mesos, Benjamin Mahler, Jay Guo, and Guangya Liu.


Changes
---

Addressed comments from guoger.


Bugs: MESOS-6631
https://issues.apache.org/jira/browse/MESOS-6631


Repository: mesos


Description (updated)
---

We currently do not allow `MULTI_ROLE` frameworks to change their
roles. This restriction will be lifted later.


Diffs (updated)
-

  src/master/master.hpp 8e8a9037af11cf95961b6498540a0fd486ed091b 
  src/tests/master_validation_tests.cpp 
c092362152e1fe8a6b615c2eda171d852c1bbd86 

Diff: https://reviews.apache.org/r/55271/diff/


Testing
---

Tested on various Linux configurations in internal CI.


Thanks,

Benjamin Bannier



Re: Review Request 55271: Disallow multi-role frameworks to change their roles.

2017-01-18 Thread Benjamin Bannier


> On Jan. 18, 2017, 3:36 a.m., Jay Guo wrote:
> > src/master/master.hpp, lines 2522-2528
> > 
> >
> > This check is only valid **iff** both `src` and `dest` frameworkInfo 
> > are single role, isn't it? maybe
> > ```cpp
> > if (!capabilities.multiRole &&
> > !protobuf::frameworkHasCapability(
> > source, FrameworkInfo::Capability::MULTI_ROLE) &&
> > source.role() != info.role()) {
> >   ...
> > }

Also updated the comment. I did use `protobuf::frameworkHasCapability` in both 
cases for uniformity.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55271/#review162021
---


On Jan. 17, 2017, 11:20 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55271/
> ---
> 
> (Updated Jan. 17, 2017, 11:20 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Jay Guo, and Guangya Liu.
> 
> 
> Bugs: MESOS-6631
> https://issues.apache.org/jira/browse/MESOS-6631
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> We currently do not allow `MULTI_ROLE` frameworks to change their
> roles. This restriction will be lifted later.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 44f4fecb1fbe8bebf830990a59a5462338e6e004 
>   src/tests/master_validation_tests.cpp 
> e5d55e03648cb218d42adc594d6fa7d40ea9bcbb 
> 
> Diff: https://reviews.apache.org/r/55271/diff/
> 
> 
> Testing
> ---
> 
> Tested on various Linux configurations in internal CI.
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 55491: Added a `support/README.md` and the `support/jenkins` directory.

2017-01-18 Thread Michael Park


> On Jan. 18, 2017, 4:50 a.m., Benjamin Bannier wrote:
> > support/jenkins/buildbot.sh, line 22
> > 
> >
> > Any reason this isn't named `MESOS_DIRECTORY` like similar variables in 
> > other places?

See https://reviews.apache.org/r/55490/#comment233317


> On Jan. 18, 2017, 4:50 a.m., Benjamin Bannier wrote:
> > support/jenkins/tidybot.sh, line 22
> > 
> >
> > `MESOS_DIRECTORY`?

See https://reviews.apache.org/r/55490/#comment233317


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55491/#review162080
---


On Jan. 18, 2017, 7:12 a.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55491/
> ---
> 
> (Updated Jan. 18, 2017, 7:12 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a `support/README.md` and the `support/jenkins` directory.
> 
> 
> Diffs
> -
> 
>   support/README.md PRE-CREATION 
>   support/jenkins/README.md PRE-CREATION 
>   support/jenkins/buildbot.sh PRE-CREATION 
>   support/jenkins/reviewbot.sh PRE-CREATION 
>   support/jenkins/tidybot.sh PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/55491/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 55490: Used the `mesos/mesos-tidy` image from DockerHub.

2017-01-18 Thread Benjamin Bannier


> On Jan. 18, 2017, 1:53 p.m., Benjamin Bannier wrote:
> > support/mesos-tidy.sh, line 22
> > 
> >
> > Let's keep this further down so to not take away from `CHECKS` 
> > prominence. Any reason we'd want to deviate from `MESOS_DIRECTORY` used 
> > elsewhere? They seem to mean the same thing (extracted in slightly 
> > different ways).
> 
> Michael Park wrote:
> I thought here was that most (if not all) of our scripts should start 
> like this:
> 
> ```bash
> #!/usr/bin/env bash
> 
> // LICENSE ...
> 
> set -e
> set -o pipefail
> 
> MESOS_DIR=$(git rev-parse --show-toplevel)
> ```
> 
> In terms of the use of `git`, I didn't like that I couldn't use
> `$(cd "$(dirname "$0")/.." && pwd)` in scripts within deeper directories
> (e.g., `support/jenkins`) because of the `..` portion of the pattern.
> 
> It seems like the `git` command is more readable and also more portable.
> 
> I did think about the "added" dependency on `git`, but `git` is an 
> essential tool,
> and we would've needed `git` to end up with this repository somehow 
> anyway.
> So I figured that shouldn't be an issue.
> 
> Michael Park wrote:
> The `s/MESOS_DIRECTORY/MESOS_DIR/` is simply because we seem to use
> `_DIR` suffix for directories more widely than `DIRECTORY` in general.

Makes sense, dropping.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55490/#review162068
---


On Jan. 18, 2017, 9:58 a.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55490/
> ---
> 
> (Updated Jan. 18, 2017, 9:58 a.m.)
> 
> 
> Review request for mesos and Benjamin Bannier.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Used the `mesos/mesos-tidy` image from DockerHub.
> 
> 
> Diffs
> -
> 
>   support/mesos-tidy.sh 96d3ecb91f5476ff499ca5f043c527681c30efe9 
> 
> Diff: https://reviews.apache.org/r/55490/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

2017-01-18 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/#review162098
---



- Vinod Kone


On Jan. 17, 2017, 6:36 p.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54183/
> ---
> 
> (Updated Jan. 17, 2017, 6:36 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6619
> https://issues.apache.org/jira/browse/MESOS-6619
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Before partition-awareness, when an agent failed health checks, the
> master removed the agent from the registry, marked all of its tasks
> TASK_LOST, and moved them to the `completedTasks` list in the master's
> memory. Although "lost" tasks might still be running, partitioned agents
> would only be allowed to re-register if the master failed over, in which
> case the `completedTasks` map would be emptied.
> 
> When partition-awareness was introduced, we initially followed the same
> scheme, with the only difference that partition-aware tasks are marked
> TASK_UNREACHABLE, not TASK_LOST.
> 
> This scheme has a few shortcomings. First, partition-aware tasks might
> resume running when the partitioned agent re-registers. Second, we
> re-added non-partition aware tasks when the agent re-registered but then
> marked them completed when the framework is shutdown, resulting in two
> entries in `completedTasks`.
> 
> This commit introduces a separate bounded map, `unreachableTasks`. These
> tasks are reported separately via the HTTP endpoints, because they have
> different semantics (unlike completed tasks, unreachable tasks can
> resume running). The size of this map is limited by a new master flag,
> `--max_unreachable_tasks_per_framework`. This commit also changes the
> master to omit re-adding non-partition-aware tasks on re-registering
> agents (unless the master has failed over): those tasks will shortly be
> shutdown anyway.
> 
> Finally, this commit fixes a minor bug in the previous code: the
> previous coding neglected to shutdown non-partition-aware frameworks
> running on pre-1.0 Mesos agents that re-register with the master after
> a network partition.
> 
> 
> Diffs
> -
> 
>   docs/configuration.md 2113d06f58dddc0a28ae1241a24096266fe39801 
>   include/mesos/master/master.proto 03203c8ddd05876339b1c56789b7bb140499d49d 
>   include/mesos/v1/master/master.proto 
> f8edf39a68752c8601cece345f52bce8b9a8a68b 
>   src/master/constants.hpp 900b69460de9280f30b16b86925e436054e080ca 
>   src/master/flags.hpp 6a17b763dc76daa10073394f416b049e97a44238 
>   src/master/flags.cpp 124bfbca6be9d5d60597b9a1793bd807202da8ec 
>   src/master/http.cpp a44621f39cb059e654a56f57f75b38947f3a4230 
>   src/master/master.hpp 44f4fecb1fbe8bebf830990a59a5462338e6e004 
>   src/master/master.cpp b863ff6e93931c3d1ee056248084c7f44caf2fd9 
>   src/tests/partition_tests.cpp 72013d1bfee275c6f3cb90173f0c408d55e0bc5d 
> 
> Diff: https://reviews.apache.org/r/54183/diff/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Neil Conway
> 
>



Re: Review Request 54183: Improved management of unreachable and completed tasks in master.

2017-01-18 Thread Vinod Kone


> On Jan. 3, 2017, 10:25 p.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5507
> > 
> >
> > can we inline this?
> 
> Neil Conway wrote:
> We could, but personally I find it more readable to make it a separate 
> function. The logic is a bit involved here (since we need to check two 
> different data structures due to backward compatibility requirements), and 
> there's already a lot going on in `Master::_reregisterSlave`.
> 
> Vinod Kone wrote:
> In general, we tend to factor out something into a function if it is 
> reusable and generic, not to reduce the number of lines someone has to read 
> in a function. In fact the argument was to inline in such cases because it 
> will avoid the need to context switch. 
> 
> Both `findPartitonAwareFrameworks` and `filterPartitionTasks` seem very 
> specific to the `_reregisterSlave` method, hence the suggestion.
> 
> For example, instead of looping through all the agent's tasks 3 times to 
> figure out which tasks to add to `Slave*`, it's probably easier to do it in 
> one inline loop here?
> 
> Neil Conway wrote:
> My goal isn't to reduce the # of lines to read the function, but to raise 
> the level of abstraction a little bit. For example, 
> `findPartitonAwareFrameworks` has to check _two_ data structures, depending 
> on both agent version and whether the master has failed over. That is quite 
> strange and takes a bit of thought to validate. With a separate function, the 
> reader of `_reregisterSlave` doesn't need to think about those details if 
> they don't want to -- instead they can just think "okay, we find all the 
> partition-aware frameworks on the agent and then ..."
> 
> I'm fine with looping over the agent's tasks three times, because each of 
> the three things we want to do is logically independent -- it takes a bit of 
> thought to validate whether it is actually correct to combine them into a 
> single loop (because each loop iteration would access a collection that is 
> modified by the loop itself. It would actually be okay, but IMO current 
> approach is safer).

This is the code I had in mind. Let me know what you think:

```
// Add recovered frameworks into a hashset for easy lookup.
hashset frameworks_;
foreach (const FrameworkInfo& framework, frameworks) {
  frameworks_.put(framework.id(), framework);
}

vector tasks_;
foreach (const Task& task, tasks) {
  const FrameworkInfo& frameworkInfo = frameworks_[task.framework_id()];
  Framework* framework = getFramework(framework.id());
  
  if (protobuf::frameworkHasCapability(frameworkInfo, PARTITION_AWARE)) {
// Add the task if it is partition aware.
tasks_.push_back(task);

// Remove the partition aware task from `unreachableTasks`.
if (framework != null) {
  framework->unreachableTasks.erase(task.task_id());
}
  } else if (slaves.removed.get(slaveInfo.id()).isSome()) {
// Add a non-partion task iff the master failed over;
tasks_.push_back(task);
  }
}
  
Slave* slave = new Slave(
  this,
  slaveInfo,
  pid,
  machineId,
  version,
  Clock::now(),
  checkpointedResources,
  executorInfos,
  tasks_);
}

```


- Vinod


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54183/#review160019
---


On Jan. 17, 2017, 6:36 p.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54183/
> ---
> 
> (Updated Jan. 17, 2017, 6:36 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6619
> https://issues.apache.org/jira/browse/MESOS-6619
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Before partition-awareness, when an agent failed health checks, the
> master removed the agent from the registry, marked all of its tasks
> TASK_LOST, and moved them to the `completedTasks` list in the master's
> memory. Although "lost" tasks might still be running, partitioned agents
> would only be allowed to re-register if the master failed over, in which
> case the `completedTasks` map would be emptied.
> 
> When partition-awareness was introduced, we initially followed the same
> scheme, with the only difference that partition-aware tasks are marked
> TASK_UNREACHABLE, not TASK_LOST.
> 
> This scheme has a few shortcomings. First, partition-aware tasks might
> resume running when the partitioned agent re-registers. Second, we
> re-added non-partition aware tasks when the agent re-registered but then
> marked them completed when the framework is shutdown, resulting in two
> entries in `completedTasks`.
> 
> This commit introduces a separate bounded map, 

Re: Review Request 55491: Added a `support/README.md` and the `support/jenkins` directory.

2017-01-18 Thread Michael Park


> On Jan. 18, 2017, 4:50 a.m., Benjamin Bannier wrote:
> > support/README.md, lines 5-7
> > 
> >
> > > The scripts directly in this directory are intended to be used by 
> > Mesos developers, while the build scripts in [`jenkins`](jenkins) are used 
> > by the ASF Jenkins CI in a bot-specific way.

I decided to just drop the `jenkins` stuff. People can look at 
`jenkins/README.md`.


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55491/#review162080
---


On Jan. 18, 2017, 7:12 a.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55491/
> ---
> 
> (Updated Jan. 18, 2017, 7:12 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a `support/README.md` and the `support/jenkins` directory.
> 
> 
> Diffs
> -
> 
>   support/README.md PRE-CREATION 
>   support/jenkins/README.md PRE-CREATION 
>   support/jenkins/buildbot.sh PRE-CREATION 
>   support/jenkins/reviewbot.sh PRE-CREATION 
>   support/jenkins/tidybot.sh PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/55491/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 55491: Added a `support/README.md` and the `support/jenkins` directory.

2017-01-18 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55491/
---

(Updated Jan. 18, 2017, 7:12 a.m.)


Review request for mesos, Benjamin Bannier and Vinod Kone.


Changes
---

Addressed Vinod and Benjamin's comments.


Repository: mesos


Description (updated)
---

Added a `support/README.md` and the `support/jenkins` directory.


Diffs (updated)
-

  support/README.md PRE-CREATION 
  support/jenkins/README.md PRE-CREATION 
  support/jenkins/buildbot.sh PRE-CREATION 
  support/jenkins/reviewbot.sh PRE-CREATION 
  support/jenkins/tidybot.sh PRE-CREATION 

Diff: https://reviews.apache.org/r/55491/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 55490: Used the `mesos/mesos-tidy` image from DockerHub.

2017-01-18 Thread Michael Park


> On Jan. 18, 2017, 4:53 a.m., Benjamin Bannier wrote:
> > support/mesos-tidy.sh, line 22
> > 
> >
> > Let's keep this further down so to not take away from `CHECKS` 
> > prominence. Any reason we'd want to deviate from `MESOS_DIRECTORY` used 
> > elsewhere? They seem to mean the same thing (extracted in slightly 
> > different ways).
> 
> Michael Park wrote:
> I thought here was that most (if not all) of our scripts should start 
> like this:
> 
> ```bash
> #!/usr/bin/env bash
> 
> // LICENSE ...
> 
> set -e
> set -o pipefail
> 
> MESOS_DIR=$(git rev-parse --show-toplevel)
> ```
> 
> In terms of the use of `git`, I didn't like that I couldn't use
> `$(cd "$(dirname "$0")/.." && pwd)` in scripts within deeper directories
> (e.g., `support/jenkins`) because of the `..` portion of the pattern.
> 
> It seems like the `git` command is more readable and also more portable.
> 
> I did think about the "added" dependency on `git`, but `git` is an 
> essential tool,
> and we would've needed `git` to end up with this repository somehow 
> anyway.
> So I figured that shouldn't be an issue.

The `s/MESOS_DIRECTORY/MESOS_DIR/` is simply because we seem to use
`_DIR` suffix for directories more widely than `DIRECTORY` in general.


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55490/#review162068
---


On Jan. 18, 2017, 12:58 a.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55490/
> ---
> 
> (Updated Jan. 18, 2017, 12:58 a.m.)
> 
> 
> Review request for mesos and Benjamin Bannier.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Used the `mesos/mesos-tidy` image from DockerHub.
> 
> 
> Diffs
> -
> 
>   support/mesos-tidy.sh 96d3ecb91f5476ff499ca5f043c527681c30efe9 
> 
> Diff: https://reviews.apache.org/r/55490/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 55654: Renamed `CONFIGURE_FLAGS` to `CMAKE_ARGS`.

2017-01-18 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55654/
---

(Updated Jan. 18, 2017, 6:54 a.m.)


Review request for mesos and Benjamin Bannier.


Changes
---

Addressed Benjamin's comments.


Repository: mesos


Description (updated)
---

Updated `support/mesos-tidy/README.md` accordingly + minor updates.


Diffs (updated)
-

  support/mesos-tidy.sh 96d3ecb91f5476ff499ca5f043c527681c30efe9 
  support/mesos-tidy/README.md e285632c53613d67a5317bbc8c443fe2d06ba38a 
  support/mesos-tidy/entrypoint.sh c1de1c6ea8622fb7b4a7aa313af8d46ed256 

Diff: https://reviews.apache.org/r/55654/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 55490: Used the `mesos/mesos-tidy` image from DockerHub.

2017-01-18 Thread Michael Park


> On Jan. 18, 2017, 4:53 a.m., Benjamin Bannier wrote:
> > support/mesos-tidy.sh, line 22
> > 
> >
> > Let's keep this further down so to not take away from `CHECKS` 
> > prominence. Any reason we'd want to deviate from `MESOS_DIRECTORY` used 
> > elsewhere? They seem to mean the same thing (extracted in slightly 
> > different ways).

I thought here was that most (if not all) of our scripts should start like this:

```bash
#!/usr/bin/env bash

// LICENSE ...

set -e
set -o pipefail

MESOS_DIR=$(git rev-parse --show-toplevel)
```

In terms of the use of `git`, I didn't like that I couldn't use
`$(cd "$(dirname "$0")/.." && pwd)` in scripts within deeper directories
(e.g., `support/jenkins`) because of the `..` portion of the pattern.

It seems like the `git` command is more readable and also more portable.

I did think about the "added" dependency on `git`, but `git` is an essential 
tool,
and we would've needed `git` to end up with this repository somehow anyway.
So I figured that shouldn't be an issue.


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55490/#review162068
---


On Jan. 18, 2017, 12:58 a.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55490/
> ---
> 
> (Updated Jan. 18, 2017, 12:58 a.m.)
> 
> 
> Review request for mesos and Benjamin Bannier.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Used the `mesos/mesos-tidy` image from DockerHub.
> 
> 
> Diffs
> -
> 
>   support/mesos-tidy.sh 96d3ecb91f5476ff499ca5f043c527681c30efe9 
> 
> Diff: https://reviews.apache.org/r/55490/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 55464: Made the Agent API able to handle containers nested at arbitrary levels.

2017-01-18 Thread Gastón Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55464/
---

(Updated Jan. 18, 2017, 2:48 p.m.)


Review request for mesos, Adam B, Alexander Rukletsov, Anand Mazumdar, 
Alexander Rojas, haosdent huang, and Vinod Kone.


Changes
---

Addressed Anand's feedback.


Bugs: MESOS-6864
https://issues.apache.org/jira/browse/MESOS-6864


Repository: mesos


Description
---

Made the Agent API able to handle containers nested at arbitrary levels.


Diffs (updated)
-

  src/slave/http.cpp ee7119179a6ddd935c3f43a618ef645619c305ee 
  src/slave/slave.cpp ae183e6b8b40094b4764525a6d63164f210ef9d8 
  src/tests/api_tests.cpp ea62226fbc5dd0bcc5ea60d1bbcd3748f9367467 

Diff: https://reviews.apache.org/r/55464/diff/


Testing
---

`GTEST_FILTER="" make -j 8 check` in macOS.
`make check` in Linux.

I also did some manual testing using a proof of concept  that makes the 
`DefaultExecutor` leverage this change to perform CMD health checks.


Thanks,

Gastón Kleiman



Re: Review Request 55490: Used the `mesos/mesos-tidy` image from DockerHub.

2017-01-18 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55490/#review162095
---


Ship it!




Modulo my comments.

- Benjamin Bannier


On Jan. 18, 2017, 9:58 a.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55490/
> ---
> 
> (Updated Jan. 18, 2017, 9:58 a.m.)
> 
> 
> Review request for mesos and Benjamin Bannier.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Used the `mesos/mesos-tidy` image from DockerHub.
> 
> 
> Diffs
> -
> 
>   support/mesos-tidy.sh 96d3ecb91f5476ff499ca5f043c527681c30efe9 
> 
> Diff: https://reviews.apache.org/r/55490/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 55667: Added `libssl-dev` to `support/mesos-tidy/Dockerfile`.

2017-01-18 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55667/#review162094
---


Ship it!




Ship It!

- Benjamin Bannier


On Jan. 18, 2017, 9:57 a.m., Michael Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55667/
> ---
> 
> (Updated Jan. 18, 2017, 9:57 a.m.)
> 
> 
> Review request for mesos and Benjamin Bannier.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary. Refer to MESOS-6942.
> 
> 
> Diffs
> -
> 
>   support/mesos-tidy/Dockerfile 8d8c476d6f65ea3ce8ded75ed26b0967a9aced82 
> 
> Diff: https://reviews.apache.org/r/55667/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Michael Park
> 
>



Re: Review Request 55621: Fixed issues with the Docker fetcher when using a proxy.

2017-01-18 Thread Jan Schlicht

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55621/
---

(Updated Jan. 18, 2017, 2:45 p.m.)


Review request for mesos, Benjamin Bannier, Gilbert Song, and Jie Yu.


Changes
---

Addressed issues, added more checks for CONNECT response.


Bugs: MESOS-6010
https://issues.apache.org/jira/browse/MESOS-6010


Repository: mesos


Description
---

When behing a proxy, 'curl' uses HTTP CONNECT tunneling to access
HTTPS. This lead to problems with our HTTP parser because the response
of a 'CONNECT' doesn't have neither headers nor a body.


Diffs (updated)
-

  src/uri/fetchers/docker.cpp 027e7480bd91038b2e51031dc77aad779d6d5a88 

Diff: https://reviews.apache.org/r/55621/diff/


Testing
---

./bin/mesos-test.sh without a proxy (to test that it's not breaking existing 
behavior)
./bin/mesos-tests.sh behind a proxy.
For example by running:
```
docker run -d -p 3128:3128 minumum2scp/squid
export https_proxy=127.0.0.1:3128
./bin/mesos-tests.sh
```
Without this diff, tests cases in the `DockerFetcherPluginTest` fixture should 
fail.


Thanks,

Jan Schlicht



Re: Review Request 55496: Added support for HTTP responses with unspecified length.

2017-01-18 Thread Jan Schlicht

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55496/
---

(Updated Jan. 18, 2017, 2:44 p.m.)


Review request for mesos, Benjamin Bannier, Gilbert Song, and Jie Yu.


Changes
---

Added a unit test and addressed issues.


Bugs: MESOS-6010
https://issues.apache.org/jira/browse/MESOS-6010


Repository: mesos


Description
---

Some HTTP responses might not set the 'Content-Length' header. This
usually means that the message body should be read until EOF. This
will be done by an additional call of 'decoder.decode' with an empty
string.


Diffs (updated)
-

  3rdparty/libprocess/src/http.cpp e2e9b8f7563f8756ee406fecca5900d644834974 
  3rdparty/libprocess/src/tests/decoder_tests.cpp 
87563f4fcdfaa2a33c4482533ddff24e062e603a 

Diff: https://reviews.apache.org/r/55496/diff/


Testing
---

make check


Thanks,

Jan Schlicht



Re: Review Request 55637: CMake: Added `test` target.

2017-01-18 Thread Ilya Pronin


> On Jan. 17, 2017, 8:46 p.m., Ilya Pronin wrote:
> > I think the name "test" can confuse people because it's usually expected to 
> > be used for CTest. Maybe it would better be leave old {{tests}} and 
> > {{check}} targets? https://cmake.org/Wiki/CMakeEmulateMakeCheck
> 
> Alex Clemmer wrote:
> That's a great question. I'm not a CMake native, so it's not clear to me 
> how weird this would be to do. The idea was to expose a `test` target to 
> achieve something resembling script parity. Is this idea not compelling to 
> you?

Very compelling. I just meant that `make test` is usually expected to not only 
build tests but also run them, i.e. act the same way as autotools generated 
`make check`. E.g. debhelper's `dh_auto_test` looks for `test` or `check` 
target in a Makefile. So I suggested that target that just builds tests should 
be called `tests`.


- Ilya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55637/#review161940
---


On Jan. 17, 2017, 7:33 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55637/
> ---
> 
> (Updated Jan. 17, 2017, 7:33 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, and Joseph Wu.
> 
> 
> Bugs: MESOS-3697
> https://issues.apache.org/jira/browse/MESOS-3697
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This commit will introduce a `test` target to the CMake build system.
> The semantics of this target are very similar to the autotools build
> system, in that they build the tests, but do not run them.
> 
> Accomplishing this is somewhat complex in CMake, because CMake expects
> to control the `test` target itself. We work around this by (1)
> silencing the warning that `test` is a reserved target, (2) removing the
> call to `enable_testing` that sets up CTest, and (3) adding our own
> `test` target. As an additional insurance policy, we error out if any of
> the `BUILD_TESTING` flags are defined, which would indicate that
> `include(CTest)` was called.
> 
> 
> Diffs
> -
> 
>   CMakeLists.txt c6b1eb6a7c7e6ed3650847acbec0ddd3101bb96f 
>   cmake/MesosConfigure.cmake 6a9ed9dc02f5c9a6d1fce0866f19ffeafec35cdc 
>   src/tests/CMakeLists.txt b7adf4f3126611a57baf4f62b62b208bc0236da5 
> 
> Diff: https://reviews.apache.org/r/55637/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 55621: Fixed issues with the Docker fetcher when using a proxy.

2017-01-18 Thread Jan Schlicht


> On Jan. 17, 2017, 7:11 p.m., Jie Yu wrote:
> > src/uri/fetchers/docker.cpp, line 171
> > 
> >
> > I think we need a case insensitive os::getenv because one can use 
> > `HTTPS_PROXY` as well.
> > 
> > Maybe introduce a default parameter `Option caseSensitive` in 
> > os::getenv, default to None(). None() means use system default.
> > 
> > If you feel the above is too much for now, add a helper to do that for 
> > now in this file and add a TODO.
> 
> Jan Schlicht wrote:
> Making `os::getenv` case insensitive wouldn't do the trick, as they 
> aren't case insensitive in Linux and both the lower-case as well as the 
> upper-case variable could be defined. For curl the lower case version has 
> precedence (see https://curl.haxx.se/docs/manpage.html in the 'Environment' 
> section), but this may not be the general case.

But, of course, we'll have to check for `HTTPS_PROXY` here. I'll add that.


- Jan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55621/#review161893
---


On Jan. 17, 2017, 3:53 p.m., Jan Schlicht wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55621/
> ---
> 
> (Updated Jan. 17, 2017, 3:53 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Gilbert Song, and Jie Yu.
> 
> 
> Bugs: MESOS-6010
> https://issues.apache.org/jira/browse/MESOS-6010
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When behing a proxy, 'curl' uses HTTP CONNECT tunneling to access
> HTTPS. This lead to problems with our HTTP parser because the response
> of a 'CONNECT' doesn't have neither headers nor a body.
> 
> 
> Diffs
> -
> 
>   src/uri/fetchers/docker.cpp 3f38dddfb4c089322fe4e13b1ef2070b4835885c 
> 
> Diff: https://reviews.apache.org/r/55621/diff/
> 
> 
> Testing
> ---
> 
> ./bin/mesos-test.sh without a proxy (to test that it's not breaking existing 
> behavior)
> ./bin/mesos-tests.sh behind a proxy.
> For example by running:
> ```
> docker run -d -p 3128:3128 minumum2scp/squid
> export https_proxy=127.0.0.1:3128
> ./bin/mesos-tests.sh
> ```
> Without this diff, tests cases in the `DockerFetcherPluginTest` fixture 
> should fail.
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>



  1   2   >