Re: Review Request 52181: Added synchronization in link logic to prevent relinking races.

2016-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52181/#review150296
---



Patch looks great!

Reviews applied: [52180, 52182, 52181]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On Sept. 24, 2016, 1:04 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52181/
> ---
> 
> (Updated Sept. 24, 2016, 1:04 a.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Artem Harutyunyan, and Joris Van 
> Remoortere.
> 
> 
> Bugs: MESOS-6234
> https://issues.apache.org/jira/browse/MESOS-6234
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> There is a general pattern in the `SocketManager` in which most methods
> will grab the mutex and then check if the socket to manage exists in
> the `SocketManager`s mapping.  If the socket does not exist, the
> `SocketManager` silently returns.
> 
> This adds similar logic inside two critical sections of the `link`
> codepath.  If there are multiple calls to `link` in-flight at once,
> this prevents sockets from being leaked into unmanaged callback loops.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/process.cpp 
> 02a192529e53479d5a163fa6a20873674b51ee2c 
> 
> Diff: https://reviews.apache.org/r/52181/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> ---
> 
> This test is not included in the review diff because it cannot be run in a 
> loop (i.e. `--gtest_repeat`).  The system will run out of emphemeral sockets 
> after ~80 iterations and the test will begin to fail due to failing socket 
> connections.
> 
> 3rdparty/libprocess/libprocess-tests 
> --gtest_filter="ProcessRemoteLinkTest.RemoteRelinkLoop" --gtest_repeat=50 
> --gtest_break_on_failure
> 
> ```
> // Verifies that repeated relinking of a remote process
> // maintains monitoring of the remote process.
> TEST_F(ProcessRemoteLinkTest, RemoteRelinkLoop)
> {
>   RemoteLinkTestProcess relinker(pid);
>   Future relinkerExitedPid;
> 
>   EXPECT_CALL(relinker, exited(pid))
> .WillOnce(FutureArg<0>(&relinkerExitedPid));
> 
>   spawn(relinker);
> 
>   // NOTE: The remote process we are linking to never closes sockets.
>   // Therefore, we can only make a limited number of connections before
>   // the remote process runs out of file descriptors.
>   for (int i = 0; i < 200; i++) {
> relinker.relink();
>   }
> 
>   os::killtree(linkee->pid(), SIGKILL);
>   reap_linkee();
>   linkee = None();
> 
>   AWAIT_ASSERT_EQ(pid, relinkerExitedPid);
> 
>   terminate(relinker);
>   wait(relinker);
> }
> ```
> 
> ---
> 
> In order to test the SSL-downgrade path, you need to tweak the way we launch 
> `test-linkee` in `ProcessRemoteLinkTest::SetUp`:
> ```
> std::map empty_environment;
> Try s = process::subprocess(
> path::join(BUILD_DIR, "test-linkee") +
>   " '" + stringify(coordinator.self()) + "'",
> process::Subprocess::FD(STDIN_FILENO),
> process::Subprocess::FD(STDOUT_FILENO),
> process::Subprocess::FD(STDERR_FILENO),
> empty_environment);
> ```
> 
> LIBPROCESS_SSL_ENABLED=true LIBPROCESS_SSL_SUPPORT_DOWNGRADE=true 
> LIBPROCESS_SSL_KEY_FILE=key.pem LIBPROCESS_SSL_CERT_FILE=cert.pem 
> 3rdparty/libprocess/libprocess-tests 
> --gtest_filter="ProcessRemoteLinkTest.RemoteRelinkLoop" --gtest_repeat=50 
> --gtest_break_on_failure
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-23 Thread Jacob Janco


> On Sept. 23, 2016, 2:40 a.m., Guangya Liu wrote:
> > It is really weired that the performance of 
> > `SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7`
> >  does not improve much when calling `addSlave`, need check more for why 
> > `addSlave` was same? Without fix, the `addSlave` will call `allocate` for 
> > each agent, but with the fix, only one `allocate` will be called
> > 
> > ```
> > without fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 122268us
> > Added 1000 agents in 42.037104secs
> > 
> > With fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 116107us
> > Added 1000 agents in 41.615396secs
> > ```
> 
> Guangya Liu wrote:
> Jacob, I did more test with the code on Aug 23, at which I posted some 
> result in this RR, and found that the test result is different, I did 
> following to get Aug 23 code.
> 
> ```
> LiuGuangyas-MacBook-Pro:build gyliu$ git checkout 
> 2f78a440ef4201c5b11fb92c225694e84a60369c
> 
> LiuGuangyas-MacBook-Pro:build gyliu$ git log -1
> commit 2f78a440ef4201c5b11fb92c225694e84a60369c
> Author: Gilbert Song 
> Date:   Mon Aug 22 13:00:58 2016 -0700
> 
> Fixed potential flakiness in ROOT_RecoverOrphanedPersistentVolume.
> 
> Review: https://reviews.apache.org/r/51271/
> ```
> 
> The test result seems still same as now (without your patch and the code 
> is get from Aug 23):
> 
> ```
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> Using 1000 agents and 6000 frameworks
> Added 6000 frameworks in 144272us
> Added 1000 agents in 43.107001secs
> ```
> 
> But anyway, I think that we need find out why the performance for 
> `addSlave` was not improved based on your patch.
> 
> Jacob Janco wrote:
> Yes agreed, per our Slack discussions, I'll look into this. Thanks for 
> posting the followup.
> 
> Benjamin Mahler wrote:
> `addSlave()` is asynchronous and we do not wait for all of the 
> `addSlave()` futures to complete, so any speedup in `addSlave()` will only 
> affect the next caller that waits for a result from the allocator.
> 
> Benjamin Mahler wrote:
> Ah I missed that we do a `Clock::settle()`, nevermind :)
> 
> Guangya Liu wrote:
> Some thinking for why `addSlave` does not improve much...
> 
> Without Jacob's patch, the logic woule be:
> 
> ```
> addSlave -> allocate the single slave
> addSlave -> allocate the single slave
> addSlave -> allocate the single slave
> ...
> addSlave -> allocate the single slave
> ```
> 
> With Jacob's patch, the logic would be:
> 
> ```
> addSlave
> addSlave
> addSlave
> ...
> addSlave - > allocate for **all** of the slaves
> ```
> 
> The time elapsed by `allocate a single slave N times` with `allocate N 
> slaves in one allocate` request should not different much, the only 
> difference is one is looping the event queue while another is looping in 
> allocator, that's why there are not enough performance change for this.
> 
> But this will impact a lot when adding frameworks or some other events in 
> allocator which will call `allocate(slaves)`, one proposal is we may need to 
> add some new benchmark test cases which do the following logic, the following 
> logic will trigger each `addframework` operation call `allocate(slaves)` 
> without Jacob's patch, but will only call `allocate(slaves)` one time with 
> Jacob's patch.
> 
> ```
> 1) Add slaves first
> 2) Add frameworks
> ```
> 
> We may get some performance improvement with above case.
> 
> Currently, all of the benchmark test are using 
> 
> ```
> 1) Add frameworks
> 2) Add agents
> ```
> 
> That's why not much performance improvement...
> 
> Jacob Janco wrote:
> This makes sense Guangya, I'm in the process of creating a minimal 
> benchmark adding a set of slaves then adding frameworks. I'll post here if 
> the results are interesting.
> 
> Guangya Liu wrote:
> Jacob, just FYI, your fix does help a lot fo

Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-23 Thread Guangya Liu


> On 九月 23, 2016, 2:40 a.m., Guangya Liu wrote:
> > It is really weired that the performance of 
> > `SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7`
> >  does not improve much when calling `addSlave`, need check more for why 
> > `addSlave` was same? Without fix, the `addSlave` will call `allocate` for 
> > each agent, but with the fix, only one `allocate` will be called
> > 
> > ```
> > without fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 122268us
> > Added 1000 agents in 42.037104secs
> > 
> > With fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 116107us
> > Added 1000 agents in 41.615396secs
> > ```
> 
> Guangya Liu wrote:
> Jacob, I did more test with the code on Aug 23, at which I posted some 
> result in this RR, and found that the test result is different, I did 
> following to get Aug 23 code.
> 
> ```
> LiuGuangyas-MacBook-Pro:build gyliu$ git checkout 
> 2f78a440ef4201c5b11fb92c225694e84a60369c
> 
> LiuGuangyas-MacBook-Pro:build gyliu$ git log -1
> commit 2f78a440ef4201c5b11fb92c225694e84a60369c
> Author: Gilbert Song 
> Date:   Mon Aug 22 13:00:58 2016 -0700
> 
> Fixed potential flakiness in ROOT_RecoverOrphanedPersistentVolume.
> 
> Review: https://reviews.apache.org/r/51271/
> ```
> 
> The test result seems still same as now (without your patch and the code 
> is get from Aug 23):
> 
> ```
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> Using 1000 agents and 6000 frameworks
> Added 6000 frameworks in 144272us
> Added 1000 agents in 43.107001secs
> ```
> 
> But anyway, I think that we need find out why the performance for 
> `addSlave` was not improved based on your patch.
> 
> Jacob Janco wrote:
> Yes agreed, per our Slack discussions, I'll look into this. Thanks for 
> posting the followup.
> 
> Benjamin Mahler wrote:
> `addSlave()` is asynchronous and we do not wait for all of the 
> `addSlave()` futures to complete, so any speedup in `addSlave()` will only 
> affect the next caller that waits for a result from the allocator.
> 
> Benjamin Mahler wrote:
> Ah I missed that we do a `Clock::settle()`, nevermind :)
> 
> Guangya Liu wrote:
> Some thinking for why `addSlave` does not improve much...
> 
> Without Jacob's patch, the logic woule be:
> 
> ```
> addSlave -> allocate the single slave
> addSlave -> allocate the single slave
> addSlave -> allocate the single slave
> ...
> addSlave -> allocate the single slave
> ```
> 
> With Jacob's patch, the logic would be:
> 
> ```
> addSlave
> addSlave
> addSlave
> ...
> addSlave - > allocate for **all** of the slaves
> ```
> 
> The time elapsed by `allocate a single slave N times` with `allocate N 
> slaves in one allocate` request should not different much, the only 
> difference is one is looping the event queue while another is looping in 
> allocator, that's why there are not enough performance change for this.
> 
> But this will impact a lot when adding frameworks or some other events in 
> allocator which will call `allocate(slaves)`, one proposal is we may need to 
> add some new benchmark test cases which do the following logic, the following 
> logic will trigger each `addframework` operation call `allocate(slaves)` 
> without Jacob's patch, but will only call `allocate(slaves)` one time with 
> Jacob's patch.
> 
> ```
> 1) Add slaves first
> 2) Add frameworks
> ```
> 
> We may get some performance improvement with above case.
> 
> Currently, all of the benchmark test are using 
> 
> ```
> 1) Add frameworks
> 2) Add agents
> ```
> 
> That's why not much performance improvement...
> 
> Jacob Janco wrote:
> This makes sense Guangya, I'm in the process of creating a minimal 
> benchmark adding a set of slaves then adding frameworks. I'll post here if 
> the results are interesting.

Jacob, just FYI, your fix does help a lot for the case I listed above!

A ne

Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-23 Thread Guangya Liu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/#review150281
---



I think that you are still working on the test cases as I saw many test are 
failing


src/master/allocator/mesos/hierarchical.hpp (line 214)


```
Allocate resources just from the specifed agents.
```



src/master/allocator/mesos/hierarchical.hpp (line 217)


```
// Allocate resources from all agents.
```



src/master/allocator/mesos/hierarchical.cpp (lines 1291 - 1292)


1) Kill the `/` to the end
2) Why using union of `pendingAllocationAgentIds and slaveIds` here? Why 
not use `pendingAllocationAgentIds` directly? Can you please add more comments 
for the reason here?


- Guangya Liu


On 九月 23, 2016, 4:32 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51027/
> ---
> 
> (Updated 九月 23, 2016, 4:32 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus 
> Ma, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Triggered allocations dispatch allocate() only
>   if there is no pending allocation in the queue.
> - Allocation candidates are accumulated and only
>   cleared when enqueued allocations are processed.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> 2c31471ee0f5d6836393bf87ff9ecfd8df835013 
>   src/master/allocator/mesos/hierarchical.cpp 
> 2d56bd011f2c87c67a02d0ae467a4a537d36867e 
> 
> Diff: https://reviews.apache.org/r/51027/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> note: check without filters depends on https://reviews.apache.org/r/51028
> 
> With new benchmark https://reviews.apache.org/r/49617: 
> Sample output without 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 57251us
> Added 1 agents in 3.2134535333mins
> allocator settled after  1.6123603833mins
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (290578 ms)
> 
> Sample output with 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 39817us
> Added 1 agents in 3.2286054167mins
> allocator settled after  25.525654secs
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (220137 ms)
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 51407: Updated mesos containerizer to checkpoint container runtime information.

2016-09-23 Thread Jie Yu


> On Sept. 18, 2016, 11:51 p.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/containerizer.cpp, lines 1794-1800
> > 
> >
> > Hum, should we do the deletion here if we don't even sure the processes 
> > are killed properly?
> > 
> > I think my suggestion on RuntimePath for container is:
> > 1) we create the RuntimePath as the first thing in 'launch', even 
> > before we call any provisioner/isolator functions.
> > 2) we checkpoint the pid right after fork
> > 3) we delete the RuntimePath after the destroy is successful.
> > 
> > The invariants we have if using the above way are:
> > 1) If RuntimePath exists, we know that provisioner/isolator prepare 
> > might be called, so cleanup is necessary during recovery.
> > 2) If RuntimePath does not exist, we know that all cleanups have been 
> > done properly and we no longer need to worry about cleanup.
> > 3) If pid file exists, we know that the process has been forked.
> > 4) If the pid file does not exist, we may or maynot have process being 
> > forked.
> > 
> > For the upgrade situation, some checkpointed containers or orphans may 
> > not have RuntimePath. In such case, we should not create RuntimePath (i.e., 
> > do not checkpoint pid or exit status) in order to maintain the above 
> > invariant. So for old containers launched by previous version of agent, 
> > there will be no RuntimePath for it at any time (this is another invariant).
> > 
> > It's likely that a container has RuntimePath, but launcher does not 
> > know about it. This is the case where launcher->destroy has been called 
> > (and being successful), but agent crashes before removing RuntimePath.
> > 
> > It's also likely that a container does not have RuntimePath, but 
> > launcher knows about it. This is the legacy container case.

Hum, I don't see all the comments here are addressed? Especially the following:

> 1) we create the RuntimePath as the first thing in 'launch', even before we 
> call any provisioner/isolator functions

> For the upgrade situation, some checkpointed containers or orphans may not 
> have RuntimePath. In such case, we should not create RuntimePath (i.e., do 
> not checkpoint pid or exit status) in order to maintain the above invariant.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51407/#review149406
---


On Sept. 23, 2016, 9:01 p.m., Kevin Klues wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51407/
> ---
> 
> (Updated Sept. 23, 2016, 9:01 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Bugs: MESOS-6204
> https://issues.apache.org/jira/browse/MESOS-6204
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This includes checkpointing both the container pid and the status of
> the container upon exit. This also includes an update to tests to
> account for new 'init' process semantics in a container. That is, the
> name of the init process of the container is now "mesos-containerizer"
> not "sh".
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.hpp 
> 16f9e3e92e90fe7f8a0ebd24e567800e1f285bc9 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 144b0db501d40d4e0bba12672723616bedd76e7e 
>   src/tests/containerizer/isolator_tests.cpp 
> b4d25e57df7f0e157769c9ae4f7847657c505e78 
> 
> Diff: https://reviews.apache.org/r/51407/diff/
> 
> 
> Testing
> ---
> 
> $ GTEST_FILTER="" make -j check
> $ src/mesos-tests
> $ sudo src/mesos-tests
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>



Re: Review Request 52198: Fixed warning in `statistics_tests`.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52198/#review150285
---


Ship it!




Ship It!

- Joseph Wu


On Sept. 22, 2016, 10:09 p.m., Daniel Pravat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52198/
> ---
> 
> (Updated Sept. 22, 2016, 10:09 p.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed warning in `statistics_tests`.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/tests/statistics_tests.cpp 
> 54849a0a95be334a7203d7725185a402947755e5 
> 
> Diff: https://reviews.apache.org/r/52198/diff/
> 
> 
> Testing
> ---
> 
> Windows: build
> OSX: make check
> 
> 
> Thanks,
> 
> Daniel Pravat
> 
>



Re: Review Request 51407: Updated mesos containerizer to checkpoint container runtime information.

2016-09-23 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51407/#review150282
---




src/slave/containerizer/mesos/containerizer.cpp (lines 1252 - 1269)


I think we should do that before provisioner/isolators prepare are called. 
The reason is because we need to do cleanups for provisioner/isolators to undo 
what's done in prepare after agent restarts if agent crashes after prepare is 
done but before here. This is not important for top level containers because 
agent will checkpoint it anyway. But this is important for nested containers, 
because agent won't checkpoint them.



src/slave/containerizer/mesos/containerizer.cpp (lines 1339 - 1345)


Let's avoid checkpoint the pid for legacy container (i.e., those containers 
whose runtime dir does not exist). See my comments last time.



src/slave/containerizer/mesos/containerizer.cpp (line 1340)


4 space indentation



src/slave/containerizer/mesos/containerizer.cpp (line 1887)


This should be LOG(ERROR) or LOG(WARNING)?

I think for legacy container case, we should do a exists check and if 
exists, LOG(ERROR) if we fail to remove it.


- Jie Yu


On Sept. 23, 2016, 9:01 p.m., Kevin Klues wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51407/
> ---
> 
> (Updated Sept. 23, 2016, 9:01 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Bugs: MESOS-6204
> https://issues.apache.org/jira/browse/MESOS-6204
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This includes checkpointing both the container pid and the status of
> the container upon exit. This also includes an update to tests to
> account for new 'init' process semantics in a container. That is, the
> name of the init process of the container is now "mesos-containerizer"
> not "sh".
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.hpp 
> 16f9e3e92e90fe7f8a0ebd24e567800e1f285bc9 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 144b0db501d40d4e0bba12672723616bedd76e7e 
>   src/tests/containerizer/isolator_tests.cpp 
> b4d25e57df7f0e157769c9ae4f7847657c505e78 
> 
> Diff: https://reviews.apache.org/r/51407/diff/
> 
> 
> Testing
> ---
> 
> $ GTEST_FILTER="" make -j check
> $ src/mesos-tests
> $ sudo src/mesos-tests
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>



Re: Review Request 52193: Fixed VC warnings in windows.hpp.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52193/#review150284
---


Ship it!




Ship It!

- Joseph Wu


On Sept. 22, 2016, 9:57 p.m., Daniel Pravat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52193/
> ---
> 
> (Updated Sept. 22, 2016, 9:57 p.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed VC warnings in windows.hpp.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/windows.hpp 
> 5cca853c4b090c4516ad4ec04c6e5830a4e7ccfa 
> 
> Diff: https://reviews.apache.org/r/52193/diff/
> 
> 
> Testing
> ---
> 
> Windows: build
> OSX: make check
> 
> 
> Thanks,
> 
> Daniel Pravat
> 
>



Re: Review Request 52192: Fixed VC warnings in bytes.hpp.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52192/#review150283
---



It's a bit odd that this function signature has a double.  We should 
investigate why and decided whether or not we should change the interface to 
`uint64_t`.

- Joseph Wu


On Sept. 22, 2016, 9:47 p.m., Daniel Pravat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52192/
> ---
> 
> (Updated Sept. 22, 2016, 9:47 p.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed VC warnings in bytes.hpp.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/bytes.hpp 
> 9debe2f5579b9b2d67933ecc2bfcc40c2730f7f1 
> 
> Diff: https://reviews.apache.org/r/52192/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Daniel Pravat
> 
>



Re: Review Request 52065: Fixed warnings in Windows `shell.hpp`.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52065/#review150277
---


Fix it, then Ship it!




I can fix this before committing.


3rdparty/stout/include/stout/os/windows/shell.hpp (lines 101 - 102)


s/valu/value/
s/was/is/
Surround `_P_WAIT` in backticks.

Ditto for the other comments.


- Joseph Wu


On Sept. 19, 2016, 1:04 p.m., Daniel Pravat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52065/
> ---
> 
> (Updated Sept. 19, 2016, 1:04 p.m.)
> 
> 
> Review request for mesos, Alex Naparu, Artem Harutyunyan, Alex Clemmer, 
> Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed warnings in Windows `shell.hpp`.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/os/windows/shell.hpp 
> 8fb48c1c4178485cac2934833b5c440d873e4fc4 
> 
> Diff: https://reviews.apache.org/r/52065/diff/
> 
> 
> Testing
> ---
> 
> Windows: build
> 
> 
> Thanks,
> 
> Daniel Pravat
> 
>



Re: Review Request 52048: Fixed OsTest.User test failure due to gids ordering.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52048/#review150279
---


Ship it!




Ship It!

- Joseph Wu


On Sept. 23, 2016, 4:59 p.m., Mao Geng wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52048/
> ---
> 
> (Updated Sept. 23, 2016, 4:59 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Kapil Arya, and Joseph Wu.
> 
> 
> Bugs: MESOS-5909
> https://issues.apache.org/jira/browse/MESOS-5909
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed OsTest.User test failure due to gids ordering.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/tests/os_tests.cpp c2900b8bf25b06a498c32c81bd2f8c852eb682f5 
> 
> Diff: https://reviews.apache.org/r/52048/diff/
> 
> 
> Testing
> ---
> 
> Just changed the OsTest.User test. `make check` passed on Ubuntu 12.04.5 LTS.
> 
> 
> Thanks,
> 
> Mao Geng
> 
>



Re: Review Request 51406: Updated launch helper to optionally spawn an 'init' process on linux.

2016-09-23 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51406/#review150266
---


Fix it, then Ship it!





src/slave/containerizer/mesos/launch.cpp (line 33)


No need for this as we already included `os.hpp`



src/slave/containerizer/mesos/launch.cpp (line 140)


Looks like sprintf is not async safe in glibc:

http://stackoverflow.com/questions/14573000/print-int-from-signal-handler-using-write-or-async-safe-functions

cerr << below is not async safe either.

I think it's highly unlikely that we'll hit the async safe problem here 
because most of the time, our code will be in syscall. So I suggest we simply 
this function and add a TODO about async safety (we try to be async safe, but 
if it's too hard, use a simple way):

```
Try write = os::write(
containerStatusFd.get(),
std::to_string(status));

if (write.isError()) {
  os::write(
  STDERR_FILENO,
  "Failed to write container status '" + statusString +
  "': " + ::strerror(errno));
}
```

I'd not do the `_exit` here because this helper is meant to be writting the 
status. Leave the `_exit` to the caller.



src/slave/containerizer/mesos/launch.cpp (line 303)


Not your, but what I am thinking here is that if we should 
`exitWithStatus(EXIT_FAILURE)` here or simply `exitWithSignal(SIGABRT)`.

The reason being: If we exit with EXIT_FAILURE, the user might think their 
task (or nested container) failed, rather than something wrong with our init 
process.

Let me know what you think. We don't need to do it now as this is the 
current semantics.


- Jie Yu


On Sept. 23, 2016, 8:59 p.m., Kevin Klues wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51406/
> ---
> 
> (Updated Sept. 23, 2016, 8:59 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Bugs: MESOS-6088
> https://issues.apache.org/jira/browse/MESOS-6088
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously the 'mesos-containerizer launch' binary would simply exec
> into the actual command we wanted to launch after doing some set of
> preperatory work. The problem with this approach, however, is that
> this gives us no opportunity to checkpoint the wait status of the
> command so the agent can recover it in cases where it is offline at
> the time the command completes.
> 
> To compensate for this, we now allow 'mesos-containerizer launch' to
> take an optional '--runtime_directory' argument, which indicates where
> to checkpint the status of the launched command (as returned by
> 'waitpid()'). In order to do this checkpointing, however, we cannot
> simply exec into the command anymore. Instead we now fork-exec the
> command and reap it once it completes. We then checkpoint its status
> and return it as our own. We call the original process the 'init'
> process of the container and the fork-exec'd process its child.  As a
> side effect of doing things this way, we also have to be careful to
> forward all signals sent to the init process down to the child.
> 
> In a subsequent commit we will update the mesos containerizer to use
> this new functionality.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/launch.hpp 
> 859990cb85e9e8c06400397256cfc512f0811800 
>   src/slave/containerizer/mesos/launch.cpp 
> 777121cffaa5fc76adfefc1e617409d997c7aac8 
> 
> Diff: https://reviews.apache.org/r/51406/diff/
> 
> 
> Testing
> ---
> 
> $ GTEST_FILTER="" make -j check
> $ src/mesos-tests
> $ sudo src/mesos-tests
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>



Re: Review Request 52231: Fixed driver based schedulers to ACK updates from HTTP executors.

2016-09-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52231/#review150278
---


Fix it, then Ship it!





src/slave/slave.cpp (lines 3622 - 3628)


How about checking that they're both set and they match the slave's id?

```
  if (!update.has_slave_id() || update.slave_id() != info.id()) {
// log warning, increment metric, return  
  }

  if (!update.status().has_slave_id() || update.status().slave_id() != 
info.id()) {
// log warning, increment metric, return
  }
```

As it stands it seems this check will allow the wrong slave id or unset 
slave ids?



src/tests/command_executor_tests.cpp (lines 276 - 283)


Is this part necessary?


- Benjamin Mahler


On Sept. 24, 2016, 1:07 a.m., Anand Mazumdar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52231/
> ---
> 
> (Updated Sept. 24, 2016, 1:07 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-6245
> https://issues.apache.org/jira/browse/MESOS-6245
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, driver based schedulers were not able to acknowledge
> status updates from HTTP based executors if they had explicit
> acknowledgements enabled. This was due to the fact that we
> were not populating `TaskStatus.slave_id` correctly if not
> set. It would also be great to validate `TaskStatus.slave_id`
> set by the executor and let them know if the value is incorrect.
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.cpp 5db4be4bdc7b9a3a2a66a17f8a9ac74c8d3dfbf6 
>   src/slave/slave.cpp 11e9c8af87aa5153f72f2a20cc578fe3d729b153 
>   src/tests/command_executor_tests.cpp 
> 07e5eb4d7c2ace2b6714fbe02f29d41663152556 
> 
> Diff: https://reviews.apache.org/r/52231/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>



Re: Review Request 52105: Prevent `/redirect/foo` loop.

2016-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52105/#review150276
---



Patch looks great!

Reviews applied: [52105]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On Sept. 23, 2016, 6:04 p.m., Charles Allen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52105/
> ---
> 
> (Updated Sept. 23, 2016, 6:04 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6210
> https://issues.apache.org/jira/browse/MESOS-6210
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Prevent `/redirect/foo` loop.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 9005e7c308d5f57c6f5c573951d468a3ba730740 
> 
> Diff: https://reviews.apache.org/r/52105/diff/
> 
> 
> Testing
> ---
> 
> Single Node started with `./src/mesos-master --work_dir=/tmp/mesos`
> 
> 
> ```
> Charless-MacBook-Pro:~ charlesallen$ curl -v http://127.0.0.1:5050/redirect
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Fri, 23 Sep 2016 17:30:17 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v http://127.0.0.1:5050/redirect/
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect/ HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Fri, 23 Sep 2016 17:30:19 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/redirect/foo
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect/foo HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Fri, 23 Sep 2016 17:30:21 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/redirect_foo
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect_foo HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Fri, 23 Sep 2016 17:30:23 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/ HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Fri, 23 Sep 2016 17:30:49 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Fri, 23 Sep 2016 17:30:50 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect_foo
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect_foo HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Fri, 23 Sep 2016 17:30:53 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/foo
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/foo HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Fri, 23 Sep 2016 17:30:55 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> ```
> 
> It is worth noting that in this PR /master/redirect/  and /redirect/ return 
> 404, not 307. This behavior is consistent with master.
> 
> 
> Thanks,
> 
> Charles Allen
> 
>



Review Request 52231: Fixed driver based schedulers to ACK updates from HTTP executors.

2016-09-23 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52231/
---

Review request for mesos and Benjamin Mahler.


Bugs: MESOS-6245
https://issues.apache.org/jira/browse/MESOS-6245


Repository: mesos


Description
---

Previously, driver based schedulers were not able to acknowledge
status updates from HTTP based executors if they had explicit
acknowledgements enabled. This was due to the fact that we
were not populating `TaskStatus.slave_id` correctly if not
set. It would also be great to validate `TaskStatus.slave_id`
set by the executor and let them know if the value is incorrect.


Diffs
-

  src/common/protobuf_utils.cpp 5db4be4bdc7b9a3a2a66a17f8a9ac74c8d3dfbf6 
  src/slave/slave.cpp 11e9c8af87aa5153f72f2a20cc578fe3d729b153 
  src/tests/command_executor_tests.cpp 07e5eb4d7c2ace2b6714fbe02f29d41663152556 

Diff: https://reviews.apache.org/r/52231/diff/


Testing
---

make check


Thanks,

Anand Mazumdar



Re: Review Request 52180: Added ExitedEvents for links whose sockets fail on creation.

2016-09-23 Thread Joseph Wu


> On Sept. 23, 2016, 4:06 p.m., Benjamin Mahler wrote:
> > 3rdparty/libprocess/src/tests/process_tests.cpp, line 972
> > 
> >
> > underscores?

Oops, looks like this test file uses camelCase, even though the libprocess 
style prefers snake_case...

I should do `s/fd_hogs/fdHogs/` to be consistent.


> On Sept. 23, 2016, 4:06 p.m., Benjamin Mahler wrote:
> > 3rdparty/libprocess/src/tests/process_tests.cpp, lines 959-969
> > 
> >
> > Can this be moved down?

Yes.  This is only needed directly prior to the `.relink()` command.


- Joseph


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52180/#review150256
---


On Sept. 23, 2016, 6:04 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52180/
> ---
> 
> (Updated Sept. 23, 2016, 6:04 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Artem Harutyunyan, and Joris Van 
> Remoortere.
> 
> 
> Bugs: MESOS-6246
> https://issues.apache.org/jira/browse/MESOS-6246
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When we try to link to a remote process, we will send the linkee an
> `ExitedEvent` when the link is broken or if the connection fails.
> This patch adds an `ExitedEvent` when the socket creation step fails.
> This is logically equivalent to having the connection step fail.
> 
> Because this is an entirely unexpected case, the log level should be
> a WARNING or higher.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/process.cpp 
> 02a192529e53479d5a163fa6a20873674b51ee2c 
>   3rdparty/libprocess/src/tests/process_tests.cpp 
> b9feec7e34cffe19e49035f8865b150f79258f54 
> 
> Diff: https://reviews.apache.org/r/52180/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 3rdparty/libprocess/libprocess-tests 
> --gtest_filter="ProcessRemoteLinkTest.RemoteLinkLeak" --gtest_repeat=5000 
> --gtest_break_on_failure
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 52181: Added synchronization in link logic to prevent relinking races.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52181/
---

(Updated Sept. 23, 2016, 6:04 p.m.)


Review request for mesos, Benjamin Mahler, Artem Harutyunyan, and Joris Van 
Remoortere.


Changes
---

Moved test into the "Testing Done" as it has some repeatability issues.


Summary (updated)
-

Added synchronization in link logic to prevent relinking races.


Bugs: MESOS-6234
https://issues.apache.org/jira/browse/MESOS-6234


Repository: mesos


Description
---

There is a general pattern in the `SocketManager` in which most methods
will grab the mutex and then check if the socket to manage exists in
the `SocketManager`s mapping.  If the socket does not exist, the
`SocketManager` silently returns.

This adds similar logic inside two critical sections of the `link`
codepath.  If there are multiple calls to `link` in-flight at once,
this prevents sockets from being leaked into unmanaged callback loops.


Diffs (updated)
-

  3rdparty/libprocess/src/process.cpp 02a192529e53479d5a163fa6a20873674b51ee2c 

Diff: https://reviews.apache.org/r/52181/diff/


Testing (updated)
---

make check

---

This test is not included in the review diff because it cannot be run in a loop 
(i.e. `--gtest_repeat`).  The system will run out of emphemeral sockets after 
~80 iterations and the test will begin to fail due to failing socket 
connections.

3rdparty/libprocess/libprocess-tests 
--gtest_filter="ProcessRemoteLinkTest.RemoteRelinkLoop" --gtest_repeat=50 
--gtest_break_on_failure

```
// Verifies that repeated relinking of a remote process
// maintains monitoring of the remote process.
TEST_F(ProcessRemoteLinkTest, RemoteRelinkLoop)
{
  RemoteLinkTestProcess relinker(pid);
  Future relinkerExitedPid;

  EXPECT_CALL(relinker, exited(pid))
.WillOnce(FutureArg<0>(&relinkerExitedPid));

  spawn(relinker);

  // NOTE: The remote process we are linking to never closes sockets.
  // Therefore, we can only make a limited number of connections before
  // the remote process runs out of file descriptors.
  for (int i = 0; i < 200; i++) {
relinker.relink();
  }

  os::killtree(linkee->pid(), SIGKILL);
  reap_linkee();
  linkee = None();

  AWAIT_ASSERT_EQ(pid, relinkerExitedPid);

  terminate(relinker);
  wait(relinker);
}
```

---

In order to test the SSL-downgrade path, you need to tweak the way we launch 
`test-linkee` in `ProcessRemoteLinkTest::SetUp`:
```
std::map empty_environment;
Try s = process::subprocess(
path::join(BUILD_DIR, "test-linkee") +
  " '" + stringify(coordinator.self()) + "'",
process::Subprocess::FD(STDIN_FILENO),
process::Subprocess::FD(STDOUT_FILENO),
process::Subprocess::FD(STDERR_FILENO),
empty_environment);
```

LIBPROCESS_SSL_ENABLED=true LIBPROCESS_SSL_SUPPORT_DOWNGRADE=true 
LIBPROCESS_SSL_KEY_FILE=key.pem LIBPROCESS_SSL_CERT_FILE=cert.pem 
3rdparty/libprocess/libprocess-tests 
--gtest_filter="ProcessRemoteLinkTest.RemoteRelinkLoop" --gtest_repeat=50 
--gtest_break_on_failure


Thanks,

Joseph Wu



Re: Review Request 52182: Prevented a race when relinking with SSL downgrade enabled.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52182/
---

(Updated Sept. 23, 2016, 6:04 p.m.)


Review request for mesos, Benjamin Mahler, Artem Harutyunyan, and Joris Van 
Remoortere.


Changes
---

Reword comment.


Bugs: MESOS-6233
https://issues.apache.org/jira/browse/MESOS-6233


Repository: mesos


Description
---

If there are two or more actors, on the same OS process, that try to
relink at the same time, there is a potential CHECK failure when SSL
downgrade is also enabled.  The following interleaving is problematic:
| Actor A | Actor B |
|-+-|
|Starts to relink.|Starts to relink.|
|Creates socket A.| |
|   Replaces link with Socket A.  | |
| Tries to connect with Socket A. | |
| |Creates socket B.|
|  Connection fails due to SSL.   |   Replaces link with Socket B.  |
| Attempts to downgrade.  | |
| Tries to replace link.  | |
The last step in the interleaving fails because we assert that the
socket we are swapping out exists in the `SocketManager`s state.


Diffs (updated)
-

  3rdparty/libprocess/src/process.cpp 02a192529e53479d5a163fa6a20873674b51ee2c 

Diff: https://reviews.apache.org/r/52182/diff/


Testing
---

See next review.


Thanks,

Joseph Wu



Re: Review Request 52180: Added ExitedEvents for links whose sockets fail on creation.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52180/
---

(Updated Sept. 23, 2016, 6:04 p.m.)


Review request for mesos, Benjamin Mahler, Artem Harutyunyan, and Joris Van 
Remoortere.


Changes
---

Renamed `fd_hogs` and moved that code down a bit.


Bugs: MESOS-6246
https://issues.apache.org/jira/browse/MESOS-6246


Repository: mesos


Description
---

When we try to link to a remote process, we will send the linkee an
`ExitedEvent` when the link is broken or if the connection fails.
This patch adds an `ExitedEvent` when the socket creation step fails.
This is logically equivalent to having the connection step fail.

Because this is an entirely unexpected case, the log level should be
a WARNING or higher.


Diffs (updated)
-

  3rdparty/libprocess/src/process.cpp 02a192529e53479d5a163fa6a20873674b51ee2c 
  3rdparty/libprocess/src/tests/process_tests.cpp 
b9feec7e34cffe19e49035f8865b150f79258f54 

Diff: https://reviews.apache.org/r/52180/diff/


Testing
---

make check

3rdparty/libprocess/libprocess-tests 
--gtest_filter="ProcessRemoteLinkTest.RemoteLinkLeak" --gtest_repeat=5000 
--gtest_break_on_failure


Thanks,

Joseph Wu



Re: Review Request 52182: Prevented a race when relinking with SSL downgrade enabled.

2016-09-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52182/#review150272
---


Ship it!





3rdparty/libprocess/src/process.cpp (lines 1466 - 1467)


Something like?

In this case, we simply stop here and allow the latest created socket to 
continue.

Defer might suggest process::defer to some.


- Benjamin Mahler


On Sept. 23, 2016, 9:49 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52182/
> ---
> 
> (Updated Sept. 23, 2016, 9:49 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Artem Harutyunyan, and Joris Van 
> Remoortere.
> 
> 
> Bugs: MESOS-6233
> https://issues.apache.org/jira/browse/MESOS-6233
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> If there are two or more actors, on the same OS process, that try to
> relink at the same time, there is a potential CHECK failure when SSL
> downgrade is also enabled.  The following interleaving is problematic:
> | Actor A | Actor B |
> |-+-|
> |Starts to relink.|Starts to relink.|
> |Creates socket A.| |
> |   Replaces link with Socket A.  | |
> | Tries to connect with Socket A. | |
> | |Creates socket B.|
> |  Connection fails due to SSL.   |   Replaces link with Socket B.  |
> | Attempts to downgrade.  | |
> | Tries to replace link.  | |
> The last step in the interleaving fails because we assert that the
> socket we are swapping out exists in the `SocketManager`s state.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/process.cpp 
> 02a192529e53479d5a163fa6a20873674b51ee2c 
> 
> Diff: https://reviews.apache.org/r/52182/diff/
> 
> 
> Testing
> ---
> 
> See next review.
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 52048: Fixed OsTest.User test failure due to gids ordering.

2016-09-23 Thread Mao Geng


> On Sept. 22, 2016, 1:47 a.m., Joseph Wu wrote:
> > 3rdparty/stout/tests/os_tests.cpp, line 691
> > 
> >
> > Let's call this variable `expected_gids`.

Done


> On Sept. 22, 2016, 1:47 a.m., Joseph Wu wrote:
> > 3rdparty/stout/tests/os_tests.cpp, line 695
> > 
> >
> > We don't actually include `std::sort` in the headers, so this shouldn't 
> > be able to compile...
> > 
> > Either way, you should fully qualify this function.
> 
> Mao Geng wrote:
> Agree it should be fully qualified. However, last diff didn't fail to 
> compile - I suspect a header file with "using namespace std" is included by 
> accident.

Done


> On Sept. 22, 2016, 1:47 a.m., Joseph Wu wrote:
> > 3rdparty/stout/tests/os_tests.cpp, line 700
> > 
> >
> > No need to clear this vector, especially since it won't be re-used.

Done


- Mao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52048/#review149936
---


On Sept. 23, 2016, 11:59 p.m., Mao Geng wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52048/
> ---
> 
> (Updated Sept. 23, 2016, 11:59 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Kapil Arya, and Joseph Wu.
> 
> 
> Bugs: MESOS-5909
> https://issues.apache.org/jira/browse/MESOS-5909
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed OsTest.User test failure due to gids ordering.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/tests/os_tests.cpp c2900b8bf25b06a498c32c81bd2f8c852eb682f5 
> 
> Diff: https://reviews.apache.org/r/52048/diff/
> 
> 
> Testing
> ---
> 
> Just changed the OsTest.User test. `make check` passed on Ubuntu 12.04.5 LTS.
> 
> 
> Thanks,
> 
> Mao Geng
> 
>



Re: Review Request 52225: Added missing WSTOPSIG macro to windows.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52225/#review150269
---


Ship it!




If this breaks anything on Windows, I'll take responsibility for fixing it :)

- Joseph Wu


On Sept. 23, 2016, 1:41 p.m., Kevin Klues wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52225/
> ---
> 
> (Updated Sept. 23, 2016, 1:41 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> These macros are still not "correct", in that they don't accurately
> represent the completion status of a process on windows, but they are
> now defined so we can fill them in with the proper implementations
> later.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/windows.hpp 
> 5cca853c4b090c4516ad4ec04c6e5830a4e7ccfa 
> 
> Diff: https://reviews.apache.org/r/52225/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>



Re: Review Request 52048: Fixed OsTest.User test failure due to gids ordering.

2016-09-23 Thread Mao Geng


> On Sept. 23, 2016, 4:43 p.m., haosdent huang wrote:
> > 3rdparty/stout/tests/os_tests.cpp, lines 697-699
> > 
> >
> > How about `EXPECT_EQ(tokens.get(), expected_gids);` here?
> 
> Mao Geng wrote:
> Not sure if EXPECT_EQ can compare two vectors directly? 
> I saw 
> https://github.com/google/googletest/blob/master/googletest/docs/Primer.md 
> has an example:  
> ```
> for (int i = 0; i < x.size(); ++i) {
>   EXPECT_EQ(x[i], y[i]) << "Vectors x and y differ at index " << i;
> }
> ```
> 
> Mao Geng wrote:
> Just tested EXPECT_EQ can compare two vectors perfectly. Thanks @haosdent 
> for pointing it out. I am updating the diff

Done


- Mao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52048/#review150202
---


On Sept. 23, 2016, 11:59 p.m., Mao Geng wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52048/
> ---
> 
> (Updated Sept. 23, 2016, 11:59 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Kapil Arya, and Joseph Wu.
> 
> 
> Bugs: MESOS-5909
> https://issues.apache.org/jira/browse/MESOS-5909
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed OsTest.User test failure due to gids ordering.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/tests/os_tests.cpp c2900b8bf25b06a498c32c81bd2f8c852eb682f5 
> 
> Diff: https://reviews.apache.org/r/52048/diff/
> 
> 
> Testing
> ---
> 
> Just changed the OsTest.User test. `make check` passed on Ubuntu 12.04.5 LTS.
> 
> 
> Thanks,
> 
> Mao Geng
> 
>



Re: Review Request 52224: Updated check for instantiating the "windows" launcher.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52224/#review150268
---


Ship it!




LGTM!

- Joseph Wu


On Sept. 23, 2016, 1:40 p.m., Kevin Klues wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52224/
> ---
> 
> (Updated Sept. 23, 2016, 1:40 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we allowed the posix launcher to specified on windows and
> just blindly ignored it and instantiated a windows launcher anyway.
> This was likely due to legacy code, not setting the default launcher
> properly in flags.cpp. Now that we set the default to "windows"
> properly, this hack is no longer needed.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 144b0db501d40d4e0bba12672723616bedd76e7e 
> 
> Diff: https://reviews.apache.org/r/52224/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>



Re: Review Request 52048: Fixed OsTest.User test failure due to gids ordering.

2016-09-23 Thread Mao Geng

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52048/
---

(Updated Sept. 23, 2016, 11:59 p.m.)


Review request for mesos, Gilbert Song, Kapil Arya, and Joseph Wu.


Bugs: MESOS-5909
https://issues.apache.org/jira/browse/MESOS-5909


Repository: mesos


Description
---

Fixed OsTest.User test failure due to gids ordering.


Diffs (updated)
-

  3rdparty/stout/tests/os_tests.cpp c2900b8bf25b06a498c32c81bd2f8c852eb682f5 

Diff: https://reviews.apache.org/r/52048/diff/


Testing
---

Just changed the OsTest.User test. `make check` passed on Ubuntu 12.04.5 LTS.


Thanks,

Mao Geng



Re: Review Request 52012: Reverted "Added extra functions to the 'Launcher' abstraction.".

2016-09-23 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52012/#review150265
---


Ship it!




Ship It!

- Jie Yu


On Sept. 23, 2016, 8:59 p.m., Kevin Klues wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52012/
> ---
> 
> (Updated Sept. 23, 2016, 8:59 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Gilbert Song, and Jie Yu.
> 
> 
> Bugs: MESOS-6204
> https://issues.apache.org/jira/browse/MESOS-6204
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This reverts commit bb047cd72936aa1836b6e959127a572c72fb824b.
> 
> We decided to revert this commit in favor of keeping all container
> checkpointing information in the containerizer. A subsequent commit
> will reintroduce this functionality there.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/launcher.hpp 
> 6d9fb8dacb8f697b48d3088fabe2cff321bb31c1 
>   src/slave/containerizer/mesos/launcher.cpp 
> 2155e7a1d94368ba18efbd77fc0c1c57010df17f 
>   src/slave/containerizer/mesos/linux_launcher.hpp 
> 2cd1e4d4b6d9e656c2447a9d3ab8e1f49ba5fffa 
>   src/slave/containerizer/mesos/linux_launcher.cpp 
> f27757032cc28be0f0067bed045536431dde4d6c 
>   src/tests/containerizer/launcher.hpp 
> a149220a41f12d11b774660381409105f8d70512 
> 
> Diff: https://reviews.apache.org/r/52012/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>



Re: Review Request 52013: Added helpers for maintaining files in a container runtime directory.

2016-09-23 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52013/#review150264
---


Fix it, then Ship it!





src/slave/containerizer/mesos/paths.hpp (line 17)


`__MESOS_CONTAINERIZER_PATHS_HPP__`



src/slave/containerizer/mesos/paths.hpp (line 33)


2 lines apart



src/slave/containerizer/mesos/paths.hpp (line 63)


camelCase here.


- Jie Yu


On Sept. 23, 2016, 8:51 p.m., Kevin Klues wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52013/
> ---
> 
> (Updated Sept. 23, 2016, 8:51 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Gilbert Song, and Jie Yu.
> 
> 
> Bugs: MESOS-6204
> https://issues.apache.org/jira/browse/MESOS-6204
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added helpers for maintaining files in a container runtime directory.
> 
> 
> Diffs
> -
> 
>   src/CMakeLists.txt 42c52b60cc850901f2eff1545cf7900f4a65ca81 
>   src/Makefile.am bfdb66a6969a35660d545210c1c6951926117ef3 
>   src/slave/containerizer/mesos/paths.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/paths.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/52013/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>



Re: Review Request 52225: Added missing WSTOPSIG macro to windows.

2016-09-23 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52225/#review150263
---


Ship it!




Ship It!

- Jie Yu


On Sept. 23, 2016, 8:41 p.m., Kevin Klues wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52225/
> ---
> 
> (Updated Sept. 23, 2016, 8:41 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> These macros are still not "correct", in that they don't accurately
> represent the completion status of a process on windows, but they are
> now defined so we can fill them in with the proper implementations
> later.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/windows.hpp 
> 5cca853c4b090c4516ad4ec04c6e5830a4e7ccfa 
> 
> Diff: https://reviews.apache.org/r/52225/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>



Re: Review Request 52224: Updated check for instantiating the "windows" launcher.

2016-09-23 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52224/#review150261
---


Ship it!




Ship It!

- Jie Yu


On Sept. 23, 2016, 8:40 p.m., Kevin Klues wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52224/
> ---
> 
> (Updated Sept. 23, 2016, 8:40 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we allowed the posix launcher to specified on windows and
> just blindly ignored it and instantiated a windows launcher anyway.
> This was likely due to legacy code, not setting the default launcher
> properly in flags.cpp. Now that we set the default to "windows"
> properly, this hack is no longer needed.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 144b0db501d40d4e0bba12672723616bedd76e7e 
> 
> Diff: https://reviews.apache.org/r/52224/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>



Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-23 Thread Jacob Janco


> On Sept. 23, 2016, 2:40 a.m., Guangya Liu wrote:
> > It is really weired that the performance of 
> > `SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7`
> >  does not improve much when calling `addSlave`, need check more for why 
> > `addSlave` was same? Without fix, the `addSlave` will call `allocate` for 
> > each agent, but with the fix, only one `allocate` will be called
> > 
> > ```
> > without fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 122268us
> > Added 1000 agents in 42.037104secs
> > 
> > With fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 116107us
> > Added 1000 agents in 41.615396secs
> > ```
> 
> Guangya Liu wrote:
> Jacob, I did more test with the code on Aug 23, at which I posted some 
> result in this RR, and found that the test result is different, I did 
> following to get Aug 23 code.
> 
> ```
> LiuGuangyas-MacBook-Pro:build gyliu$ git checkout 
> 2f78a440ef4201c5b11fb92c225694e84a60369c
> 
> LiuGuangyas-MacBook-Pro:build gyliu$ git log -1
> commit 2f78a440ef4201c5b11fb92c225694e84a60369c
> Author: Gilbert Song 
> Date:   Mon Aug 22 13:00:58 2016 -0700
> 
> Fixed potential flakiness in ROOT_RecoverOrphanedPersistentVolume.
> 
> Review: https://reviews.apache.org/r/51271/
> ```
> 
> The test result seems still same as now (without your patch and the code 
> is get from Aug 23):
> 
> ```
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> Using 1000 agents and 6000 frameworks
> Added 6000 frameworks in 144272us
> Added 1000 agents in 43.107001secs
> ```
> 
> But anyway, I think that we need find out why the performance for 
> `addSlave` was not improved based on your patch.
> 
> Jacob Janco wrote:
> Yes agreed, per our Slack discussions, I'll look into this. Thanks for 
> posting the followup.
> 
> Benjamin Mahler wrote:
> `addSlave()` is asynchronous and we do not wait for all of the 
> `addSlave()` futures to complete, so any speedup in `addSlave()` will only 
> affect the next caller that waits for a result from the allocator.
> 
> Benjamin Mahler wrote:
> Ah I missed that we do a `Clock::settle()`, nevermind :)
> 
> Guangya Liu wrote:
> Some thinking for why `addSlave` does not improve much...
> 
> Without Jacob's patch, the logic woule be:
> 
> ```
> addSlave -> allocate the single slave
> addSlave -> allocate the single slave
> addSlave -> allocate the single slave
> ...
> addSlave -> allocate the single slave
> ```
> 
> With Jacob's patch, the logic would be:
> 
> ```
> addSlave
> addSlave
> addSlave
> ...
> addSlave - > allocate for **all** of the slaves
> ```
> 
> The time elapsed by `allocate a single slave N times` with `allocate N 
> slaves in one allocate` request should not different much, the only 
> difference is one is looping the event queue while another is looping in 
> allocator, that's why there are not enough performance change for this.
> 
> But this will impact a lot when adding frameworks or some other events in 
> allocator which will call `allocate(slaves)`, one proposal is we may need to 
> add some new benchmark test cases which do the following logic, the following 
> logic will trigger each `addframework` operation call `allocate(slaves)` 
> without Jacob's patch, but will only call `allocate(slaves)` one time with 
> Jacob's patch.
> 
> ```
> 1) Add slaves first
> 2) Add frameworks
> ```
> 
> We may get some performance improvement with above case.
> 
> Currently, all of the benchmark test are using 
> 
> ```
> 1) Add frameworks
> 2) Add agents
> ```
> 
> That's why not much performance improvement...

This makes sense Guangya, I'm in the process of creating a minimal benchmark 
adding a set of slaves then adding frameworks. I'll post here if the results 
are interesting.


- Jacob


---
This is an automatically generated 

Re: Review Request 52048: Fixed OsTest.User test failure due to gids ordering.

2016-09-23 Thread Mao Geng

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52048/
---

(Updated Sept. 23, 2016, 11:43 p.m.)


Review request for mesos, Gilbert Song, Kapil Arya, and Joseph Wu.


Bugs: MESOS-5909
https://issues.apache.org/jira/browse/MESOS-5909


Repository: mesos


Description
---

Fixed OsTest.User test failure due to gids ordering.


Diffs (updated)
-

  3rdparty/stout/tests/os_tests.cpp c2900b8 

Diff: https://reviews.apache.org/r/52048/diff/


Testing
---

Just changed the OsTest.User test. `make check` passed on Ubuntu 12.04.5 LTS.


Thanks,

Mao Geng



Re: Review Request 52048: Fixed OsTest.User test failure due to gids ordering.

2016-09-23 Thread Mao Geng


> On Sept. 23, 2016, 4:43 p.m., haosdent huang wrote:
> > 3rdparty/stout/tests/os_tests.cpp, lines 697-699
> > 
> >
> > How about `EXPECT_EQ(tokens.get(), expected_gids);` here?
> 
> Mao Geng wrote:
> Not sure if EXPECT_EQ can compare two vectors directly? 
> I saw 
> https://github.com/google/googletest/blob/master/googletest/docs/Primer.md 
> has an example:  
> ```
> for (int i = 0; i < x.size(); ++i) {
>   EXPECT_EQ(x[i], y[i]) << "Vectors x and y differ at index " << i;
> }
> ```

Just tested EXPECT_EQ can compare two vectors perfectly. Thanks @haosdent for 
pointing it out. I am updating the diff


- Mao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52048/#review150202
---


On Sept. 22, 2016, 7:29 a.m., Mao Geng wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52048/
> ---
> 
> (Updated Sept. 22, 2016, 7:29 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Kapil Arya, and Joseph Wu.
> 
> 
> Bugs: MESOS-5909
> https://issues.apache.org/jira/browse/MESOS-5909
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed OsTest.User test failure due to gids ordering.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/tests/os_tests.cpp c2900b8 
> 
> Diff: https://reviews.apache.org/r/52048/diff/
> 
> 
> Testing
> ---
> 
> Just changed the OsTest.User test. `make check` passed on Ubuntu 12.04.5 LTS.
> 
> 
> Thanks,
> 
> Mao Geng
> 
>



Re: Review Request 52180: Added ExitedEvents for links whose sockets fail on creation.

2016-09-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52180/#review150256
---


Ship it!




Seems like the wrong ticket to link to, mind creating a new one for the lack of 
ExitedEvent during socket creation error?


3rdparty/libprocess/src/tests/process_tests.cpp (lines 959 - 969)


Can this be moved down?



3rdparty/libprocess/src/tests/process_tests.cpp (line 972)


underscores?


- Benjamin Mahler


On Sept. 23, 2016, 9:49 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52180/
> ---
> 
> (Updated Sept. 23, 2016, 9:49 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Artem Harutyunyan, and Joris Van 
> Remoortere.
> 
> 
> Bugs: MESOS-6234
> https://issues.apache.org/jira/browse/MESOS-6234
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When we try to link to a remote process, we will send the linkee an
> `ExitedEvent` when the link is broken or if the connection fails.
> This patch adds an `ExitedEvent` when the socket creation step fails.
> This is logically equivalent to having the connection step fail.
> 
> Because this is an entirely unexpected case, the log level should be
> a WARNING or higher.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/process.cpp 
> 02a192529e53479d5a163fa6a20873674b51ee2c 
>   3rdparty/libprocess/src/tests/process_tests.cpp 
> b9feec7e34cffe19e49035f8865b150f79258f54 
> 
> Diff: https://reviews.apache.org/r/52180/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 3rdparty/libprocess/libprocess-tests 
> --gtest_filter="ProcessRemoteLinkTest.RemoteLinkLeak" --gtest_repeat=5000 
> --gtest_break_on_failure
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-23 Thread Guangya Liu


> On 九月 23, 2016, 2:40 a.m., Guangya Liu wrote:
> > It is really weired that the performance of 
> > `SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7`
> >  does not improve much when calling `addSlave`, need check more for why 
> > `addSlave` was same? Without fix, the `addSlave` will call `allocate` for 
> > each agent, but with the fix, only one `allocate` will be called
> > 
> > ```
> > without fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 122268us
> > Added 1000 agents in 42.037104secs
> > 
> > With fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 116107us
> > Added 1000 agents in 41.615396secs
> > ```
> 
> Guangya Liu wrote:
> Jacob, I did more test with the code on Aug 23, at which I posted some 
> result in this RR, and found that the test result is different, I did 
> following to get Aug 23 code.
> 
> ```
> LiuGuangyas-MacBook-Pro:build gyliu$ git checkout 
> 2f78a440ef4201c5b11fb92c225694e84a60369c
> 
> LiuGuangyas-MacBook-Pro:build gyliu$ git log -1
> commit 2f78a440ef4201c5b11fb92c225694e84a60369c
> Author: Gilbert Song 
> Date:   Mon Aug 22 13:00:58 2016 -0700
> 
> Fixed potential flakiness in ROOT_RecoverOrphanedPersistentVolume.
> 
> Review: https://reviews.apache.org/r/51271/
> ```
> 
> The test result seems still same as now (without your patch and the code 
> is get from Aug 23):
> 
> ```
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> Using 1000 agents and 6000 frameworks
> Added 6000 frameworks in 144272us
> Added 1000 agents in 43.107001secs
> ```
> 
> But anyway, I think that we need find out why the performance for 
> `addSlave` was not improved based on your patch.
> 
> Jacob Janco wrote:
> Yes agreed, per our Slack discussions, I'll look into this. Thanks for 
> posting the followup.
> 
> Benjamin Mahler wrote:
> `addSlave()` is asynchronous and we do not wait for all of the 
> `addSlave()` futures to complete, so any speedup in `addSlave()` will only 
> affect the next caller that waits for a result from the allocator.
> 
> Benjamin Mahler wrote:
> Ah I missed that we do a `Clock::settle()`, nevermind :)

Some thinking for why `addSlave` does not improve much...

Without Jacob's patch, the logic woule be:

```
addSlave -> allocate the single slave
addSlave -> allocate the single slave
addSlave -> allocate the single slave
...
addSlave -> allocate the single slave
```

With Jacob's patch, the logic would be:

```
addSlave
addSlave
addSlave
...
addSlave - > allocate for **all** of the slaves
```

The time elapsed by `allocate a single slave N times` with `allocate N slaves 
in one allocate` request should not different much, the only difference is one 
is looping the event queue while another is looping in allocator, that's why 
there are not enough performance change for this.

But this will impact a lot when adding frameworks or some other events in 
allocator which will call `allocate(slaves)`, one proposal is we may need to 
add some new benchmark test cases which do the following logic, the following 
logic will trigger each `addframework` operation call `allocate(slaves)` 
without Jacob's patch, but will only call `allocate(slaves)` one time with 
Jacob's patch.

```
1) Add slaves first
2) Add frameworks
```

We may get some performance improvement with above case.

Currently, all of the benchmark test are using 

```
1) Add frameworks
2) Add agents
```

That's why not much performance improvement...


- Guangya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/#review150123
---


On 九月 23, 2016, 4:32 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51027/
> ---
> 
> (Updated 九月 23, 2016, 4:32 p.m.)
> 
> 
> Revie

Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-23 Thread Benjamin Mahler


> On Sept. 23, 2016, 2:40 a.m., Guangya Liu wrote:
> > It is really weired that the performance of 
> > `SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7`
> >  does not improve much when calling `addSlave`, need check more for why 
> > `addSlave` was same? Without fix, the `addSlave` will call `allocate` for 
> > each agent, but with the fix, only one `allocate` will be called
> > 
> > ```
> > without fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 122268us
> > Added 1000 agents in 42.037104secs
> > 
> > With fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 116107us
> > Added 1000 agents in 41.615396secs
> > ```
> 
> Guangya Liu wrote:
> Jacob, I did more test with the code on Aug 23, at which I posted some 
> result in this RR, and found that the test result is different, I did 
> following to get Aug 23 code.
> 
> ```
> LiuGuangyas-MacBook-Pro:build gyliu$ git checkout 
> 2f78a440ef4201c5b11fb92c225694e84a60369c
> 
> LiuGuangyas-MacBook-Pro:build gyliu$ git log -1
> commit 2f78a440ef4201c5b11fb92c225694e84a60369c
> Author: Gilbert Song 
> Date:   Mon Aug 22 13:00:58 2016 -0700
> 
> Fixed potential flakiness in ROOT_RecoverOrphanedPersistentVolume.
> 
> Review: https://reviews.apache.org/r/51271/
> ```
> 
> The test result seems still same as now (without your patch and the code 
> is get from Aug 23):
> 
> ```
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> Using 1000 agents and 6000 frameworks
> Added 6000 frameworks in 144272us
> Added 1000 agents in 43.107001secs
> ```
> 
> But anyway, I think that we need find out why the performance for 
> `addSlave` was not improved based on your patch.
> 
> Jacob Janco wrote:
> Yes agreed, per our Slack discussions, I'll look into this. Thanks for 
> posting the followup.
> 
> Benjamin Mahler wrote:
> `addSlave()` is asynchronous and we do not wait for all of the 
> `addSlave()` futures to complete, so any speedup in `addSlave()` will only 
> affect the next caller that waits for a result from the allocator.

Ah I missed that we do a `Clock::settle()`, nevermind :)


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/#review150123
---


On Sept. 23, 2016, 4:32 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51027/
> ---
> 
> (Updated Sept. 23, 2016, 4:32 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus 
> Ma, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Triggered allocations dispatch allocate() only
>   if there is no pending allocation in the queue.
> - Allocation candidates are accumulated and only
>   cleared when enqueued allocations are processed.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> 2c31471ee0f5d6836393bf87ff9ecfd8df835013 
>   src/master/allocator/mesos/hierarchical.cpp 
> 2d56bd011f2c87c67a02d0ae467a4a537d36867e 
> 
> Diff: https://reviews.apache.org/r/51027/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> note: check without filters depends on https://reviews.apache.org/r/51028
> 
> With new benchmark https://reviews.apache.org/r/49617: 
> Sample output without 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 57251us
> Added 1 agents in 3.2134535333mins
> allocator settled after  1.6123603833mins
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (290578 ms)
> 
> Sample output with 51027:
> [ RUN  ] 
> S

Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-23 Thread Benjamin Mahler


> On Sept. 23, 2016, 2:40 a.m., Guangya Liu wrote:
> > It is really weired that the performance of 
> > `SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7`
> >  does not improve much when calling `addSlave`, need check more for why 
> > `addSlave` was same? Without fix, the `addSlave` will call `allocate` for 
> > each agent, but with the fix, only one `allocate` will be called
> > 
> > ```
> > without fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 122268us
> > Added 1000 agents in 42.037104secs
> > 
> > With fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 116107us
> > Added 1000 agents in 41.615396secs
> > ```
> 
> Guangya Liu wrote:
> Jacob, I did more test with the code on Aug 23, at which I posted some 
> result in this RR, and found that the test result is different, I did 
> following to get Aug 23 code.
> 
> ```
> LiuGuangyas-MacBook-Pro:build gyliu$ git checkout 
> 2f78a440ef4201c5b11fb92c225694e84a60369c
> 
> LiuGuangyas-MacBook-Pro:build gyliu$ git log -1
> commit 2f78a440ef4201c5b11fb92c225694e84a60369c
> Author: Gilbert Song 
> Date:   Mon Aug 22 13:00:58 2016 -0700
> 
> Fixed potential flakiness in ROOT_RecoverOrphanedPersistentVolume.
> 
> Review: https://reviews.apache.org/r/51271/
> ```
> 
> The test result seems still same as now (without your patch and the code 
> is get from Aug 23):
> 
> ```
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> Using 1000 agents and 6000 frameworks
> Added 6000 frameworks in 144272us
> Added 1000 agents in 43.107001secs
> ```
> 
> But anyway, I think that we need find out why the performance for 
> `addSlave` was not improved based on your patch.
> 
> Jacob Janco wrote:
> Yes agreed, per our Slack discussions, I'll look into this. Thanks for 
> posting the followup.

`addSlave()` is asynchronous and we do not wait for all of the `addSlave()` 
futures to complete, so any speedup in `addSlave()` will only affect the next 
caller that waits for a result from the allocator.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/#review150123
---


On Sept. 23, 2016, 4:32 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51027/
> ---
> 
> (Updated Sept. 23, 2016, 4:32 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus 
> Ma, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Triggered allocations dispatch allocate() only
>   if there is no pending allocation in the queue.
> - Allocation candidates are accumulated and only
>   cleared when enqueued allocations are processed.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> 2c31471ee0f5d6836393bf87ff9ecfd8df835013 
>   src/master/allocator/mesos/hierarchical.cpp 
> 2d56bd011f2c87c67a02d0ae467a4a537d36867e 
> 
> Diff: https://reviews.apache.org/r/51027/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> note: check without filters depends on https://reviews.apache.org/r/51028
> 
> With new benchmark https://reviews.apache.org/r/49617: 
> Sample output without 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 57251us
> Added 1 agents in 3.2134535333mins
> allocator settled after  1.6123603833mins
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (290578 ms)
> 
> Sample output with 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 a

Re: Review Request 52056: Exposed unknown container case from Containerizer::destroy.

2016-09-23 Thread Guangya Liu


> On 九月 20, 2016, 1:27 p.m., Guangya Liu wrote:
> > src/slave/containerizer/composing.cpp, line 462
> > 
> >
> > Why kill this message? Ditto for others
> 
> Benjamin Mahler wrote:
> Hm.. I opted not to log since the caller is now responsible. But since 
> the agent isn't logging destroy failures or unknown containers yet, I'll 
> restore these and add a TODO to have the caller log instead eventually. Note 
> that the agent logs the unknown container case for `wait()` (so I'll keep 
> that omitted).

Good to know, thanks Ben.


- Guangya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52056/#review149620
---


On 九月 23, 2016, 7:46 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52056/
> ---
> 
> (Updated 九月 23, 2016, 7:46 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Vinod Kone.
> 
> 
> Bugs: MESOS-6243
> https://issues.apache.org/jira/browse/MESOS-6243
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently the callers of `destroy` cannot determine if the call
> succeeds or fails (without a secondary call to `wait()`).
> 
> This also allows the caller to distinguish between a failure and
> waiting on an unknown container. This is important for the upcoming
> agent child container API, as the end-user would benefit from the
> distinction.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/composing.hpp 
> ef3c2ee22cf0ced35c64025c7f9f7ca75165d54d 
>   src/slave/containerizer/composing.cpp 
> 5ff3e65facd5831e2637ae1cee3ea01b21d0f6b2 
>   src/slave/containerizer/containerizer.hpp 
> f13669d0dfc4ce3287cfe630cabd0470dc765b51 
>   src/slave/containerizer/docker.hpp a378fa9772742b4934578efcec54aeeaf5791a83 
>   src/slave/containerizer/docker.cpp 110a1eb41b1ff7cb94f3630a1843d9f01efbe09c 
>   src/slave/containerizer/mesos/containerizer.hpp 
> 078ef4f4e7bf5e1522804a720c51cfa5518d8efd 
>   src/slave/containerizer/mesos/containerizer.cpp 
> dc18e4e3b0eca3f116f1e240217bbebf64a75e3a 
>   src/tests/containerizer.hpp f1fd57945c09fb80b7790f9124843a5d4ea785ee 
>   src/tests/containerizer.cpp bda3e6f84f77a95f0eb9ff3aabdc9513f0f18b3f 
>   src/tests/containerizer/composing_containerizer_tests.cpp 
> 51aab33cd190b53328339e39fd12853714882454 
>   src/tests/containerizer/docker_containerizer_tests.cpp 
> 28cd3fa66886dbdbae3fdeca77707147faafcb7a 
>   src/tests/containerizer/mesos_containerizer_tests.cpp 
> 96e24500a12825161553eb050da389088b122695 
> 
> Diff: https://reviews.apache.org/r/52056/diff/
> 
> 
> Testing
> ---
> 
> Added tests.
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-23 Thread Jacob Janco


> On Sept. 23, 2016, 2:40 a.m., Guangya Liu wrote:
> > It is really weired that the performance of 
> > `SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7`
> >  does not improve much when calling `addSlave`, need check more for why 
> > `addSlave` was same? Without fix, the `addSlave` will call `allocate` for 
> > each agent, but with the fix, only one `allocate` will be called
> > 
> > ```
> > without fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 122268us
> > Added 1000 agents in 42.037104secs
> > 
> > With fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 116107us
> > Added 1000 agents in 41.615396secs
> > ```
> 
> Guangya Liu wrote:
> Jacob, I did more test with the code on Aug 23, at which I posted some 
> result in this RR, and found that the test result is different, I did 
> following to get Aug 23 code.
> 
> ```
> LiuGuangyas-MacBook-Pro:build gyliu$ git checkout 
> 2f78a440ef4201c5b11fb92c225694e84a60369c
> 
> LiuGuangyas-MacBook-Pro:build gyliu$ git log -1
> commit 2f78a440ef4201c5b11fb92c225694e84a60369c
> Author: Gilbert Song 
> Date:   Mon Aug 22 13:00:58 2016 -0700
> 
> Fixed potential flakiness in ROOT_RecoverOrphanedPersistentVolume.
> 
> Review: https://reviews.apache.org/r/51271/
> ```
> 
> The test result seems still same as now (without your patch and the code 
> is get from Aug 23):
> 
> ```
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> Using 1000 agents and 6000 frameworks
> Added 6000 frameworks in 144272us
> Added 1000 agents in 43.107001secs
> ```
> 
> But anyway, I think that we need find out why the performance for 
> `addSlave` was not improved based on your patch.

Yes agreed, per our Slack discussions, I'll look into this. Thanks for posting 
the followup.


- Jacob


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/#review150123
---


On Sept. 23, 2016, 4:32 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51027/
> ---
> 
> (Updated Sept. 23, 2016, 4:32 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus 
> Ma, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Triggered allocations dispatch allocate() only
>   if there is no pending allocation in the queue.
> - Allocation candidates are accumulated and only
>   cleared when enqueued allocations are processed.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> 2c31471ee0f5d6836393bf87ff9ecfd8df835013 
>   src/master/allocator/mesos/hierarchical.cpp 
> 2d56bd011f2c87c67a02d0ae467a4a537d36867e 
> 
> Diff: https://reviews.apache.org/r/51027/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> note: check without filters depends on https://reviews.apache.org/r/51028
> 
> With new benchmark https://reviews.apache.org/r/49617: 
> Sample output without 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 57251us
> Added 1 agents in 3.2134535333mins
> allocator settled after  1.6123603833mins
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (290578 ms)
> 
> Sample output with 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 39817us
> Added 1 agents in 3.2286054167mins
> allocator settled after  25.525654secs
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover

Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-23 Thread Guangya Liu


> On 九月 23, 2016, 2:40 a.m., Guangya Liu wrote:
> > It is really weired that the performance of 
> > `SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7`
> >  does not improve much when calling `addSlave`, need check more for why 
> > `addSlave` was same? Without fix, the `addSlave` will call `allocate` for 
> > each agent, but with the fix, only one `allocate` will be called
> > 
> > ```
> > without fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 122268us
> > Added 1000 agents in 42.037104secs
> > 
> > With fix:
> > [==] Running 1 test from 1 test case.
> > [--] Global test environment set-up.
> > [--] 1 test from 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
> > [ RUN  ] 
> > SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
> > Using 1000 agents and 6000 frameworks
> > Added 6000 frameworks in 116107us
> > Added 1000 agents in 41.615396secs
> > ```

Jacob, I did more test with the code on Aug 23, at which I posted some result 
in this RR, and found that the test result is different, I did following to get 
Aug 23 code.

```
LiuGuangyas-MacBook-Pro:build gyliu$ git checkout 
2f78a440ef4201c5b11fb92c225694e84a60369c

LiuGuangyas-MacBook-Pro:build gyliu$ git log -1
commit 2f78a440ef4201c5b11fb92c225694e84a60369c
Author: Gilbert Song 
Date:   Mon Aug 22 13:00:58 2016 -0700

Fixed potential flakiness in ROOT_RecoverOrphanedPersistentVolume.

Review: https://reviews.apache.org/r/51271/
```

The test result seems still same as now (without your patch and the code is get 
from Aug 23):

```
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/7
Using 1000 agents and 6000 frameworks
Added 6000 frameworks in 144272us
Added 1000 agents in 43.107001secs
```

But anyway, I think that we need find out why the performance for `addSlave` 
was not improved based on your patch.


- Guangya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/#review150123
---


On 九月 23, 2016, 4:32 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51027/
> ---
> 
> (Updated 九月 23, 2016, 4:32 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus 
> Ma, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Triggered allocations dispatch allocate() only
>   if there is no pending allocation in the queue.
> - Allocation candidates are accumulated and only
>   cleared when enqueued allocations are processed.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> 2c31471ee0f5d6836393bf87ff9ecfd8df835013 
>   src/master/allocator/mesos/hierarchical.cpp 
> 2d56bd011f2c87c67a02d0ae467a4a537d36867e 
> 
> Diff: https://reviews.apache.org/r/51027/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> note: check without filters depends on https://reviews.apache.org/r/51028
> 
> With new benchmark https://reviews.apache.org/r/49617: 
> Sample output without 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 57251us
> Added 1 agents in 3.2134535333mins
> allocator settled after  1.6123603833mins
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (290578 ms)
> 
> Sample output with 51027:
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
> Using 1 agents and 3000 frameworks
> Added 3000 frameworks in 39817us
> Added 1 agents in 3.2286054167mins
> allocator settled after  25.525654secs
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
>  (220137 ms)
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Review Request 52181: WIP: Added synchronization in link logic to prevent relinking races.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52181/
---

Review request for mesos, Benjamin Mahler, Artem Harutyunyan, and Joris Van 
Remoortere.


Bugs: MESOS-6234
https://issues.apache.org/jira/browse/MESOS-6234


Repository: mesos


Description
---

There is a general pattern in the `SocketManager` in which most methods
will grab the mutex and then check if the socket to manage exists in
the `SocketManager`s mapping.  If the socket does not exist, the
`SocketManager` silently returns.

This adds similar logic inside two critical sections of the `link`
codepath.  If there are multiple calls to `link` in-flight at once,
this prevents sockets from being leaked into unmanaged callback loops.


Diffs
-

  3rdparty/libprocess/src/process.cpp 02a192529e53479d5a163fa6a20873674b51ee2c 
  3rdparty/libprocess/src/tests/process_tests.cpp 
b9feec7e34cffe19e49035f8865b150f79258f54 

Diff: https://reviews.apache.org/r/52181/diff/


Testing
---

make check

3rdparty/libprocess/libprocess-tests 
--gtest_filter="ProcessRemoteLinkTest.RemoteRelinkLoop" --gtest_repeat=5000 
--gtest_break_on_failure

Investigating a consistent failure after ~80 iterations:
```
I0923 13:26:08.611300 2060546816 process.cpp:1446] Failed to link, connect: 
Failed to connect to 0.0.0.0:51459: Can't assign requested address
```


Thanks,

Joseph Wu



Re: Review Request 52219: Renamed slave to agent in subprocess.cpp.

2016-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52219/#review150242
---



Patch looks great!

Reviews applied: [52219]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On Sept. 23, 2016, 3:46 p.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52219/
> ---
> 
> (Updated Sept. 23, 2016, 3:46 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Joerg Schad.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Renamed slave to agent in subprocess.cpp.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/subprocess.cpp 
> 8b244434d61884e4bddc4694350265111414e935 
> 
> Diff: https://reviews.apache.org/r/52219/diff/
> 
> 
> Testing
> ---
> 
> These typos are introduced in https://reviews.apache.org/r/45491/
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Review Request 52180: Added ExitedEvents for links whose sockets fail on creation.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52180/
---

Review request for mesos, Benjamin Mahler, Artem Harutyunyan, and Joris Van 
Remoortere.


Bugs: MESOS-6234
https://issues.apache.org/jira/browse/MESOS-6234


Repository: mesos


Description
---

When we try to link to a remote process, we will send the linkee an
`ExitedEvent` when the link is broken or if the connection fails.
This patch adds an `ExitedEvent` when the socket creation step fails.
This is logically equivalent to having the connection step fail.

Because this is an entirely unexpected case, the log level should be
a WARNING or higher.


Diffs
-

  3rdparty/libprocess/src/process.cpp 02a192529e53479d5a163fa6a20873674b51ee2c 
  3rdparty/libprocess/src/tests/process_tests.cpp 
b9feec7e34cffe19e49035f8865b150f79258f54 

Diff: https://reviews.apache.org/r/52180/diff/


Testing
---

make check

3rdparty/libprocess/libprocess-tests 
--gtest_filter="ProcessRemoteLinkTest.RemoteLinkLeak" --gtest_repeat=5000 
--gtest_break_on_failure


Thanks,

Joseph Wu



Review Request 52182: Prevented a race when relinking with SSL downgrade enabled.

2016-09-23 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52182/
---

Review request for mesos, Benjamin Mahler, Artem Harutyunyan, and Joris Van 
Remoortere.


Bugs: MESOS-6233
https://issues.apache.org/jira/browse/MESOS-6233


Repository: mesos


Description
---

If there are two or more actors, on the same OS process, that try to
relink at the same time, there is a potential CHECK failure when SSL
downgrade is also enabled.  The following interleaving is problematic:
| Actor A | Actor B |
|-+-|
|Starts to relink.|Starts to relink.|
|Creates socket A.| |
|   Replaces link with Socket A.  | |
| Tries to connect with Socket A. | |
| |Creates socket B.|
|  Connection fails due to SSL.   |   Replaces link with Socket B.  |
| Attempts to downgrade.  | |
| Tries to replace link.  | |
The last step in the interleaving fails because we assert that the
socket we are swapping out exists in the `SocketManager`s state.


Diffs
-

  3rdparty/libprocess/src/process.cpp 02a192529e53479d5a163fa6a20873674b51ee2c 

Diff: https://reviews.apache.org/r/52182/diff/


Testing
---

See next review.


Thanks,

Joseph Wu



Re: Review Request 52222: Changed the name of a constant in the tests.

2016-09-23 Thread Neil Conway

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5/#review150236
---


Ship it!




Ship It!

- Neil Conway


On Sept. 23, 2016, 7:31 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/5/
> ---
> 
> (Updated Sept. 23, 2016, 7:31 p.m.)
> 
> 
> Review request for mesos, Michael Park and Neil Conway.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This changes the `DEFAULT_ROLE` constant to
> `DEFAULT_TEST_ROLE` to avoid confusion with
> the master's `--default_role`.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp 3cd63bdf34eaad2a9c9cb897c62294c7f0c622dc 
>   src/tests/persistent_volume_tests.cpp 
> a726d784a9e32fb81aa579d37205baed0f4734cc 
> 
> Diff: https://reviews.apache.org/r/5/diff/
> 
> 
> Testing
> ---
> 
> `GTEST_FILTER="*PersistentVolumeTest*" bin/mesos-tests.sh`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 51407: Updated mesos containerizer to checkpoint container runtime information.

2016-09-23 Thread Kevin Klues

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51407/#review150235
---




src/slave/containerizer/mesos/containerizer.cpp (line 155)


This line should be removed. Shephers, please remove this before committing 
if there are no other major problems with the patch.



src/slave/containerizer/mesos/containerizer.cpp (lines 682 - 683)


This comment should be deleted. Shephers, please remove this before 
committing if there are no other major problems with the patch.


- Kevin Klues


On Sept. 23, 2016, 9:01 p.m., Kevin Klues wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51407/
> ---
> 
> (Updated Sept. 23, 2016, 9:01 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Bugs: MESOS-6204
> https://issues.apache.org/jira/browse/MESOS-6204
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This includes checkpointing both the container pid and the status of
> the container upon exit. This also includes an update to tests to
> account for new 'init' process semantics in a container. That is, the
> name of the init process of the container is now "mesos-containerizer"
> not "sh".
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/containerizer.hpp 
> 16f9e3e92e90fe7f8a0ebd24e567800e1f285bc9 
>   src/slave/containerizer/mesos/containerizer.cpp 
> 144b0db501d40d4e0bba12672723616bedd76e7e 
>   src/tests/containerizer/isolator_tests.cpp 
> b4d25e57df7f0e157769c9ae4f7847657c505e78 
> 
> Diff: https://reviews.apache.org/r/51407/diff/
> 
> 
> Testing
> ---
> 
> $ GTEST_FILTER="" make -j check
> $ src/mesos-tests
> $ sudo src/mesos-tests
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>



Re: Review Request 51407: Updated mesos containerizer to checkpoint container runtime information.

2016-09-23 Thread Kevin Klues

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51407/
---

(Updated Sept. 23, 2016, 9:01 p.m.)


Review request for mesos and Jie Yu.


Changes
---

Updated based on changes in previous patches.


Bugs: MESOS-6204
https://issues.apache.org/jira/browse/MESOS-6204


Repository: mesos


Description
---

This includes checkpointing both the container pid and the status of
the container upon exit. This also includes an update to tests to
account for new 'init' process semantics in a container. That is, the
name of the init process of the container is now "mesos-containerizer"
not "sh".


Diffs (updated)
-

  src/slave/containerizer/mesos/containerizer.hpp 
16f9e3e92e90fe7f8a0ebd24e567800e1f285bc9 
  src/slave/containerizer/mesos/containerizer.cpp 
144b0db501d40d4e0bba12672723616bedd76e7e 
  src/tests/containerizer/isolator_tests.cpp 
b4d25e57df7f0e157769c9ae4f7847657c505e78 

Diff: https://reviews.apache.org/r/51407/diff/


Testing
---

$ GTEST_FILTER="" make -j check
$ src/mesos-tests
$ sudo src/mesos-tests


Thanks,

Kevin Klues



Re: Review Request 52012: Reverted "Added extra functions to the 'Launcher' abstraction.".

2016-09-23 Thread Kevin Klues

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52012/
---

(Updated Sept. 23, 2016, 8:59 p.m.)


Review request for mesos, Benjamin Hindman, Gilbert Song, and Jie Yu.


Changes
---

Rebased on master.


Bugs: MESOS-6204
https://issues.apache.org/jira/browse/MESOS-6204


Repository: mesos


Description
---

This reverts commit bb047cd72936aa1836b6e959127a572c72fb824b.

We decided to revert this commit in favor of keeping all container
checkpointing information in the containerizer. A subsequent commit
will reintroduce this functionality there.


Diffs (updated)
-

  src/slave/containerizer/mesos/launcher.hpp 
6d9fb8dacb8f697b48d3088fabe2cff321bb31c1 
  src/slave/containerizer/mesos/launcher.cpp 
2155e7a1d94368ba18efbd77fc0c1c57010df17f 
  src/slave/containerizer/mesos/linux_launcher.hpp 
2cd1e4d4b6d9e656c2447a9d3ab8e1f49ba5fffa 
  src/slave/containerizer/mesos/linux_launcher.cpp 
f27757032cc28be0f0067bed045536431dde4d6c 
  src/tests/containerizer/launcher.hpp a149220a41f12d11b774660381409105f8d70512 

Diff: https://reviews.apache.org/r/52012/diff/


Testing
---


Thanks,

Kevin Klues



Re: Review Request 51406: Updated launch helper to optionally spawn an 'init' process on linux.

2016-09-23 Thread Kevin Klues

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51406/
---

(Updated Sept. 23, 2016, 8:59 p.m.)


Review request for mesos and Jie Yu.


Changes
---

Updated for more robust signal forwarding, as well as consistent checkpointing 
of the exit status. We also now support checkpointing on Posix as well as 
Linux. Tests are forthcoming.


Bugs: MESOS-6088
https://issues.apache.org/jira/browse/MESOS-6088


Repository: mesos


Description (updated)
---

Previously the 'mesos-containerizer launch' binary would simply exec
into the actual command we wanted to launch after doing some set of
preperatory work. The problem with this approach, however, is that
this gives us no opportunity to checkpoint the wait status of the
command so the agent can recover it in cases where it is offline at
the time the command completes.

To compensate for this, we now allow 'mesos-containerizer launch' to
take an optional '--runtime_directory' argument, which indicates where
to checkpint the status of the launched command (as returned by
'waitpid()'). In order to do this checkpointing, however, we cannot
simply exec into the command anymore. Instead we now fork-exec the
command and reap it once it completes. We then checkpoint its status
and return it as our own. We call the original process the 'init'
process of the container and the fork-exec'd process its child.  As a
side effect of doing things this way, we also have to be careful to
forward all signals sent to the init process down to the child.

In a subsequent commit we will update the mesos containerizer to use
this new functionality.


Diffs (updated)
-

  src/slave/containerizer/mesos/launch.hpp 
859990cb85e9e8c06400397256cfc512f0811800 
  src/slave/containerizer/mesos/launch.cpp 
777121cffaa5fc76adfefc1e617409d997c7aac8 

Diff: https://reviews.apache.org/r/51406/diff/


Testing
---

$ GTEST_FILTER="" make -j check
$ src/mesos-tests
$ sudo src/mesos-tests


Thanks,

Kevin Klues



Re: Review Request 52222: Changed the name of a constant in the tests.

2016-09-23 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5/#review150233
---


Ship it!




Ship It!

- Michael Park


On Sept. 23, 2016, 7:31 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/5/
> ---
> 
> (Updated Sept. 23, 2016, 7:31 p.m.)
> 
> 
> Review request for mesos, Michael Park and Neil Conway.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This changes the `DEFAULT_ROLE` constant to
> `DEFAULT_TEST_ROLE` to avoid confusion with
> the master's `--default_role`.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp 3cd63bdf34eaad2a9c9cb897c62294c7f0c622dc 
>   src/tests/persistent_volume_tests.cpp 
> a726d784a9e32fb81aa579d37205baed0f4734cc 
> 
> Diff: https://reviews.apache.org/r/5/diff/
> 
> 
> Testing
> ---
> 
> `GTEST_FILTER="*PersistentVolumeTest*" bin/mesos-tests.sh`
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 52013: Added helpers for maintaining files in a container runtime directory.

2016-09-23 Thread Kevin Klues

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52013/
---

(Updated Sept. 23, 2016, 8:51 p.m.)


Review request for mesos, Benjamin Hindman, Gilbert Song, and Jie Yu.


Changes
---

Reworked the API for the helpers and added a few more helpers to actually pull 
out the PID and STATUS files from the runtime directory.


Summary (updated)
-

Added helpers for maintaining files in a container runtime directory.


Bugs: MESOS-6204
https://issues.apache.org/jira/browse/MESOS-6204


Repository: mesos


Description (updated)
---

Added helpers for maintaining files in a container runtime directory.


Diffs (updated)
-

  src/CMakeLists.txt 42c52b60cc850901f2eff1545cf7900f4a65ca81 
  src/Makefile.am bfdb66a6969a35660d545210c1c6951926117ef3 
  src/slave/containerizer/mesos/paths.hpp PRE-CREATION 
  src/slave/containerizer/mesos/paths.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/52013/diff/


Testing (updated)
---


Thanks,

Kevin Klues



Review Request 52225: Added missing WSTOPSIG macro to windows.

2016-09-23 Thread Kevin Klues

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52225/
---

Review request for mesos and Jie Yu.


Repository: mesos


Description
---

These macros are still not "correct", in that they don't accurately
represent the completion status of a process on windows, but they are
now defined so we can fill them in with the proper implementations
later.


Diffs
-

  3rdparty/stout/include/stout/windows.hpp 
5cca853c4b090c4516ad4ec04c6e5830a4e7ccfa 

Diff: https://reviews.apache.org/r/52225/diff/


Testing
---


Thanks,

Kevin Klues



Review Request 52224: Updated check for instantiating the "windows" launcher.

2016-09-23 Thread Kevin Klues

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52224/
---

Review request for mesos and Jie Yu.


Repository: mesos


Description
---

Previously, we allowed the posix launcher to specified on windows and
just blindly ignored it and instantiated a windows launcher anyway.
This was likely due to legacy code, not setting the default launcher
properly in flags.cpp. Now that we set the default to "windows"
properly, this hack is no longer needed.


Diffs
-

  src/slave/containerizer/mesos/containerizer.cpp 
144b0db501d40d4e0bba12672723616bedd76e7e 

Diff: https://reviews.apache.org/r/52224/diff/


Testing
---


Thanks,

Kevin Klues



Re: Review Request 51720: Updated tests to properly set 'flags.launcher' with correct value.

2016-09-23 Thread Kevin Klues

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51720/
---

(Updated Sept. 23, 2016, 8:38 p.m.)


Review request for mesos, Benjamin Hindman and Jie Yu.


Changes
---

Rebased on master.


Bugs: MESOS-6141
https://issues.apache.org/jira/browse/MESOS-6141


Repository: mesos


Description
---

In some of our tests we manually create a 'PosixLauncher' rather than
relying on the value of 'flags.launcher' to decide which type of
launcher to create. Since calls to 'CreateSlaveFlags()' set
'flags.launcher' to 'linux' by default, there was a discrepency in
what the flags said, and what actual launcher type we were creating.

This commit fixes this to explicitly set 'flags.launcher' to the
appropriate type.


Diffs (updated)
-

  src/tests/container_logger_tests.cpp 1b121a2fcce2d874aeefc4257b9d4e594866e78d 
  src/tests/containerizer/mesos_containerizer_tests.cpp 
f5cc9d52c9629eb18b8f2dd1a380996955cf 

Diff: https://reviews.apache.org/r/51720/diff/


Testing
---

$ GTEST_FILTER="" make -j check
$ src/mesos-tests
$ sudo src/mesos-tests


Thanks,

Kevin Klues



Re: Review Request 51405: Updated some tests to use the CreateSlaveFlags() helper.

2016-09-23 Thread Kevin Klues

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51405/
---

(Updated Sept. 23, 2016, 8:36 p.m.)


Review request for mesos and Jie Yu.


Changes
---

Rebased on master.


Bugs: MESOS-6050
https://issues.apache.org/jira/browse/MESOS-6050


Repository: mesos


Description
---

This change was motivated by the desire to set the new `runtime_dir`
flag properly in these tests.


Diffs (updated)
-

  src/tests/containerizer/mesos_containerizer_tests.cpp 
f5cc9d52c9629eb18b8f2dd1a380996955cf 

Diff: https://reviews.apache.org/r/51405/diff/


Testing
---

$ GTEST_FILTER="" make -j check
$ src/mesos-tests
$ sudo src/mesos-tests


Thanks,

Kevin Klues



Re: Review Request 52135: Implemented the *_NESTED_CONTAINER calls in the agent API.

2016-09-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52135/
---

(Updated Sept. 23, 2016, 7:48 p.m.)


Review request for mesos and Vinod Kone.


Changes
---

Added a more precise JIRA ticket under the epic.


Bugs: MESOS-6241
https://issues.apache.org/jira/browse/MESOS-6241


Repository: mesos


Description
---

This patch adds the wiring for the *_NESTED_CONTAINER calls,
including validation and calling into the containerizer.


Diffs
-

  src/slave/http.cpp 73135be5651300e1fe7a20428aeb034392f915ed 
  src/tests/api_tests.cpp 26f99f7c337fbbc5278d1b30d3d5c8f659ddf5ca 

Diff: https://reviews.apache.org/r/52135/diff/


Testing
---

Added a test that mocks out the containerizer.


Thanks,

Benjamin Mahler



Re: Review Request 52057: Introduced an agent API for managing nested containers.

2016-09-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52057/
---

(Updated Sept. 23, 2016, 7:48 p.m.)


Review request for mesos, Anand Mazumdar, Benjamin Hindman, Jie Yu, and Vinod 
Kone.


Changes
---

Added a more precise JIRA ticket under the epic.


Bugs: MESOS-6241
https://issues.apache.org/jira/browse/MESOS-6241


Repository: mesos


Description
---

This allows the executor to manage nested containers within its
container. This is useful for the implementation of task groups
("pod"-like containers) as well as for executors that would like
to implement isolation between sub-tasks.

This initial patch provides the ability to launch, wait, and
destroy nested containers. This was added as an agent API rather
than an executor API in order to enable operators to inject
subcontainers for future use-cases. For example, an operator
may want to inject a debugging container at run-time.


Diffs
-

  include/mesos/agent/agent.proto 5b91677487f8e23301e67af14c2115c6b4a533ac 
  include/mesos/v1/agent/agent.proto 8145669073553dc8aa56cfe2c0a0b756d70fee0e 
  src/slave/http.cpp 0aae14583faf95fee678b3a3424c96cdba531968 
  src/slave/slave.hpp 20ca2fb0fe89cc9e156b76711be67cac69476e13 
  src/slave/validation.cpp a9f318214182a87783a985cd8495a0ec00c95378 

Diff: https://reviews.apache.org/r/52057/diff/


Testing
---

Tests will be added alongside the implementation.


Thanks,

Benjamin Mahler



Re: Review Request 52100: Added validation of *_NESTED_CONTAINER calls in the agent API.

2016-09-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52100/
---

(Updated Sept. 23, 2016, 7:48 p.m.)


Review request for mesos and Vinod Kone.


Changes
---

Added a more precise JIRA ticket under the epic.


Bugs: MESOS-6241
https://issues.apache.org/jira/browse/MESOS-6241


Repository: mesos


Description
---

Added validation of *_NESTED_CONTAINER calls in the agent API.


Diffs
-

  src/Makefile.am 478fb5ae01dbdf1a735680edf7c5f30867726e07 
  src/slave/validation.cpp a9f318214182a87783a985cd8495a0ec00c95378 
  src/tests/slave_validation_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/52100/diff/


Testing
---

Added tests.


Thanks,

Benjamin Mahler



Re: Review Request 52056: Exposed unknown container case from Containerizer::destroy.

2016-09-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52056/
---

(Updated Sept. 23, 2016, 7:46 p.m.)


Review request for mesos, Jie Yu and Vinod Kone.


Changes
---

Added JIRA ticket.


Bugs: MESOS-6243
https://issues.apache.org/jira/browse/MESOS-6243


Repository: mesos


Description
---

Currently the callers of `destroy` cannot determine if the call
succeeds or fails (without a secondary call to `wait()`).

This also allows the caller to distinguish between a failure and
waiting on an unknown container. This is important for the upcoming
agent child container API, as the end-user would benefit from the
distinction.


Diffs
-

  src/slave/containerizer/composing.hpp 
ef3c2ee22cf0ced35c64025c7f9f7ca75165d54d 
  src/slave/containerizer/composing.cpp 
5ff3e65facd5831e2637ae1cee3ea01b21d0f6b2 
  src/slave/containerizer/containerizer.hpp 
f13669d0dfc4ce3287cfe630cabd0470dc765b51 
  src/slave/containerizer/docker.hpp a378fa9772742b4934578efcec54aeeaf5791a83 
  src/slave/containerizer/docker.cpp 110a1eb41b1ff7cb94f3630a1843d9f01efbe09c 
  src/slave/containerizer/mesos/containerizer.hpp 
078ef4f4e7bf5e1522804a720c51cfa5518d8efd 
  src/slave/containerizer/mesos/containerizer.cpp 
dc18e4e3b0eca3f116f1e240217bbebf64a75e3a 
  src/tests/containerizer.hpp f1fd57945c09fb80b7790f9124843a5d4ea785ee 
  src/tests/containerizer.cpp bda3e6f84f77a95f0eb9ff3aabdc9513f0f18b3f 
  src/tests/containerizer/composing_containerizer_tests.cpp 
51aab33cd190b53328339e39fd12853714882454 
  src/tests/containerizer/docker_containerizer_tests.cpp 
28cd3fa66886dbdbae3fdeca77707147faafcb7a 
  src/tests/containerizer/mesos_containerizer_tests.cpp 
96e24500a12825161553eb050da389088b122695 

Diff: https://reviews.apache.org/r/52056/diff/


Testing
---

Added tests.

make check


Thanks,

Benjamin Mahler



Re: Review Request 52055: Exposed unknown container case from Containerizer::wait.

2016-09-23 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52055/
---

(Updated Sept. 23, 2016, 7:45 p.m.)


Review request for mesos, Jie Yu and Vinod Kone.


Changes
---

Added JIRA ticket.


Bugs: MESOS-6242
https://issues.apache.org/jira/browse/MESOS-6242


Repository: mesos


Description
---

This allows the caller to distinguish between a failure and waiting
on an unknown container. This is important for the upcoming agent
child container API, as the end-user would benefit from the
distinction.


Diffs
-

  src/slave/containerizer/composing.hpp 
ef3c2ee22cf0ced35c64025c7f9f7ca75165d54d 
  src/slave/containerizer/composing.cpp 
5ff3e65facd5831e2637ae1cee3ea01b21d0f6b2 
  src/slave/containerizer/containerizer.hpp 
f13669d0dfc4ce3287cfe630cabd0470dc765b51 
  src/slave/containerizer/docker.hpp a378fa9772742b4934578efcec54aeeaf5791a83 
  src/slave/containerizer/docker.cpp 110a1eb41b1ff7cb94f3630a1843d9f01efbe09c 
  src/slave/containerizer/mesos/containerizer.hpp 
078ef4f4e7bf5e1522804a720c51cfa5518d8efd 
  src/slave/containerizer/mesos/containerizer.cpp 
dc18e4e3b0eca3f116f1e240217bbebf64a75e3a 
  src/slave/slave.hpp e659c44fcc415f10a6980e7b29cc1ce536dfe657 
  src/slave/slave.cpp 2c7b5ada1aa81babb9ceee0c9f928685878a778c 
  src/tests/cluster.cpp b04653af97d17aaa9d0d3ee872169b66cd67126b 
  src/tests/containerizer.hpp f1fd57945c09fb80b7790f9124843a5d4ea785ee 
  src/tests/containerizer.cpp bda3e6f84f77a95f0eb9ff3aabdc9513f0f18b3f 
  src/tests/containerizer/composing_containerizer_tests.cpp 
51aab33cd190b53328339e39fd12853714882454 
  src/tests/containerizer/docker_containerizer_tests.cpp 
28cd3fa66886dbdbae3fdeca77707147faafcb7a 
  src/tests/containerizer/filesystem_isolator_tests.cpp 
df4642d2667407b1758ffe2efcfdbf9968cf2c33 
  src/tests/containerizer/isolator_tests.cpp 
9bb1e69209f34b18b5b64c3daf5ea26780f2ab74 
  src/tests/containerizer/mesos_containerizer_tests.cpp 
96e24500a12825161553eb050da389088b122695 
  src/tests/containerizer/port_mapping_tests.cpp 
7fac6ca1550acf54616fb4cef2eab1335f5e9760 
  src/tests/health_check_tests.cpp 50a252b5267cbc3b21ddc75e1da86c4975f6faf1 
  src/tests/hook_tests.cpp d864ef3d6153dedab8cc47403b1b9f76d75aeb5c 
  src/tests/slave_recovery_tests.cpp d58d9bc0dbb0a60e114699485d1306202981ecf5 

Diff: https://reviews.apache.org/r/52055/diff/


Testing
---

Added tests.

make check


Thanks,

Benjamin Mahler



Review Request 52222: Changed the name of a constant in the tests.

2016-09-23 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5/
---

Review request for mesos, Michael Park and Neil Conway.


Repository: mesos


Description
---

This changes the `DEFAULT_ROLE` constant to
`DEFAULT_TEST_ROLE` to avoid confusion with
the master's `--default_role`.


Diffs
-

  src/tests/mesos.hpp 3cd63bdf34eaad2a9c9cb897c62294c7f0c622dc 
  src/tests/persistent_volume_tests.cpp 
a726d784a9e32fb81aa579d37205baed0f4734cc 

Diff: https://reviews.apache.org/r/5/diff/


Testing
---

`GTEST_FILTER="*PersistentVolumeTest*" bin/mesos-tests.sh`


Thanks,

Greg Mann



Re: Review Request 51410: Enabled meta discovery using appc labels.

2016-09-23 Thread Srinivas Brahmaroutu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51410/
---

(Updated Sept. 23, 2016, 7:25 p.m.)


Review request for mesos, Gilbert Song and Jie Yu.


Bugs: MESOS-4288
https://issues.apache.org/jira/browse/MESOS-4288


Repository: mesos


Description
---

Added appc_labels to mesos-execute so that we can indicate meta
discovery using useMetaDiscovery="true" along with other appc labels
that determines the os, arch of the image we are discovering. This logic
needs to be documented so that user can switch between simple and meta
discovery methods. This implementation is necessary as there is no other
way to determine from the URI.


Diffs (updated)
-

  src/cli/execute.cpp c9f56af7f37d5b79b51f350d6c533714c170e889 
  src/slave/containerizer/mesos/provisioner/appc/fetcher.cpp 
907e761856e8996b72e4231de27fa15e884d52e3 

Diff: https://reviews.apache.org/r/51410/diff/


Testing
---

make check


Thanks,

Srinivas Brahmaroutu



Re: Review Request 51124: Support more layers through symlink for overlay backend.

2016-09-23 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51124/#review150221
---


Ship it!




Ship It!

- Jie Yu


On Sept. 22, 2016, 8:52 p.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51124/
> ---
> 
> (Updated Sept. 22, 2016, 8:52 p.m.)
> 
> 
> Review request for mesos, Xiaojian Huang, Gilbert Song, and Jie Yu.
> 
> 
> Bugs: MESOS-6000
> https://issues.apache.org/jira/browse/MESOS-6000
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Support more layers through symlink for overlay backend.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/provisioner/backend.hpp 
> 93819c887f2f801f06aae7020084b56cc81ce818 
>   src/slave/containerizer/mesos/provisioner/backends/aufs.hpp 
> a5ac13bc17b8f291472b622ebfdd0cd1dd072273 
>   src/slave/containerizer/mesos/provisioner/backends/aufs.cpp 
> 4c5cdb660dea205aea29d217ba80d845d1108fdf 
>   src/slave/containerizer/mesos/provisioner/backends/bind.hpp 
> 19d44c6a9a2cd5f8de79f33c76e9d79900ab003f 
>   src/slave/containerizer/mesos/provisioner/backends/bind.cpp 
> 62ccaec1a5cfb466e929f4db7bf7e5376f2a0c2d 
>   src/slave/containerizer/mesos/provisioner/backends/copy.hpp 
> 5bd6a5279d8dd6d088afcd58e4839bf6087ccd1e 
>   src/slave/containerizer/mesos/provisioner/backends/copy.cpp 
> e57bb3d2fb4e67e9caae416824a6a748db1460a1 
>   src/slave/containerizer/mesos/provisioner/backends/overlay.hpp 
> 387f28a331813c75a509b4a31dbbdc764080b8c1 
>   src/slave/containerizer/mesos/provisioner/backends/overlay.cpp 
> e20cd48463a78ecd8a652a4d5ac923dc02cca4d9 
>   src/slave/containerizer/mesos/provisioner/provisioner.cpp 
> f37c45ccfa572876dfbba6a0797c223896db5a7f 
>   src/tests/containerizer/provisioner_backend_tests.cpp 
> 7c9a7cd5c733f74e10316fc1234115e6820cc2cb 
> 
> Diff: https://reviews.apache.org/r/51124/diff/
> 
> 
> Testing
> ---
> 
> 1. Make sure `OverlayBackendTest.*` passes;
> 2. Use mesos-execute to provision a container using overlay backend. Observed 
> following log lines:
> 
> ```
> I0816 01:04:27.823420 46584 overlay.cpp:167] Created tempLink at 
> '/var/lib/mesos/provisioner/containers/fd574bbf-4bc0-4538-9ce5-8c2cc93b94c7/backends/overlay/links'
>  to '/tmp/NcmRZt'
> I0816 01:04:27.824834 46584 overlay.cpp:191] Provisioning image rootfs with 
> overlayfs: 
> 'lowerdir=/tmp/NcmRZt/42:/tmp/NcmRZt/41:/tmp/NcmRZt/40:/tmp/NcmRZt/39:/tmp/NcmRZt/38:/tmp/NcmRZt/37:/tmp/NcmRZt/36:/tmp/NcmRZt/35:/tmp/NcmRZt/34:/tmp/NcmRZt/33:/tmp/NcmRZt/32:/tmp/NcmRZt/31:/tmp/NcmRZt/30:/tmp/NcmRZt/29:/tmp/NcmRZt/28:/tmp/NcmRZt/27:/tmp/NcmRZt/26:/tmp/NcmRZt/25:/tmp/NcmRZt/24:/tmp/NcmRZt/23:/tmp/NcmRZt/22:/tmp/NcmRZt/21:/tmp/NcmRZt/20:/tmp/NcmRZt/19:/tmp/NcmRZt/18:/tmp/NcmRZt/17:/tmp/NcmRZt/16:/tmp/NcmRZt/15:/tmp/NcmRZt/14:/tmp/NcmRZt/13:/tmp/NcmRZt/12:/tmp/NcmRZt/11:/tmp/NcmRZt/10:/tmp/NcmRZt/9:/tmp/NcmRZt/8:/tmp/NcmRZt/7:/tmp/NcmRZt/6:/tmp/NcmRZt/5:/tmp/NcmRZt/4:/tmp/NcmRZt/3:/tmp/NcmRZt/2:/tmp/NcmRZt/1:/tmp/NcmRZt/0,upperdir=/var/lib/mesos/provisioner/containers/fd574bbf-4bc0-4538-9ce5-8c2cc93b94c7/backends/overlay/scratch/25d9b6e4-fd0e-401e-bb9e-754dbaec4f55/upperdir,workdir=/var/lib/mesos/provisioner/containers/fd574bbf-4bc0-4538-9ce5-8c2cc93b94c7/backends/overlay/scratch/2
 5d9b6e4-fd0e-401e-bb9e-754dbaec4f55/workdir'
> ...
> (after executor exited)
> I0816 01:04:34.859851 46584 overlay.cpp:281] Removed temporary directory 
> '/tmp/NcmRZt' pointed by 
> '/var/lib/mesos/provisioner/containers/fd574bbf-4bc0-4538-9ce5-8c2cc93b94c7/backends/overlay/links'
> ```
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>



Re: Review Request 52105: Prevent `/redirect/foo` loop.

2016-09-23 Thread Charles Allen


> On Sept. 23, 2016, 3:05 p.m., haosdent huang wrote:
> > src/master/http.cpp, lines 2038-2040
> > 
> >
> > Another problem here is if we call `redirect(/redirect_a)` in http.cpp, 
> > would get unexpected result. So how about just return 404 which consistent 
> > with `/redirect/`
> > 
> > ```
> >   // When current master is not the leader, redirect to the leading 
> > master.
> >   if (!master->elected()) {
> > return redirect(request);
> >   }
> > ```
> 
> Charles Allen wrote:
> Would that mean things like `/redirect/foo` should also return 404?
> 
> haosdent huang wrote:
> yes.
> 
> haosdent huang wrote:
> This is consistent with `/redirect/`. And I think something like 
> `/redirect/foo/redirect/bar/xxx` make it messy if we keep redirecting to 
> `/foo/redirect/xxx` here.

Please see updated examples. Pretty much everything is 404 except explicitly 
`/redirect` and `/{master_id}/redirect`


- Charles


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52105/#review150185
---


On Sept. 23, 2016, 6:04 p.m., Charles Allen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52105/
> ---
> 
> (Updated Sept. 23, 2016, 6:04 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6210
> https://issues.apache.org/jira/browse/MESOS-6210
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Prevent `/redirect/foo` loop.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 9005e7c308d5f57c6f5c573951d468a3ba730740 
> 
> Diff: https://reviews.apache.org/r/52105/diff/
> 
> 
> Testing
> ---
> 
> Single Node started with `./src/mesos-master --work_dir=/tmp/mesos`
> 
> 
> ```
> Charless-MacBook-Pro:~ charlesallen$ curl -v http://127.0.0.1:5050/redirect
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Fri, 23 Sep 2016 17:30:17 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v http://127.0.0.1:5050/redirect/
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect/ HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Fri, 23 Sep 2016 17:30:19 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/redirect/foo
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect/foo HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Fri, 23 Sep 2016 17:30:21 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/redirect_foo
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect_foo HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Fri, 23 Sep 2016 17:30:23 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/ HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Fri, 23 Sep 2016 17:30:49 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Fri, 23 Sep 2016 17:30:50 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect_foo
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect_foo HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Fri, 23 Sep 2016 17:30:53 GMT
> < Content-Length: 0
> <
> * Connect

Re: Review Request 51124: Support more layers through symlink for overlay backend.

2016-09-23 Thread Zhitao Li


> On Sept. 21, 2016, 6:26 p.m., Jie Yu wrote:
> > src/tests/containerizer/provisioner_backend_tests.cpp, lines 116-131
> > 
> >
> > I'd suggest we create a separate test. Also, i suggest we construct a 
> > lot of layers (using a for loop), and make sure the test fail without 
> > applying this patch and succeed after applying this patch. This better 
> > captures the regression.
> 
> Zhitao Li wrote:
> +1 for the for loop based test w/ many layers.
> For the comment of "create a separate test", do you mean move these link 
> related comparisons to that test?
> 
> Jie Yu wrote:
> well, i don't think we need to test the links in the unit tests. It's an 
> impl. detail. We just need to test if we can handle many layers, if without 
> the link solution will definitely fail.

I get your idea now. I personally prefer testing certain impl. details as long 
as the test is still robust, unflaky and it does not slow test down, because it 
helps people to read and understand the code, and also makes sure some 
behaviors are not broken w/o noticed.

In this case, there is nothing checking that symlinks are cleaned up to the 
best effort if we move these tests.

Still, if you think strong on this, I'm happy to remove these assertions for 
links.

The test for many layers part is added.


- Zhitao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51124/#review149863
---


On Sept. 22, 2016, 8:52 p.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51124/
> ---
> 
> (Updated Sept. 22, 2016, 8:52 p.m.)
> 
> 
> Review request for mesos, Xiaojian Huang, Gilbert Song, and Jie Yu.
> 
> 
> Bugs: MESOS-6000
> https://issues.apache.org/jira/browse/MESOS-6000
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Support more layers through symlink for overlay backend.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/provisioner/backend.hpp 
> 93819c887f2f801f06aae7020084b56cc81ce818 
>   src/slave/containerizer/mesos/provisioner/backends/aufs.hpp 
> a5ac13bc17b8f291472b622ebfdd0cd1dd072273 
>   src/slave/containerizer/mesos/provisioner/backends/aufs.cpp 
> 4c5cdb660dea205aea29d217ba80d845d1108fdf 
>   src/slave/containerizer/mesos/provisioner/backends/bind.hpp 
> 19d44c6a9a2cd5f8de79f33c76e9d79900ab003f 
>   src/slave/containerizer/mesos/provisioner/backends/bind.cpp 
> 62ccaec1a5cfb466e929f4db7bf7e5376f2a0c2d 
>   src/slave/containerizer/mesos/provisioner/backends/copy.hpp 
> 5bd6a5279d8dd6d088afcd58e4839bf6087ccd1e 
>   src/slave/containerizer/mesos/provisioner/backends/copy.cpp 
> e57bb3d2fb4e67e9caae416824a6a748db1460a1 
>   src/slave/containerizer/mesos/provisioner/backends/overlay.hpp 
> 387f28a331813c75a509b4a31dbbdc764080b8c1 
>   src/slave/containerizer/mesos/provisioner/backends/overlay.cpp 
> e20cd48463a78ecd8a652a4d5ac923dc02cca4d9 
>   src/slave/containerizer/mesos/provisioner/provisioner.cpp 
> f37c45ccfa572876dfbba6a0797c223896db5a7f 
>   src/tests/containerizer/provisioner_backend_tests.cpp 
> 7c9a7cd5c733f74e10316fc1234115e6820cc2cb 
> 
> Diff: https://reviews.apache.org/r/51124/diff/
> 
> 
> Testing
> ---
> 
> 1. Make sure `OverlayBackendTest.*` passes;
> 2. Use mesos-execute to provision a container using overlay backend. Observed 
> following log lines:
> 
> ```
> I0816 01:04:27.823420 46584 overlay.cpp:167] Created tempLink at 
> '/var/lib/mesos/provisioner/containers/fd574bbf-4bc0-4538-9ce5-8c2cc93b94c7/backends/overlay/links'
>  to '/tmp/NcmRZt'
> I0816 01:04:27.824834 46584 overlay.cpp:191] Provisioning image rootfs with 
> overlayfs: 
> 'lowerdir=/tmp/NcmRZt/42:/tmp/NcmRZt/41:/tmp/NcmRZt/40:/tmp/NcmRZt/39:/tmp/NcmRZt/38:/tmp/NcmRZt/37:/tmp/NcmRZt/36:/tmp/NcmRZt/35:/tmp/NcmRZt/34:/tmp/NcmRZt/33:/tmp/NcmRZt/32:/tmp/NcmRZt/31:/tmp/NcmRZt/30:/tmp/NcmRZt/29:/tmp/NcmRZt/28:/tmp/NcmRZt/27:/tmp/NcmRZt/26:/tmp/NcmRZt/25:/tmp/NcmRZt/24:/tmp/NcmRZt/23:/tmp/NcmRZt/22:/tmp/NcmRZt/21:/tmp/NcmRZt/20:/tmp/NcmRZt/19:/tmp/NcmRZt/18:/tmp/NcmRZt/17:/tmp/NcmRZt/16:/tmp/NcmRZt/15:/tmp/NcmRZt/14:/tmp/NcmRZt/13:/tmp/NcmRZt/12:/tmp/NcmRZt/11:/tmp/NcmRZt/10:/tmp/NcmRZt/9:/tmp/NcmRZt/8:/tmp/NcmRZt/7:/tmp/NcmRZt/6:/tmp/NcmRZt/5:/tmp/NcmRZt/4:/tmp/NcmRZt/3:/tmp/NcmRZt/2:/tmp/NcmRZt/1:/tmp/NcmRZt/0,upperdir=/var/lib/mesos/provisioner/containers/fd574bbf-4bc0-4538-9ce5-8c2cc93b94c7/backends/overlay/scratch/25d9b6e4-fd0e-401e-bb9e-754dbaec4f55/upperdir,workdir=/var/lib/mesos/provisioner/containers/fd574bbf-4bc0-4538-9ce5-8c2cc93b94c7/backends/overlay/scratch/2
 5d9b6e4-fd0e-401e-bb9e-754dbaec4f55/workdir'
> ...
> (after executor exited)
> I0816 01:04:34.859851 46584 overlay.cpp:281] R

Re: Review Request 52105: Prevent `/redirect/foo` loop.

2016-09-23 Thread Charles Allen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52105/
---

(Updated Sept. 23, 2016, 6:04 p.m.)


Review request for mesos and Vinod Kone.


Changes
---

Fixing review comments


Bugs: MESOS-6210
https://issues.apache.org/jira/browse/MESOS-6210


Repository: mesos


Description
---

Prevent `/redirect/foo` loop.


Diffs (updated)
-

  src/master/http.cpp 9005e7c308d5f57c6f5c573951d468a3ba730740 

Diff: https://reviews.apache.org/r/52105/diff/


Testing (updated)
---

Single Node started with `./src/mesos-master --work_dir=/tmp/mesos`


```
Charless-MacBook-Pro:~ charlesallen$ curl -v http://127.0.0.1:5050/redirect
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> GET /redirect HTTP/1.1
> Host: 127.0.0.1:5050
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 307 Temporary Redirect
< Date: Fri, 23 Sep 2016 17:30:17 GMT
< Location: //10.17.97.185:5050
< Content-Length: 0
<
* Connection #0 to host 127.0.0.1 left intact
Charless-MacBook-Pro:~ charlesallen$ curl -v http://127.0.0.1:5050/redirect/
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> GET /redirect/ HTTP/1.1
> Host: 127.0.0.1:5050
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Date: Fri, 23 Sep 2016 17:30:19 GMT
< Content-Length: 0
<
* Connection #0 to host 127.0.0.1 left intact
Charless-MacBook-Pro:~ charlesallen$ curl -v http://127.0.0.1:5050/redirect/foo
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> GET /redirect/foo HTTP/1.1
> Host: 127.0.0.1:5050
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Date: Fri, 23 Sep 2016 17:30:21 GMT
< Content-Length: 0
<
* Connection #0 to host 127.0.0.1 left intact
Charless-MacBook-Pro:~ charlesallen$ curl -v http://127.0.0.1:5050/redirect_foo
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> GET /redirect_foo HTTP/1.1
> Host: 127.0.0.1:5050
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Date: Fri, 23 Sep 2016 17:30:23 GMT
< Content-Length: 0
<
* Connection #0 to host 127.0.0.1 left intact
Charless-MacBook-Pro:~ charlesallen$ curl -v 
http://127.0.0.1:5050/master/redirect/
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> GET /master/redirect/ HTTP/1.1
> Host: 127.0.0.1:5050
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Date: Fri, 23 Sep 2016 17:30:49 GMT
< Content-Length: 0
<
* Connection #0 to host 127.0.0.1 left intact
Charless-MacBook-Pro:~ charlesallen$ curl -v 
http://127.0.0.1:5050/master/redirect
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> GET /master/redirect HTTP/1.1
> Host: 127.0.0.1:5050
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 307 Temporary Redirect
< Date: Fri, 23 Sep 2016 17:30:50 GMT
< Location: //10.17.97.185:5050
< Content-Length: 0
<
* Connection #0 to host 127.0.0.1 left intact
Charless-MacBook-Pro:~ charlesallen$ curl -v 
http://127.0.0.1:5050/master/redirect_foo
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> GET /master/redirect_foo HTTP/1.1
> Host: 127.0.0.1:5050
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Date: Fri, 23 Sep 2016 17:30:53 GMT
< Content-Length: 0
<
* Connection #0 to host 127.0.0.1 left intact
Charless-MacBook-Pro:~ charlesallen$ curl -v 
http://127.0.0.1:5050/master/redirect/foo
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> GET /master/redirect/foo HTTP/1.1
> Host: 127.0.0.1:5050
> User-Agent: curl/7.43.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Date: Fri, 23 Sep 2016 17:30:55 GMT
< Content-Length: 0
<
* Connection #0 to host 127.0.0.1 left intact
```

It is worth noting that in this PR /master/redirect/  and /redirect/ return 
404, not 307. This behavior is consistent with master.


Thanks,

Charles Allen



Re: Review Request 51124: Support more layers through symlink for overlay backend.

2016-09-23 Thread Jie Yu


> On Sept. 21, 2016, 6:26 p.m., Jie Yu wrote:
> > src/tests/containerizer/provisioner_backend_tests.cpp, lines 116-131
> > 
> >
> > I'd suggest we create a separate test. Also, i suggest we construct a 
> > lot of layers (using a for loop), and make sure the test fail without 
> > applying this patch and succeed after applying this patch. This better 
> > captures the regression.
> 
> Zhitao Li wrote:
> +1 for the for loop based test w/ many layers.
> For the comment of "create a separate test", do you mean move these link 
> related comparisons to that test?

well, i don't think we need to test the links in the unit tests. It's an impl. 
detail. We just need to test if we can handle many layers, if without the link 
solution will definitely fail.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51124/#review149863
---


On Sept. 22, 2016, 8:52 p.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51124/
> ---
> 
> (Updated Sept. 22, 2016, 8:52 p.m.)
> 
> 
> Review request for mesos, Xiaojian Huang, Gilbert Song, and Jie Yu.
> 
> 
> Bugs: MESOS-6000
> https://issues.apache.org/jira/browse/MESOS-6000
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Support more layers through symlink for overlay backend.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/provisioner/backend.hpp 
> 93819c887f2f801f06aae7020084b56cc81ce818 
>   src/slave/containerizer/mesos/provisioner/backends/aufs.hpp 
> a5ac13bc17b8f291472b622ebfdd0cd1dd072273 
>   src/slave/containerizer/mesos/provisioner/backends/aufs.cpp 
> 4c5cdb660dea205aea29d217ba80d845d1108fdf 
>   src/slave/containerizer/mesos/provisioner/backends/bind.hpp 
> 19d44c6a9a2cd5f8de79f33c76e9d79900ab003f 
>   src/slave/containerizer/mesos/provisioner/backends/bind.cpp 
> 62ccaec1a5cfb466e929f4db7bf7e5376f2a0c2d 
>   src/slave/containerizer/mesos/provisioner/backends/copy.hpp 
> 5bd6a5279d8dd6d088afcd58e4839bf6087ccd1e 
>   src/slave/containerizer/mesos/provisioner/backends/copy.cpp 
> e57bb3d2fb4e67e9caae416824a6a748db1460a1 
>   src/slave/containerizer/mesos/provisioner/backends/overlay.hpp 
> 387f28a331813c75a509b4a31dbbdc764080b8c1 
>   src/slave/containerizer/mesos/provisioner/backends/overlay.cpp 
> e20cd48463a78ecd8a652a4d5ac923dc02cca4d9 
>   src/slave/containerizer/mesos/provisioner/provisioner.cpp 
> f37c45ccfa572876dfbba6a0797c223896db5a7f 
>   src/tests/containerizer/provisioner_backend_tests.cpp 
> 7c9a7cd5c733f74e10316fc1234115e6820cc2cb 
> 
> Diff: https://reviews.apache.org/r/51124/diff/
> 
> 
> Testing
> ---
> 
> 1. Make sure `OverlayBackendTest.*` passes;
> 2. Use mesos-execute to provision a container using overlay backend. Observed 
> following log lines:
> 
> ```
> I0816 01:04:27.823420 46584 overlay.cpp:167] Created tempLink at 
> '/var/lib/mesos/provisioner/containers/fd574bbf-4bc0-4538-9ce5-8c2cc93b94c7/backends/overlay/links'
>  to '/tmp/NcmRZt'
> I0816 01:04:27.824834 46584 overlay.cpp:191] Provisioning image rootfs with 
> overlayfs: 
> 'lowerdir=/tmp/NcmRZt/42:/tmp/NcmRZt/41:/tmp/NcmRZt/40:/tmp/NcmRZt/39:/tmp/NcmRZt/38:/tmp/NcmRZt/37:/tmp/NcmRZt/36:/tmp/NcmRZt/35:/tmp/NcmRZt/34:/tmp/NcmRZt/33:/tmp/NcmRZt/32:/tmp/NcmRZt/31:/tmp/NcmRZt/30:/tmp/NcmRZt/29:/tmp/NcmRZt/28:/tmp/NcmRZt/27:/tmp/NcmRZt/26:/tmp/NcmRZt/25:/tmp/NcmRZt/24:/tmp/NcmRZt/23:/tmp/NcmRZt/22:/tmp/NcmRZt/21:/tmp/NcmRZt/20:/tmp/NcmRZt/19:/tmp/NcmRZt/18:/tmp/NcmRZt/17:/tmp/NcmRZt/16:/tmp/NcmRZt/15:/tmp/NcmRZt/14:/tmp/NcmRZt/13:/tmp/NcmRZt/12:/tmp/NcmRZt/11:/tmp/NcmRZt/10:/tmp/NcmRZt/9:/tmp/NcmRZt/8:/tmp/NcmRZt/7:/tmp/NcmRZt/6:/tmp/NcmRZt/5:/tmp/NcmRZt/4:/tmp/NcmRZt/3:/tmp/NcmRZt/2:/tmp/NcmRZt/1:/tmp/NcmRZt/0,upperdir=/var/lib/mesos/provisioner/containers/fd574bbf-4bc0-4538-9ce5-8c2cc93b94c7/backends/overlay/scratch/25d9b6e4-fd0e-401e-bb9e-754dbaec4f55/upperdir,workdir=/var/lib/mesos/provisioner/containers/fd574bbf-4bc0-4538-9ce5-8c2cc93b94c7/backends/overlay/scratch/2
 5d9b6e4-fd0e-401e-bb9e-754dbaec4f55/workdir'
> ...
> (after executor exited)
> I0816 01:04:34.859851 46584 overlay.cpp:281] Removed temporary directory 
> '/tmp/NcmRZt' pointed by 
> '/var/lib/mesos/provisioner/containers/fd574bbf-4bc0-4538-9ce5-8c2cc93b94c7/backends/overlay/links'
> ```
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>



Re: Review Request 52219: Renamed slave to agent in subprocess.cpp.

2016-09-23 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52219/#review150212
---




3rdparty/libprocess/src/subprocess.cpp (line 83)


why libprocess is aware of Mesos agent? We should work it differently. 
Maybe just remove the agenth here completely


- Jie Yu


On Sept. 23, 2016, 3:46 p.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52219/
> ---
> 
> (Updated Sept. 23, 2016, 3:46 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Joerg Schad.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Renamed slave to agent in subprocess.cpp.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/subprocess.cpp 
> 8b244434d61884e4bddc4694350265111414e935 
> 
> Diff: https://reviews.apache.org/r/52219/diff/
> 
> 
> Testing
> ---
> 
> These typos are introduced in https://reviews.apache.org/r/45491/
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 51560: Deprecated using health checks without setting the type.

2016-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51560/#review150204
---



Patch looks great!

Reviews applied: [51561, 51803, 51560]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On Sept. 23, 2016, 3:38 p.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51560/
> ---
> 
> (Updated Sept. 23, 2016, 3:38 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Joseph Wu, Silas Snider, and 
> Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6110
> https://issues.apache.org/jira/browse/MESOS-6110
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Health checks must supply set type field from now on. Additionally,
> `HealthCheck.HTTP` message has been renamed to
> `HealthCheck.HttpCheckInfo` to avoid naming collisions.
> 
> 
> Diffs
> -
> 
>   src/health-check/health_checker.cpp 
> 736725c4ef954ece8580f383cfd31d289795903f 
>   src/tests/health_check_tests.cpp e6b02f23dc1b1b84381ab0af0e3df5918b60ae40 
> 
> Diff: https://reviews.apache.org/r/51560/diff/
> 
> 
> Testing
> ---
> 
> Could verfied from 
> https://github.com/haosdent/mesos/blob/85abb6da058bf8daf4e794df7da8eb5f240b671c/docs/upgrades.md
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 52048: Fixed OsTest.User test failure due to gids ordering.

2016-09-23 Thread Mao Geng


> On Sept. 23, 2016, 4:43 p.m., haosdent huang wrote:
> > 3rdparty/stout/tests/os_tests.cpp, lines 697-699
> > 
> >
> > How about `EXPECT_EQ(tokens.get(), expected_gids);` here?

Not sure if EXPECT_EQ can compare two vectors directly? 
I saw 
https://github.com/google/googletest/blob/master/googletest/docs/Primer.md has 
an example:  
```
for (int i = 0; i < x.size(); ++i) {
  EXPECT_EQ(x[i], y[i]) << "Vectors x and y differ at index " << i;
}
```


- Mao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52048/#review150202
---


On Sept. 22, 2016, 7:29 a.m., Mao Geng wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52048/
> ---
> 
> (Updated Sept. 22, 2016, 7:29 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Kapil Arya, and Joseph Wu.
> 
> 
> Bugs: MESOS-5909
> https://issues.apache.org/jira/browse/MESOS-5909
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed OsTest.User test failure due to gids ordering.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/tests/os_tests.cpp c2900b8 
> 
> Diff: https://reviews.apache.org/r/52048/diff/
> 
> 
> Testing
> ---
> 
> Just changed the OsTest.User test. `make check` passed on Ubuntu 12.04.5 LTS.
> 
> 
> Thanks,
> 
> Mao Geng
> 
>



Re: Review Request 52048: Fixed OsTest.User test failure due to gids ordering.

2016-09-23 Thread haosdent huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52048/#review150202
---




3rdparty/stout/tests/os_tests.cpp (lines 697 - 699)


How about `EXPECT_EQ(tokens.get(), expected_gids);` here?


- haosdent huang


On Sept. 22, 2016, 7:29 a.m., Mao Geng wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52048/
> ---
> 
> (Updated Sept. 22, 2016, 7:29 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Kapil Arya, and Joseph Wu.
> 
> 
> Bugs: MESOS-5909
> https://issues.apache.org/jira/browse/MESOS-5909
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed OsTest.User test failure due to gids ordering.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/tests/os_tests.cpp c2900b8 
> 
> Diff: https://reviews.apache.org/r/52048/diff/
> 
> 
> Testing
> ---
> 
> Just changed the OsTest.User test. `make check` passed on Ubuntu 12.04.5 LTS.
> 
> 
> Thanks,
> 
> Mao Geng
> 
>



Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-09-23 Thread Jacob Janco

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/
---

(Updated Sept. 23, 2016, 4:32 p.m.)


Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus Ma, 
and Jiang Yan Xu.


Changes
---

Small update to batch.


Summary (updated)
-

Track allocation candidates to bound allocator.


Bugs: MESOS-3157
https://issues.apache.org/jira/browse/MESOS-3157


Repository: mesos


Description (updated)
---

- Triggered allocations dispatch allocate() only
  if there is no pending allocation in the queue.
- Allocation candidates are accumulated and only
  cleared when enqueued allocations are processed.


Diffs (updated)
-

  src/master/allocator/mesos/hierarchical.hpp 
2c31471ee0f5d6836393bf87ff9ecfd8df835013 
  src/master/allocator/mesos/hierarchical.cpp 
2d56bd011f2c87c67a02d0ae467a4a537d36867e 

Diff: https://reviews.apache.org/r/51027/diff/


Testing
---

make check

note: check without filters depends on https://reviews.apache.org/r/51028

With new benchmark https://reviews.apache.org/r/49617: 
Sample output without 51027:
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
Using 1 agents and 3000 frameworks
Added 3000 frameworks in 57251us
Added 1 agents in 3.2134535333mins
allocator settled after  1.6123603833mins
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
 (290578 ms)

Sample output with 51027:
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
Using 1 agents and 3000 frameworks
Added 3000 frameworks in 39817us
Added 1 agents in 3.2286054167mins
allocator settled after  25.525654secs
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.FrameworkFailover/22
 (220137 ms)


Thanks,

Jacob Janco



Re: Review Request 52105: Prevent `/redirect/foo` loop.

2016-09-23 Thread haosdent huang


> On Sept. 23, 2016, 3:05 p.m., haosdent huang wrote:
> > src/master/http.cpp, lines 2038-2040
> > 
> >
> > Another problem here is if we call `redirect(/redirect_a)` in http.cpp, 
> > would get unexpected result. So how about just return 404 which consistent 
> > with `/redirect/`
> > 
> > ```
> >   // When current master is not the leader, redirect to the leading 
> > master.
> >   if (!master->elected()) {
> > return redirect(request);
> >   }
> > ```
> 
> Charles Allen wrote:
> Would that mean things like `/redirect/foo` should also return 404?
> 
> haosdent huang wrote:
> yes.

This is consistent with `/redirect/`. And I think something like 
`/redirect/foo/redirect/bar/xxx` make it messy if we keep redirecting to 
`/foo/redirect/xxx` here.


- haosdent


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52105/#review150185
---


On Sept. 22, 2016, 8:47 p.m., Charles Allen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52105/
> ---
> 
> (Updated Sept. 22, 2016, 8:47 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6210
> https://issues.apache.org/jira/browse/MESOS-6210
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Prevent `/redirect/foo` loop.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 9005e7c308d5f57c6f5c573951d468a3ba730740 
> 
> Diff: https://reviews.apache.org/r/52105/diff/
> 
> 
> Testing
> ---
> 
> Single Node started with `./src/mesos-master --work_dir=/tmp/mesos`
> 
> 
> ```
> Charless-MacBook-Pro:~ charlesallen$ curl -v http://127.0.0.1:5050/redirect
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:18 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:23 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/ HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Thu, 22 Sep 2016 20:35:29 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/foo
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/foo HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:31 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/foo/bar
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/foo/bar HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:34 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/redirect/foo/bar
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect/foo/bar HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:41 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/redirectNOTFOUND/foo/bar
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirectNOTFOUND/foo/bar HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Thu, 22 Sep 2016 20:35:47 GMT

Re: Review Request 52105: Prevent `/redirect/foo` loop.

2016-09-23 Thread haosdent huang


> On Sept. 23, 2016, 3:05 p.m., haosdent huang wrote:
> > src/master/http.cpp, lines 2038-2040
> > 
> >
> > Another problem here is if we call `redirect(/redirect_a)` in http.cpp, 
> > would get unexpected result. So how about just return 404 which consistent 
> > with `/redirect/`
> > 
> > ```
> >   // When current master is not the leader, redirect to the leading 
> > master.
> >   if (!master->elected()) {
> > return redirect(request);
> >   }
> > ```
> 
> Charles Allen wrote:
> Would that mean things like `/redirect/foo` should also return 404?

yes.


- haosdent


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52105/#review150185
---


On Sept. 22, 2016, 8:47 p.m., Charles Allen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52105/
> ---
> 
> (Updated Sept. 22, 2016, 8:47 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6210
> https://issues.apache.org/jira/browse/MESOS-6210
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Prevent `/redirect/foo` loop.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 9005e7c308d5f57c6f5c573951d468a3ba730740 
> 
> Diff: https://reviews.apache.org/r/52105/diff/
> 
> 
> Testing
> ---
> 
> Single Node started with `./src/mesos-master --work_dir=/tmp/mesos`
> 
> 
> ```
> Charless-MacBook-Pro:~ charlesallen$ curl -v http://127.0.0.1:5050/redirect
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:18 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:23 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/ HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Thu, 22 Sep 2016 20:35:29 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/foo
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/foo HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:31 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/foo/bar
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/foo/bar HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:34 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/redirect/foo/bar
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect/foo/bar HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:41 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/redirectNOTFOUND/foo/bar
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirectNOTFOUND/foo/bar HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Thu, 22 Sep 2016 20:35:47 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> ```
> 
> It is worth noting that in this PR /master/redirect/  and /redirect/ return 
> 404, not 307. This behavior is con

Re: Review Request 52105: Prevent `/redirect/foo` loop.

2016-09-23 Thread Charles Allen


> On Sept. 23, 2016, 3:05 p.m., haosdent huang wrote:
> > src/master/http.cpp, lines 2038-2040
> > 
> >
> > Another problem here is if we call `redirect(/redirect_a)` in http.cpp, 
> > would get unexpected result. So how about just return 404 which consistent 
> > with `/redirect/`
> > 
> > ```
> >   // When current master is not the leader, redirect to the leading 
> > master.
> >   if (!master->elected()) {
> > return redirect(request);
> >   }
> > ```

Would that mean things like `/redirect/foo` should also return 404?


- Charles


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52105/#review150185
---


On Sept. 22, 2016, 8:47 p.m., Charles Allen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52105/
> ---
> 
> (Updated Sept. 22, 2016, 8:47 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6210
> https://issues.apache.org/jira/browse/MESOS-6210
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Prevent `/redirect/foo` loop.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 9005e7c308d5f57c6f5c573951d468a3ba730740 
> 
> Diff: https://reviews.apache.org/r/52105/diff/
> 
> 
> Testing
> ---
> 
> Single Node started with `./src/mesos-master --work_dir=/tmp/mesos`
> 
> 
> ```
> Charless-MacBook-Pro:~ charlesallen$ curl -v http://127.0.0.1:5050/redirect
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:18 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:23 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/ HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Thu, 22 Sep 2016 20:35:29 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/foo
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/foo HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:31 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/foo/bar
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/foo/bar HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:34 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/redirect/foo/bar
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect/foo/bar HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:41 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/redirectNOTFOUND/foo/bar
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirectNOTFOUND/foo/bar HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Thu, 22 Sep 2016 20:35:47 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> ```
> 
> It is worth noting that in this PR /master/redirect/  and /redirect/ return 
> 404, not 307. This behavior is consistent with master.
> 
> 
> Thanks,
>

Re: Review Request 52219: Renamed slave to agent in subprocess.cpp.

2016-09-23 Thread Joerg Schad

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52219/#review150194
---


Ship it!




Ship It!

- Joerg Schad


On Sept. 23, 2016, 3:46 p.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52219/
> ---
> 
> (Updated Sept. 23, 2016, 3:46 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Joerg Schad.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Renamed slave to agent in subprocess.cpp.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/subprocess.cpp 
> 8b244434d61884e4bddc4694350265111414e935 
> 
> Diff: https://reviews.apache.org/r/52219/diff/
> 
> 
> Testing
> ---
> 
> These typos are introduced in https://reviews.apache.org/r/45491/
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Review Request 52219: Renamed slave to agent in subprocess.cpp.

2016-09-23 Thread haosdent huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52219/
---

Review request for mesos, Jie Yu and Joerg Schad.


Repository: mesos


Description
---

Renamed slave to agent in subprocess.cpp.


Diffs
-

  3rdparty/libprocess/src/subprocess.cpp 
8b244434d61884e4bddc4694350265111414e935 

Diff: https://reviews.apache.org/r/52219/diff/


Testing
---

These typos are introduced in https://reviews.apache.org/r/45491/


Thanks,

haosdent huang



Re: Review Request 52062: Fixed warnings in `numify.hpp`.

2016-09-23 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52062/#review150192
---



(1) Is it possible to narrow the warning disabling statements like this?
```
#ifdef __WINDOWS__
  #pragma warning(disable:4146)
#endif
result = -result;
#ifdef __WINDOWS__
  #pragma warning(default:4146)
#endif
```

(2) Could just leave a short comment around the reason for turning off this 
warning at this specific spot?

- Michael Park


On Sept. 23, 2016, 4:46 a.m., Daniel Pravat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52062/
> ---
> 
> (Updated Sept. 23, 2016, 4:46 a.m.)
> 
> 
> Review request for mesos, Alex Naparu, Artem Harutyunyan, Alex Clemmer, 
> Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When the template is instatiated for unsigned scalars
> the unary negation generates warnings.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/numify.hpp 
> c174fcb8cb9d809f443e44058f07b58751bed9dd 
> 
> Diff: https://reviews.apache.org/r/52062/diff/
> 
> 
> Testing
> ---
> 
> Windows: build
> 
> 
> Thanks,
> 
> Daniel Pravat
> 
>



Re: Review Request 51560: Deprecated using health checks without setting the type.

2016-09-23 Thread haosdent huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51560/
---

(Updated Sept. 23, 2016, 3:38 p.m.)


Review request for mesos, Alexander Rukletsov, Joseph Wu, Silas Snider, and 
Jiang Yan Xu.


Changes
---

Address @alexr's comments.


Bugs: MESOS-6110
https://issues.apache.org/jira/browse/MESOS-6110


Repository: mesos


Description
---

Health checks must supply set type field from now on. Additionally,
`HealthCheck.HTTP` message has been renamed to
`HealthCheck.HttpCheckInfo` to avoid naming collisions.


Diffs (updated)
-

  src/health-check/health_checker.cpp 736725c4ef954ece8580f383cfd31d289795903f 
  src/tests/health_check_tests.cpp e6b02f23dc1b1b84381ab0af0e3df5918b60ae40 

Diff: https://reviews.apache.org/r/51560/diff/


Testing
---

Could verfied from 
https://github.com/haosdent/mesos/blob/85abb6da058bf8daf4e794df7da8eb5f240b671c/docs/upgrades.md


Thanks,

haosdent huang



Re: Review Request 51803: Ensured `HealthCheck::HTTPCheckInfo` compatible with the old one.

2016-09-23 Thread haosdent huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51803/
---

(Updated Sept. 23, 2016, 3:28 p.m.)


Review request for mesos, Alexander Rukletsov, Joseph Wu, Silas Snider, and 
Jiang Yan Xu.


Changes
---

Address @alexr and @xujyan's comments.


Bugs: MESOS-6110
https://issues.apache.org/jira/browse/MESOS-6110


Repository: mesos


Description
---

Ensured `HealthCheck::HTTPCheckInfo` compatible with the old one.


Diffs (updated)
-

  include/mesos/mesos.proto 2209ea2fb0bf39c773d60f8a0eea865320a03bb6 
  include/mesos/v1/mesos.proto 00c623450268a990d48b4e119aa9429fabf2f135 

Diff: https://reviews.apache.org/r/51803/diff/


Testing
---


Thanks,

haosdent huang



Re: Review Request 51803: Ensured `HealthCheck::HTTPCheckInfo` compatible with the old one.

2016-09-23 Thread Alexander Rukletsov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51803/#review150184
---




include/mesos/mesos.proto (line 338)


s/"strategy"/type



include/mesos/mesos.proto (line 339)


s/strategy/type



include/mesos/mesos.proto (lines 349 - 352)


Let's rephrase a bit. How about

```
  // Describes an HTTP health check. Sends a GET request to
  // scheme://:port/path. Note that  is not configurable and is
  // resolved automatically, in most cases to 127.0.0.1. Default executors
  // treat return codes between 200 and 399 as success; custom executors
  // may employ a different strategy, e.g. leveraging the `statuses` field.
```



include/mesos/mesos.proto (line 352)


s/localhost/127.0.0.1



include/mesos/mesos.proto (lines 371 - 372)


Let's follow Yan's suggestion and add something like

```
// NOTE: It is up to the custom executor to interpret and act on this
// field. Setting this field has no effect on the default executors.
//
// TODO(haosdent): Deprecate this field when we add better support for
// success and possibly failure statuses, e.g. ranges of success and
// failure statuses.
```


- Alexander Rukletsov


On Sept. 21, 2016, 6:21 p.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51803/
> ---
> 
> (Updated Sept. 21, 2016, 6:21 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Joseph Wu, Silas Snider, and 
> Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6110
> https://issues.apache.org/jira/browse/MESOS-6110
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Ensured `HealthCheck::HTTPCheckInfo` compatible with the old one.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 2209ea2fb0bf39c773d60f8a0eea865320a03bb6 
>   include/mesos/v1/mesos.proto 00c623450268a990d48b4e119aa9429fabf2f135 
> 
> Diff: https://reviews.apache.org/r/51803/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 52105: Prevent `/redirect/foo` loop.

2016-09-23 Thread haosdent huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52105/#review150185
---




src/master/http.cpp (lines 2036 - 2044)


Becasue we have an util function `startsWith`, how about change it like

```
diff --git a/src/master/http.cpp b/src/master/http.cpp
index e9f9d16..c77229a 100644
--- a/src/master/http.cpp
+++ b/src/master/http.cpp
@@ -2038,6 +2038,11 @@ Future Master::Http::redirect(const 
Request& request) const
 // When request url is '/redirect' or '/master/redirect', redirect to 
the
 // base url of leading master to avoid infinite redirect loop.
 return TemporaryRedirect(basePath);
+  } else if (strings::startsWith(request.url.path, "/redirect/") ||
+ strings::startsWith(request.url.path,
+ "/" + master->self().id + "/redirect/")) {
+// Stop redirection here to avoid fall into an infinite redirection 
loop.
+return NotFound();
   } else {
```



src/master/http.cpp (lines 2038 - 2040)


Another problem here is if we call `redirect(/redirect_a)` in http.cpp, 
would get unexpected result. So how about just return 404 which consistent with 
`/redirect/`

```
  // When current master is not the leader, redirect to the leading master.
  if (!master->elected()) {
return redirect(request);
  }
```


- haosdent huang


On Sept. 22, 2016, 8:47 p.m., Charles Allen wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52105/
> ---
> 
> (Updated Sept. 22, 2016, 8:47 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6210
> https://issues.apache.org/jira/browse/MESOS-6210
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Prevent `/redirect/foo` loop.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 9005e7c308d5f57c6f5c573951d468a3ba730740 
> 
> Diff: https://reviews.apache.org/r/52105/diff/
> 
> 
> Testing
> ---
> 
> Single Node started with `./src/mesos-master --work_dir=/tmp/mesos`
> 
> 
> ```
> Charless-MacBook-Pro:~ charlesallen$ curl -v http://127.0.0.1:5050/redirect
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:18 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:23 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/ HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> < Date: Thu, 22 Sep 2016 20:35:29 GMT
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/foo
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/foo HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:31 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/master/redirect/foo/bar
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /master/redirect/foo/bar HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> < Date: Thu, 22 Sep 2016 20:35:34 GMT
> < Location: //10.17.97.185:5050
> < Content-Length: 0
> <
> * Connection #0 to host 127.0.0.1 left intact
> Charless-MacBook-Pro:~ charlesallen$ curl -v 
> http://127.0.0.1:5050/redirect/foo/bar
> *   Trying 127.0.0.1...
> * Connected to 127.0.0.1 (127.0.0.1) port 5050 (#0)
> > GET /redirect/foo/bar HTTP/1.1
> > Host: 127.0.0.1:5050
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
>

Re: Review Request 51560: Deprecated using health checks without setting the type.

2016-09-23 Thread Alexander Rukletsov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51560/#review150182
---




src/health-check/health_checker.cpp (line 188)


`UNREACHABLE` does not say what the problem is. How about this:
`LOG(FATAL) << "Invalid health check";`



src/health-check/health_checker.cpp (lines 191 - 192)


s/. U/; u
please move trailing space onto the next line



src/health-check/health_checker.cpp (lines 620 - 621)


Blank comment line, please



src/tests/health_check_tests.cpp (lines 210 - 211)


Blank comment line, please.



src/tests/health_check_tests.cpp (lines 223 - 224)


Ditto.


- Alexander Rukletsov


On Sept. 21, 2016, 6:21 p.m., haosdent huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51560/
> ---
> 
> (Updated Sept. 21, 2016, 6:21 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Joseph Wu, Silas Snider, and 
> Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6110
> https://issues.apache.org/jira/browse/MESOS-6110
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Health checks must supply set type field from now on. Additionally,
> `HealthCheck.HTTP` message has been renamed to
> `HealthCheck.HttpCheckInfo` to avoid naming collisions.
> 
> 
> Diffs
> -
> 
>   src/health-check/health_checker.cpp 
> 758cbb382b08743f518022ba66eaafbea7615592 
>   src/tests/health_check_tests.cpp 30b5a2f6b256b795853553f2e5cb3c5e08512a5f 
> 
> Diff: https://reviews.apache.org/r/51560/diff/
> 
> 
> Testing
> ---
> 
> Could verfied from 
> https://github.com/haosdent/mesos/blob/85abb6da058bf8daf4e794df7da8eb5f240b671c/docs/upgrades.md
> 
> 
> Thanks,
> 
> haosdent huang
> 
>



Re: Review Request 51651: Updated sorting in 'MountInfoTable::read()' to use iterative algorithm.

2016-09-23 Thread Ian Babrou

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51651/#review150181
---




src/linux/fs.cpp (line 156)


Can there be a newline after `:`? Otherwise the first line is glued to the 
error itself.


- Ian Babrou


On Sept. 6, 2016, 2:18 p.m., Kevin Klues wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51651/
> ---
> 
> (Updated Sept. 6, 2016, 2:18 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, and Joseph Wu.
> 
> 
> Bugs: MESOS-6100 and MESOS-6118
> https://issues.apache.org/jira/browse/MESOS-6100
> https://issues.apache.org/jira/browse/MESOS-6118
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, we were using a recursive lambda with captured state
> as part of the algorithm for sorting the entries in the mount info
> table. In certain edge cases, it appeared that doing things this way
> may have caused memory corruption that was difficult to reason about.
> 
> To avoid this, we have changed the recursive algorithm to an iterative
> one. As a side effect, the actual ordering of the entries is now
> different with the new algorithm (the original algorithm did a
> depth-first traversal of the mount hierarchy and the new one does a
> breadth-first traversal).  However, both algorithms maintain the same
> invariant -- all parent mount points appear earlier in the sorted list
> than their children.
> 
> 
> Diffs
> -
> 
>   src/linux/fs.cpp 14ae5a9089916549f691363bc2269e13c5260a14 
> 
> Diff: https://reviews.apache.org/r/51651/diff/
> 
> 
> Testing
> ---
> 
> `GTEST_FILTER="*MountInfoTableReadSorted*" make -j24 check`
> 
> Everything looks as expected according to the Mesosphere CI.
> 15x :smile: - 2x :expressionless: - 2x :disappointed:
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>



Re: Review Request 52175: Tweaked test comments.

2016-09-23 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52175/#review150180
---



Patch looks great!

Reviews applied: [52214, 52174, 52175]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On Sept. 23, 2016, 11:21 a.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52175/
> ---
> 
> (Updated Sept. 23, 2016, 11:21 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Tweaked test comments.
> 
> 
> Diffs
> -
> 
>   src/tests/partition_tests.cpp 7c38f0efa414447e6292b2d6b334fb9879c92eb5 
> 
> Diff: https://reviews.apache.org/r/52175/diff/
> 
> 
> Testing
> ---
> 
> `make check` on OSX.
> 
> 
> Thanks,
> 
> Neil Conway
> 
>



Re: Review Request 52175: Tweaked test comments.

2016-09-23 Thread Neil Conway

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52175/
---

(Updated Sept. 23, 2016, 11:21 a.m.)


Review request for mesos and Vinod Kone.


Changes
---

Rebase.


Repository: mesos


Description
---

Tweaked test comments.


Diffs (updated)
-

  src/tests/partition_tests.cpp 7c38f0efa414447e6292b2d6b334fb9879c92eb5 

Diff: https://reviews.apache.org/r/52175/diff/


Testing
---

`make check` on OSX.


Thanks,

Neil Conway



Re: Review Request 52174: Fixed bug with unreachable tasks and disconnected frameworks.

2016-09-23 Thread Neil Conway

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52174/
---

(Updated Sept. 23, 2016, 11:17 a.m.)


Review request for mesos and Vinod Kone.


Changes
---

Fetch `FrameworkInfo` for recovered frameworks.


Bugs: MESOS-6226
https://issues.apache.org/jira/browse/MESOS-6226


Repository: mesos


Description (updated)
---

We previously assumed that when marking a task unreachable, we would
have access to the `FrameworkInfo` for that task's framework. However,
that is not the case if the master has failed over and the framework has
not yet reregistered with the new master.


Diffs (updated)
-

  src/master/master.cpp 66a672f6d16233e96b29e330a9e6c474546fa851 
  src/tests/partition_tests.cpp 7c38f0efa414447e6292b2d6b334fb9879c92eb5 

Diff: https://reviews.apache.org/r/52174/diff/


Testing
---

`make check` on OSX.


Thanks,

Neil Conway



Re: Review Request 52174: Fixed bug with unreachable tasks and disconnected frameworks.

2016-09-23 Thread Neil Conway


> On Sept. 23, 2016, 3:02 a.m., Vinod Kone wrote:
> > src/master/master.cpp, line 5926
> > 
> >
> > No need for this hack. You can look in the `recovered` map for the 
> > FrameworkInfo of a recovered framework that has not resubscribed yet.

Good point, thanks.


- Neil


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52174/#review150127
---


On Sept. 22, 2016, 9:12 p.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52174/
> ---
> 
> (Updated Sept. 22, 2016, 9:12 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-6226
> https://issues.apache.org/jira/browse/MESOS-6226
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> We previously assumed that when marking a task unreachable, we would
> have access to the `FrameworkInfo` for that task's framework. However,
> that is not the case if the master has failed over and the framework has
> not yet reregistered with the new master.
> 
> In that situation, we don't have the framework's FrameworkInfo, and
> hence we cannot determine if the framework is partition-aware. For now,
> we assume the framework is partition-aware, which isn't great behavior
> (MESOS-6232).
> 
> 
> Diffs
> -
> 
>   src/master/master.cpp 66a672f6d16233e96b29e330a9e6c474546fa851 
>   src/tests/partition_tests.cpp 7c38f0efa414447e6292b2d6b334fb9879c92eb5 
> 
> Diff: https://reviews.apache.org/r/52174/diff/
> 
> 
> Testing
> ---
> 
> `make check` on OSX.
> 
> 
> Thanks,
> 
> Neil Conway
> 
>



Review Request 52214: Cleaned up some code for managing recovered frameworks.

2016-09-23 Thread Neil Conway

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52214/
---

Review request for mesos and Vinod Kone.


Repository: mesos


Description
---

Cleaned up some code for managing recovered frameworks.


Diffs
-

  src/master/master.hpp 48011eabda03986df3dfac124506645a398eaff4 
  src/master/master.cpp 66a672f6d16233e96b29e330a9e6c474546fa851 

Diff: https://reviews.apache.org/r/52214/diff/


Testing
---

`make check` on OSX and Linux.


Thanks,

Neil Conway