Re: Review Request 52534: Dispatch filter expiration twice.

2016-12-20 Thread Jacob Janco


> On Dec. 12, 2016, 5 p.m., Jiang Yan Xu wrote:
> > I feel it's hard to explain this patch without /r/51027/. It's probably 
> > better to have it go after /r/51027/ in the chain. It's fine if /r/51027/ 
> > requires this patch to pass the tests.

Yep, this this patch is necessary because of 51027. I'll chain accordingly.


- Jacob


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52534/#review158858
---


On Dec. 21, 2016, 6:07 a.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52534/
> ---
> 
> (Updated Dec. 21, 2016, 6:07 a.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Benjamin Mahler, and Jiang Yan 
> Xu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - With an asynchronous `batch()` allocation,
>   this ensures that filters do not expire
>   before the next allocation.
> - This patch should be reverted when allocation
>   occurs on resource recovery.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> a6424d624864155e1c87a28a63b784512c5c8722 
>   src/master/allocator/mesos/hierarchical.cpp 
> 91b1ec43940a788459f045ca4a4b82d4e8373bca 
> 
> Diff: https://reviews.apache.org/r/52534/diff/
> 
> 
> Testing
> ---
> 
> With https://reviews.apache.org/r/51027/: 
> 
> GTEST_FILTER="-*SmallOfferFilter*" make check -j8
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 52534: Dispatch filter expiration twice.

2016-12-20 Thread Jacob Janco


> On Dec. 12, 2016, 5:33 p.m., Alexander Rukletsov wrote:
> > src/master/allocator/mesos/hierarchical.cpp, lines 1084-1085
> > 
> >
> > If you want to add such a TODO, it should probably live close to 
> > `recoverResources()`.

Removed this TODO.


- Jacob


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52534/#review158879
---


On Dec. 21, 2016, 6:07 a.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52534/
> ---
> 
> (Updated Dec. 21, 2016, 6:07 a.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Benjamin Mahler, and Jiang Yan 
> Xu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - With an asynchronous `batch()` allocation,
>   this ensures that filters do not expire
>   before the next allocation.
> - This patch should be reverted when allocation
>   occurs on resource recovery.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> a6424d624864155e1c87a28a63b784512c5c8722 
>   src/master/allocator/mesos/hierarchical.cpp 
> 91b1ec43940a788459f045ca4a4b82d4e8373bca 
> 
> Diff: https://reviews.apache.org/r/52534/diff/
> 
> 
> Testing
> ---
> 
> With https://reviews.apache.org/r/51027/: 
> 
> GTEST_FILTER="-*SmallOfferFilter*" make check -j8
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 52534: Dispatch filter expiration twice.

2016-12-20 Thread Jacob Janco

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52534/
---

(Updated Dec. 21, 2016, 6:07 a.m.)


Review request for mesos, Alexander Rukletsov, Benjamin Mahler, and Jiang Yan 
Xu.


Repository: mesos


Description
---

- With an asynchronous `batch()` allocation,
  this ensures that filters do not expire
  before the next allocation.
- This patch should be reverted when allocation
  occurs on resource recovery.


Diffs (updated)
-

  src/master/allocator/mesos/hierarchical.hpp 
a6424d624864155e1c87a28a63b784512c5c8722 
  src/master/allocator/mesos/hierarchical.cpp 
91b1ec43940a788459f045ca4a4b82d4e8373bca 

Diff: https://reviews.apache.org/r/52534/diff/


Testing
---

With https://reviews.apache.org/r/51027/: 

GTEST_FILTER="-*SmallOfferFilter*" make check -j8


Thanks,

Jacob Janco



Re: Review Request 51027: Track allocation candidates to bound allocator.

2016-12-20 Thread Jacob Janco

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51027/
---

(Updated Dec. 21, 2016, 6:02 a.m.)


Review request for mesos, Benjamin Mahler, Guangya Liu, James Peach, Klaus Ma, 
and Jiang Yan Xu.


Bugs: MESOS-3157
https://issues.apache.org/jira/browse/MESOS-3157


Repository: mesos


Description
---

- Triggered allocations dispatch allocate() only
  if there is no pending allocation in the queue.
- Allocation candidates are accumulated and only
  cleared when enqueued allocations are processed.


Diffs
-

  src/master/allocator/mesos/hierarchical.hpp 
a6424d624864155e1c87a28a63b784512c5c8722 
  src/master/allocator/mesos/hierarchical.cpp 
91b1ec43940a788459f045ca4a4b82d4e8373bca 

Diff: https://reviews.apache.org/r/51027/diff/


Testing
---

make check with the filters below

Broken tests: 
- TEST_F(HierarchicalAllocatorTest, SuppressAndReviveOffers), fix in 51028
- TEST_F(HierarchicalAllocatorTest, AllocationRunsMetric), fix in 51028
- TEST_F(HierarchicalAllocatorTest, AllocationRunTimerMetrics), fix in 51028
- TEST_F(HierarchicalAllocatorTest, UpdateWeight), fix in 51028
- TEST_P(HierarchicalAllocator_BENCHMARK_Test, AddAndUpdateSlave), fix in 51028
- TEST_F(HierarchicalAllocatorTest, SmallOfferFilterTimeout), fix in 52534
- TEST_F(OversubscriptionTest, RescindRevocableOfferWithIncreasedRevocable), 
fix in 51621


Thanks,

Jacob Janco



Re: Review Request 51028: Fix tests with rapidly triggered allocations.

2016-12-20 Thread Jacob Janco


> On Sept. 29, 2016, 3:06 a.m., Guangya Liu wrote:
> > src/tests/hierarchical_allocator_tests.cpp, lines 3007-3011
> > 
> >
> > Just a question here, can we always guarantee only one `allocate()` 
> > will be called? I think that this depends on the speed of dispatching the 
> > `run` request.

Right, I took a look at this. I believe pausing the clock between the addSlave 
and addFramework messages is correct, this does depend on speed.


- Jacob


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51028/#review150798
---


On Dec. 21, 2016, 5:58 a.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51028/
> ---
> 
> (Updated Dec. 21, 2016, 5:58 a.m.)
> 
> 
> Review request for mesos and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Per MESOS-3157, if cluster events with the possibility
>   of triggering an allocation occur rapidly and test
>   assertions depend on gleaning information from assumed
>   order, it is likely they will fail due to lack of parity
>   between event and actual allocation. Ensure that expected
>   allocations occur.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_tests.cpp 
> 1edd0ecc8a93cd41532e1cf3641f67c780ab23a5 
> 
> Diff: https://reviews.apache.org/r/51028/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 51028: Fix tests with rapidly triggered allocations.

2016-12-20 Thread Jacob Janco

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51028/
---

(Updated Dec. 21, 2016, 5:58 a.m.)


Review request for mesos and Jiang Yan Xu.


Changes
---

Fix AllocationRunTimerMetrics logic.


Summary (updated)
-

Fix tests with rapidly triggered allocations.


Bugs: MESOS-3157
https://issues.apache.org/jira/browse/MESOS-3157


Repository: mesos


Description
---

- Per MESOS-3157, if cluster events with the possibility
  of triggering an allocation occur rapidly and test
  assertions depend on gleaning information from assumed
  order, it is likely they will fail due to lack of parity
  between event and actual allocation. Ensure that expected
  allocations occur.


Diffs (updated)
-

  src/tests/hierarchical_allocator_tests.cpp 
1edd0ecc8a93cd41532e1cf3641f67c780ab23a5 

Diff: https://reviews.apache.org/r/51028/diff/


Testing
---

make check


Thanks,

Jacob Janco



Re: Review Request 54083: Made headers in stout standalone.

2016-12-20 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54083/#review159785
---


Ship it!




Ship It!

- Michael Park


On Dec. 14, 2016, 1:04 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54083/
> ---
> 
> (Updated Dec. 14, 2016, 1:04 a.m.)
> 
> 
> Review request for mesos, Michael Park and Qian Zhang.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Missing includes were found by attempting to parse the header files
> standalone with
> 
> $ clang++ -fsyntax-only -I  HEADER_FILE.hpp
> 
> By fixing failures we can make sure that all required symbols are
> defined.
> 
> Note this change does not address the issue of transitive includes.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/os.hpp 
> 63265eed28d7345dfa18ac1c5a22ef58a7902c62 
>   3rdparty/stout/include/stout/os/osx.hpp 
> 770c85d2a2569a4242d07b221f15a87565dd0bb5 
>   3rdparty/stout/include/stout/os/permissions.hpp 
> 1fa7614878cb0f58270824c88ab0013a9d9f14e9 
>   3rdparty/stout/include/stout/os/posix/socket.hpp 
> 836e4b3c37435a7f952321a13f3193043c1055d9 
>   3rdparty/stout/include/stout/os/posix/su.hpp 
> c3ce87e35761ab8543051cd3fcc6f0188a61eae1 
>   3rdparty/stout/include/stout/os/posix/xattr.hpp 
> 56f7308677dabebb6a84058bc210e48f10adfbc7 
> 
> Diff: https://reviews.apache.org/r/54083/diff/
> 
> 
> Testing
> ---
> 
> make check (OS X, clang trunk w/o optimizations, SSL)
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Review Request 54928: Added initial random delay to agent (re)registration.

2016-12-20 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54928/
---

Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
and Joseph Wu.


Bugs: MESOS-6803
https://issues.apache.org/jira/browse/MESOS-6803


Repository: mesos


Description
---

Currently when a master fails over, agents that do not use
authentication will choose a random time to re-register to avoid the
"thundering herd" problem. This is not true of the authenticated
codepath -- agents that have a credential will attempt to re-register
immediately.

This issue adds a random delay to the initial authentication to match
the non-authenticated code path.

This issue caps off a chain of reviews that fixes tests to work with
this change. This issue also resolves MESOS-6803.


Diffs
-

  src/slave/slave.cpp a7a3a394e5e4b7f40a051663cd70add3890bdf18 

Diff: https://reviews.apache.org/r/54928/diff/


Testing
---

`make check` on Unix and I ran the test suite 1000 times overnight, minus the 
few flaky tests.


Thanks,

Alex Clemmer



Review Request 54927: Fixed partition to pass when `HAS_AUTHENTICATED` is undefined.

2016-12-20 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54927/
---

Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
and Joseph Wu.


Bugs: MESOS-6803
https://issues.apache.org/jira/browse/MESOS-6803


Repository: mesos


Description
---

This is a follow-on to 54803.

Currently, when `HAS_AUTHENTICATION` is undefined, the Agent will
use `delay` to schedule a random time in the future to register with the
Master, to avoid the thundering herd problem after a Master failover.
The authentication codepath, in contrast, schedules the registration
immediately.

In tests where we have `Clock::pause`'d when we are supposed to be
registering the slave, the authention codepath will succeeed, while
no-authentication codepath will hang forever.

A much more detailed analysis of this situation exists in MESOS-6803.

This commit will resolve this issue for a few existing test files by
changing the tests to not use `Clock::pause` when we are waiting for
Agent registration.


Diffs
-

  src/tests/partition_tests.cpp e1e2025bd0f078836323cbd8c6d7836815c4c38d 

Diff: https://reviews.apache.org/r/54927/diff/


Testing
---

`make check` and each test was run 1000 times to avoid flakey tests.


Thanks,

Alex Clemmer



Re: Review Request 54909: Fixed spurious registration bug in framework and agent.

2016-12-20 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54909/
---

(Updated Dec. 21, 2016, 3:09 a.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
and Joseph Wu.


Changes
---

Let's fix the spurious framework registration bug while we're at it! :)


Summary (updated)
-

Fixed spurious registration bug in framework and agent.


Bugs: MESOS-6803
https://issues.apache.org/jira/browse/MESOS-6803


Repository: mesos


Description (updated)
---

This issue appears when `HAS_AUTHENTICATION` is not defined. This commit
will resolve the issue, and fix tests that surfaced it (as well as some
associated errors that cause them to be flaky when this symbol is not
defined).

Currently when a new master is detected and no credential is provided,
the agent and framework (which have very similar registration code
paths) will attempt to (re)register after some random initial `delay`,
to avoid a "thundering herd" problem.  It is hence possible to have
spurious double-(re)registrations, since a new master could be detected
after we add the `delay`d registration, but before we execute it.

In a degenerate case, suppose a single agent has a registration delay
of one minute.  A master is brought up, to which, the agent successfully
registers.  Prior to this commit, the agent will still have a scheduled
registration loop (`doReliableRegistration`) with a backoff factor.
If the master goes down and a new master is brought up, the agent
will race against itself (two ongoing loops of `doReliableRegistration`)
to register with the new master.  If the first loop reaches the new
master first, authentication will fail and cause the agent to commit
suicide.

To resolve this problem, we store the value of the `Timer` returned by
`delay` in `doReliableRegistration` and cancel it when we have either
registered, or need to start a new cycle of registration. This solution
is implemented in both the framework code and the agent code.


Diffs (updated)
-

  src/sched/sched.cpp 1eecb8b9f8d75c123d13e73d3dd25129e6cd93c4 
  src/slave/slave.hpp 03860b5d0242289034d4574bd36a85ab6fb87a79 
  src/slave/slave.cpp a7a3a394e5e4b7f40a051663cd70add3890bdf18 
  src/tests/master_tests.cpp b3b5630899082a69119d2cccb6460766d1352963 
  src/tests/reservation_tests.cpp ffbb50bdf16fdeb0ad0aa98afbe71c38c784cd71 

Diff: https://reviews.apache.org/r/54909/diff/


Testing
---

`make check` and `mesos-tests --gtest_repeat=1000 --gtest_break_on_failure` to 
catch intermittent failures, which is how we caught the failing test in 
`reservation_tests.cpp`. Note that this bug was discovered when we added a 
`delay` to the call to `authenticate` in `slave::detected` (in order to get it 
to match the behavior of the non-authenticated call to `doReliableRegistration`.


Thanks,

Alex Clemmer



Review Request 54926: Augmented a fault_tolerance_test to cover update of role.

2016-12-20 Thread Jay Guo

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54926/
---

Review request for mesos, Benjamin Mahler and Guangya Liu.


Repository: mesos


Description
---

In currently implementation, update of role during framework
failover is ignored. This behavior should be reflected in test.


Diffs
-

  src/tests/fault_tolerance_tests.cpp 05937a917a2c175aa53b52488febb7cfd8400a13 

Diff: https://reviews.apache.org/r/54926/diff/


Testing
---

make check


Thanks,

Jay Guo



Re: Review Request 54693: Add ProtoBuf schema for Blkio cgroup subsystem

2016-12-20 Thread Jason Lai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54693/
---

(Updated Dec. 21, 2016, 2:19 a.m.)


Review request for mesos, Xiaojian Huang, Gilbert Song, haosdent huang, Jie Yu, 
Kunal Thakar, and Zhitao Li.


Bugs: MESOS-6162
https://issues.apache.org/jira/browse/MESOS-6162


Repository: mesos


Description
---

Add ProtoBuf schema for Blkio cgroups per the kernel Blkio controller doc.


Diffs
-

  include/mesos/mesos.proto 0017d3d9d46433b391703025f611ce437bbc7ebe 

Diff: https://reviews.apache.org/r/54693/diff/


Testing
---


Thanks,

Jason Lai



Re: Review Request 54693: Add ProtoBuf schema for Blkio cgroup subsystem

2016-12-20 Thread Jason Lai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54693/
---

(Updated Dec. 21, 2016, 2:16 a.m.)


Review request for .


Bugs: MESOS-6162
https://issues.apache.org/jira/browse/MESOS-6162


Repository: mesos


Description
---

Add ProtoBuf schema for Blkio cgroups per the kernel Blkio controller doc.


Diffs (updated)
-

  include/mesos/mesos.proto 0017d3d9d46433b391703025f611ce437bbc7ebe 

Diff: https://reviews.apache.org/r/54693/diff/


Testing
---


Thanks,

Jason Lai



Re: Review Request 54693: Add ProtoBuf schema for Blkio cgroup subsystem

2016-12-20 Thread Jason Lai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54693/
---

(Updated Dec. 21, 2016, 2:16 a.m.)


Review request for .


Bugs: MESOS-6162
https://issues.apache.org/jira/browse/MESOS-6162


Repository: mesos


Description
---

Add ProtoBuf schema for Blkio cgroups per the kernel Blkio controller doc.


Diffs (updated)
-

  include/mesos/mesos.proto 0017d3d9d46433b391703025f611ce437bbc7ebe 

Diff: https://reviews.apache.org/r/54693/diff/


Testing
---


Thanks,

Jason Lai



Re: Review Request 54693: Add ProtoBuf schema for Blkio cgroup subsystem

2016-12-20 Thread Jason Lai


> On Dec. 13, 2016, 7:29 a.m., haosdent huang wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/subsystems/blkio.hpp, lines 
> > 54-57
> > 
> >
> > Let's only add defintions for proto in this file.

Sure. I was initially hoping to add code for implementations, but seems like it 
will make the scope of this diff much bigger. I'll revert these 2 files in the 
diff.


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54693/#review158963
---


On Dec. 13, 2016, 5:12 a.m., Jason Lai wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54693/
> ---
> 
> (Updated Dec. 13, 2016, 5:12 a.m.)
> 
> 
> Review request for mesos, Xiaojian Huang, Gilbert Song, haosdent huang, Jie 
> Yu, Kunal Thakar, and Zhitao Li.
> 
> 
> Bugs: MESOS-6162
> https://issues.apache.org/jira/browse/MESOS-6162
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Add ProtoBuf schema for Blkio cgroups per the kernel Blkio controller doc.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 0017d3d9d46433b391703025f611ce437bbc7ebe 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/blkio.hpp 
> PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystems/blkio.cpp 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/54693/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Lai
> 
>



Re: Review Request 54921: Added blog post for the 0.28.3 release.

2016-12-20 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54921/#review159780
---


Ship it!




Ship It!

- Joseph Wu


On Dec. 20, 2016, 3:33 p.m., Anand Mazumdar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54921/
> ---
> 
> (Updated Dec. 20, 2016, 3:33 p.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added blog post for the 0.28.3 release.
> 
> 
> Diffs
> -
> 
>   site/source/blog/2016-12-20-mesos-0-28-3-released.md PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/54921/diff/
> 
> 
> Testing
> ---
> 
> Viewed on the website docker container. Renders fine.
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>



Re: Review Request 54920: Updated releases.yml for Mesos 0.28.3.

2016-12-20 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54920/#review159779
---


Ship it!




Ship It!

- Joseph Wu


On Dec. 20, 2016, 3:33 p.m., Anand Mazumdar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54920/
> ---
> 
> (Updated Dec. 20, 2016, 3:33 p.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Updated releases.yml for Mesos 0.28.3.
> 
> 
> Diffs
> -
> 
>   site/data/releases.yml 755300aa83a1017362129bad256381d10c815609 
> 
> Diff: https://reviews.apache.org/r/54920/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>



Re: Review Request 54919: Fixed target version for the 1.1.0 release.

2016-12-20 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54919/#review159778
---


Ship it!




Ship It!

- Joseph Wu


On Dec. 20, 2016, 3:34 p.m., Anand Mazumdar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54919/
> ---
> 
> (Updated Dec. 20, 2016, 3:34 p.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed target version for the 1.1.0 release.
> 
> 
> Diffs
> -
> 
>   site/data/releases.yml 755300aa83a1017362129bad256381d10c815609 
> 
> Diff: https://reviews.apache.org/r/54919/diff/
> 
> 
> Testing
> ---
> 
> Not a functional change.
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>



Review Request 54920: Updated releases.yml for Mesos 0.28.3.

2016-12-20 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54920/
---

Review request for mesos and Joseph Wu.


Repository: mesos


Description
---

Updated releases.yml for Mesos 0.28.3.


Diffs
-

  site/data/releases.yml 755300aa83a1017362129bad256381d10c815609 

Diff: https://reviews.apache.org/r/54920/diff/


Testing
---


Thanks,

Anand Mazumdar



Review Request 54919: Fixed target version for the 1.1.0 release.

2016-12-20 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54919/
---

Review request for mesos and Joseph Wu.


Repository: mesos


Description
---

Fixed target version for the 1.1.0 release.


Diffs
-

  site/data/releases.yml 755300aa83a1017362129bad256381d10c815609 

Diff: https://reviews.apache.org/r/54919/diff/


Testing
---

Not a functional change.


Thanks,

Anand Mazumdar



Review Request 54921: Added blog post for the 0.28.3 release.

2016-12-20 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54921/
---

Review request for mesos and Joseph Wu.


Repository: mesos


Description
---

Added blog post for the 0.28.3 release.


Diffs
-

  site/source/blog/2016-12-20-mesos-0-28-3-released.md PRE-CREATION 

Diff: https://reviews.apache.org/r/54921/diff/


Testing
---

Viewed on the website docker container. Renders fine.


Thanks,

Anand Mazumdar



Re: Review Request 51028: Fix broken tests post batched event allocation.

2016-12-20 Thread Jacob Janco


> On Dec. 13, 2016, 4:30 p.m., Jiang Yan Xu wrote:
> > src/tests/hierarchical_allocator_tests.cpp, lines 3621-3629
> > 
> >
> > I think we are still intereted in the number of allocations and we 
> > expect to see them decrease because of batching. We don't use EXPECT on 
> > them but we print them out as part of the benchmark?
> 
> Jacob Janco wrote:
> Added this in. Ex test output: 
> 
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/5
> Using 1000 agents and 1000 frameworks
> Added 1000 frameworks in 18064us
> Added 1000 agents in 5.229825secs performing 1000 allocations
> Updated 1000 agents in 5.39987secs performing 1000 allocations
> [   OK ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/5
>  (10708 ms)

A better snippet showing trend: 

[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0
Using 1000 agents and 1 frameworks
Added 1 frameworks in 866us
Added 1000 agents in 616149us performing 5 allocations
Updated 1000 agents in 351825us performing 3 allocations
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/0 
(1014 ms)
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/1
Using 1000 agents and 50 frameworks
Added 50 frameworks in 952us
Added 1000 agents in 744594us performing 142 allocations
Updated 1000 agents in 508415us performing 111 allocations
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/1 
(1310 ms)
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/2
Using 1000 agents and 100 frameworks
Added 100 frameworks in 1877us
Added 1000 agents in 1.088271secs performing 179 allocations
Updated 1000 agents in 867190us performing 135 allocations
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/2 
(2006 ms)
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/3
Using 1000 agents and 200 frameworks
Added 200 frameworks in 3357us
Added 1000 agents in 1.516704secs performing 294 allocations
Updated 1000 agents in 1.319724secs performing 215 allocations
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/3 
(2894 ms)
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/4
Using 1000 agents and 500 frameworks
Added 500 frameworks in 8704us
Added 1000 agents in 2.890338secs performing 685 allocations
Updated 1000 agents in 2.783646secs performing 513 allocations
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/4 
(5734 ms)
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/5
Using 1000 agents and 1000 frameworks
Added 1000 frameworks in 16440us
Added 1000 agents in 5.211817secs performing 1000 allocations
Updated 1000 agents in 5.439422secs performing 1000 allocations
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/5 
(10726 ms)
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/6
Using 1000 agents and 3000 frameworks
Added 3000 frameworks in 46501us
Added 1000 agents in 14.495025secs performing 1000 allocations
Updated 1000 agents in 14.99748secs performing 1000 allocations
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/6 
(29611 ms)
[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/7
Using 1000 agents and 6000 frameworks
Added 6000 frameworks in 83773us
Added 1000 agents in 28.992552secs performing 1000 allocations
Updated 1000 agents in 29.068944secs performing 1000 allocations
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/7 
(58237 ms)


- Jacob


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51028/#review158967
---


On Dec. 20, 2016, 12:05 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51028/
> ---
> 
> (Updated Dec. 20, 2016, 12:05 p.m.)
> 
> 
> Review request for mesos and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Per MESOS-3157, if cluster events with the possibility
>   of triggering an allocation occur rapidly and test
>   assertions depend on gleaning information from assumed
>   

Re: Review Request 54909: Added member to agent to avoid spurious re-registrations.

2016-12-20 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54909/
---

(Updated Dec. 20, 2016, 3:18 p.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
and Joseph Wu.


Changes
---

Discussed this case with Alex and fleshed out the commit message.  There is a 
potential failure case that this timer + cancel logic will prevent -- Joseph Wu.


Bugs: MESOS-6803
https://issues.apache.org/jira/browse/MESOS-6803


Repository: mesos


Description (updated)
---

Currently when a new master is detected and no credential is provided,
the agent will attempt to (re)register after some random initial
`delay`, to avoid a "thundering herd" problem.  It is hence possible
to have spurious double-(re)registrations, since a new master could 
be detected after we add the `delay`d registration, but before we 
execute it.

In a degenerate case, suppose a single agent has a registration delay
of one minute.  A master is brought up, to which, the agent successfully
registers.  Prior to this commit, the agent will still have a scheduled
registration loop (`doReliableRegistration`) with a backoff factor.
If the master goes down and a new master is brought up, the agent 
will race against itself (two ongoing loops of `doReliableRegistration`)
to register with the new master.  If the first loop reaches the new 
master first, authentication will fail and cause the agent to commit
suicide.

To resolve this problem, we store the value of the `Timer` returned by
`delay` in `doReliableRegistration` and cancel it when we have either
registered, or need to start a new cycle of registration.


Diffs
-

  src/slave/slave.hpp 03860b5d0242289034d4574bd36a85ab6fb87a79 
  src/slave/slave.cpp a7a3a394e5e4b7f40a051663cd70add3890bdf18 
  src/tests/reservation_tests.cpp ffbb50bdf16fdeb0ad0aa98afbe71c38c784cd71 

Diff: https://reviews.apache.org/r/54909/diff/


Testing
---

`make check` and `mesos-tests --gtest_repeat=1000 --gtest_break_on_failure` to 
catch intermittent failures, which is how we caught the failing test in 
`reservation_tests.cpp`. Note that this bug was discovered when we added a 
`delay` to the call to `authenticate` in `slave::detected` (in order to get it 
to match the behavior of the non-authenticated call to `doReliableRegistration`.


Thanks,

Alex Clemmer



Re: Review Request 54913: Added some missing log statements for scheduler calls.

2016-12-20 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54913/
---

(Updated Dec. 20, 2016, 11:09 p.m.)


Review request for mesos and Joris Van Remoortere.


Changes
---

Review comments, NNFR


Repository: mesos


Description (updated)
---

Some Scheduler API calls ('KILL'/'SHUTDOWN') were missing info
level logging statements. In the case of `KILL', the logging
statement was there but was invoked in `killTask()` but was
not consistent with how we were logging for the other calls
in the continuation helper.


Diffs (updated)
-

  src/master/master.cpp b664550d57ef9805bd23ea35ca7f9cd8f4b1ab78 

Diff: https://reviews.apache.org/r/54913/diff/


Testing
---

make check


Thanks,

Anand Mazumdar



Re: Review Request 54763: Replaced `::pipe` with `os::pipe` in libprocess.

2016-12-20 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54763/
---

(Updated Dec. 20, 2016, 2:36 p.m.)


Review request for mesos.


Repository: mesos


Description
---

Replaced `::pipe` with `os::pipe` in libprocess.


Diffs
-

  3rdparty/libprocess/include/process/posix/subprocess.hpp 
a1054e788ef6a322901c262380aceab8192235ac 

Diff: https://reviews.apache.org/r/54763/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 54880: Added unit-test for dynamic addition/deletion of CNI config.

2016-12-20 Thread Avinash sridharan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54880/
---

(Updated Dec. 20, 2016, 10:36 p.m.)


Review request for mesos, Jie Yu and Qian Zhang.


Changes
---

Fixed some typos in comments.


Bugs: MESOS-6567
https://issues.apache.org/jira/browse/MESOS-6567


Repository: mesos


Description
---

Added unit-test for dynamic addition/deletion of CNI config.


Diffs (updated)
-

  src/tests/containerizer/cni_isolator_tests.cpp 
0380f2ca122c7b8ab1dac351b14f546526580e72 

Diff: https://reviews.apache.org/r/54880/diff/


Testing
---

sudo ./bin/mesos-tests.sh --gtest_filter=*Cni*


Thanks,

Avinash sridharan



Re: Review Request 54593: Replaced `::lseek` with `os::lseek` in stout.

2016-12-20 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54593/
---

(Updated Dec. 20, 2016, 2:36 p.m.)


Review request for mesos and Daniel Pravat.


Repository: mesos


Description
---

Replaced `::lseek` with `os::lseek` in stout.


Diffs
-

  3rdparty/stout/include/stout/protobuf.hpp 
80cb20f40a7ddd4309d27973eef9fca9e4052b64 

Diff: https://reviews.apache.org/r/54593/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 54596: Replaced `::dup` with `os::dup` in libprocess.

2016-12-20 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54596/
---

(Updated Dec. 20, 2016, 2:35 p.m.)


Review request for mesos and Daniel Pravat.


Repository: mesos


Description
---

Replaced `::dup` with `os::dup` in libprocess.


Diffs
-

  3rdparty/libprocess/src/io.cpp 27da897894e12941a6bba5f5eda04c35100d2d73 
  3rdparty/libprocess/src/libevent_ssl_socket.cpp 
0e292a8b0d470f4e199db08f09a0c863d73c 
  3rdparty/libprocess/src/tests/ssl_tests.cpp 
a32d20e47f67d88bbe5928e0ddc129745c5f42e0 

Diff: https://reviews.apache.org/r/54596/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 54603: Replaced `int` with `int_fd` in mesos.

2016-12-20 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54603/
---

(Updated Dec. 20, 2016, 2:34 p.m.)


Review request for mesos, Daniel Pravat and Joris Van Remoortere.


Repository: mesos


Description
---

Replaced `int` with `int_fd` in mesos.


Diffs (updated)
-

  include/mesos/slave/container_logger.hpp 
5781fecd08a2c92c3bc936598ca707a9a15f0e23 
  src/docker/docker.cpp b2a547d9b1c5ac3922ed21bceb5c9b60fa9f6568 
  src/files/files.cpp 9c634ac928887b3f1a111f67ebb3fc5229c6fa16 
  src/slave/containerizer/fetcher.cpp 94cfa0e33adc622d77a75e256fb52fc75d3db18a 
  src/slave/containerizer/mesos/containerizer.hpp 
9fcad933e1ec3b52d4dc366ba80d282deea04c0b 
  src/slave/containerizer/mesos/containerizer.cpp 
d9d5619e45ae1199fc91878f17a33b5647f48305 
  src/slave/containerizer/mesos/io/switchboard.cpp 
c80627986f729255e3b3ad1fc9cfa6213e19c521 
  src/slave/containerizer/mesos/io/switchboard_main.cpp 
87056c3f76525151834b4ff782e2009050941ccb 
  src/slave/containerizer/mesos/launch.hpp 
5bba13998e879afb490c44c427dc6844c1c3c38d 
  src/slave/containerizer/mesos/launch.cpp 
e482ab8bdfc358f695b87cda72ca59fb64cd8c4d 
  src/slave/state.cpp a94bb8d7029295abef70d6595ebc732ac1ab87a8 
  src/slave/status_update_manager.hpp 9fad0c51ec0fc00e134265850ad2236a585e46ec 
  src/slave/status_update_manager.cpp 056a684b52756d5c6309e7e2167a1532c4e60957 
  src/tests/containerizer/cgroups_tests.cpp 
0afaec6ae948cabf1472bf01103210d8f9809cb1 
  src/tests/containerizer/io_switchboard_tests.cpp 
e32df64c73df7f99f4c1a940db8807109be94696 
  src/tests/containerizer/memory_test_helper.cpp 
b94f0aac331da2e4f586245796176893195ad137 
  src/tests/containerizer/nested_mesos_containerizer_tests.cpp 
9a6bcb54f77f8343786f42e3cb511f2e95aaff5e 
  src/tests/credentials_tests.cpp ed50609cc5ed1937d2bcd782451245e0b5889ed5 
  src/tests/mesos.cpp adbb697c2bb908a584163dc9954b9e3138164433 
  src/tests/protobuf_io_tests.cpp 9263d8c2763cb30c93fd94be5344f83a6ce3e3b1 

Diff: https://reviews.apache.org/r/54603/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 54602: Replaced `int` with `int_fd` in libprocess.

2016-12-20 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54602/
---

(Updated Dec. 20, 2016, 2:33 p.m.)


Review request for mesos, Daniel Pravat and Joris Van Remoortere.


Repository: mesos


Description
---

Replaced `int` with `int_fd` in libprocess.


Diffs (updated)
-

  3rdparty/libprocess/include/process/io.hpp 
f5489dc66adb136d9f53f98ac64c3fbe7831a1c2 
  3rdparty/libprocess/include/process/network.hpp 
8234765e23bb3d434da0b0f818fd42569d554ab8 
  3rdparty/libprocess/include/process/posix/subprocess.hpp 
a1054e788ef6a322901c262380aceab8192235ac 
  3rdparty/libprocess/include/process/socket.hpp 
a7bbeed995106fa33bb488facfbf69fcc3759efe 
  3rdparty/libprocess/include/process/subprocess_base.hpp 
74a4bef0706334b4d3c1040a08a8921fbc300344 
  3rdparty/libprocess/include/process/windows/subprocess.hpp 
3bc7f1992d9c38dac2ec23d5bc57415f37d0318a 
  3rdparty/libprocess/src/encoder.hpp 1447d6f93c15b9f3d134507ecb0bda020a218a49 
  3rdparty/libprocess/src/http.cpp 97d1424be20e217401519c2bee79857bcf087023 
  3rdparty/libprocess/src/io.cpp 27da897894e12941a6bba5f5eda04c35100d2d73 
  3rdparty/libprocess/src/libev_poll.cpp 
da2a78bd8aa4ed8c37bd2ca16d148b73112aa0e7 
  3rdparty/libprocess/src/libevent_poll.cpp 
0803ba33622c86df38b3efd4f1e3197edf93a0af 
  3rdparty/libprocess/src/libevent_ssl_socket.hpp 
57eaf4f607d0628db466cc1a139772eeeaa51136 
  3rdparty/libprocess/src/libevent_ssl_socket.cpp 
0e292a8b0d470f4e199db08f09a0c863d73c 
  3rdparty/libprocess/src/poll_socket.hpp 
89789e6bb91d79e4de1c4f4be3882df851845930 
  3rdparty/libprocess/src/poll_socket.cpp 
93ca37f105527fb225574107480114ee5ac74c76 
  3rdparty/libprocess/src/process.cpp 889a03444eaee7b5ad2be65bb414c30062d4a4f0 
  3rdparty/libprocess/src/socket.cpp 2eaa5fcdfed698bf1a71b4fe399baa3623c921ab 
  3rdparty/libprocess/src/subprocess.cpp 
ad19b0896b4a2e9c60f573cc854c10c69e909e86 
  3rdparty/libprocess/src/subprocess_posix.cpp 
19271414f145d23f50ac07570c48782819f382b4 
  3rdparty/libprocess/src/subprocess_windows.cpp 
20cad52d4a4d7fc51487e150a849972eb19ed08e 
  3rdparty/libprocess/src/tests/io_tests.cpp 
b9825e8633f64c23e4b1ea904537cdc8da64ed5b 
  3rdparty/libprocess/src/tests/ssl_tests.cpp 
a32d20e47f67d88bbe5928e0ddc129745c5f42e0 
  3rdparty/libprocess/src/tests/subprocess_tests.cpp 
59c17692012ddfb540ecdd48560c73c42a15f061 

Diff: https://reviews.apache.org/r/54602/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 54803: Fixed `SlaveTests` to pass when `HAS_AUTHENTICATION` is undefined.

2016-12-20 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54803/
---

(Updated Dec. 20, 2016, 10:32 p.m.)


Review request for mesos, Adam B, Andrew Schwartzmeyer, Daniel Pravat, Greg 
Mann, John Kordich, Joseph Wu, and Vinod Kone.


Changes
---

Add comment explaining our use of `Clock::settle`.


Bugs: MESOS-6803
https://issues.apache.org/jira/browse/MESOS-6803


Repository: mesos


Description
---

Currently, when `HAS_AUTHENTICATION` is undefined, the Agent will
use `delay` to schedule a random time in the future to register with the
Master, to avoid the thundering herd problem after a Master failover.
The authentication codepath, in contrast, schedules the registration
immediately.

In tests where we have `Clock::pause`'d when we are supposed to be
registering the slave, the authention codepath will succeeed, while
no-authentication codepath will hang forever.

A much more detailed analysis of this situation exists in MESOS-6803.

This commit will resolve this issue for `slave_tests.cpp` by changing
the tests to not use `Clock::pause` when we are waiting for Agent
registration.


Diffs (updated)
-

  src/tests/slave_tests.cpp d956a326ef29bf29837e0587a14bae457147cbca 

Diff: https://reviews.apache.org/r/54803/diff/


Testing
---

Added `delay` to the call to `authenticate` in `Slave::detected`, ran tests to 
find failing tests in `SlaveTest.*`, then fixed, then ran again.


Thanks,

Alex Clemmer



Re: Review Request 54601: Replaced `int` with `int_fd` in stout.

2016-12-20 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54601/
---

(Updated Dec. 20, 2016, 2:32 p.m.)


Review request for mesos, Daniel Pravat and Joris Van Remoortere.


Repository: mesos


Description
---

Replaced `int` with `int_fd` in stout.


Diffs (updated)
-

  3rdparty/stout/include/stout/net.hpp 4803aeb2fc1c89d42a02c631ef0450d5053ea8a2 
  3rdparty/stout/include/stout/os/mktemp.hpp 
2dfd35605eb4202a5475fe0e0d2f1fd27690a2de 
  3rdparty/stout/include/stout/os/open.hpp 
2a357926860b1523c51f12c7edee2babe6afbfa3 
  3rdparty/stout/include/stout/os/posix/socket.hpp 
836e4b3c37435a7f952321a13f3193043c1055d9 
  3rdparty/stout/include/stout/os/read.hpp 
d0b657db5b7dc0c1e11edc13fd6830a9f6273867 
  3rdparty/stout/include/stout/os/socket.hpp 
e9d9306e63aff2be083b4254fbf6f31c37edc424 
  3rdparty/stout/include/stout/os/sunos.hpp 
e0398d25c0a5d0b55934594dd59b2629957ab921 
  3rdparty/stout/include/stout/os/touch.hpp 
84d49bb8adc2612e380f037fd42c47166d55f593 
  3rdparty/stout/include/stout/os/windows/close.hpp 
9f1566bff19ee872418bce8a43a119c4f0a70a7c 
  3rdparty/stout/include/stout/os/windows/fcntl.hpp 
f331cd324bc609f5b8081fa86d66d9947daf04f6 
  3rdparty/stout/include/stout/os/windows/fsync.hpp 
e1c4868b02b320906f446af1d9371374e90651bc 
  3rdparty/stout/include/stout/os/windows/ftruncate.hpp 
a7f53ad2e8735b515590af84c0efce3edcc1bebf 
  3rdparty/stout/include/stout/os/windows/read.hpp 
09190f97f33db5dfa023f937f8f2a4a62ed1ec15 
  3rdparty/stout/include/stout/os/windows/sendfile.hpp 
d6358cc02c1eea9298907da1f74eb7eeaeec7d21 
  3rdparty/stout/include/stout/os/windows/shell.hpp 
17e3d564564abebf1d558b7a7a277aef3c87e5ae 
  3rdparty/stout/include/stout/os/windows/socket.hpp 
c787341181da53abc5a3b2eadc76c85fef181310 
  3rdparty/stout/include/stout/os/windows/write.hpp 
23488708ae366b8571bb8b4805f67d2054223fff 
  3rdparty/stout/include/stout/os/write.hpp 
9bb9fbd4593866b6193a1e253e65852da0ff5ed5 
  3rdparty/stout/include/stout/protobuf.hpp 
80cb20f40a7ddd4309d27973eef9fca9e4052b64 
  3rdparty/stout/include/stout/windows.hpp 
e641c46d033372e1b6c9f9c066b1ad4957d55088 
  3rdparty/stout/include/stout/windows/os.hpp 
5cd92545a49648e39e8eb7cf131895e9cfc97902 
  3rdparty/stout/tests/os/filesystem_tests.cpp 
22460842c0db5dc5b6effbc2bdfce043ed47db6d 
  3rdparty/stout/tests/os/sendfile_tests.cpp 
17e10d92a0a004cb7ac83f4ff9c71844418a35b0 
  3rdparty/stout/tests/os_tests.cpp 8d2005b1f109b4025aa8368600763db9c829d0c5 
  3rdparty/stout/tests/path_tests.cpp 8275e38a503e64d242da6be0fe6bd0a0c21869a3 

Diff: https://reviews.apache.org/r/54601/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 54762: Introduced an `os::pipe` abstraction to stout.

2016-12-20 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54762/
---

(Updated Dec. 20, 2016, 2:31 p.m.)


Review request for mesos.


Summary (updated)
-

Introduced an `os::pipe` abstraction to stout.


Repository: mesos


Description (updated)
---

Introduced an `os::pipe` abstraction to stout.


Diffs (updated)
-

  3rdparty/stout/include/Makefile.am 53d04a9b6c4a0d8b35d3c84ef24d619fdb8a2c82 
  3rdparty/stout/include/stout/os.hpp 63265eed28d7345dfa18ac1c5a22ef58a7902c62 
  3rdparty/stout/include/stout/os/pipe.hpp PRE-CREATION 
  3rdparty/stout/include/stout/os/posix/pipe.hpp PRE-CREATION 
  3rdparty/stout/include/stout/os/windows/pipe.hpp PRE-CREATION 
  3rdparty/stout/include/stout/posix/os.hpp 
397c2d6b06ddb607d254eae8258da5151c5f57e0 
  3rdparty/stout/include/stout/windows/os.hpp 
5cd92545a49648e39e8eb7cf131895e9cfc97902 

Diff: https://reviews.apache.org/r/54762/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 54595: Introduced an `os::dup` abstraction in stout.

2016-12-20 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54595/
---

(Updated Dec. 20, 2016, 2:30 p.m.)


Review request for mesos and Daniel Pravat.


Repository: mesos


Description
---

Introduced an `os::dup` abstraction in stout.


Diffs (updated)
-

  3rdparty/stout/include/Makefile.am 53d04a9b6c4a0d8b35d3c84ef24d619fdb8a2c82 
  3rdparty/stout/include/stout/os.hpp 63265eed28d7345dfa18ac1c5a22ef58a7902c62 
  3rdparty/stout/include/stout/os/dup.hpp PRE-CREATION 
  3rdparty/stout/include/stout/os/posix/dup.hpp PRE-CREATION 
  3rdparty/stout/include/stout/os/windows/dup.hpp PRE-CREATION 

Diff: https://reviews.apache.org/r/54595/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 54592: Introduced an `os::lseek` abstraction in stout.

2016-12-20 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54592/
---

(Updated Dec. 20, 2016, 2:31 p.m.)


Review request for mesos and Daniel Pravat.


Repository: mesos


Description
---

Introduced an `os::lseek` abstraction in stout.


Diffs (updated)
-

  3rdparty/stout/include/Makefile.am 53d04a9b6c4a0d8b35d3c84ef24d619fdb8a2c82 
  3rdparty/stout/include/stout/os.hpp 63265eed28d7345dfa18ac1c5a22ef58a7902c62 
  3rdparty/stout/include/stout/os/lseek.hpp PRE-CREATION 

Diff: https://reviews.apache.org/r/54592/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 54591: Introduced `WindowsFD` class which is analogous to an `int` in POSIX.

2016-12-20 Thread Michael Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54591/
---

(Updated Dec. 20, 2016, 2:29 p.m.)


Review request for mesos and Daniel Pravat.


Changes
---

- Fixed formatting
- Added comparison operators to allow closer treatment to `int` (as discussed 
with BenH)


Repository: mesos


Description
---

In POSIX the socket, pipe and a file are represented by the `int` type.
In Windows:
  - A socket is kept in a `SOCKET` type (64 bit wide)
  - A pipe or a WinAPI file descriptor in a `HANDLE` (64 bit wide)
  - A CRT file descriptor in an `int`

The `WindowsFD` class is a type that brings all of these things
together and behaves analogously to an `int` in POSIX.


Diffs (updated)
-

  3rdparty/stout/include/stout/os.hpp 63265eed28d7345dfa18ac1c5a22ef58a7902c62 
  3rdparty/stout/include/stout/os/int_fd.hpp PRE-CREATION 
  3rdparty/stout/include/stout/os/windows/fd.hpp PRE-CREATION 

Diff: https://reviews.apache.org/r/54591/diff/


Testing
---


Thanks,

Michael Park



Re: Review Request 54889: Handled all possible offers in test.

2016-12-20 Thread Benjamin Bannier


> On Dec. 20, 2016, 5:41 p.m., Neil Conway wrote:
> > Can we add an expectation for exactly when we expect to receive the offer? 
> > i.e., another `WillOnce(FutureSatisfy(...))` and then wait for the future 
> > at the appropriate time.
> > 
> > The thing I don't like about the `WillRepeatedly(...)` pattern is that it 
> > obscures whether additional offers are actually expected or not. Lots of 
> > tests have copied this pattern and use it, even when another offer is not 
> > actually expected.

This test only uses a single offer, and we do not know when or even if a second 
offer would arrive since the test does not run with a paused clock, 
https://github.com/apache/mesos/blob/1e72605e9892eb4e518442ab9c1fe2a1a1696748/src/tests/fault_tolerance_tests.cpp#L853.
 If execution on the thread running in the test body is delayed sufficiently 
after the clock is resumed, additional offers might be made concurrently with 
that part of the test body.

It looks to me that as a general rule of thumb, if tests do not pause the 
clock, one should always ignore subsequent calls to mock interfaces in order to 
avoid this gmock assertion.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54889/#review159727
---


On Dec. 20, 2016, 11:15 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54889/
> ---
> 
> (Updated Dec. 20, 2016, 11:15 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Neil Conway.
> 
> 
> Bugs: MESOS-6820
> https://issues.apache.org/jira/browse/MESOS-6820
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The test FaultToleranceTest.FrameworkReregister observes a single
> offer in order to monitor progress and as a trigger to initiate other
> actions later in the test. The change installs expectations for
> possible additional offers. This allows for the thread running the
> test body to be delayed with respect to the master thread which could
> in the meantime make additional offers.
> 
> 
> Diffs
> -
> 
>   src/tests/fault_tolerance_tests.cpp 
> 05937a917a2c175aa53b52488febb7cfd8400a13 
> 
> Diff: https://reviews.apache.org/r/54889/diff/
> 
> 
> Testing
> ---
> 
> `make check` (OS X, clang-trunk, w/ optimizations, SSL)
> 
> Before this fix `FaultToleranceTest.FrameworkReregister` failed for me 
> multiple times in ~10k iterations; after this patch it didn't fail in a 
> single run I stopped after 170k iterations.
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 54889: Handled all possible offers in test.

2016-12-20 Thread Benjamin Bannier


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54889/#review159727
---


On Dec. 20, 2016, 11:15 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54889/
> ---
> 
> (Updated Dec. 20, 2016, 11:15 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Neil Conway.
> 
> 
> Bugs: MESOS-6820
> https://issues.apache.org/jira/browse/MESOS-6820
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The test FaultToleranceTest.FrameworkReregister observes a single
> offer in order to monitor progress and as a trigger to initiate other
> actions later in the test. The change installs expectations for
> possible additional offers. This allows for the thread running the
> test body to be delayed with respect to the master thread which could
> in the meantime make additional offers.
> 
> 
> Diffs
> -
> 
>   src/tests/fault_tolerance_tests.cpp 
> 05937a917a2c175aa53b52488febb7cfd8400a13 
> 
> Diff: https://reviews.apache.org/r/54889/diff/
> 
> 
> Testing
> ---
> 
> `make check` (OS X, clang-trunk, w/ optimizations, SSL)
> 
> Before this fix `FaultToleranceTest.FrameworkReregister` failed for me 
> multiple times in ~10k iterations; after this patch it didn't fail in a 
> single run I stopped after 170k iterations.
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 54889: Handled all possible offers in test.

2016-12-20 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54889/
---

(Updated Dec. 20, 2016, 11:15 p.m.)


Review request for mesos, Alexander Rukletsov and Neil Conway.


Bugs: MESOS-6820
https://issues.apache.org/jira/browse/MESOS-6820


Repository: mesos


Description (updated)
---

The test FaultToleranceTest.FrameworkReregister observes a single
offer in order to monitor progress and as a trigger to initiate other
actions later in the test. The change installs expectations for
possible additional offers. This allows for the thread running the
test body to be delayed with respect to the master thread which could
in the meantime make additional offers.


Diffs
-

  src/tests/fault_tolerance_tests.cpp 05937a917a2c175aa53b52488febb7cfd8400a13 

Diff: https://reviews.apache.org/r/54889/diff/


Testing
---

`make check` (OS X, clang-trunk, w/ optimizations, SSL)

Before this fix `FaultToleranceTest.FrameworkReregister` failed for me multiple 
times in ~10k iterations; after this patch it didn't fail in a single run I 
stopped after 170k iterations.


Thanks,

Benjamin Bannier



Re: Review Request 54910: Fixed more tests to pass when `HAS_AUTHENTICATION` is undefined.

2016-12-20 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54910/
---

(Updated Dec. 20, 2016, 10:09 p.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, Greg Mann, and 
Joseph Wu.


Changes
---

Rearranging things, small changes.


Bugs: MESOS-6803
https://issues.apache.org/jira/browse/MESOS-6803


Repository: mesos


Description
---

This is a follow-on to 54803.

Currently, when `HAS_AUTHENTICATION` is undefined, the Agent will
use `delay` to schedule a random time in the future to register with the
Master, to avoid the thundering herd problem after a Master failover.
The authentication codepath, in contrast, schedules the registration
immediately.

In tests where we have `Clock::pause`'d when we are supposed to be
registering the slave, the authention codepath will succeeed, while
no-authentication codepath will hang forever.

A much more detailed analysis of this situation exists in MESOS-6803.

This commit will resolve this issue for a few existing test files by
changing the tests to not use `Clock::pause` when we are waiting for
Agent registration.


Diffs (updated)
-

  src/tests/fault_tolerance_tests.cpp 05937a917a2c175aa53b52488febb7cfd8400a13 
  src/tests/master_allocator_tests.cpp 0ff4221d510347f83daea8fc9cbf32eb83d19d80 
  src/tests/master_tests.cpp b3b5630899082a69119d2cccb6460766d1352963 
  src/tests/oversubscription_tests.cpp 027f5492150bfb5a39e7b97bcc41102f769e8922 
  src/tests/persistent_volume_endpoints_tests.cpp 
241970f3027b2a2017919325303eb58717bc93b0 
  src/tests/persistent_volume_tests.cpp 
b7c313781f3dd9ab4fd68ce23ccc044d50b734d8 
  src/tests/reconciliation_tests.cpp aeeb1097cb9325f9a8a0fb65b899aa2d71207cff 
  src/tests/reservation_endpoints_tests.cpp 
7d271085d822e85d27cea5361bfee0d0d91a5691 
  src/tests/reservation_tests.cpp ffbb50bdf16fdeb0ad0aa98afbe71c38c784cd71 

Diff: https://reviews.apache.org/r/54910/diff/


Testing
---

`make check` and `mesos-tests --gtest_repeat=1000 --gtest_break_on_failure` 
filtering on the tests in this review to catch intermittent failures.


Thanks,

Alex Clemmer



Re: Review Request 53716: Used `Shell::entrypoint`.

2016-12-20 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53716/#review159755
---


Fix it, then Ship it!




I can tweak this before committing.


src/docker/docker.cpp (line 790)


See: https://reviews.apache.org/r/53715/#comment230690

For some reason, this file/class is not namespaced, so defining constants 
will be awkward.  However, as there are only these two lines to change, we can 
just `#ifdef` in place.



src/docker/docker.cpp (line 805)


Ditto here.


- Joseph Wu


On Nov. 13, 2016, 9:43 p.m., Daniel Pravat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53716/
> ---
> 
> (Updated Nov. 13, 2016, 9:43 p.m.)
> 
> 
> Review request for mesos, Alex Naparu, Artem Harutyunyan, Alex Clemmer, 
> Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Used `Shell::entrypoint`.
> 
> 
> Diffs
> -
> 
>   src/docker/docker.cpp 50fda393a42afefc70790a26b44911e4cf17185e 
> 
> Diff: https://reviews.apache.org/r/53716/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Daniel Pravat
> 
>



Re: Review Request 54828: Fixed flags::fetch() to support Windows file paths.

2016-12-20 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54828/#review159757
---




3rdparty/stout/include/stout/flags/fetch.hpp (line 54)


I think this is superfluous given path is a `std::string`.


- Andrew Schwartzmeyer


On Dec. 16, 2016, 11:27 p.m., John Kordich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54828/
> ---
> 
> (Updated Dec. 16, 2016, 11:27 p.m.)
> 
> 
> Review request for mesos, Alex Clemmer and Joseph Wu.
> 
> 
> Bugs: MESOS-5937
> https://issues.apache.org/jira/browse/MESOS-5937
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed flags::fetch() to support Windows file paths.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/flags/fetch.hpp 
> 788d09e0a2083ead7d6ccb0c233cfdfd5d1ddb81 
>   3rdparty/stout/tests/flags_tests.cpp 
> e2681f8f68f6478d4c8a20c1e75ddb050d16b56d 
> 
> Diff: https://reviews.apache.org/r/54828/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> John Kordich
> 
>



Re: Review Request 54913: Added some missing log statements for scheduler calls.

2016-12-20 Thread Joris Van Remoortere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54913/#review159756
---


Fix it, then Ship it!





src/master/master.cpp (line 4697)


You can stream the `taskId` from above.
Please double check on the single quotes.



src/master/master.cpp (line 5947)


Please double check the quotes here as well.


- Joris Van Remoortere


On Dec. 20, 2016, 9:40 p.m., Anand Mazumdar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54913/
> ---
> 
> (Updated Dec. 20, 2016, 9:40 p.m.)
> 
> 
> Review request for mesos and Joris Van Remoortere.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Some Scheduler API calls ('KILL'/'SHUTDOWN') were missing info
> level logging statements. In the case of `KILL', the logging
> statement was there but was invoked in `killTask()` that is
> no longer used by the scheduler driver.
> 
> 
> Diffs
> -
> 
>   src/master/master.cpp b664550d57ef9805bd23ea35ca7f9cd8f4b1ab78 
> 
> Diff: https://reviews.apache.org/r/54913/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>



Re: Review Request 53715: Define docker `--entrypoint` for `Windows`.

2016-12-20 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53715/#review159671
---




3rdparty/stout/include/stout/os/posix/shell.hpp (lines 41 - 42)


The motivation behind this variable is to re-use this variable for Docker.  
However, it is worthwhile to note that `/bin/sh` is *not* the canonical path 
for the shell.

See: http://pubs.opengroup.org/onlinepubs/009695399/utilities/sh.html
> Applications should note that the standard PATH to the shell cannot be 
assumed to be either /bin/sh or /usr/bin/sh, and should be determined by 
interrogation of the PATH returned by getconf PATH , ensuring that the returned 
pathname is an absolute pathname and not a shell built-in.

Docker containers generally hardcode the `/bin/sh` for a variety of reasons 
(like having an inconsistent/incorrect `PATH` inside the container).  `/bin/sh` 
appears to be the canonical default for docker, but this is liable to change 
according to docker's whims.

Instead of defining this constant in Stout, we should just define it in the 
docker containerizer.  That way, in case they make a breaking API change later 
on, this won't change the existing (correct) way of shelling out.


- Joseph Wu


On Nov. 22, 2016, 10:59 p.m., Daniel Pravat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53715/
> ---
> 
> (Updated Nov. 22, 2016, 10:59 p.m.)
> 
> 
> Review request for mesos, Alex Naparu, Artem Harutyunyan, Alex Clemmer, 
> Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Define docker `--entrypoint` for `Windows`.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/os/posix/shell.hpp 
> 70a9184e85ba322f1cd4ce5d12b963cd4b3ad500 
>   3rdparty/stout/include/stout/os/windows/shell.hpp 
> 17e3d564564abebf1d558b7a7a277aef3c87e5ae 
> 
> Diff: https://reviews.apache.org/r/53715/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Daniel Pravat
> 
>



Review Request 54913: Added some missing log statements for scheduler calls.

2016-12-20 Thread Anand Mazumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54913/
---

Review request for mesos and Joris Van Remoortere.


Repository: mesos


Description
---

Some Scheduler API calls ('KILL'/'SHUTDOWN') were missing info
level logging statements. In the case of `KILL', the logging
statement was there but was invoked in `killTask()` that is
no longer used by the scheduler driver.


Diffs
-

  src/master/master.cpp b664550d57ef9805bd23ea35ca7f9cd8f4b1ab78 

Diff: https://reviews.apache.org/r/54913/diff/


Testing
---

make check


Thanks,

Anand Mazumdar



Re: Review Request 53719: Created default mount point on Windows.

2016-12-20 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53719/#review159751
---


Ship it!





src/slave/flags.cpp (lines 634 - 638)


nit: The indenting is a bit off, but I can tweak that before committing.


- Joseph Wu


On Nov. 13, 2016, 9:42 p.m., Daniel Pravat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53719/
> ---
> 
> (Updated Nov. 13, 2016, 9:42 p.m.)
> 
> 
> Review request for mesos, Alex Naparu, Artem Harutyunyan, Alex Clemmer, 
> Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Created default mount point on Windows.
> 
> 
> Diffs
> -
> 
>   src/slave/flags.cpp 76609f94482a792f4c8db480e604d98d172abb04 
> 
> Diff: https://reviews.apache.org/r/53719/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Daniel Pravat
> 
>



Re: Review Request 54877: Windows: Stout: Removed dependency on Shell API.

2016-12-20 Thread Andrew Schwartzmeyer


> On Dec. 20, 2016, 12:29 a.m., Daniel Pravat wrote:
> > 3rdparty/stout/include/stout/windows/os.hpp, line 748
> > 
> >
> > I don't think the conversion to UTF-8 is appropiate here.
> 
> Andrew Schwartzmeyer wrote:
> What would you convert it to? It's currently in UTF-16, and Windows paths 
> are allowed to have (almost) any Unicode character.
> 
> Alex Clemmer wrote:
> I think he's saying that bad things will happen if you use strings that 
> have Unicode in them, so it's probably better to just error out.
> 
> Andrew Schwartzmeyer wrote:
> I'm not following... That is a case our codebase must deal with, NTFS 
> paths are stored in Unicode:
> 
> 
> https://msdn.microsoft.com/en-us/library/windows/desktop/dd317748(v=vs.85).aspx
> 
> and on top of that, Windows file paths can:
> 
> > Use any character in the current code page for a name, including 
> Unicode characters and characters in the extended character set (128–255)
> 
> 
> https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx
> 
> So what I'm saying is that bad things _must not happen_ if we have paths 
> with non-ASCII characters in them; we need to handle that at any point we 
> deal with file paths on Windows.
> 
> I'm not saying this function is likely to return a path like `C:\ÂÑÐ¥`, 
> but that would be a valid file path, and so we must be treating all our paths 
> on Windows as Unicode.
> 
> If this is not the case, are purposefully constraining ourselves to 
> ASCII, and if so, why?
> 
> Alex Clemmer wrote:
> Sorry, I meant bad things will happen to Mesos if you give it strings 
> with unicode in them. Mesos itself does not deal with Unicode gracefully.

Oooh. We should probably fix that.


- Andrew


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54877/#review159672
---


On Dec. 19, 2016, 11:20 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54877/
> ---
> 
> (Updated Dec. 19, 2016, 11:20 p.m.)
> 
> 
> Review request for mesos, Daniel Pravat, Alex Clemmer, and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The API `SHGetKnownFolderPath` requires `Shell32.dll`,
> which is not available on Nano server.
> The equivalent API `GetAllUsersProfileDirectory`
> only requires `Userenv.dll`, which is available on Nano.
> 
> This API is also friendlier, as we own the allocation.
> 
> The Unicode version `GetAllUsersProfileDirectoryW` is
> explicitly used so that we are guaranteed a Unicode path,
> which we then convert from UTF-16 to UTF-8,
> instead of using the ASCII version which depends on a
> varying Windows code-page, and is not recommended.
> 
> A `vector` is used over a `wstring` to avoid dealing
> with the placement of the null-terminating character.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/windows.hpp 
> e641c46d033372e1b6c9f9c066b1ad4957d55088 
>   3rdparty/stout/include/stout/windows/os.hpp 
> 5cd92545a49648e39e8eb7cf131895e9cfc97902 
> 
> Diff: https://reviews.apache.org/r/54877/diff/
> 
> 
> Testing
> ---
> 
> cmake && msbuild, attach agent to master and check default `runtime_dir` 
> value.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 53718: Defined and used Windows string `docker-mesos-executor.exe`.

2016-12-20 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53718/#review159747
---


Fix it, then Ship it!




I can fix this up before committing.


src/slave/containerizer/docker.cpp (line 368)


There are two other occurrences of the string that should be substituted.



src/slave/containerizer/docker.cpp (line 1377)


And here.  

Technically, argv[0] doesn't *need* to have the `.exe`.  But we might as 
well be consistent.


- Joseph Wu


On Nov. 13, 2016, 9:42 p.m., Daniel Pravat wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53718/
> ---
> 
> (Updated Nov. 13, 2016, 9:42 p.m.)
> 
> 
> Review request for mesos, Alex Naparu, Artem Harutyunyan, Alex Clemmer, 
> Joseph Wu, and Michael Park.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Defined and used  Windows string `docker-mesos-executor.exe`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/docker.cpp c2ed5240aab9ea9d1a386c44c94e5ae7e98d313c 
> 
> Diff: https://reviews.apache.org/r/53718/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Daniel Pravat
> 
>



Re: Review Request 54613: Install a symlink rather than building mesos-slave twice.

2016-12-20 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54613/#review159750
---



Patch looks great!

Reviews applied: [54613]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On Dec. 20, 2016, 12:11 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54613/
> ---
> 
> (Updated Dec. 20, 2016, 12:11 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Alex Clemmer, Joseph Wu, Michael 
> Park, Vinod Kone, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6772
> https://issues.apache.org/jira/browse/MESOS-6772
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Install a symlink rather than building mesos-slave twice.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 0f62ec70816e8b48e19d35036285656a6e7cd02b 
> 
> Diff: https://reviews.apache.org/r/54613/diff/
> 
> 
> Testing
> ---
> 
> Make install:
> ```
> [jpeach@jpeach mesos.git]$ ls -l /opt/mesos/sbin/
> total 10448
> -rwxr-xr-x 1 root root 5116104 Dec  9 15:27 mesos-agent*
> -rwxr-xr-x 1 root root 5539096 Dec  9 15:27 mesos-master*
> lrwxrwxrwx 1 root root  11 Dec  9 15:27 mesos-slave -> mesos-agent*
> ```
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 54910: Fixed more tests to pass when `HAS_AUTHENTICATION` is undefined.

2016-12-20 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54910/
---

(Updated Dec. 20, 2016, 8:37 p.m.)


Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, Greg Mann, and 
Joseph Wu.


Changes
---

Updated testing description.


Bugs: MESOS-6803
https://issues.apache.org/jira/browse/MESOS-6803


Repository: mesos


Description
---

This is a follow-on to 54803.

Currently, when `HAS_AUTHENTICATION` is undefined, the Agent will
use `delay` to schedule a random time in the future to register with the
Master, to avoid the thundering herd problem after a Master failover.
The authentication codepath, in contrast, schedules the registration
immediately.

In tests where we have `Clock::pause`'d when we are supposed to be
registering the slave, the authention codepath will succeeed, while
no-authentication codepath will hang forever.

A much more detailed analysis of this situation exists in MESOS-6803.

This commit will resolve this issue for a few existing test files by
changing the tests to not use `Clock::pause` when we are waiting for
Agent registration.


Diffs
-

  src/tests/fault_tolerance_tests.cpp 05937a917a2c175aa53b52488febb7cfd8400a13 
  src/tests/master_allocator_tests.cpp 0ff4221d510347f83daea8fc9cbf32eb83d19d80 
  src/tests/master_tests.cpp b3b5630899082a69119d2cccb6460766d1352963 
  src/tests/oversubscription_tests.cpp 027f5492150bfb5a39e7b97bcc41102f769e8922 
  src/tests/persistent_volume_endpoints_tests.cpp 
241970f3027b2a2017919325303eb58717bc93b0 
  src/tests/persistent_volume_tests.cpp 
b7c313781f3dd9ab4fd68ce23ccc044d50b734d8 
  src/tests/reconciliation_tests.cpp aeeb1097cb9325f9a8a0fb65b899aa2d71207cff 
  src/tests/reservation_endpoints_tests.cpp 
7d271085d822e85d27cea5361bfee0d0d91a5691 
  src/tests/reservation_tests.cpp ffbb50bdf16fdeb0ad0aa98afbe71c38c784cd71 

Diff: https://reviews.apache.org/r/54910/diff/


Testing (updated)
---

`make check` and `mesos-tests --gtest_repeat=1000 --gtest_break_on_failure` 
filtering on the tests in this review to catch intermittent failures.


Thanks,

Alex Clemmer



Re: Review Request 54896: Fixed copy-template-and-create-symlink make target.

2016-12-20 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54896/#review159743
---




src/Makefile.am (line 2428)


Should this be a symlink too? Making a copy creates the possibility that 
you could add environment variables in one file but not the other?


- James Peach


On Dec. 20, 2016, 1:53 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54896/
> ---
> 
> (Updated Dec. 20, 2016, 1:53 p.m.)
> 
> 
> Review request for mesos, James Peach and Till Toenshoff.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This target was `PHONY`, but not declared as such. It also explicitly
> used `ln` with a `--force` flag instead of the more generally
> applicable and suggested `$(LN_S)` we use elsewhere.
> 
> This change fixes above issues. We also use independent shell commands
> for independent commands and fix some whitespace issues.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 0f62ec70816e8b48e19d35036285656a6e7cd02b 
> 
> Diff: https://reviews.apache.org/r/54896/diff/
> 
> 
> Testing
> ---
> 
> `make distcheck` (Fedora 25)
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 54877: Windows: Stout: Removed dependency on Shell API.

2016-12-20 Thread Alex Clemmer


> On Dec. 20, 2016, 12:29 a.m., Daniel Pravat wrote:
> > 3rdparty/stout/include/stout/windows/os.hpp, line 748
> > 
> >
> > I don't think the conversion to UTF-8 is appropiate here.
> 
> Andrew Schwartzmeyer wrote:
> What would you convert it to? It's currently in UTF-16, and Windows paths 
> are allowed to have (almost) any Unicode character.
> 
> Alex Clemmer wrote:
> I think he's saying that bad things will happen if you use strings that 
> have Unicode in them, so it's probably better to just error out.
> 
> Andrew Schwartzmeyer wrote:
> I'm not following... That is a case our codebase must deal with, NTFS 
> paths are stored in Unicode:
> 
> 
> https://msdn.microsoft.com/en-us/library/windows/desktop/dd317748(v=vs.85).aspx
> 
> and on top of that, Windows file paths can:
> 
> > Use any character in the current code page for a name, including 
> Unicode characters and characters in the extended character set (128–255)
> 
> 
> https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx
> 
> So what I'm saying is that bad things _must not happen_ if we have paths 
> with non-ASCII characters in them; we need to handle that at any point we 
> deal with file paths on Windows.
> 
> I'm not saying this function is likely to return a path like `C:\ÂÑÐ¥`, 
> but that would be a valid file path, and so we must be treating all our paths 
> on Windows as Unicode.
> 
> If this is not the case, are purposefully constraining ourselves to 
> ASCII, and if so, why?

Sorry, I meant bad things will happen to Mesos if you give it strings with 
unicode in them. Mesos itself does not deal with Unicode gracefully.


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54877/#review159672
---


On Dec. 19, 2016, 11:20 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54877/
> ---
> 
> (Updated Dec. 19, 2016, 11:20 p.m.)
> 
> 
> Review request for mesos, Daniel Pravat, Alex Clemmer, and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The API `SHGetKnownFolderPath` requires `Shell32.dll`,
> which is not available on Nano server.
> The equivalent API `GetAllUsersProfileDirectory`
> only requires `Userenv.dll`, which is available on Nano.
> 
> This API is also friendlier, as we own the allocation.
> 
> The Unicode version `GetAllUsersProfileDirectoryW` is
> explicitly used so that we are guaranteed a Unicode path,
> which we then convert from UTF-16 to UTF-8,
> instead of using the ASCII version which depends on a
> varying Windows code-page, and is not recommended.
> 
> A `vector` is used over a `wstring` to avoid dealing
> with the placement of the null-terminating character.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/windows.hpp 
> e641c46d033372e1b6c9f9c066b1ad4957d55088 
>   3rdparty/stout/include/stout/windows/os.hpp 
> 5cd92545a49648e39e8eb7cf131895e9cfc97902 
> 
> Diff: https://reviews.apache.org/r/54877/diff/
> 
> 
> Testing
> ---
> 
> cmake && msbuild, attach agent to master and check default `runtime_dir` 
> value.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 54880: Added unit-test for dynamic addition/deletion of CNI config.

2016-12-20 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54880/#review159739
---



Patch looks great!

Reviews applied: [54716, 54717, 54718, 54880]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On Dec. 20, 2016, 3:37 a.m., Avinash sridharan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54880/
> ---
> 
> (Updated Dec. 20, 2016, 3:37 a.m.)
> 
> 
> Review request for mesos, Jie Yu and Qian Zhang.
> 
> 
> Bugs: MESOS-6567
> https://issues.apache.org/jira/browse/MESOS-6567
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added unit-test for dynamic addition/deletion of CNI config.
> 
> 
> Diffs
> -
> 
>   src/tests/containerizer/cni_isolator_tests.cpp 
> 0380f2ca122c7b8ab1dac351b14f546526580e72 
> 
> Diff: https://reviews.apache.org/r/54880/diff/
> 
> 
> Testing
> ---
> 
> sudo ./bin/mesos-tests.sh --gtest_filter=*Cni*
> 
> 
> Thanks,
> 
> Avinash sridharan
> 
>



Re: Review Request 54035: Extended test coverage of posix/rlimits isolator.

2016-12-20 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54035/#review159658
---


Fix it, then Ship it!





src/tests/containerizer/posix_rlimits_isolator_tests.cpp (line 282)


i'll stick to slave for now given all the rest of the funciton names uses 
Slave (and 'slave') still.


- Jie Yu


On Dec. 19, 2016, 11:07 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54035/
> ---
> 
> (Updated Dec. 19, 2016, 11:07 a.m.)
> 
> 
> Review request for mesos, Jie Yu and Till Toenshoff.
> 
> 
> Bugs: MESOS-6402
> https://issues.apache.org/jira/browse/MESOS-6402
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This change adds additional tests of the posix/rlimits isolator.
> 
> 
> Diffs
> -
> 
>   src/tests/containerizer/posix_rlimits_isolator_tests.cpp 
> 24b19d9eb1cc3e19f8325fbd76182837a5690581 
> 
> Diff: https://reviews.apache.org/r/54035/diff/
> 
> 
> Testing
> ---
> 
> `make check` (OS X, clang trunk w/o optimizations)
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 54877: Windows: Stout: Removed dependency on Shell API.

2016-12-20 Thread Andrew Schwartzmeyer


> On Dec. 20, 2016, 12:29 a.m., Daniel Pravat wrote:
> > 3rdparty/stout/include/stout/windows/os.hpp, line 748
> > 
> >
> > I don't think the conversion to UTF-8 is appropiate here.
> 
> Andrew Schwartzmeyer wrote:
> What would you convert it to? It's currently in UTF-16, and Windows paths 
> are allowed to have (almost) any Unicode character.
> 
> Alex Clemmer wrote:
> I think he's saying that bad things will happen if you use strings that 
> have Unicode in them, so it's probably better to just error out.

I'm not following... That is a case our codebase must deal with, NTFS paths are 
stored in Unicode:

https://msdn.microsoft.com/en-us/library/windows/desktop/dd317748(v=vs.85).aspx

and on top of that, Windows file paths can:

> Use any character in the current code page for a name, including Unicode 
> characters and characters in the extended character set (128–255)

https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx

So what I'm saying is that bad things _must not happen_ if we have paths with 
non-ASCII characters in them; we need to handle that at any point we deal with 
file paths on Windows.

I'm not saying this function is likely to return a path like `C:\ÂÑÐ¥`, but 
that would be a valid file path, and so we must be treating all our paths on 
Windows as Unicode.

If this is not the case, are purposefully constraining ourselves to ASCII, and 
if so, why?


- Andrew


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54877/#review159672
---


On Dec. 19, 2016, 11:20 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54877/
> ---
> 
> (Updated Dec. 19, 2016, 11:20 p.m.)
> 
> 
> Review request for mesos, Daniel Pravat, Alex Clemmer, and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The API `SHGetKnownFolderPath` requires `Shell32.dll`,
> which is not available on Nano server.
> The equivalent API `GetAllUsersProfileDirectory`
> only requires `Userenv.dll`, which is available on Nano.
> 
> This API is also friendlier, as we own the allocation.
> 
> The Unicode version `GetAllUsersProfileDirectoryW` is
> explicitly used so that we are guaranteed a Unicode path,
> which we then convert from UTF-16 to UTF-8,
> instead of using the ASCII version which depends on a
> varying Windows code-page, and is not recommended.
> 
> A `vector` is used over a `wstring` to avoid dealing
> with the placement of the null-terminating character.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/windows.hpp 
> e641c46d033372e1b6c9f9c066b1ad4957d55088 
>   3rdparty/stout/include/stout/windows/os.hpp 
> 5cd92545a49648e39e8eb7cf131895e9cfc97902 
> 
> Diff: https://reviews.apache.org/r/54877/diff/
> 
> 
> Testing
> ---
> 
> cmake && msbuild, attach agent to master and check default `runtime_dir` 
> value.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Review Request 54910: Fixed more tests to pass when `HAS_AUTHENTICATION` is undefined.

2016-12-20 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54910/
---

Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, Greg Mann, and 
Joseph Wu.


Bugs: MESOS-6803
https://issues.apache.org/jira/browse/MESOS-6803


Repository: mesos


Description
---

This is a follow-on to 54803.

Currently, when `HAS_AUTHENTICATION` is undefined, the Agent will
use `delay` to schedule a random time in the future to register with the
Master, to avoid the thundering herd problem after a Master failover.
The authentication codepath, in contrast, schedules the registration
immediately.

In tests where we have `Clock::pause`'d when we are supposed to be
registering the slave, the authention codepath will succeeed, while
no-authentication codepath will hang forever.

A much more detailed analysis of this situation exists in MESOS-6803.

This commit will resolve this issue for a few existing test files by
changing the tests to not use `Clock::pause` when we are waiting for
Agent registration.


Diffs
-

  src/tests/fault_tolerance_tests.cpp 05937a917a2c175aa53b52488febb7cfd8400a13 
  src/tests/master_allocator_tests.cpp 0ff4221d510347f83daea8fc9cbf32eb83d19d80 
  src/tests/master_tests.cpp b3b5630899082a69119d2cccb6460766d1352963 
  src/tests/oversubscription_tests.cpp 027f5492150bfb5a39e7b97bcc41102f769e8922 
  src/tests/persistent_volume_endpoints_tests.cpp 
241970f3027b2a2017919325303eb58717bc93b0 
  src/tests/persistent_volume_tests.cpp 
b7c313781f3dd9ab4fd68ce23ccc044d50b734d8 
  src/tests/reconciliation_tests.cpp aeeb1097cb9325f9a8a0fb65b899aa2d71207cff 
  src/tests/reservation_endpoints_tests.cpp 
7d271085d822e85d27cea5361bfee0d0d91a5691 
  src/tests/reservation_tests.cpp ffbb50bdf16fdeb0ad0aa98afbe71c38c784cd71 

Diff: https://reviews.apache.org/r/54910/diff/


Testing
---

`make check` and `mesos-tests --gtest_repeat=1000 --gtest_break_on_failure` to 
catch intermittent failures, which is how we caught the failing test in 
`reservation_tests.cpp`. Note that this bug was discovered when we added a 
`delay` to the call to `authenticate` in `slave::detected` (in order to get it 
to match the behavior of the non-authenticated call to `doReliableRegistration`.


Thanks,

Alex Clemmer



Review Request 54909: Added member to agent to avoid spurious re-registrations.

2016-12-20 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54909/
---

Review request for mesos, Andrew Schwartzmeyer, Daniel Pravat, John Kordich, 
and Joseph Wu.


Bugs: MESOS-6803
https://issues.apache.org/jira/browse/MESOS-6803


Repository: mesos


Description
---

Currently when a new master is detected and no credential is provided,
the agent will attempt to (re-)register after some random initial
`delay`, to avoid thundering herds. It is hence possible to have
spurious double-registrations, since a new master could be detected
after we add the `delay`'d registration, but before we execute it.

To resolve this problem, we add a member, `agentRegistrationTimer` to
the agent, and call `Clock::cancel` on it when we successfully register
with the master.


Diffs
-

  src/slave/slave.hpp 03860b5d0242289034d4574bd36a85ab6fb87a79 
  src/slave/slave.cpp a7a3a394e5e4b7f40a051663cd70add3890bdf18 
  src/tests/reservation_tests.cpp ffbb50bdf16fdeb0ad0aa98afbe71c38c784cd71 

Diff: https://reviews.apache.org/r/54909/diff/


Testing
---

`make check` and `mesos-tests --gtest_repeat=1000 --gtest_break_on_failure` to 
catch intermittent failures, which is how we caught the failing test in 
`reservation_tests.cpp`. Note that this bug was discovered when we added a 
`delay` to the call to `authenticate` in `slave::detected` (in order to get it 
to match the behavior of the non-authenticated call to `doReliableRegistration`.


Thanks,

Alex Clemmer



Re: Review Request 54803: Fixed `SlaveTests` to pass when `HAS_AUTHENTICATION` is undefined.

2016-12-20 Thread Alex Clemmer


> On Dec. 20, 2016, 5:36 p.m., Greg Mann wrote:
> > LGTM, thanks Alex! Have you tested these on Windows as well? If so, it 
> > would be good to include in the "testing" section.

I have, but unfortunately, some of these tests have other issues that I need to 
deal with on Windows before they're ready to ship. I have another review in the 
chain that deals with this orthogonal concern.


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54803/#review159732
---


On Dec. 20, 2016, 10:37 a.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54803/
> ---
> 
> (Updated Dec. 20, 2016, 10:37 a.m.)
> 
> 
> Review request for mesos, Adam B, Andrew Schwartzmeyer, Daniel Pravat, Greg 
> Mann, John Kordich, Joseph Wu, and Vinod Kone.
> 
> 
> Bugs: MESOS-6803
> https://issues.apache.org/jira/browse/MESOS-6803
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently, when `HAS_AUTHENTICATION` is undefined, the Agent will
> use `delay` to schedule a random time in the future to register with the
> Master, to avoid the thundering herd problem after a Master failover.
> The authentication codepath, in contrast, schedules the registration
> immediately.
> 
> In tests where we have `Clock::pause`'d when we are supposed to be
> registering the slave, the authention codepath will succeeed, while
> no-authentication codepath will hang forever.
> 
> A much more detailed analysis of this situation exists in MESOS-6803.
> 
> This commit will resolve this issue for `slave_tests.cpp` by changing
> the tests to not use `Clock::pause` when we are waiting for Agent
> registration.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_tests.cpp d956a326ef29bf29837e0587a14bae457147cbca 
> 
> Diff: https://reviews.apache.org/r/54803/diff/
> 
> 
> Testing
> ---
> 
> Added `delay` to the call to `authenticate` in `Slave::detected`, ran tests 
> to find failing tests in `SlaveTest.*`, then fixed, then ran again.
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 54803: Fixed `SlaveTests` to pass when `HAS_AUTHENTICATION` is undefined.

2016-12-20 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54803/#review159732
---


Ship it!




LGTM, thanks Alex! Have you tested these on Windows as well? If so, it would be 
good to include in the "testing" section.

- Greg Mann


On Dec. 20, 2016, 10:37 a.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54803/
> ---
> 
> (Updated Dec. 20, 2016, 10:37 a.m.)
> 
> 
> Review request for mesos, Adam B, Andrew Schwartzmeyer, Daniel Pravat, Greg 
> Mann, John Kordich, Joseph Wu, and Vinod Kone.
> 
> 
> Bugs: MESOS-6803
> https://issues.apache.org/jira/browse/MESOS-6803
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently, when `HAS_AUTHENTICATION` is undefined, the Agent will
> use `delay` to schedule a random time in the future to register with the
> Master, to avoid the thundering herd problem after a Master failover.
> The authentication codepath, in contrast, schedules the registration
> immediately.
> 
> In tests where we have `Clock::pause`'d when we are supposed to be
> registering the slave, the authention codepath will succeeed, while
> no-authentication codepath will hang forever.
> 
> A much more detailed analysis of this situation exists in MESOS-6803.
> 
> This commit will resolve this issue for `slave_tests.cpp` by changing
> the tests to not use `Clock::pause` when we are waiting for Agent
> registration.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_tests.cpp d956a326ef29bf29837e0587a14bae457147cbca 
> 
> Diff: https://reviews.apache.org/r/54803/diff/
> 
> 
> Testing
> ---
> 
> Added `delay` to the call to `authenticate` in `Slave::detected`, ran tests 
> to find failing tests in `SlaveTest.*`, then fixed, then ran again.
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 54889: Handled all possible offers in test.

2016-12-20 Thread Neil Conway

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54889/#review159727
---



Can we add an expectation for exactly when we expect to receive the offer? 
i.e., another `WillOnce(FutureSatisfy(...))` and then wait for the future at 
the appropriate time.

The thing I don't like about the `WillRepeatedly(...)` pattern is that it 
obscures whether additional offers are actually expected or not. Lots of tests 
have copied this pattern and use it, even when another offer is not actually 
expected.

- Neil Conway


On Dec. 20, 2016, 11:54 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54889/
> ---
> 
> (Updated Dec. 20, 2016, 11:54 a.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Neil Conway.
> 
> 
> Bugs: MESOS-6820
> https://issues.apache.org/jira/browse/MESOS-6820
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The test FaultToleranceTest.FrameworkReregister observes a single
> offer in order to monitor progress and as a trigger to initiate other
> orders later in the test. The change installs expectations for
> possibly additional offers. This allows for the thread running the
> test body to be delayed with respect to the master thread which could
> in the meantime hand out additional offers.
> 
> 
> Diffs
> -
> 
>   src/tests/fault_tolerance_tests.cpp 
> 05937a917a2c175aa53b52488febb7cfd8400a13 
> 
> Diff: https://reviews.apache.org/r/54889/diff/
> 
> 
> Testing
> ---
> 
> `make check` (OS X, clang-trunk, w/ optimizations, SSL)
> 
> Before this fix `FaultToleranceTest.FrameworkReregister` failed for me 
> multiple times in ~10k iterations; after this patch it didn't fail in a 
> single run I stopped after 170k iterations.
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 54901: Updated description of "--agent_reregister_timeout".

2016-12-20 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54901/#review159724
---


Ship it!




Ship It!

- Vinod Kone


On Dec. 20, 2016, 4:09 p.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54901/
> ---
> 
> (Updated Dec. 20, 2016, 4:09 p.m.)
> 
> 
> Review request for mesos, haosdent huang and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The behavior of this flag changed in Mesos 1.1.0.
> 
> 
> Diffs
> -
> 
>   docs/configuration.md efe3e9bd9d203a7ba44adf4ead24f14b8b577637 
>   src/master/constants.hpp 5dd0667f62d2c0617cc0d5aed8cc005bd8344c88 
>   src/master/flags.cpp e5edfd0a0c529a10dc602ef9a88a0ec60c69 
> 
> Diff: https://reviews.apache.org/r/54901/diff/
> 
> 
> Testing
> ---
> 
> No functional change.
> 
> 
> Thanks,
> 
> Neil Conway
> 
>



Re: Review Request 54901: Updated description of "--agent_reregister_timeout".

2016-12-20 Thread haosdent huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54901/#review159722
---


Ship it!




Ship It!

- haosdent huang


On Dec. 20, 2016, 4:09 p.m., Neil Conway wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54901/
> ---
> 
> (Updated Dec. 20, 2016, 4:09 p.m.)
> 
> 
> Review request for mesos, haosdent huang and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The behavior of this flag changed in Mesos 1.1.0.
> 
> 
> Diffs
> -
> 
>   docs/configuration.md efe3e9bd9d203a7ba44adf4ead24f14b8b577637 
>   src/master/constants.hpp 5dd0667f62d2c0617cc0d5aed8cc005bd8344c88 
>   src/master/flags.cpp e5edfd0a0c529a10dc602ef9a88a0ec60c69 
> 
> Diff: https://reviews.apache.org/r/54901/diff/
> 
> 
> Testing
> ---
> 
> No functional change.
> 
> 
> Thanks,
> 
> Neil Conway
> 
>



Review Request 54901: Updated description of "--agent_reregister_timeout".

2016-12-20 Thread Neil Conway

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54901/
---

Review request for mesos, haosdent huang and Vinod Kone.


Repository: mesos


Description
---

The behavior of this flag changed in Mesos 1.1.0.


Diffs
-

  docs/configuration.md efe3e9bd9d203a7ba44adf4ead24f14b8b577637 
  src/master/constants.hpp 5dd0667f62d2c0617cc0d5aed8cc005bd8344c88 
  src/master/flags.cpp e5edfd0a0c529a10dc602ef9a88a0ec60c69 

Diff: https://reviews.apache.org/r/54901/diff/


Testing
---

No functional change.


Thanks,

Neil Conway



Review Request 54898: Added a CHECK in updateFrameworkInfo.

2016-12-20 Thread Jay Guo

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54898/
---

Review request for mesos, Guangya Liu and Joris Van Remoortere.


Repository: mesos


Description
---

Added a CHECK in updateFrameworkInfo.


Diffs
-

  src/master/master.hpp 89b3c394b268a8645885412aeb19980db8d73c69 

Diff: https://reviews.apache.org/r/54898/diff/


Testing
---

make check


Thanks,

Jay Guo



Re: Review Request 52071: Updated docs to handle resources with no size in agent flags.

2016-12-20 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52071/#review159717
---



Patch looks great!

Reviews applied: [51999, 52002, 51879, 51880, 52071]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On Dec. 20, 2016, 2:05 a.m., Anindya Sinha wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52071/
> ---
> 
> (Updated Dec. 20, 2016, 2:05 a.m.)
> 
> 
> Review request for mesos and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6062
> https://issues.apache.org/jira/browse/MESOS-6062
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When a resource with no size is specified in `--resources` startup
> flag of mesos agent, the size is auto detected by the agent. This
> is enabled for all known resource types (except gpus) only when
> the resources are specified in JSON format.
> For disk resources, this is done for both root disks as well as
> MOUNT and PATH disks.
> 
> 
> Diffs
> -
> 
>   docs/attributes-resources.md 2caa44927d45c0ab13f8160794b50f4fde3711aa 
> 
> Diff: https://reviews.apache.org/r/52071/diff/
> 
> 
> Testing
> ---
> 
> Documentation change only. All tests passed.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>



Re: Review Request 54613: Install a symlink rather than building mesos-slave twice.

2016-12-20 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54613/#review159714
---


Ship it!





src/Makefile.am (line 2446)


Not yours, but since you touch this it would be great if you could make 
these independent commands (i.e., remove the ` &&` here).

I added more cleanups here, https://reviews.apache.org/r/54896/.


- Benjamin Bannier


On Dec. 20, 2016, 1:11 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54613/
> ---
> 
> (Updated Dec. 20, 2016, 1:11 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Alex Clemmer, Joseph Wu, Michael 
> Park, Vinod Kone, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6772
> https://issues.apache.org/jira/browse/MESOS-6772
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Install a symlink rather than building mesos-slave twice.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 0f62ec70816e8b48e19d35036285656a6e7cd02b 
> 
> Diff: https://reviews.apache.org/r/54613/diff/
> 
> 
> Testing
> ---
> 
> Make install:
> ```
> [jpeach@jpeach mesos.git]$ ls -l /opt/mesos/sbin/
> total 10448
> -rwxr-xr-x 1 root root 5116104 Dec  9 15:27 mesos-agent*
> -rwxr-xr-x 1 root root 5539096 Dec  9 15:27 mesos-master*
> lrwxrwxrwx 1 root root  11 Dec  9 15:27 mesos-slave -> mesos-agent*
> ```
> 
> 
> Thanks,
> 
> James Peach
> 
>



Review Request 54896: Fixed copy-template-and-create-symlink make target.

2016-12-20 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54896/
---

Review request for mesos, James Peach and Till Toenshoff.


Repository: mesos


Description
---

This target was `PHONY`, but not declared as such. It also explicitly
used `ln` with a `--force` flag instead of the more generally
applicable and suggested `$(LN_S)` we use elsewhere.

This change fixes above issues. We also use independent shell commands
for independent commands and fix some whitespace issues.


Diffs
-

  src/Makefile.am 0f62ec70816e8b48e19d35036285656a6e7cd02b 

Diff: https://reviews.apache.org/r/54896/diff/


Testing
---

`make distcheck` (Fedora 25)


Thanks,

Benjamin Bannier



Re: Review Request 53791: Use the stout ELF parser to collect Linux rootfs files.

2016-12-20 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53791/#review159711
---



Patch looks great!

Reviews applied: [53790, 54712, 54878, 53791]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On Dec. 19, 2016, 11:24 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53791/
> ---
> 
> (Updated Dec. 19, 2016, 11:24 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jie Yu, Kevin Klues, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6588
> https://issues.apache.org/jira/browse/MESOS-6588
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The dependencies for the programs we need in the Linux root
> filesystem vary over time and across distributions. Since Stout
> has support for parsing the library dependencies of ELF binaries,
> use this to collect the necessary dependencies when constructing
> a root filesystem for the tests.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 0f62ec70816e8b48e19d35036285656a6e7cd02b 
>   src/tests/containerizer/rootfs.hpp 6bc3835cbb62536ec933ef38c9e15138b8611e5f 
>   src/tests/containerizer/rootfs.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/53791/diff/
> 
> 
> Testing
> ---
> 
> sudo make check (Fedora 24)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 51028: Fix broken tests post batched event allocation.

2016-12-20 Thread Jacob Janco

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51028/
---

(Updated Dec. 20, 2016, 12:05 p.m.)


Review request for mesos and Jiang Yan Xu.


Summary (updated)
-

Fix broken tests post batched event allocation.


Bugs: MESOS-3157
https://issues.apache.org/jira/browse/MESOS-3157


Repository: mesos


Description
---

- Per MESOS-3157, if cluster events with the possibility
  of triggering an allocation occur rapidly and test
  assertions depend on gleaning information from assumed
  order, it is likely they will fail due to lack of parity
  between event and actual allocation. Ensure that expected
  allocations occur.


Diffs (updated)
-

  src/tests/hierarchical_allocator_tests.cpp 
1edd0ecc8a93cd41532e1cf3641f67c780ab23a5 

Diff: https://reviews.apache.org/r/51028/diff/


Testing
---

make check


Thanks,

Jacob Janco



Re: Review Request 51028: Fix tests with rapidly triggered allocations.

2016-12-20 Thread Jacob Janco


> On Sept. 29, 2016, 3:06 a.m., Guangya Liu wrote:
> > src/tests/hierarchical_allocator_tests.cpp, lines 3623-3627
> > 
> >
> > Instead of killing this part, what about updating `offerCallback` 
> > function to record the number of `agents`? And then assert the number of 
> > `agents` here is same as number of agents in cluster.

Added in logging for the offer callbacks which do correspond to actual runs 
performed as suggested below, what do you think?


- Jacob


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51028/#review150798
---


On Sept. 28, 2016, 9:16 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51028/
> ---
> 
> (Updated Sept. 28, 2016, 9:16 p.m.)
> 
> 
> Review request for mesos and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Per MESOS-3157, if cluster events with the possibility
>   of triggering an allocation occur rapidly and test
>   assertions depend on gleaning information from assumed
>   order, it is likely they will fail due to lack of parity
>   between event and actual allocation. Ensure that expected
>   allocations occur.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_tests.cpp 
> 46553f6d501deb37862dd5ba48be1c114f6e6cb8 
> 
> Diff: https://reviews.apache.org/r/51028/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 51028: Fix tests with rapidly triggered allocations.

2016-12-20 Thread Jacob Janco


> On Dec. 13, 2016, 4:30 p.m., Jiang Yan Xu wrote:
> > src/tests/hierarchical_allocator_tests.cpp, lines 3005-3006
> > 
> >
> > Similar to Guangya's suggestion, I think we can elaborate on the reason 
> > to use tow agents and two frameworks.
> > 
> > ```
> > Trigger at least two allocation runs to generate the window statistics. 
> > Here we are using two pairs of `addSlave()` and `addFramework()` calls 
> > because each guarantees one allocation run that yields an `Allocation`.
> > ```

I realized all we need is to await the allocation after `addSlave()` as well as 
`addFramework()` to ensure that we have window metrics.


> On Dec. 13, 2016, 4:30 p.m., Jiang Yan Xu wrote:
> > src/tests/hierarchical_allocator_tests.cpp, lines 3621-3629
> > 
> >
> > I think we are still intereted in the number of allocations and we 
> > expect to see them decrease because of batching. We don't use EXPECT on 
> > them but we print them out as part of the benchmark?

Added this in. Ex test output: 

[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/5
Using 1000 agents and 1000 frameworks
Added 1000 frameworks in 18064us
Added 1000 agents in 5.229825secs performing 1000 allocations
Updated 1000 agents in 5.39987secs performing 1000 allocations
[   OK ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.AddAndUpdateSlave/5 
(10708 ms)


- Jacob


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51028/#review158967
---


On Sept. 28, 2016, 9:16 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51028/
> ---
> 
> (Updated Sept. 28, 2016, 9:16 p.m.)
> 
> 
> Review request for mesos and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Per MESOS-3157, if cluster events with the possibility
>   of triggering an allocation occur rapidly and test
>   assertions depend on gleaning information from assumed
>   order, it is likely they will fail due to lack of parity
>   between event and actual allocation. Ensure that expected
>   allocations occur.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_tests.cpp 
> 46553f6d501deb37862dd5ba48be1c114f6e6cb8 
> 
> Diff: https://reviews.apache.org/r/51028/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Re: Review Request 51028: Fix tests with rapidly triggered allocations.

2016-12-20 Thread Jacob Janco


> On Sept. 28, 2016, 11:02 p.m., Guangya Liu wrote:
> > src/tests/hierarchical_allocator_tests.cpp, line 2974
> > 
> >
> > Can you please add some comments here for why not using 
> > `Clock::pause()`?

Added this back.


- Jacob


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51028/#review150773
---


On Sept. 28, 2016, 9:16 p.m., Jacob Janco wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51028/
> ---
> 
> (Updated Sept. 28, 2016, 9:16 p.m.)
> 
> 
> Review request for mesos and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-3157
> https://issues.apache.org/jira/browse/MESOS-3157
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> - Per MESOS-3157, if cluster events with the possibility
>   of triggering an allocation occur rapidly and test
>   assertions depend on gleaning information from assumed
>   order, it is likely they will fail due to lack of parity
>   between event and actual allocation. Ensure that expected
>   allocations occur.
> 
> 
> Diffs
> -
> 
>   src/tests/hierarchical_allocator_tests.cpp 
> 46553f6d501deb37862dd5ba48be1c114f6e6cb8 
> 
> Diff: https://reviews.apache.org/r/51028/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Jacob Janco
> 
>



Review Request 54889: Handled all possible offers in test.

2016-12-20 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54889/
---

Review request for mesos and Alexander Rukletsov.


Bugs: MESOS-6820
https://issues.apache.org/jira/browse/MESOS-6820


Repository: mesos


Description
---

The test FaultToleranceTest.FrameworkReregister observes a single
offer in order to monitor progress and as a trigger to initiate other
orders later in the test. The change installs expectations for
possibly additional offers. This allows for the thread running the
test body to be delayed with respect to the master thread which could
in the meantime hand out additional offers.


Diffs
-

  src/tests/fault_tolerance_tests.cpp 05937a917a2c175aa53b52488febb7cfd8400a13 

Diff: https://reviews.apache.org/r/54889/diff/


Testing
---

`make check` (OS X, clang-trunk, w/ optimizations, SSL)

Before this fix `FaultToleranceTest.FrameworkReregister` failed for me multiple 
times in ~10k iterations; after this patch it didn't fail in a single run I 
stopped after 170k iterations.


Thanks,

Benjamin Bannier



Re: Review Request 54875: CMake: Fixed typo `-DHAS_AUTHENTICATION=0` instructions.

2016-12-20 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54875/#review159699
---



Patch looks great!

Reviews applied: [54875]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker_build.sh

- Mesos ReviewBot


On Dec. 19, 2016, 10:12 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54875/
> ---
> 
> (Updated Dec. 19, 2016, 10:12 p.m.)
> 
> 
> Review request for mesos and Alex Clemmer.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Windows builds without this flag instruct you to add it,
> but the instructions were missing the prefix `D` for define,
> and so as-is resulted in an error message.
> This could be a source of frustration for those who
> do not know CMake.
> 
> 
> Diffs
> -
> 
>   cmake/CompilationConfigure.cmake d8575873eb315f812ba281564c42d848915349de 
> 
> Diff: https://reviews.apache.org/r/54875/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 54803: Fixed `SlaveTests` to pass when `HAS_AUTHENTICATION` is undefined.

2016-12-20 Thread Alex Clemmer


> On Dec. 20, 2016, 12:01 a.m., Greg Mann wrote:
> > I was able to catch one flaky test by running the agent tests in 
> > repetition. For the other patches you're working on, I would recommend 
> > running the altered tests for a while with `--gtest_repeat=-1 
> > --gtest_break_on_failure` to check for flakiness.

Thanks for the tip. This time I verified this solution with:

```
make mesos-tests -j4 && ./src/mesos-tests --gtest_repeat=1000 
--gtest_break_on_failure 
--gtest_filter="SlaveTest.DuplicateTerminalUpdateBeforeAck:SlaveTest.MetricsSlaveLaunchErrors:SlaveTest.StateEndpoint:SlaveTest.PingTimeoutNoPings:SlaveTest.PingTimeoutSomePings:SlaveTest.ReregisterWithStatusUpdateTaskState"
```


- Alex


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54803/#review159647
---


On Dec. 17, 2016, 11:01 p.m., Alex Clemmer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54803/
> ---
> 
> (Updated Dec. 17, 2016, 11:01 p.m.)
> 
> 
> Review request for mesos, Adam B, Andrew Schwartzmeyer, Daniel Pravat, Greg 
> Mann, John Kordich, Joseph Wu, and Vinod Kone.
> 
> 
> Bugs: MESOS-6803
> https://issues.apache.org/jira/browse/MESOS-6803
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently, when `HAS_AUTHENTICATION` is undefined, the Agent will
> use `delay` to schedule a random time in the future to register with the
> Master, to avoid the thundering herd problem after a Master failover.
> The authentication codepath, in contrast, schedules the registration
> immediately.
> 
> In tests where we have `Clock::pause`'d when we are supposed to be
> registering the slave, the authention codepath will succeeed, while
> no-authentication codepath will hang forever.
> 
> A much more detailed analysis of this situation exists in MESOS-6803.
> 
> This commit will resolve this issue for `slave_tests.cpp` by changing
> the tests to not use `Clock::pause` when we are waiting for Agent
> registration.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_tests.cpp d956a326ef29bf29837e0587a14bae457147cbca 
> 
> Diff: https://reviews.apache.org/r/54803/diff/
> 
> 
> Testing
> ---
> 
> Added `delay` to the call to `authenticate` in `Slave::detected`, ran tests 
> to find failing tests in `SlaveTest.*`, then fixed, then ran again.
> 
> 
> Thanks,
> 
> Alex Clemmer
> 
>



Re: Review Request 54803: Fixed `SlaveTests` to pass when `HAS_AUTHENTICATION` is undefined.

2016-12-20 Thread Alex Clemmer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54803/
---

(Updated Dec. 20, 2016, 10:37 a.m.)


Review request for mesos, Adam B, Andrew Schwartzmeyer, Daniel Pravat, Greg 
Mann, John Kordich, Joseph Wu, and Vinod Kone.


Changes
---

Addressed Greg's comments.


Bugs: MESOS-6803
https://issues.apache.org/jira/browse/MESOS-6803


Repository: mesos


Description
---

Currently, when `HAS_AUTHENTICATION` is undefined, the Agent will
use `delay` to schedule a random time in the future to register with the
Master, to avoid the thundering herd problem after a Master failover.
The authentication codepath, in contrast, schedules the registration
immediately.

In tests where we have `Clock::pause`'d when we are supposed to be
registering the slave, the authention codepath will succeeed, while
no-authentication codepath will hang forever.

A much more detailed analysis of this situation exists in MESOS-6803.

This commit will resolve this issue for `slave_tests.cpp` by changing
the tests to not use `Clock::pause` when we are waiting for Agent
registration.


Diffs (updated)
-

  src/tests/slave_tests.cpp d956a326ef29bf29837e0587a14bae457147cbca 

Diff: https://reviews.apache.org/r/54803/diff/


Testing
---

Added `delay` to the call to `authenticate` in `Slave::detected`, ran tests to 
find failing tests in `SlaveTest.*`, then fixed, then ran again.


Thanks,

Alex Clemmer