Re: Review Request 63200: Added ContainerMountInfo to avoid pre_exec_commands for mounts.

2017-10-20 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63200/
---

(Updated Oct. 21, 2017, 4:50 a.m.)


Review request for mesos, Gilbert Song, James Peach, and Joseph Wu.


Bugs: MESOS-7306
https://issues.apache.org/jira/browse/MESOS-7306


Repository: mesos


Description
---

Added ContainerMountInfo protobuf in ContainerLaunchInfo. It is used
by isolators and MesosContainerizer to describe mounts. Previously,
the additional mounts for each container is specified using
'pre_exec_commands', which is not very flexible and slow (requires a
fork/exec for each mount).


Diffs
-

  include/mesos/slave/containerizer.proto 
689acfcbbb07f071b6195472118a7a7520a44abd 
  src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp 
ed6556830c1d6a72df22c1780ac77fe7d6225f99 
  src/slave/containerizer/mesos/isolators/filesystem/linux.hpp 
c7fbda55e619052edfa5cd97114544d8fefc3ea5 
  src/slave/containerizer/mesos/isolators/filesystem/linux.cpp 
faf94909f995f7486b5f9cb7532af58a90a9eed3 
  src/slave/containerizer/mesos/isolators/filesystem/shared.cpp 
927d95be2c25ceb324acac99edaee24a28a5d59c 
  src/slave/containerizer/mesos/isolators/gpu/isolator.cpp 
25636b50314cb4ea53d1931c97c87c65ecb658a8 
  src/slave/containerizer/mesos/isolators/namespaces/pid.cpp 
42bc2e1245c9c957604e6625244a48a564ca5a8c 
  src/slave/containerizer/mesos/isolators/volume/host_path.cpp 
949510aba6b1f30a64f5f9a0777a4f03e2701302 
  src/slave/containerizer/mesos/isolators/volume/image.cpp 
35966aa9a8702e53fa95b3e045cba9327623b99a 
  src/slave/containerizer/mesos/isolators/volume/sandbox_path.cpp 
ee5ea3dee3be197e923be544aab96806f0adf1cf 
  src/slave/containerizer/mesos/launch.cpp 
49f11f1d586672bb46f6eccabcfda9321cc3c607 


Diff: https://reviews.apache.org/r/63200/diff/1/


Testing
---

sudo make check


Thanks,

Jie Yu



Review Request 63200: Added ContainerMountInfo to avoid pre_exec_commands for mounts.

2017-10-20 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63200/
---

Review request for mesos, Gilbert Song, James Peach, and Joseph Wu.


Repository: mesos


Description
---

Added ContainerMountInfo protobuf in ContainerLaunchInfo. It is used
by isolators and MesosContainerizer to describe mounts. Previously,
the additional mounts for each container is specified using
'pre_exec_commands', which is not very flexible and slow (requires a
fork/exec for each mount).


Diffs
-

  include/mesos/slave/containerizer.proto 
689acfcbbb07f071b6195472118a7a7520a44abd 
  src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp 
ed6556830c1d6a72df22c1780ac77fe7d6225f99 
  src/slave/containerizer/mesos/isolators/filesystem/linux.hpp 
c7fbda55e619052edfa5cd97114544d8fefc3ea5 
  src/slave/containerizer/mesos/isolators/filesystem/linux.cpp 
faf94909f995f7486b5f9cb7532af58a90a9eed3 
  src/slave/containerizer/mesos/isolators/filesystem/shared.cpp 
927d95be2c25ceb324acac99edaee24a28a5d59c 
  src/slave/containerizer/mesos/isolators/gpu/isolator.cpp 
25636b50314cb4ea53d1931c97c87c65ecb658a8 
  src/slave/containerizer/mesos/isolators/namespaces/pid.cpp 
42bc2e1245c9c957604e6625244a48a564ca5a8c 
  src/slave/containerizer/mesos/isolators/volume/host_path.cpp 
949510aba6b1f30a64f5f9a0777a4f03e2701302 
  src/slave/containerizer/mesos/isolators/volume/image.cpp 
35966aa9a8702e53fa95b3e045cba9327623b99a 
  src/slave/containerizer/mesos/isolators/volume/sandbox_path.cpp 
ee5ea3dee3be197e923be544aab96806f0adf1cf 
  src/slave/containerizer/mesos/launch.cpp 
49f11f1d586672bb46f6eccabcfda9321cc3c607 


Diff: https://reviews.apache.org/r/63200/diff/1/


Testing
---

sudo make check


Thanks,

Jie Yu



Re: Review Request 63174: Added a benchmark for agent reregistration during master failover.

2017-10-20 Thread Jiang Yan Xu


> On Oct. 19, 2017, 6:38 p.m., Benjamin Mahler wrote:
> > Thanks Yan! I will dig in soon.
> > 
> > Just some quick questions:
> > 
> > (1) I thought during the meeting you said it was taking a minute, but 
> > looking at all the benchmark timings they're all under a second? Is it only 
> > the benchmark setup that's expensive here?
> > (2) Is this with the lock free event & run queues? If not, how much do they 
> > help?
> > (3) As an aside, it has come up before, but it would be useful to be able 
> > to force the messages to go through the remote stack rather than the local 
> > stack. No need to think about this yet, but just something to keep in mind 
> > as not being accurate in this benchmark.

1) Yeah looks like it. I used to include the setup time so it was large. 
2) Yeah I have used `--enable-optimize --enable-lock-free-run-queue 
--enable-lock-free-event-queue 
--enable-last-in-first-out-fixed-size-semaphore`. I could compare with the perf 
without them.
3) Right right I think we should keep that in mind and we should have tests 
that cover the remote stack. For the case here I thought it would be a simple 
and good-enough start since the local stack alright coveres the proto 
(de)serliazation and the rest of the libprocess optimization that we recently 
have improved.


- Jiang Yan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63174/#review188799
---


On Oct. 19, 2017, 4:28 p.m., Jiang Yan Xu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63174/
> ---
> 
> (Updated Oct. 19, 2017, 4:28 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Dmitry Zhuk, and Ilya Pronin.
> 
> 
> Bugs: MESOS-8098
> https://issues.apache.org/jira/browse/MESOS-8098
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The current benchmark is very simple: without framework involvement and 
> without agent retries but it's possible to add a number of others so I am 
> creating a new file for them.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 936bc49ddfca03b9278ab11b6d317f3ff635cb00 
>   src/tests/CMakeLists.txt 386e0473c93d0a993248c7818067071d0c761c76 
>   src/tests/master_benchmarks.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/63174/diff/1/
> 
> 
> Testing
> ---
> 
> Benchmark based off 
> https://github.com/apache/mesos/commit/41193181d6b75eeecae2729bf98007d9318e351a
>  (close to current HEAD).
> 
> ```
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
> Reregistered 2000 agents with a total of 50 running tasks and 50 
> completed tasks in 45.075488ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
>  (48126 ms)
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
> Reregistered 2000 agents with a total of 100 running tasks and 0 
> completed tasks in 14.172361ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
>  (45979 ms)
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
> Reregistered 2 agents with a total of 100 running tasks and 0 
> completed tasks in 413.508328ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
>  (49487 ms)
> [--] 3 tests from 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (143596 ms total)
> 
> ...
> 
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
> Reregistered 2000 agents with a total of 50 running tasks and 50 
> completed tasks in 32.787363ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
>  (48266 ms)
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
> Reregistered 2000 agents with a total of 100 running tasks and 0 
> completed tasks in 19.735003ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
>  (46169 ms)
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
> Reregistered 2 agents with a total of 100 running tasks and 0 
> completed tasks in 321.267267ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
>  (51550 ms)
> [--] 3 tests from 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (145987 ms total)
> ```
> 
> Benchmark based off 
> https://github.com/apache/mesos/commit/d9c90bf1d9c8b3a7dcc47be0cb773efff57cfb9d
>  (before 

Re: Review Request 63190: Moved the search for the containerizer executable launch path.

2017-10-20 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63190/#review188873
---


Ship it!




Ship It!

- Jie Yu


On Oct. 20, 2017, 9:19 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63190/
> ---
> 
> (Updated Oct. 20, 2017, 9:19 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Improved code readability by hoisting the code that searches for
> the containerizer executable launch path up to be near where the
> executable path is first determined.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/launch.cpp 
> d1372987cfbd3761d344479992b2ca6bb918540c 
> 
> 
> Diff: https://reviews.apache.org/r/63190/diff/1/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26).
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 63191: Refactored entering the chroot in the container launch.

2017-10-20 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63191/#review188872
---


Ship it!




Ship It!

- Jie Yu


On Oct. 20, 2017, 9:20 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63191/
> ---
> 
> (Updated Oct. 20, 2017, 9:20 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Hoisted the platform-dependent chroot code out into a helper function
> to improve readability. Improved the error message to always include
> the rootfs we attempted to enter.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/launch.cpp 
> d1372987cfbd3761d344479992b2ca6bb918540c 
> 
> 
> Diff: https://reviews.apache.org/r/63191/diff/1/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 63189: Refactored installing rlimits in the container launch.

2017-10-20 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63189/#review188874
---


Ship it!




Ship It!

- Jie Yu


On Oct. 20, 2017, 9:19 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63189/
> ---
> 
> (Updated Oct. 20, 2017, 9:19 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Hoisted the platform-dependent rlimit installation into a helper
> function to improve readability. Improved the error message to
> specify which rlimit we failed to set.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/launch.cpp 
> d1372987cfbd3761d344479992b2ca6bb918540c 
> 
> 
> Diff: https://reviews.apache.org/r/63189/diff/1/
> 
> 
> Testing
> ---
> 
> make check (Fedora 26).
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 63183: Stopped awaiting the connected event in ports isolator tests.

2017-10-20 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63183/#review188871
---



Patch looks great!

Reviews applied: [63183]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Oct. 20, 2017, 4:27 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63183/
> ---
> 
> (Updated Oct. 20, 2017, 4:27 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Gaston Kleiman.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Rather than explicitly waiting for the `connected` scheduler event,
> consistently apply the testing pattern from the default executor
> tests. We expect that the `connected` event happens, but we only
> need to synchronize the test on the `subscribed` event.
> 
> 
> Diffs
> -
> 
>   src/tests/containerizer/ports_isolator_tests.cpp 
> 7a19193d2027d51319c3a69a97bfae1fa819845b 
> 
> 
> Diff: https://reviews.apache.org/r/63183/diff/1/
> 
> 
> Testing
> ---
> 
> sudo ./mesos-tests --gtest_repeat=10 --gtest_break_on_failure 
> --gtest_filter=NetworkPorts*
> 
> (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Review Request 63191: Refactored entering the chroot in the container launch.

2017-10-20 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63191/
---

Review request for mesos and Jie Yu.


Repository: mesos


Description
---

Hoisted the platform-dependent chroot code out into a helper function
to improve readability. Improved the error message to always include
the rootfs we attempted to enter.


Diffs
-

  src/slave/containerizer/mesos/launch.cpp 
d1372987cfbd3761d344479992b2ca6bb918540c 


Diff: https://reviews.apache.org/r/63191/diff/1/


Testing
---

make check (Fedora 26)


Thanks,

James Peach



Review Request 63190: Moved the search for the containerizer executable launch path.

2017-10-20 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63190/
---

Review request for mesos and Jie Yu.


Repository: mesos


Description
---

Improved code readability by hoisting the code that searches for
the containerizer executable launch path up to be near where the
executable path is first determined.


Diffs
-

  src/slave/containerizer/mesos/launch.cpp 
d1372987cfbd3761d344479992b2ca6bb918540c 


Diff: https://reviews.apache.org/r/63190/diff/1/


Testing
---

make check (Fedora 26).


Thanks,

James Peach



Review Request 63189: Refactored installing rlimits in the container launch.

2017-10-20 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63189/
---

Review request for mesos and Jie Yu.


Repository: mesos


Description
---

Hoisted the platform-dependent rlimit installation into a helper
function to improve readability. Improved the error message to
specify which rlimit we failed to set.


Diffs
-

  src/slave/containerizer/mesos/launch.cpp 
d1372987cfbd3761d344479992b2ca6bb918540c 


Diff: https://reviews.apache.org/r/63189/diff/1/


Testing
---

make check (Fedora 26).


Thanks,

James Peach



Re: Review Request 63018: Added filesystem layout for local resource providers.

2017-10-20 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63018/
---

(Updated Oct. 20, 2017, 8:28 p.m.)


Review request for mesos, Jie Yu, Joseph Wu, and Jan Schlicht.


Changes
---

Addressed Jie's comments.


Bugs: MESOS-8097
https://issues.apache.org/jira/browse/MESOS-8097


Repository: mesos


Description
---

A local resource provide can use
`work_dir/resource_providers///` to store whatever data
whose lifecycle is not tied to the agent ID, and store checkpoints,
whose lifecycles should be tied to the agent ID, under
`work_dir/meta/slaves/latest/resource_providers///`.


Diffs (updated)
-

  src/resource_provider/daemon.cpp d584eb9d7aa75522aec97277674321061b90fbed 
  src/resource_provider/local.hpp ebaa07d03ad77d516066ee2d4b60864be0611b5f 
  src/resource_provider/local.cpp ad98f333c5668ca81de6e7ed3fc8f59323b151da 
  src/resource_provider/storage/provider.hpp 
6de88c2329b358fcf48bc39ddda0132170991c3c 
  src/resource_provider/storage/provider.cpp 
46224997430ac0c568904d80014166a6f059907f 
  src/slave/paths.hpp f000508d414daf9f943561f89c7105503a8a98b3 
  src/slave/paths.cpp fd546525b900cb6524fb9196d19616ef18de0f30 


Diff: https://reviews.apache.org/r/63018/diff/2/

Changes: https://reviews.apache.org/r/63018/diff/1-2/


Testing
---

make


Thanks,

Chun-Hung Hsiao



Re: Review Request 63021: Added functions to launch CSI plugin in storage local resource provider.

2017-10-20 Thread Chun-Hung Hsiao


> On Oct. 20, 2017, 12:26 a.m., Jie Yu wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 237 (patched)
> > 
> >
> > I'd just use lambda here:
> > ```
> > driver->send(evolve(call))
> >   .onFailed(defer(self(), [=](const string& failure) {
> > LOG(ERROR) << "Failed to subscribe resource provider with type "
> ><< "'" info.type() << "' and name "
> ><< "'" << info.name() << "': " << failure;
> >   }))
> >   ...
> > ```

Is it perferred to have `onFailed(...).onDiscarded(...)` with similar error 
logging code in both, or use `onAny(... LOG(ERROR) << ... << (future.isFailed() 
? future.failure() : "future discarded"; ...)`?


> On Oct. 20, 2017, 12:26 a.m., Jie Yu wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 291 (patched)
> > 
> >
> > Let's introduce a paths.hpp helper for SLRP:
> > ```
> > src/resource_provider/storage/paths.hpp
> > ```
> > 
> > This can be `getCSIEndpointDirSymlinkPath`

Do you want to expose this in `src/slave/paths.hpp`? I didn't create a helper 
because I think it's an SLRP-internal thing.


> On Oct. 20, 2017, 12:26 a.m., Jie Yu wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 296 (patched)
> > 
> >
> > What if someone choose a very long name for the provider? I would 
> > suggest we just use `os::mkdtemp` here. In fact, `os::temp` depends on 
> > TMPDIR env var. For instance, on my mac:
> > 
> > ```
> > Jies-MacBook-Pro:tmp jie$ echo $TMPDIR
> > /var/folders/cs/xp4tynrs69v0bbx55l4k5lx8gn/T/
> > ```
> > 
> > It can be very long too. But let's punt that for now as it should be 
> > rare.

I'm aware of `TMPDIR`. I've seen that both `os::temp` and 
`path::join(os::PATH_SEPARATOR, "tmp")` have been used in the codepath. Since 
we have the concern about the length, let's just use the latter, as it is safe 
to assume the standard `/tmp` dir exists in Un*x systems.


> On Oct. 20, 2017, 12:26 a.m., Jie Yu wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 337 (patched)
> > 
> >
> > You need to create metadata dir as well.
> > 
> > Although I don't quite like the fact that it'll be a CHECK failure if 
> > the creation fails. Ideally, we check the return value and call `fatal` if 
> > the creation failed.
> > 
> > Also, depends on the order you create both directories, the recovery 
> > logic in `initialize` might be different. In the recovery logic, it's 
> > likely one directory does not have a symlink (depends on the order here).
> > 
> > One way is to only treat this RP as an existing RP if both directories 
> > exists in `initialize`.

If the agent reregistered with a different slave ID, then only the top-level RP 
work dir will exist. I'd treat the lifecycle of the RP and the plugins 
differently: RP is an existing one if the meta dir exists, and plugin exists if 
the csi endpoint dir exists. The recovery logic now does not involve anything 
about the RP itself yet.


> On Oct. 20, 2017, 12:26 a.m., Jie Yu wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 341-363 (patched)
> > 
> >
> > Any reason do not move the logic in `initialize` related to csi 
> > endpoint dir here? It'll be much more readable if we group all logics 
> > related to CSIEndpointDir initialization here:
> > 
> > Basically, you want to "ensure" that the directory and symlink is 
> > properly setup:
> > 
> > 1) If symlink is there, and the linked dir is there, nothing needs to 
> > be one
> > 2) If symlink is there, and the linked dir is not there, do the same as 
> > 3)
> > 3) If symlink does not exist, create the tmp dir and link it

The only reason is to fail fast. But yes I agree that it will be more readible 
if we put all related logic here. Will do.


> On Oct. 20, 2017, 12:26 a.m., Jie Yu wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 385-386 (patched)
> > 
> >
> > My thinking is that in `StorageLocalResourceProviderProcess`, we 
> > maintain two `Future`, one for controller service and one for 
> > node service.
> > 
> > ```
> > Future controllerService;
> > Future nodeService;
> > ```
> > 
> > The above two will become ready once CSI plugin containers are properly 
> > launched.
> > 
> > Whenever the SLRP wants to use the csi client to talk to the plugin, 
> > it'll do something like this:
> > 
> > ```
> > 

Re: Review Request 63095: Added the Getting Started landing page.

2017-10-20 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63095/#review188857
---


Ship it!




Ship It!

- Greg Mann


On Oct. 18, 2017, 6:46 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63095/
> ---
> 
> (Updated Oct. 18, 2017, 6:46 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-8117
> https://issues.apache.org/jira/browse/MESOS-8117
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> After moving the build documentation to its own page, we can now have a
> real "Getting Started" page suitable for anyone to get started with
> Mesos. It is purposefully short, and therefore not overwhelming.
> 
> 
> Diffs
> -
> 
>   site/config.rb 9e6738fccc48365e16221188269c1ed40772eca7 
>   site/source/downloads.html.erb 3e24100a76a99b23ebc6589ee4f7332f43ef0ac9 
>   site/source/getting-started.html.md PRE-CREATION 
>   site/source/index.html.erb b4e847fadf6682503a4e09139491b539bc59e0fd 
>   site/source/layouts/documentation.erb 
> a91f916a5fb7348b2702c272e7a2059bdf628c66 
>   site/source/layouts/getting_started_section.erb PRE-CREATION 
>   site/source/layouts/layout.erb 9213c63e6fef3f57fc225a87a7d7e0abdcc11a88 
> 
> 
> Diff: https://reviews.apache.org/r/63095/diff/2/
> 
> 
> Testing
> ---
> 
> Built the site and tested it out locally.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 63177: Fixed 'SlaveRecoveryTest.RegisterDisconnectedSlave'.

2017-10-20 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63177/
---

(Updated Oct. 20, 2017, 4:58 p.m.)


Review request for mesos and Alexander Rukletsov.


Bugs: MESOS-7941
https://issues.apache.org/jira/browse/MESOS-7941


Repository: mesos


Description
---

The test was broken in 'de11f0ffed36fbabc8fb4167859fd23b75f43f10'
due to the executor sending one more status update.

Additionally, an explicit verification of the status update
acknowledgement was added to eliminate a common source of
test flakyness.


Diffs
-

  src/tests/slave_recovery_tests.cpp c2d9cc88a25e58887e280bbb7d3a4466525055d5 


Diff: https://reviews.apache.org/r/63177/diff/1/


Testing
---

`./mesos-tests --gtest_filter="SlaveRecoveryTest/0.RegisterDisconnectedSlave" 
--break_on_failure --gtest_repeat=5000`


Thanks,

Benno Evers



Re: Review Request 63076: Windows: Fixed off-by-one error in long path support.

2017-10-20 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63076/
---

(Updated Oct. 20, 2017, 9:57 a.m.)


Review request for mesos, James Peach and Joseph Wu.


Changes
---

Fixed typo and changed some ASSERT to EXPECT.


Bugs: MESOS-8116
https://issues.apache.org/jira/browse/MESOS-8116


Repository: mesos


Description
---

The `CreateDirectoryW` API will fail with a path of 248 characters
exactly, so this needed to be `>=`.


Diffs (updated)
-

  3rdparty/stout/include/stout/internal/windows/longpath.hpp 
446b038b0828c25208dd27f9b7f14ea2d32dfbea 
  3rdparty/stout/tests/os/filesystem_tests.cpp 
0982fd5d35ae1088ec7e768e55b143166c923f8a 


Diff: https://reviews.apache.org/r/63076/diff/3/

Changes: https://reviews.apache.org/r/63076/diff/2-3/


Testing
---

Tested on Windows with a patch of exactly 248 characters.


Thanks,

Andrew Schwartzmeyer



Re: Review Request 63177: Fixed 'SlaveRecoveryTest.RegisterDisconnectedSlave'.

2017-10-20 Thread Alexander Rukletsov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63177/#review188850
---


Ship it!




Ship It!

- Alexander Rukletsov


On Oct. 20, 2017, 9:47 a.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63177/
> ---
> 
> (Updated Oct. 20, 2017, 9:47 a.m.)
> 
> 
> Review request for mesos and Alexander Rukletsov.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The test was broken in 'de11f0ffed36fbabc8fb4167859fd23b75f43f10'
> due to the executor sending one more status update.
> 
> Additionally, an explicit verification of the status update
> acknowledgement was added to eliminate a common source of
> test flakyness.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_recovery_tests.cpp c2d9cc88a25e58887e280bbb7d3a4466525055d5 
> 
> 
> Diff: https://reviews.apache.org/r/63177/diff/1/
> 
> 
> Testing
> ---
> 
> `./mesos-tests --gtest_filter="SlaveRecoveryTest/0.RegisterDisconnectedSlave" 
> --break_on_failure --gtest_repeat=5000`
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Re: Review Request 63076: Windows: Fixed off-by-one error in long path support.

2017-10-20 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63076/#review188847
---


Fix it, then Ship it!





3rdparty/stout/include/stout/internal/windows/longpath.hpp
Line 36 (original), 36 (patched)


Not yours, but s/Uniode/Unicode/.



3rdparty/stout/tests/os/filesystem_tests.cpp
Lines 232 (patched)


Nit, this can be an `EXPECT_SOME`.


- James Peach


On Oct. 19, 2017, 11:37 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63076/
> ---
> 
> (Updated Oct. 19, 2017, 11:37 p.m.)
> 
> 
> Review request for mesos, James Peach and Joseph Wu.
> 
> 
> Bugs: MESOS-8116
> https://issues.apache.org/jira/browse/MESOS-8116
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The `CreateDirectoryW` API will fail with a path of 248 characters
> exactly, so this needed to be `>=`.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/internal/windows/longpath.hpp 
> 446b038b0828c25208dd27f9b7f14ea2d32dfbea 
>   3rdparty/stout/tests/os/filesystem_tests.cpp 
> 0982fd5d35ae1088ec7e768e55b143166c923f8a 
> 
> 
> Diff: https://reviews.apache.org/r/63076/diff/2/
> 
> 
> Testing
> ---
> 
> Tested on Windows with a patch of exactly 248 characters.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 63183: Stopped awaiting the connected event in ports isolator tests.

2017-10-20 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63183/#review188848
---


Ship it!




Ship It!

- Gaston Kleiman


On Oct. 20, 2017, 9:27 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63183/
> ---
> 
> (Updated Oct. 20, 2017, 9:27 a.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Gaston Kleiman.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Rather than explicitly waiting for the `connected` scheduler event,
> consistently apply the testing pattern from the default executor
> tests. We expect that the `connected` event happens, but we only
> need to synchronize the test on the `subscribed` event.
> 
> 
> Diffs
> -
> 
>   src/tests/containerizer/ports_isolator_tests.cpp 
> 7a19193d2027d51319c3a69a97bfae1fa819845b 
> 
> 
> Diff: https://reviews.apache.org/r/63183/diff/1/
> 
> 
> Testing
> ---
> 
> sudo ./mesos-tests --gtest_repeat=10 --gtest_break_on_failure 
> --gtest_filter=NetworkPorts*
> 
> (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Review Request 63183: Stopped awaiting the connected event in ports isolator tests.

2017-10-20 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63183/
---

Review request for mesos, Alexander Rukletsov and Gaston Kleiman.


Repository: mesos


Description
---

Rather than explicitly waiting for the `connected` scheduler event,
consistently apply the testing pattern from the default executor
tests. We expect that the `connected` event happens, but we only
need to synchronize the test on the `subscribed` event.


Diffs
-

  src/tests/containerizer/ports_isolator_tests.cpp 
7a19193d2027d51319c3a69a97bfae1fa819845b 


Diff: https://reviews.apache.org/r/63183/diff/1/


Testing
---

sudo ./mesos-tests --gtest_repeat=10 --gtest_break_on_failure 
--gtest_filter=NetworkPorts*

(Fedora 26)


Thanks,

James Peach



Re: Review Request 63104: Added a helper to extract resources from storage operations.

2017-10-20 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63104/#review188846
---


Ship it!




Ship It!

- Jie Yu


On Oct. 20, 2017, 12:50 p.m., Jan Schlicht wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63104/
> ---
> 
> (Updated Oct. 20, 2017, 12:50 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Jie Yu.
> 
> 
> Bugs: MESOS-7594
> https://issues.apache.org/jira/browse/MESOS-7594
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a helper to extract resources from storage operations.
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp c43ab75b5492320dfe19a7c723a72ac52b8ab722 
>   src/common/protobuf_utils.cpp fd4858a64dfc136dd03cb1eef4c97d0f8d43bdae 
> 
> 
> Diff: https://reviews.apache.org/r/63104/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>



Re: Review Request 63107: Added operation feedback for storage operations.

2017-10-20 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63107/#review188845
---



Bad patch!

Reviews applied: [63107, 63106, 61947, 63105, 61946, 61810, 63104, 63094, 
62903, 63001]

Failed command: python support/apply-reviews.py -n -r 63107

Error:
2017-10-20 16:16:20 URL:https://reviews.apache.org/r/63107/diff/raw/ 
[13052/13052] -> "63107.patch" [1]
error: patch failed: src/resource_provider/message.hpp:18
error: src/resource_provider/message.hpp: patch does not apply
error: patch failed: src/slave/slave.cpp:6644
error: src/slave/slave.cpp: patch does not apply

Full log: https://builds.apache.org/job/Mesos-Reviewbot/19860/console

- Mesos Reviewbot


On Oct. 20, 2017, 5:56 a.m., Jan Schlicht wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63107/
> ---
> 
> (Updated Oct. 20, 2017, 5:56 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Jie Yu.
> 
> 
> Bugs: MESOS-8087
> https://issues.apache.org/jira/browse/MESOS-8087
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When a framework accepts an offer that contains resource provider
> resources with a storage operation (`CREATE_VOLUME`, `DESTROY_VOLUME`,
> `CREATE_BLOCK`, `DESTROY_BLOCK`), the result of this operation cannot
> be guessed and will only be known after the operation has been
> successfully applied by the resource provider.
> This patch introduces the necessary handling for such operations. The
> internal bookkeeping of available resources in the master and agent has
> been updated to update resources only after operation feedback has been
> received. This ensures that converted resources can only be offered
> after the operation was executed by a resource provider.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp a3c9530e340b14b739da90851227f8eed254b4ac 
>   src/master/master.cpp 3603878f02ae3dba82811a4a5770dd21ec790ef6 
>   src/resource_provider/manager.cpp 31fcb789f5ab907511e868c374c49f7457a33ed3 
>   src/resource_provider/message.hpp 3c7c3f2baeb726e04edd6ffbb9784699d7afe521 
>   src/slave/slave.hpp aea1e948209c7c8945665915bc2f6d8eb47814ef 
>   src/slave/slave.cpp d8477b4e364270f5ad61311aa8fe48f823bc7aac 
> 
> 
> Diff: https://reviews.apache.org/r/63107/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>



Re: Review Request 63177: Fixed 'SlaveRecoveryTest.RegisterDisconnectedSlave'.

2017-10-20 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63177/#review188839
---



PASS: Mesos patch 63177 was successfully built and tested.

Reviews applied: `['63177']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/63177

- Mesos Reviewbot Windows


On Oct. 20, 2017, 5:47 a.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63177/
> ---
> 
> (Updated Oct. 20, 2017, 5:47 a.m.)
> 
> 
> Review request for mesos and Alexander Rukletsov.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The test was broken in 'de11f0ffed36fbabc8fb4167859fd23b75f43f10'
> due to the executor sending one more status update.
> 
> Additionally, an explicit verification of the status update
> acknowledgement was added to eliminate a common source of
> test flakyness.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_recovery_tests.cpp c2d9cc88a25e58887e280bbb7d3a4466525055d5 
> 
> 
> Diff: https://reviews.apache.org/r/63177/diff/1/
> 
> 
> Testing
> ---
> 
> `./mesos-tests --gtest_filter="SlaveRecoveryTest/0.RegisterDisconnectedSlave" 
> --break_on_failure --gtest_repeat=5000`
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Re: Review Request 63107: Added operation feedback for storage operations.

2017-10-20 Thread Jan Schlicht


> On Oct. 19, 2017, 3:08 a.m., Jie Yu wrote:
> > src/master/master.cpp
> > Lines 7088-7089 (patched)
> > 
> >
> > This shouldn't be necessary.

I need to update the allocator with updateSlave here to inform it about the new 
resources. Otherwise these new resources would never appear in an offer. That's 
also the reason why slave->totalResources has to be changed.


> On Oct. 19, 2017, 3:08 a.m., Jie Yu wrote:
> > src/resource_provider/message.hpp
> > Line 36 (original), 37 (patched)
> > 
> >
> > Not yours, this should probably be `UpdateState`

I'll rebase on https://reviews.apache.org/r/62903/, it'll be UPDATE_STATE then.


- Jan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63107/#review188618
---


On Oct. 20, 2017, 2:56 p.m., Jan Schlicht wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63107/
> ---
> 
> (Updated Oct. 20, 2017, 2:56 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Jie Yu.
> 
> 
> Bugs: MESOS-8087
> https://issues.apache.org/jira/browse/MESOS-8087
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When a framework accepts an offer that contains resource provider
> resources with a storage operation (`CREATE_VOLUME`, `DESTROY_VOLUME`,
> `CREATE_BLOCK`, `DESTROY_BLOCK`), the result of this operation cannot
> be guessed and will only be known after the operation has been
> successfully applied by the resource provider.
> This patch introduces the necessary handling for such operations. The
> internal bookkeeping of available resources in the master and agent has
> been updated to update resources only after operation feedback has been
> received. This ensures that converted resources can only be offered
> after the operation was executed by a resource provider.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp a3c9530e340b14b739da90851227f8eed254b4ac 
>   src/master/master.cpp 3603878f02ae3dba82811a4a5770dd21ec790ef6 
>   src/resource_provider/manager.cpp 31fcb789f5ab907511e868c374c49f7457a33ed3 
>   src/resource_provider/message.hpp 3c7c3f2baeb726e04edd6ffbb9784699d7afe521 
>   src/slave/slave.hpp aea1e948209c7c8945665915bc2f6d8eb47814ef 
>   src/slave/slave.cpp d8477b4e364270f5ad61311aa8fe48f823bc7aac 
> 
> 
> Diff: https://reviews.apache.org/r/63107/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>



Re: Review Request 63107: Added operation feedback for storage operations.

2017-10-20 Thread Jan Schlicht

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63107/
---

(Updated Oct. 20, 2017, 2:56 p.m.)


Review request for mesos, Benjamin Bannier and Jie Yu.


Changes
---

Rebased and addresses issues.


Bugs: MESOS-8087
https://issues.apache.org/jira/browse/MESOS-8087


Repository: mesos


Description
---

When a framework accepts an offer that contains resource provider
resources with a storage operation (`CREATE_VOLUME`, `DESTROY_VOLUME`,
`CREATE_BLOCK`, `DESTROY_BLOCK`), the result of this operation cannot
be guessed and will only be known after the operation has been
successfully applied by the resource provider.
This patch introduces the necessary handling for such operations. The
internal bookkeeping of available resources in the master and agent has
been updated to update resources only after operation feedback has been
received. This ensures that converted resources can only be offered
after the operation was executed by a resource provider.


Diffs (updated)
-

  src/master/master.hpp a3c9530e340b14b739da90851227f8eed254b4ac 
  src/master/master.cpp 3603878f02ae3dba82811a4a5770dd21ec790ef6 
  src/resource_provider/manager.cpp 31fcb789f5ab907511e868c374c49f7457a33ed3 
  src/resource_provider/message.hpp 3c7c3f2baeb726e04edd6ffbb9784699d7afe521 
  src/slave/slave.hpp aea1e948209c7c8945665915bc2f6d8eb47814ef 
  src/slave/slave.cpp d8477b4e364270f5ad61311aa8fe48f823bc7aac 


Diff: https://reviews.apache.org/r/63107/diff/2/

Changes: https://reviews.apache.org/r/63107/diff/1-2/


Testing
---

make check


Thanks,

Jan Schlicht



Re: Review Request 61947: Implemented handling of resource provider offer operations.

2017-10-20 Thread Jan Schlicht

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61947/
---

(Updated Oct. 20, 2017, 2:54 p.m.)


Review request for mesos, Benjamin Bannier and Jie Yu.


Changes
---

Rebased, addressed issues: No longer alters `slave->totalResources` by calling 
`apply`.


Bugs: MESOS-7594
https://issues.apache.org/jira/browse/MESOS-7594


Repository: mesos


Description
---

While the resource provider manager is responsible to apply offer
operations by sending events to the respective resource providers,
the master takes care of accepting these operations. Hence, for local
resource providers the master sends an `ApplyResourceOperationMessage`
to the agent where the resource provider is running on. The agent
then relays the operation contained in the message to the resource
provider manager.


Diffs (updated)
-

  src/master/master.cpp 3603878f02ae3dba82811a4a5770dd21ec790ef6 
  src/slave/slave.hpp aea1e948209c7c8945665915bc2f6d8eb47814ef 
  src/slave/slave.cpp d8477b4e364270f5ad61311aa8fe48f823bc7aac 


Diff: https://reviews.apache.org/r/61947/diff/4/

Changes: https://reviews.apache.org/r/61947/diff/3-4/


Testing
---

make check


Thanks,

Jan Schlicht



Re: Review Request 63105: Removed TODOs from storage operation 'apply' handlers.

2017-10-20 Thread Jan Schlicht

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63105/
---

(Updated Oct. 20, 2017, 2:53 p.m.)


Review request for mesos, Benjamin Bannier and Jie Yu.


Changes
---

Changed scope. No longer alters resources.


Summary (updated)
-

Removed TODOs from storage operation 'apply' handlers.


Bugs: MESOS-7594
https://issues.apache.org/jira/browse/MESOS-7594


Repository: mesos


Description (updated)
---

These operations don't alter the offered resources immediately.
Only after operation feedback has been received, resources might
be altered. But this will be handled in a different code path.


Diffs (updated)
-

  src/common/resources.cpp 7ee4dae1389e037531aec533a3d235ee06443ea8 


Diff: https://reviews.apache.org/r/63105/diff/2/

Changes: https://reviews.apache.org/r/63105/diff/1-2/


Testing
---

make check


Thanks,

Jan Schlicht



Re: Review Request 61810: Added a function to apply offer operations.

2017-10-20 Thread Jan Schlicht

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61810/
---

(Updated Oct. 20, 2017, 2:51 p.m.)


Review request for mesos, Benjamin Bannier and Jie Yu.


Changes
---

Rebased, changed logging behavior to include operation id.


Bugs: MESOS-7594
https://issues.apache.org/jira/browse/MESOS-7594


Repository: mesos


Description
---

The resource provider manager provides an 'apply' function for offer
operation affecting resource providers. The resource on which the
operation should be applied contains a resource provider ID. This will
be extracted and an event will be sent to the respective resource
provider.


Diffs (updated)
-

  src/resource_provider/manager.hpp 3b70e75c6b6721864ae0ee9c4a593b5035d8388f 
  src/resource_provider/manager.cpp 31fcb789f5ab907511e868c374c49f7457a33ed3 
  src/tests/resource_provider_manager_tests.cpp 
ca49e1f0203494fc8b4a4507c33e5a3885a14a59 


Diff: https://reviews.apache.org/r/61810/diff/3/

Changes: https://reviews.apache.org/r/61810/diff/2-3/


Testing
---

make check


Thanks,

Jan Schlicht



Re: Review Request 63104: Added a helper to extract resources from storage operations.

2017-10-20 Thread Jan Schlicht

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63104/
---

(Updated Oct. 20, 2017, 2:50 p.m.)


Review request for mesos, Benjamin Bannier and Jie Yu.


Changes
---

Addressed issues and changed dependencies of chain.


Bugs: MESOS-7594
https://issues.apache.org/jira/browse/MESOS-7594


Repository: mesos


Description
---

Added a helper to extract resources from storage operations.


Diffs (updated)
-

  src/common/protobuf_utils.hpp c43ab75b5492320dfe19a7c723a72ac52b8ab722 
  src/common/protobuf_utils.cpp fd4858a64dfc136dd03cb1eef4c97d0f8d43bdae 


Diff: https://reviews.apache.org/r/63104/diff/2/

Changes: https://reviews.apache.org/r/63104/diff/1-2/


Testing
---

make check


Thanks,

Jan Schlicht



Re: Review Request 63094: Added resource version uuid for offer operations.

2017-10-20 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63094/#review188829
---




src/messages/messages.proto
Lines 613 (patched)


See below.



src/messages/messages.proto
Lines 663 (patched)


Even though currently operation on agent resources cannot exhibit operation 
speculation failures, I would still prefer would we include some version uuid 
for the agent resources here as well.

AFAICT this could happen either by adjusting `ResourceVersionUUID` so that 
`resource_provider_id` becomes `optional` or by introducing an explicit proto3 
`map` here, or alternatively, by the agent maintaining its own uuid which 
changes whenever a RP uuid changes and using that uuid instead of a vector here 
(we discussed this offline).



src/messages/messages.proto
Lines 712 (patched)


See comment on `UpdateSlaveMessage`.


- Benjamin Bannier


On Oct. 19, 2017, 11:35 p.m., Jie Yu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63094/
> ---
> 
> (Updated Oct. 19, 2017, 11:35 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Mahler, Chun-Hung Hsiao, 
> Gaston Kleiman, Greg Mann, Joseph Wu, Jan Schlicht, and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The resource version UUID is used to establish the relationship
> between the operation and the resources that the operation is
> operating on. Each resource provider will keep a resource version
> UUID, and changes it when it believes that the resources from this
> resource provider are out of sync from the master's view.  The master
> will keep track of the last known resource version UUID for each
> resource provider, and attach the resource version UUID in each
> operation it sends out. The resource provider should reject operations
> that have a different resource version UUID than that it maintains,
> because this means the operation is operating on resources that might
> have already been invalidated.
> 
> 
> Diffs
> -
> 
>   include/mesos/resource_provider/resource_provider.proto 
> f5a9073075327019fd133bd51265f695ef464845 
>   include/mesos/v1/resource_provider/resource_provider.proto 
> e5cbede5b6e57a8641fca1ebfee5454f292cc24c 
>   src/messages/messages.proto 0a32b3457e9143a7d48670610ca3e56dd516136f 
>   src/tests/resource_provider_manager_tests.cpp 
> ca49e1f0203494fc8b4a4507c33e5a3885a14a59 
>   src/tests/resource_provider_validation_tests.cpp 
> f182bff4670318e9de22c2915c5dbb423a74ad6c 
> 
> 
> Diff: https://reviews.apache.org/r/63094/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Jie Yu
> 
>



Re: Review Request 63177: Fixed 'SlaveRecoveryTest.RegisterDisconnectedSlave'.

2017-10-20 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63177/#review188828
---



Patch looks great!

Reviews applied: [63177]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Oct. 20, 2017, 2:47 a.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63177/
> ---
> 
> (Updated Oct. 20, 2017, 2:47 a.m.)
> 
> 
> Review request for mesos and Alexander Rukletsov.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The test was broken in 'de11f0ffed36fbabc8fb4167859fd23b75f43f10'
> due to the executor sending one more status update.
> 
> Additionally, an explicit verification of the status update
> acknowledgement was added to eliminate a common source of
> test flakyness.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_recovery_tests.cpp c2d9cc88a25e58887e280bbb7d3a4466525055d5 
> 
> 
> Diff: https://reviews.apache.org/r/63177/diff/1/
> 
> 
> Testing
> ---
> 
> `./mesos-tests --gtest_filter="SlaveRecoveryTest/0.RegisterDisconnectedSlave" 
> --break_on_failure --gtest_repeat=5000`
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Re: Review Request 63076: Windows: Fixed off-by-one error in long path support.

2017-10-20 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63076/#review188827
---



PASS: Mesos patch 63076 was successfully built and tested.

Reviews applied: `['63076']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/63076

- Mesos Reviewbot Windows


On Oct. 19, 2017, 11:37 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63076/
> ---
> 
> (Updated Oct. 19, 2017, 11:37 p.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The `CreateDirectoryW` API will fail with a path of 248 characters
> exactly, so this needed to be `>=`.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/internal/windows/longpath.hpp 
> 446b038b0828c25208dd27f9b7f14ea2d32dfbea 
>   3rdparty/stout/tests/os/filesystem_tests.cpp 
> 0982fd5d35ae1088ec7e768e55b143166c923f8a 
> 
> 
> Diff: https://reviews.apache.org/r/63076/diff/2/
> 
> 
> Testing
> ---
> 
> Tested on Windows with a patch of exactly 248 characters.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 63107: Added operation feedback for storage operations.

2017-10-20 Thread Jan Schlicht

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63107/#review188826
---




src/master/master.cpp
Lines 7088-7089 (patched)


I need to update the allocator with `updateSlave` here to inform it about 
the new resources. Otherwise these new resources would never appear in an 
offer. That's also the reason why `slave->totalResources` has to be changed.



src/resource_provider/message.hpp
Line 36 (original), 37 (patched)


I'll rebase on https://reviews.apache.org/r/62903/, it'll be `UPDATE_STATE` 
then.


- Jan Schlicht


On Oct. 18, 2017, 4:39 p.m., Jan Schlicht wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63107/
> ---
> 
> (Updated Oct. 18, 2017, 4:39 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Jie Yu.
> 
> 
> Bugs: MESOS-8087
> https://issues.apache.org/jira/browse/MESOS-8087
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When a framework accepts an offer that contains resource provider
> resources with a storage operation (`CREATE_VOLUME`, `DESTROY_VOLUME`,
> `CREATE_BLOCK`, `DESTROY_BLOCK`), the result of this operation cannot
> be guessed and will only be known after the operation has been
> successfully applied by the resource provider.
> This patch introduces the necessary handling for such operations. The
> internal bookkeeping of available resources in the master and agent has
> been updated to update resources only after operation feedback has been
> received. This ensures that converted resources can only be offered
> after the operation was executed by a resource provider.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp 0ddc98259f64b3921d08f5f4ec81543bb0826378 
>   src/master/master.cpp 3603878f02ae3dba82811a4a5770dd21ec790ef6 
>   src/resource_provider/manager.cpp 31fcb789f5ab907511e868c374c49f7457a33ed3 
>   src/resource_provider/message.hpp 3c7c3f2baeb726e04edd6ffbb9784699d7afe521 
>   src/slave/slave.hpp aea1e948209c7c8945665915bc2f6d8eb47814ef 
>   src/slave/slave.cpp 4d7dc8e9a3901b00103031e24e5d6328d0f2e2ad 
> 
> 
> Diff: https://reviews.apache.org/r/63107/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>



Review Request 63177: Fixed 'SlaveRecoveryTest.RegisterDisconnectedSlave'.

2017-10-20 Thread Benno Evers

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63177/
---

Review request for mesos and Alexander Rukletsov.


Repository: mesos


Description
---

The test was broken in 'de11f0ffed36fbabc8fb4167859fd23b75f43f10'
due to the executor sending one more status update.

Additionally, an explicit verification of the status update
acknowledgement was added to eliminate a common source of
test flakyness.


Diffs
-

  src/tests/slave_recovery_tests.cpp c2d9cc88a25e58887e280bbb7d3a4466525055d5 


Diff: https://reviews.apache.org/r/63177/diff/1/


Testing
---

`./mesos-tests --gtest_filter="SlaveRecoveryTest/0.RegisterDisconnectedSlave" 
--break_on_failure --gtest_repeat=5000`


Thanks,

Benno Evers



Re: Review Request 63175: Do not generate UnavailableResources for inactive frameworks.

2017-10-20 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63175/#review188824
---



PASS: Mesos patch 63175 was successfully built and tested.

Reviews applied: `['63175']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/63175

- Mesos Reviewbot Windows


On Oct. 20, 2017, 12:02 a.m., Jiang Yan Xu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63175/
> ---
> 
> (Updated Oct. 20, 2017, 12:02 a.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Bugs: MESOS-8085
> https://issues.apache.org/jira/browse/MESOS-8085
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Do not generate UnavailableResources for inactive frameworks.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> c2346054b2c98516f15ab8ce2dc798224ff4def4 
>   src/master/allocator/mesos/hierarchical.cpp 
> 5b6efe5faa3c3b10f1f714f582a155b368f8ccaf 
> 
> 
> Diff: https://reviews.apache.org/r/63175/diff/1/
> 
> 
> Testing
> ---
> 
> make check.
> 
> I didn' write a new test as the externally observable behavior doesn't change.
> 
> 
> Thanks,
> 
> Jiang Yan Xu
> 
>



Re: Review Request 63175: Do not generate UnavailableResources for inactive frameworks.

2017-10-20 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63175/#review188823
---



Patch looks great!

Reviews applied: [63175]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Oct. 20, 2017, 12:02 a.m., Jiang Yan Xu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63175/
> ---
> 
> (Updated Oct. 20, 2017, 12:02 a.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Bugs: MESOS-8085
> https://issues.apache.org/jira/browse/MESOS-8085
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Do not generate UnavailableResources for inactive frameworks.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> c2346054b2c98516f15ab8ce2dc798224ff4def4 
>   src/master/allocator/mesos/hierarchical.cpp 
> 5b6efe5faa3c3b10f1f714f582a155b368f8ccaf 
> 
> 
> Diff: https://reviews.apache.org/r/63175/diff/1/
> 
> 
> Testing
> ---
> 
> make check.
> 
> I didn' write a new test as the externally observable behavior doesn't change.
> 
> 
> Thanks,
> 
> Jiang Yan Xu
> 
>



Re: Review Request 63174: Added a benchmark for agent reregistration during master failover.

2017-10-20 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63174/#review188819
---



PASS: Mesos patch 63174 was successfully built and tested.

Reviews applied: `['63174']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/63174

- Mesos Reviewbot Windows


On Oct. 19, 2017, 11:28 p.m., Jiang Yan Xu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63174/
> ---
> 
> (Updated Oct. 19, 2017, 11:28 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Dmitry Zhuk, and Ilya Pronin.
> 
> 
> Bugs: MESOS-8098
> https://issues.apache.org/jira/browse/MESOS-8098
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The current benchmark is very simple: without framework involvement and 
> without agent retries but it's possible to add a number of others so I am 
> creating a new file for them.
> 
> 
> Diffs
> -
> 
>   src/Makefile.am 936bc49ddfca03b9278ab11b6d317f3ff635cb00 
>   src/tests/CMakeLists.txt 386e0473c93d0a993248c7818067071d0c761c76 
>   src/tests/master_benchmarks.cpp PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/63174/diff/1/
> 
> 
> Testing
> ---
> 
> Benchmark based off 
> https://github.com/apache/mesos/commit/41193181d6b75eeecae2729bf98007d9318e351a
>  (close to current HEAD).
> 
> ```
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
> Reregistered 2000 agents with a total of 50 running tasks and 50 
> completed tasks in 45.075488ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
>  (48126 ms)
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
> Reregistered 2000 agents with a total of 100 running tasks and 0 
> completed tasks in 14.172361ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
>  (45979 ms)
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
> Reregistered 2 agents with a total of 100 running tasks and 0 
> completed tasks in 413.508328ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
>  (49487 ms)
> [--] 3 tests from 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (143596 ms total)
> 
> ...
> 
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
> Reregistered 2000 agents with a total of 50 running tasks and 50 
> completed tasks in 32.787363ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
>  (48266 ms)
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
> Reregistered 2000 agents with a total of 100 running tasks and 0 
> completed tasks in 19.735003ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
>  (46169 ms)
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
> Reregistered 2 agents with a total of 100 running tasks and 0 
> completed tasks in 321.267267ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
>  (51550 ms)
> [--] 3 tests from 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (145987 ms total)
> ```
> 
> Benchmark based off 
> https://github.com/apache/mesos/commit/d9c90bf1d9c8b3a7dcc47be0cb773efff57cfb9d
>  (before https://issues.apache.org/jira/browse/MESOS-7713 was merged)
> ```
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
> Reregistered 2000 agents with a total of 50 running tasks and 50 
> completed tasks in 85.800335ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
>  (59247 ms)
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
> Reregistered 2000 agents with a total of 100 running tasks and 0 
> completed tasks in 35.342066ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
>  (93662 ms)
> [ RUN  ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
> Reregistered 2 agents with a total of 100 running tasks and 0 
> completed tasks in 798.738642ms
> [   OK ] 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
>  (116078 ms)
> [--] 3 tests from 
> AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (268987 ms 

Re: Review Request 63076: Windows: Fixed off-by-one error in long path support.

2017-10-20 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63076/#review188816
---



Patch looks great!

Reviews applied: [63076]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Oct. 19, 2017, 11:37 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63076/
> ---
> 
> (Updated Oct. 19, 2017, 11:37 p.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The `CreateDirectoryW` API will fail with a path of 248 characters
> exactly, so this needed to be `>=`.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/internal/windows/longpath.hpp 
> 446b038b0828c25208dd27f9b7f14ea2d32dfbea 
>   3rdparty/stout/tests/os/filesystem_tests.cpp 
> 0982fd5d35ae1088ec7e768e55b143166c923f8a 
> 
> 
> Diff: https://reviews.apache.org/r/63076/diff/2/
> 
> 
> Testing
> ---
> 
> Tested on Windows with a patch of exactly 248 characters.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>