Re: Review Request 69211: Improved the code comments for `getContainerDevicesPath`.

2018-12-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69211/#review211015
---


Ship it!




Ship It!

- Jie Yu


On Dec. 4, 2018, 1:19 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69211/
> ---
> 
> (Updated Dec. 4, 2018, 1:19 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jason Lai, Jie Yu, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-9319
> https://issues.apache.org/jira/browse/MESOS-9319
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Improved the code comments for `getContainerDevicesPath`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/paths.hpp 
> de3981db7eb08e53901547037c947f594c8d46ab 
> 
> 
> Diff: https://reviews.apache.org/r/69211/diff/10/
> 
> 
> Testing
> ---
> 
> make check (Fedora 28)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 69450: Applied the `ContainerMountInfo` protobuf helper.

2018-12-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69450/#review211014
---


Ship it!




Ship It!

- Jie Yu


On Nov. 27, 2018, 12:49 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69450/
> ---
> 
> (Updated Nov. 27, 2018, 12:49 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jason Lai, Jie Yu, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-9319
> https://issues.apache.org/jira/browse/MESOS-9319
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Now that there is a protobuf helper to manufacture `ContainerMountInfo`
> messages, apply it where appropriate.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> 507665cdd3c08f6716840f2b10f2b9659e7fcf1a 
>   src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp 
> d1938836997b177ec89c1e8d671dc6e0379aa061 
>   src/slave/containerizer/mesos/isolators/filesystem/shared.cpp 
> 5b481b5e8e66104b8b7af6fa97dcb411fb0a1f5e 
>   src/slave/containerizer/mesos/isolators/namespaces/pid.cpp 
> 73e09636808081ce411c022bf2bcb2a157a3ad25 
>   src/slave/containerizer/mesos/isolators/volume/host_path.cpp 
> 88ecf91d91e2bebd484a4ac94510a14b3500dbfb 
>   src/slave/containerizer/mesos/isolators/volume/image.cpp 
> bb3fc659f1322d26fe3d035c961d3942c73353f0 
>   src/slave/containerizer/mesos/isolators/volume/sandbox_path.cpp 
> 300b3d95d74b73fbe0221096f3f3f172be745081 
> 
> 
> Diff: https://reviews.apache.org/r/69450/diff/2/
> 
> 
> Testing
> ---
> 
> sudo make check (Fedora 29)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 69210: Used the MS_SILENT mount flag to elide unwanted logging.

2018-12-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69210/#review211013
---




src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
Line 189 (original), 189 (patched)


This is a bit weird to me. I'd prefer just add a new optional field in 
ContainerMountInfo


- Jie Yu


On Nov. 17, 2018, 12:49 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69210/
> ---
> 
> (Updated Nov. 17, 2018, 12:49 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jason Lai, Jie Yu, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-9319
> https://issues.apache.org/jira/browse/MESOS-9319
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> After moving the container root filesystem mounts to the
> `filesystem/linux` isolator, these mounts are not logged into the task
> sandbox by the container launcher. Since these logs are not generally
> useful to tasks, and we did't previously emit them, use the `MS_SILENT`
> flag to indicate that they don't need to be logged. Since the kernel
> doesn't use `MS_SILENT` for anything, we can safely use it internally.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/isolators/filesystem/linux.cpp 
> c7d753ac2e5575a8d687600bfb9e0617fa72c990 
>   src/slave/containerizer/mesos/launch.cpp 
> 882bcdf89e2b0cca3d3f62e6d017849a51ceaead 
> 
> 
> Diff: https://reviews.apache.org/r/69210/diff/9/
> 
> 
> Testing
> ---
> 
> make check (Fedora 28)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 69086: Moved the container root construction to the isolators.

2018-12-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69086/#review211012
---


Ship it!




Ship It!

- Jie Yu


On Nov. 27, 2018, 12:49 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69086/
> ---
> 
> (Updated Nov. 27, 2018, 12:49 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jason Lai, Jie Yu, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-9319
> https://issues.apache.org/jira/browse/MESOS-9319
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Previously, if the container was configured with a root filesytem,
> the root was populated by a combination of the `fs::chroot:prepare`
> API and the various isolators. The implementation details of some
> isolators had leaked into the chroot code, which had a special case
> for adding GPU devices.
> 
> This change moves all the responsibility for defining the
> root filesystem from the `fs::chroot::prepare()` API to the
> `filesystem/linux` isolator. The `filesystem/linux` isolator is
> now the single place that captures how to mount the container
> pseudo-filesystems as well as how to construct a proper `/dev`
> directory.
> 
> Since the `linux/filesystem` isolator is now entirely responsible
> for creating and mounting the container `/dev`, any other isolators
> that enable access to devices should populate device nodes in the
> container devices directory and add a corresponding bind mount.
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp 1662125ed3e47b179ee32d08c1d3af75553066ba 
>   src/common/protobuf_utils.cpp a45607eed4c4bae5010bcc3f3ffeabd6d911062a 
>   src/linux/fs.hpp 31969f6ba82bf5ee549bfdf9698a21adaa486a90 
>   src/linux/fs.cpp 5cdffe1f4c7f00aee5b8f640e7cfa4a0018cfa0a 
>   src/slave/containerizer/mesos/isolators/filesystem/linux.cpp 
> c7d753ac2e5575a8d687600bfb9e0617fa72c990 
>   src/slave/containerizer/mesos/isolators/gpu/isolator.hpp 
> 4645c625877d9451516133b24bd3959e0f49c0a9 
>   src/slave/containerizer/mesos/isolators/gpu/isolator.cpp 
> 56d835779618fd965d928c6926664583e9141f79 
>   src/slave/containerizer/mesos/isolators/linux/devices.cpp 
> 8f8ff95ec3856ba06647637a80315365d0e66e23 
>   src/slave/containerizer/mesos/launch.cpp 
> 882bcdf89e2b0cca3d3f62e6d017849a51ceaead 
> 
> 
> Diff: https://reviews.apache.org/r/69086/diff/13/
> 
> 
> Testing
> ---
> 
> sudo make check (Fedora 28)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 69163: Set agent and/or resource provider ID in operation status updates.

2018-12-03 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69163/#review211010
---




src/tests/master_tests.cpp
Lines 9276 (patched)


Will we hit MESOS-6033 here?

Also, will `disconnected` be called during teardown? If yes maybe we want 
to expect it to be called at most once?


- Chun-Hung Hsiao


On Dec. 3, 2018, 1:20 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69163/
> ---
> 
> (Updated Dec. 3, 2018, 1:20 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch sets agent and/or resource provider ID operation status
> update messages. This is not always possible, e.g., some operations
> might fail validation so that no corresponding IDs can be extracted.
> 
> Since operations failing validation are currently directly rejected by
> the master without going through a status update manager, they are not
> retried either. If a master status update manager for operations is
> introduced at a later point it should be possible to forward
> acknowledgements for updates to the master's update manager (not agent
> ID, not resource provider ID).
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp 1662125ed3e47b179ee32d08c1d3af75553066ba 
>   src/common/protobuf_utils.cpp a45607eed4c4bae5010bcc3f3ffeabd6d911062a 
>   src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
>   src/resource_provider/manager.cpp 6c81c430e9e1205d71982a7fa2bcd9aa15fc01b2 
>   src/resource_provider/storage/provider.cpp 
> a22c82c442304979fbdec0fcb74543077751a135 
>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
>   src/tests/master_tests.cpp ef2c00101fc3d30c564a9ca34884dece2cdd2651 
>   src/tests/mesos.hpp c08e7e6c1dbc3dd9eb980868e43368c0a423c3ee 
> 
> 
> Diff: https://reviews.apache.org/r/69163/diff/6/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69211: Improved the code comments for `getContainerDevicesPath`.

2018-12-03 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69211/#review211008
---



PASS: Mesos patch 69211 was successfully built and tested.

Reviews applied: `['69086', '69210', '69450', '69211']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2667/mesos-review-69211

- Mesos Reviewbot Windows


On Dec. 4, 2018, 9:19 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69211/
> ---
> 
> (Updated Dec. 4, 2018, 9:19 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jason Lai, Jie Yu, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-9319
> https://issues.apache.org/jira/browse/MESOS-9319
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Improved the code comments for `getContainerDevicesPath`.
> 
> 
> Diffs
> -
> 
>   src/slave/containerizer/mesos/paths.hpp 
> de3981db7eb08e53901547037c947f594c8d46ab 
> 
> 
> Diff: https://reviews.apache.org/r/69211/diff/10/
> 
> 
> Testing
> ---
> 
> make check (Fedora 28)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 69211: Improved the code comments for `getContainerDevicesPath`.

2018-12-03 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69211/
---

(Updated Dec. 4, 2018, 1:19 a.m.)


Review request for mesos, Gilbert Song, Jason Lai, Jie Yu, and Jiang Yan Xu.


Bugs: MESOS-9319
https://issues.apache.org/jira/browse/MESOS-9319


Repository: mesos


Description
---

Improved the code comments for `getContainerDevicesPath`.


Diffs (updated)
-

  src/slave/containerizer/mesos/paths.hpp 
de3981db7eb08e53901547037c947f594c8d46ab 


Diff: https://reviews.apache.org/r/69211/diff/10/

Changes: https://reviews.apache.org/r/69211/diff/9-10/


Testing
---

make check (Fedora 28)


Thanks,

James Peach



Re: Review Request 69163: Set agent and/or resource provider ID in operation status updates.

2018-12-03 Thread Greg Mann


> On Dec. 1, 2018, 3:16 a.m., Greg Mann wrote:
> > src/tests/master_tests.cpp
> > Lines 9226 (patched)
> > 
> >
> > Could you run this test in repetition to ensure that we're not 
> > introducing any flakiness?
> 
> Benjamin Bannier wrote:
> I already did run this test in repetition and under stress during 
> development. Are you hinting that you saw a failure, or see a problem with 
> the test setup?

Nope I hadn't noticed anything - since only `make check` was listed in the 
"Testing Done" field, I assumed that it hadn't been run in repetition. As long 
as you've done that, I'm satisfied! Thanks :)


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69163/#review210993
---


On Dec. 3, 2018, 1:20 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69163/
> ---
> 
> (Updated Dec. 3, 2018, 1:20 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch sets agent and/or resource provider ID operation status
> update messages. This is not always possible, e.g., some operations
> might fail validation so that no corresponding IDs can be extracted.
> 
> Since operations failing validation are currently directly rejected by
> the master without going through a status update manager, they are not
> retried either. If a master status update manager for operations is
> introduced at a later point it should be possible to forward
> acknowledgements for updates to the master's update manager (not agent
> ID, not resource provider ID).
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp 1662125ed3e47b179ee32d08c1d3af75553066ba 
>   src/common/protobuf_utils.cpp a45607eed4c4bae5010bcc3f3ffeabd6d911062a 
>   src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
>   src/resource_provider/manager.cpp 6c81c430e9e1205d71982a7fa2bcd9aa15fc01b2 
>   src/resource_provider/storage/provider.cpp 
> a22c82c442304979fbdec0fcb74543077751a135 
>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
>   src/tests/master_tests.cpp ef2c00101fc3d30c564a9ca34884dece2cdd2651 
>   src/tests/mesos.hpp c08e7e6c1dbc3dd9eb980868e43368c0a423c3ee 
> 
> 
> Diff: https://reviews.apache.org/r/69163/diff/6/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69398: Added validation for `FrameworkID`s.

2018-12-03 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69398/#review211004
---



Seems a bit clearer to have a separate ticket for validating framework IDs and 
say on the crash ticket that we fix it via the validation ticket?

Probably we need to talk about the fact that mesos is the one that generates 
the frameowrk IDs (and document the format we currently use in the protobuf), 
and mention that the behavior of framework generating them is not supported 
(but this has been an abuse used by a few frameworks at certain points in 
time). In the case of this ticket, do you know how the invalid one showed up?

Also, note that we write framework Id as a part of file system paths, so need 
to deal with same validation issues that came up for windows recently with 
executor IDs.

- Benjamin Mahler


On Nov. 19, 2018, 3:52 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69398/
> ---
> 
> (Updated Nov. 19, 2018, 3:52 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-8470
> https://issues.apache.org/jira/browse/MESOS-8470
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> We never explicitly supported `FrameworkID`s containing literal `/`.
> Moreover, since the introduction of hierarchical roles such
> `FrameworkID`s would cause fatal errors.
> 
> This patch adds validation for `FrameworkID`s so such IDs are
> rejected.
> 
> 
> Diffs
> -
> 
>   src/master/validation.hpp 9af903970795a5c8c479d1984a580e41d91f6c91 
>   src/master/validation.cpp 5768ac8fe802f28855fbd7be135c75532771 
>   src/tests/master_validation_tests.cpp 
> aa7c8f70c09459be32c6c415497e95fcdc218efd 
> 
> 
> Diff: https://reviews.apache.org/r/69398/diff/1/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69163: Set agent and/or resource provider ID in operation status updates.

2018-12-03 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69163/
---

(Updated Dec. 3, 2018, 2:20 p.m.)


Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.


Changes
---

Addressed issues raised by Greg.


Bugs: MESOS-9293
https://issues.apache.org/jira/browse/MESOS-9293


Repository: mesos


Description
---

This patch sets agent and/or resource provider ID operation status
update messages. This is not always possible, e.g., some operations
might fail validation so that no corresponding IDs can be extracted.

Since operations failing validation are currently directly rejected by
the master without going through a status update manager, they are not
retried either. If a master status update manager for operations is
introduced at a later point it should be possible to forward
acknowledgements for updates to the master's update manager (not agent
ID, not resource provider ID).


Diffs (updated)
-

  src/common/protobuf_utils.hpp 1662125ed3e47b179ee32d08c1d3af75553066ba 
  src/common/protobuf_utils.cpp a45607eed4c4bae5010bcc3f3ffeabd6d911062a 
  src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
  src/resource_provider/manager.cpp 6c81c430e9e1205d71982a7fa2bcd9aa15fc01b2 
  src/resource_provider/storage/provider.cpp 
a22c82c442304979fbdec0fcb74543077751a135 
  src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
  src/tests/master_tests.cpp ef2c00101fc3d30c564a9ca34884dece2cdd2651 
  src/tests/mesos.hpp c08e7e6c1dbc3dd9eb980868e43368c0a423c3ee 


Diff: https://reviews.apache.org/r/69163/diff/6/

Changes: https://reviews.apache.org/r/69163/diff/5-6/


Testing
---

`make check`


Thanks,

Benjamin Bannier



Re: Review Request 69163: Set agent and/or resource provider ID in operation status updates.

2018-12-03 Thread Benjamin Bannier


> On Dec. 1, 2018, 4:16 a.m., Greg Mann wrote:
> > src/resource_provider/manager.cpp
> > Lines 579 (patched)
> > 
> >
> > Did you intend to do this in the current patch or make this a TODO?

No, this was intended to be a `TODO`.


> On Dec. 1, 2018, 4:16 a.m., Greg Mann wrote:
> > src/tests/master_tests.cpp
> > Lines 9226 (patched)
> > 
> >
> > Could you run this test in repetition to ensure that we're not 
> > introducing any flakiness?

I already did run this test in repetition and under stress during development. 
Are you hinting that you saw a failure, or see a problem with the test setup?


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69163/#review210993
---


On Dec. 3, 2018, 2:20 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69163/
> ---
> 
> (Updated Dec. 3, 2018, 2:20 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch sets agent and/or resource provider ID operation status
> update messages. This is not always possible, e.g., some operations
> might fail validation so that no corresponding IDs can be extracted.
> 
> Since operations failing validation are currently directly rejected by
> the master without going through a status update manager, they are not
> retried either. If a master status update manager for operations is
> introduced at a later point it should be possible to forward
> acknowledgements for updates to the master's update manager (not agent
> ID, not resource provider ID).
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp 1662125ed3e47b179ee32d08c1d3af75553066ba 
>   src/common/protobuf_utils.cpp a45607eed4c4bae5010bcc3f3ffeabd6d911062a 
>   src/master/master.cpp 3b3824a67f46866cd64e32d7f9f92484b5891aa2 
>   src/resource_provider/manager.cpp 6c81c430e9e1205d71982a7fa2bcd9aa15fc01b2 
>   src/resource_provider/storage/provider.cpp 
> a22c82c442304979fbdec0fcb74543077751a135 
>   src/slave/slave.cpp 858b78620e1ef33f3587d0bd95a684917aaf5bbb 
>   src/tests/master_tests.cpp ef2c00101fc3d30c564a9ca34884dece2cdd2651 
>   src/tests/mesos.hpp c08e7e6c1dbc3dd9eb980868e43368c0a423c3ee 
> 
> 
> Diff: https://reviews.apache.org/r/69163/diff/6/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>