Re: Review Request 70765: Renamed struct `ProfileRecord` to `ProfileData` for consistency.

2019-06-05 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70765/#review215711
---



Bad review!

Reviews applied: [70765, 70766, 70622, 70621, 70620]

Error:
2019-06-06 05:28:23 URL:https://reviews.apache.org/r/70765/diff/raw/ 
[2910/2910] -> "70765.patch" [1]
error: patch failed: 
src/resource_provider/storage/uri_disk_profile_adaptor.cpp:278
error: src/resource_provider/storage/uri_disk_profile_adaptor.cpp: patch does 
not apply

- Mesos Reviewbot


On May 31, 2019, 3:12 a.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70765/
> ---
> 
> (Updated May 31, 2019, 3:12 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Renamed struct `ProfileRecord` to `ProfileData` for consistency.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/storage/uri_disk_profile_adaptor.hpp 
> a5a34dc4dc1d518dc69aeb15fe62bd124d828ed3 
>   src/resource_provider/storage/uri_disk_profile_adaptor.cpp 
> 215f7f92b5c2a0e60555134ce9887e8a187e3b1d 
> 
> 
> Diff: https://reviews.apache.org/r/70765/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 70766: Fixed chaining futures infinitely in `UriDiskProfileAdaptor`.

2019-06-05 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70766/
---

(Updated June 6, 2019, 4:58 a.m.)


Review request for mesos, Benjamin Bannier and Benjamin Mahler.


Changes
---

Made `watchers` a vector and avoided the use of iterators.


Bugs: MESOS-9803
https://issues.apache.org/jira/browse/MESOS-9803


Repository: mesos


Description
---

Previously it is possible to have an infinite chain of futures when
`UriDiskProfileAdaptor::watch` is called: if the set of profiles remains
fixed for every poll, each poll would satisfy a promise that triggers
an asynchronous recursive call to `UriDiskProfileAdaptor::watch` again.

This patch fixes the problem by removing the asynchronous recursion.
Instead, we maintain a separated promise for each watcher that is never
associated to another promise. After each poll, we check if the current
set of profiles differs from the known set for a watcher, and satisfy
its own promise if so.


Diffs (updated)
-

  src/resource_provider/storage/uri_disk_profile_adaptor.hpp 
a5a34dc4dc1d518dc69aeb15fe62bd124d828ed3 
  src/resource_provider/storage/uri_disk_profile_adaptor.cpp 
215f7f92b5c2a0e60555134ce9887e8a187e3b1d 


Diff: https://reviews.apache.org/r/70766/diff/2/

Changes: https://reviews.apache.org/r/70766/diff/1-2/


Testing
---

make check

Manually tested with the unit test in r/70764. The unit test will fail at the 
5th poll without the fix and will pass with the fix.


Thanks,

Chun-Hung Hsiao



Review Request 70788: Garbage-collected disappeared RPs when agent resources remain unchanged.

2019-06-05 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70788/
---

Review request for mesos and Benjamin Bannier.


Bugs: MESOS-9831
https://issues.apache.org/jira/browse/MESOS-9831


Repository: mesos


Description
---

Previously when there is a missing resource provider in an
`UpdateSlaveMessage` but the agent's total resources remain unchanged,
the update message will be completely ignored, so the missing resource
provider will still be cached in the master's state, which is not the
desired behavior. This patch ensures that the master's state gets
updated if any resource provider is missing.


Diffs
-

  src/master/master.cpp 4d7c37cf8814e45432b3fe15173f5343676a372b 
  src/tests/api_tests.cpp 37d0cb11ddb24b32f1e83feefe168f2aa2eb659c 


Diff: https://reviews.apache.org/r/70788/diff/1/


Testing
---

make check


Thanks,

Chun-Hung Hsiao



Re: Review Request 70783: Added debug logging for metrics which are slow to become ready.

2019-06-05 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70783/#review215710
---



Patch looks great!

Reviews applied: [70783]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On June 5, 2019, 4:13 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70783/
> ---
> 
> (Updated June 5, 2019, 4:13 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Joseph Wu, and Vinod 
> Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added debug logging for metrics which are slow to become ready.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/include/process/metrics/metrics.hpp 
> 75711edbaf46797e5eb54ba720ea11cf3de81522 
>   3rdparty/libprocess/src/metrics/metrics.cpp 
> 623d44adbe838f995ddbe89ee26f5bcc9c600be5 
>   3rdparty/libprocess/src/tests/metrics_tests.cpp 
> 881275693e67f3c9fb670c7e70cb5014090ed7a5 
> 
> 
> Diff: https://reviews.apache.org/r/70783/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 70741: Adopted container file operations for secrets volumes.

2019-06-05 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70741/#review215708
---



Patch looks great!

Reviews applied: [70741]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On June 5, 2019, 8:44 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70741/
> ---
> 
> (Updated June 5, 2019, 8:44 a.m.)
> 
> 
> Review request for mesos, Xudong Ni, Gilbert Song, Jie Yu, Jacob Janco, and 
> Jiang Yan Xu.
> 
> 
> Bugs: MESOS-9799
> https://issues.apache.org/jira/browse/MESOS-9799
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Switched the `volumes/secrets` isolator from using container
> pre-exec commands to using file operations. This allows us to
> reduce the number of forked child processes, consolidate code
> and leaves the `network/port_mapping` isolator as the last
> consumer of pre-exec commands.
> 
> 
> Diffs
> -
> 
>   include/mesos/slave/containerizer.proto 
> b2e35cbf01caea6c1e4f45c7b7d833bc7f065099 
>   src/common/protobuf_utils.hpp 8aa9aff71cbbd8e98573d0095cfc007cdea163bc 
>   src/common/protobuf_utils.cpp 870859de93b44a1debc6786a562b4bc28955ddab 
>   src/slave/containerizer/mesos/isolators/volume/secret.cpp 
> 7a9bb82b23b35728408fb37bac53d79883c0a19f 
>   src/slave/containerizer/mesos/launch.cpp 
> 0c482f46a97063133edfe29ae3c6a2721d29f6c6 
> 
> 
> Diff: https://reviews.apache.org/r/70741/diff/4/
> 
> 
> Testing
> ---
> 
> sudo make check (Fedora 30)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 70786: Moved `class ResourceQuantities` to public header.

2019-06-05 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70786/#review215707
---



Patch looks great!

Reviews applied: [70786]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On June 5, 2019, 4:17 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70786/
> ---
> 
> (Updated June 5, 2019, 4:17 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Some public facing classes such as `Resources` already depends
> on `ResourceQuantities` and more are coming.
> 
> 
> Diffs
> -
> 
>   include/mesos/allocator/allocator.hpp 
> f1868387b28ea1d303cd3ed0176e32d66debd4c1 
>   include/mesos/resources.hpp 609ca61a0ffa80a7b10e6a3ebb0caa15966c3371 
>   include/mesos/v1/resources.hpp 8afccbe2673cc892667d74ea982d69a3ef122066 
>   src/Makefile.am 5f97523fbe2d80733fbdcc7706f2761f5a071f9f 
>   src/common/resource_quantities.hpp 538256bed96e72b4b61daa3052a111f59e252284 
>   src/common/resource_quantities.cpp f79b6786ccc93fff32447c9f9c3a0801500e7527 
>   src/common/resources.cpp 543820de435eb4869fbb93534fa00c58a9c5dfdd 
>   src/common/resources_utils.hpp 9b78ca231fe24aa88edfc3c14c39563737c5856f 
>   src/common/resources_utils.cpp 58fdb317809dd2b4dbc865e7d3eae5ebcdf5f174 
>   src/master/allocator/mesos/hierarchical.cpp 
> aa4c43882d0befb2100c92f383e87ab637fa4c7e 
>   src/master/allocator/sorter/drf/sorter.hpp 
> 91a9d668b87079158f7072780dc86bb08865166e 
>   src/master/allocator/sorter/random/sorter.hpp 
> 55e22d7705f163fe47d5aa47416ee0714c5a87c0 
>   src/master/master.cpp 4d7c37cf8814e45432b3fe15173f5343676a372b 
>   src/master/quota.cpp bad19a2b8c1f8e4e044b0dc31374cbd34673ef1d 
>   src/master/validation.cpp 65af1a773ed98651daa1aed2ee44aff15f2fe724 
>   src/tests/hierarchical_allocator_benchmarks.cpp 
> 797cf7eab8e31124db3046019684aae20f78f781 
>   src/tests/hierarchical_allocator_tests.cpp 
> 7221a642a8dda1c7b8ed5ebe7f9b840b5e949e88 
>   src/tests/resource_quantities_tests.cpp 
> f018c8a10a169d196a36c441929f477df6ef0df1 
>   src/tests/resources_tests.cpp bbac85180214a8197c42395990484cfdd4ee8a38 
>   src/v1/resources.cpp 7d6113fa8b45aedbd7135d29083983ae7641a9da 
> 
> 
> Diff: https://reviews.apache.org/r/70786/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 70728: Backed `MockResourceProvider` by a process.

2019-06-05 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70728/#review215705
---



s/completelt/completely/ in the description.


src/tests/master_tests.cpp
Line 9017 (original), 9017 (patched)


Not yours, but it seems to me that the dequeueing of the 
`UpdateSlaveMessage` in the master does not causally happen before the 
`subscribedDefault` call (which sets up the resource provider ID).


- Chun-Hung Hsiao


On May 27, 2019, 4:32 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70728/
> ---
> 
> (Updated May 27, 2019, 4:32 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Jan Schlicht.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> We were passing callbacks into `MockResourceProvider` to the HTTP
> driver. Since the lifecycle of the callbacks and the mock provider were
> decoupled and these callbacks were binding the mock provider instance
> the code was not safe as written as the driver could invoke the callback
> after the provider had been destructed.
> 
> This patch makes sure that the callbacks are defered to a process. We
> also dispatch a number of other functions which are strongly coupled to
> the lifecycle of the provider. We still do not hide the provider away
> completelt so the provider can be mocked in tests.
> 
> 
> Diffs
> -
> 
>   src/tests/api_tests.cpp 2220cecc22778a86f0c29317adf495927e1a900d 
>   src/tests/master_slave_reconciliation_tests.cpp 
> 7b6ac50adacc8416b91c0dde55ff7ba46a20515c 
>   src/tests/master_tests.cpp 097f1b77a59e29c6690210773d1556ebf2bb701e 
>   src/tests/mesos.hpp d0c82fa4c1800ed2a1825d563a9462ecd41b77fd 
>   src/tests/operation_reconciliation_tests.cpp 
> eae318da2273faae904f0155e49bb23cdb24f007 
>   src/tests/resource_provider_manager_tests.cpp 
> 7d48f18e89f046df6c92e52edeef592acfb13b10 
>   src/tests/slave_tests.cpp c2035976713abb31b3646c0d23771fa40df93271 
> 
> 
> Diff: https://reviews.apache.org/r/70728/diff/1/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 70738: Allow roles to burst up to quota limits in the allocator.

2019-06-05 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70738/#review215704
---



Patch looks great!

Reviews applied: [70779, 70780, 70738]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers 
--disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; 
./support/docker-build.sh

- Mesos Reviewbot


On June 4, 2019, 9:15 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70738/
> ---
> 
> (Updated June 4, 2019, 9:15 p.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8456
> https://issues.apache.org/jira/browse/MESOS-8456
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch allows roles to burst above their quota guarantees
> up to the quota limits.
> 
> In the first allocation stage which is dedicated to roles
> with non-default guarantees, we allow these roles to burst
> up to their limits for resources that they have guarantees
> set as well as for other resources allocated together
> with the guaranteed resources. These bursts are subject to
> global headroom enforcement.
> 
> In the second allocation stage, we allow all roles to burst
> up to their limits while maintaining the global headroom.
> In this stage, if some unreserved non-revocable scalar resources
> on the agent need to be held back to maintain the headroom,
> we will "chop" them to make the largest offer possible without
> breaking the quota headroom. Previously, if any resources are
> needed on this agent to maintain the headroom, NO unreserved
> scalar resources will be offered at all.
> 
> Also fixed an affected test.
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.cpp 
> aa4c43882d0befb2100c92f383e87ab637fa4c7e 
>   src/tests/hierarchical_allocator_tests.cpp 
> 7221a642a8dda1c7b8ed5ebe7f9b840b5e949e88 
> 
> 
> Diff: https://reviews.apache.org/r/70738/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 70783: Added debug logging for metrics which are slow to become ready.

2019-06-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70783/
---

(Updated June 5, 2019, 4:13 p.m.)


Review request for mesos, Benno Evers, Benjamin Mahler, Joseph Wu, and Vinod 
Kone.


Repository: mesos


Description
---

Added debug logging for metrics which are slow to become ready.


Diffs (updated)
-

  3rdparty/libprocess/include/process/metrics/metrics.hpp 
75711edbaf46797e5eb54ba720ea11cf3de81522 
  3rdparty/libprocess/src/metrics/metrics.cpp 
623d44adbe838f995ddbe89ee26f5bcc9c600be5 
  3rdparty/libprocess/src/tests/metrics_tests.cpp 
881275693e67f3c9fb670c7e70cb5014090ed7a5 


Diff: https://reviews.apache.org/r/70783/diff/4/

Changes: https://reviews.apache.org/r/70783/diff/3-4/


Testing
---


Thanks,

Greg Mann



Re: Review Request 70782: Added a non-deterministic test for MESOS-9808.

2019-06-05 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70782/
---

(Updated June 5, 2019, 3:47 p.m.)


Review request for mesos, Benjamin Mahler, Chun-Hung Hsiao, and Greg Mann.


Bugs: MESOS-9808
https://issues.apache.org/jira/browse/MESOS-9808


Repository: mesos


Description
---

Added a non-deterministic test for MESOS-9808.


Diffs
-

  3rdparty/libprocess/src/tests/process_tests.cpp 
05dc5ec2fdc74a989689e4378bef775bcf2b7a87 


Diff: https://reviews.apache.org/r/70782/diff/2/


Testing (updated)
---

Without any of two fixes from https://reviews.apache.org/r/70778/ - deadlocks 
100 out of 100 times on the hardware I used. 
Without the first fix the deadlock is due to the same reason as initially 
observed in in MESOS-9808.


With both fixes applied, one run takes around 400 ms on a release build. No 
deadlock observed in 1000 runs.


Thanks,

Andrei Sekretenko



Re: Review Request 70783: Added debug logging for metrics which are slow to become ready.

2019-06-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70783/#review215701
---




3rdparty/libprocess/src/metrics/metrics.cpp
Lines 212-226 (patched)


Whoops, just realized that I am not discarding these timers. Will update 
the patch again to do so.


- Greg Mann


On June 5, 2019, 3:39 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70783/
> ---
> 
> (Updated June 5, 2019, 3:39 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Joseph Wu, and Vinod 
> Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added debug logging for metrics which are slow to become ready.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/include/process/metrics/metrics.hpp 
> 75711edbaf46797e5eb54ba720ea11cf3de81522 
>   3rdparty/libprocess/src/metrics/metrics.cpp 
> 623d44adbe838f995ddbe89ee26f5bcc9c600be5 
>   3rdparty/libprocess/src/tests/metrics_tests.cpp 
> 881275693e67f3c9fb670c7e70cb5014090ed7a5 
> 
> 
> Diff: https://reviews.apache.org/r/70783/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 70782: Added a non-deterministic test for MESOS-9808.

2019-06-05 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70782/
---

(Updated June 5, 2019, 3:42 p.m.)


Review request for mesos, Benjamin Mahler and Chun-Hung Hsiao.


Changes
---

Optimized this test for catching the deletion by the decomissioned EventQueue:
- introduced the gate as suggested (improves deadlock probability) and removed 
the dispatches before terminating process A (when the gate is added, they have 
no effect and only waste time)
- reduced number of dispatches in each iteration to 10
- increased iteration count from 100 to 5000

Also cleaned up the code.


Summary (updated)
-

Added a non-deterministic test for MESOS-9808.


Bugs: MESOS-9808
https://issues.apache.org/jira/browse/MESOS-9808


Repository: mesos


Description (updated)
---

Added a non-deterministic test for MESOS-9808.


Diffs (updated)
-

  3rdparty/libprocess/src/tests/process_tests.cpp 
05dc5ec2fdc74a989689e4378bef775bcf2b7a87 


Diff: https://reviews.apache.org/r/70782/diff/2/

Changes: https://reviews.apache.org/r/70782/diff/1-2/


Testing
---

Without a fix from https://reviews.apache.org/r/70778/ - deadlocks 100 out of 
100 times on the hardware I used, due to the same reason as the code in 
MESOS-9808.

With a fix from https://reviews.apache.org/r/70778/ - one run takes around 2 
seconds. Sometimes still deadlocks with similar stacks.


Thanks,

Andrei Sekretenko



Re: Review Request 70783: Added debug logging for metrics which are slow to become ready.

2019-06-05 Thread Greg Mann


> On June 5, 2019, 1:05 a.m., Benjamin Mahler wrote:
> > 3rdparty/libprocess/src/metrics/metrics.cpp
> > Lines 199-207 (patched)
> > 
> >
> > If we consolidate the logging between user timeout and slow timeout can 
> > we have one path?
> > 
> > ```
> >   Future waited =  > timeout>;
> > 
> >   return waited
> > .then(defer(self(), ::_snapshotHandler, ...));
> > }
> >   
> > _snapshotHandler(...)
> > {
> >   // LOG(INFO) not ready futures after timeout (user or built-in SLOW 
> > one)
> >   ...
> > 
> >   Future waited = ;
> > 
> >   return waited
> > .then(defer(self(), ::__snapshotHandler, ...));
> > }
> >   
> > __snapshotHandler(...)
> > {
> >   ...
> >   return response;
> > }
> > ```
> > 
> > We would log those not ready after either the user provided one, or the 
> > built-in slow one, whichever is smaller (I guess we can use a min() for 
> > that too if we want). The downside here seems to be that if a user 
> > specifies a small one and still lots of pull gauges are used, will lead to 
> > a lot of INFO logging. Not sure how much we should worry about this case.
> > 
> > We should probably name the endpoint handler functions and snapshot map 
> > function differently or house the handler ones in an 'Http' sub-struct, a 
> > bit confusing right now.

I had considered a similar option previously but ended up opting for a more 
minimal diff. I wrote a new solution along these lines and although the patch 
is now much more complex, I think the end result is cleaner. The complexity 
largely arises from the fact that if the user specifies a timeout longer than 
the hard-coded timeout, then we can potentially execute the logging code twice. 
Let me know what you think.


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70783/#review215693
---


On June 5, 2019, 3:39 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70783/
> ---
> 
> (Updated June 5, 2019, 3:39 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Joseph Wu, and Vinod 
> Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added debug logging for metrics which are slow to become ready.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/include/process/metrics/metrics.hpp 
> 75711edbaf46797e5eb54ba720ea11cf3de81522 
>   3rdparty/libprocess/src/metrics/metrics.cpp 
> 623d44adbe838f995ddbe89ee26f5bcc9c600be5 
>   3rdparty/libprocess/src/tests/metrics_tests.cpp 
> 881275693e67f3c9fb670c7e70cb5014090ed7a5 
> 
> 
> Diff: https://reviews.apache.org/r/70783/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 70783: Added debug logging for metrics which are slow to become ready.

2019-06-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70783/
---

(Updated June 5, 2019, 3:39 p.m.)


Review request for mesos, Benno Evers, Benjamin Mahler, Joseph Wu, and Vinod 
Kone.


Repository: mesos


Description
---

Added debug logging for metrics which are slow to become ready.


Diffs (updated)
-

  3rdparty/libprocess/include/process/metrics/metrics.hpp 
75711edbaf46797e5eb54ba720ea11cf3de81522 
  3rdparty/libprocess/src/metrics/metrics.cpp 
623d44adbe838f995ddbe89ee26f5bcc9c600be5 
  3rdparty/libprocess/src/tests/metrics_tests.cpp 
881275693e67f3c9fb670c7e70cb5014090ed7a5 


Diff: https://reviews.apache.org/r/70783/diff/3/

Changes: https://reviews.apache.org/r/70783/diff/2-3/


Testing
---


Thanks,

Greg Mann



Re: Review Request 70783: Added debug logging for metrics which are slow to become ready.

2019-06-05 Thread Greg Mann


> On June 4, 2019, 8:42 p.m., Joseph Wu wrote:
> > I think it is a little unfortunate that we end up with a conditional 
> > `select()` and an extra chaining step in this endpoint.  But since we 
> > aren't saving the vector of keys/futures/statistics anywhere, the chaining 
> > is mostly unavoidable.
> > 
> > Since this adds some significant logic, I would like there to be at least 
> > an addition to the tests.  We can't read the resulting log line, but we can 
> > at least exercise the code.  For example, copying this chunk of existing 
> > test code and increasing the in-test timeout to 10+ seconds:
> > 
> > ```
> > TEST_F(MetricsTest, THREADSAFE_SnapshotTimeout)
> > {
> >   ...
> >   
> >   // Get the snapshot.
> >   Future response = http::get(upid, "snapshot", "timeout=2secs"); 
> > // <- Increase this timeout.
> > 
> >   // Make sure the request is pending before the timeout is exceeded.
> >   //
> >   // TODO(neilc): Replace the `sleep` here with a less flaky
> >   // synchronization method.
> >   os::sleep(Milliseconds(10));
> >   Clock::settle();
> > 
> >   ASSERT_TRUE(response.isPending());
> > 
> >   // Advance the clock to trigger the timeout.
> >   Clock::advance(Seconds(2)); // <- And this 
> > timeout.
> > 
> >   AWAIT_EXPECT_RESPONSE_STATUS_EQ(OK().status, response);
> > 
> >   // Parse the response.
> >   Try responseJSON = 
> > JSON::parse(response->body);
> >   ASSERT_SOME(responseJSON);
> > 
> >   // We can't use simple JSON equality testing here as initializing
> >   // libprocess adds metrics to the system. We want to only check if
> >   // the metrics from this test are correctly handled.
> >   map values = responseJSON->values;
> > 
> >   EXPECT_EQ(1u, values.count("test/counter"));
> >   EXPECT_DOUBLE_EQ(0.0, 
> > values["test/counter"].as().as());
> > 
> >   EXPECT_EQ(1u, values.count("test/gauge"));
> >   EXPECT_DOUBLE_EQ(42.0, 
> > values["test/gauge"].as().as());
> > 
> >   EXPECT_EQ(0u, values.count("test/gauge_fail"));
> >   EXPECT_EQ(0u, values.count("test/gauge_timeout"));
> > ```

Good call, I updated the test.


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70783/#review215689
---


On June 5, 2019, 3:39 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70783/
> ---
> 
> (Updated June 5, 2019, 3:39 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Benjamin Mahler, Joseph Wu, and Vinod 
> Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added debug logging for metrics which are slow to become ready.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/include/process/metrics/metrics.hpp 
> 75711edbaf46797e5eb54ba720ea11cf3de81522 
>   3rdparty/libprocess/src/metrics/metrics.cpp 
> 623d44adbe838f995ddbe89ee26f5bcc9c600be5 
>   3rdparty/libprocess/src/tests/metrics_tests.cpp 
> 881275693e67f3c9fb670c7e70cb5014090ed7a5 
> 
> 
> Diff: https://reviews.apache.org/r/70783/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 70783: Added debug logging for metrics which are slow to become ready.

2019-06-05 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70783/
---

(Updated June 5, 2019, 3:37 p.m.)


Review request for mesos, Benno Evers, Benjamin Mahler, Joseph Wu, and Vinod 
Kone.


Repository: mesos


Description (updated)
---

Added debug logging for metrics which are slow to become ready.


Diffs (updated)
-

  3rdparty/libprocess/include/process/metrics/metrics.hpp 
75711edbaf46797e5eb54ba720ea11cf3de81522 
  3rdparty/libprocess/src/metrics/metrics.cpp 
623d44adbe838f995ddbe89ee26f5bcc9c600be5 
  3rdparty/libprocess/src/tests/metrics_tests.cpp 
881275693e67f3c9fb670c7e70cb5014090ed7a5 


Diff: https://reviews.apache.org/r/70783/diff/2/

Changes: https://reviews.apache.org/r/70783/diff/1-2/


Testing
---


Thanks,

Greg Mann



Re: Review Request 70741: Adopted container file operations for secrets volumes.

2019-06-05 Thread James Peach


> On June 5, 2019, 7:16 a.m., Jiang Yan Xu wrote:
> > src/slave/containerizer/mesos/isolators/volume/secret.cpp
> > Line 138 (original), 147 (patched)
> > 
> >
> > Do you know why `launchInfo.mounts` was not used (and for the same 
> > reason it couldn't be used here)?

You need to sequence the mount with other operations, so you can't use the 
container mounts. We can revisit coalescing these in the future, but for now I 
opted for compatibility and simplicity.


> On June 5, 2019, 7:16 a.m., Jiang Yan Xu wrote:
> > src/slave/containerizer/mesos/launch.cpp
> > Lines 516 (patched)
> > 
> >
> > Is it more explicit if you just name the operation 
> > `ContainerFileOperation::MV` or `ContainerFileOperation::MOVE` and 
> > implement it as such (rename + fallback) either here or in stout since you 
> > are essentially implementing how we use a `mv` command?

`rename` vs. `move` seems pretty similar. To me, `rename` seemed clear so I 
prefer it. I didn't add a stout implementation because I didn't have the time 
to handle all the corner cases and write proper tests for a general utility.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70741/#review215696
---


On May 29, 2019, 4:29 a.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70741/
> ---
> 
> (Updated May 29, 2019, 4:29 a.m.)
> 
> 
> Review request for mesos, Xudong Ni, Gilbert Song, Jie Yu, Jacob Janco, and 
> Jiang Yan Xu.
> 
> 
> Bugs: MESOS-9799
> https://issues.apache.org/jira/browse/MESOS-9799
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Switched the `volumes/secrets` isolator from using container
> pre-exec commands to using file operations. This allows us to
> reduce the number of forked child processes, consolidate code
> and leaves the `network/port_mapping` isolator as the last
> consumer of pre-exec commands.
> 
> 
> Diffs
> -
> 
>   include/mesos/slave/containerizer.proto 
> b2e35cbf01caea6c1e4f45c7b7d833bc7f065099 
>   src/common/protobuf_utils.hpp 8aa9aff71cbbd8e98573d0095cfc007cdea163bc 
>   src/common/protobuf_utils.cpp 870859de93b44a1debc6786a562b4bc28955ddab 
>   src/slave/containerizer/mesos/isolators/volume/secret.cpp 
> 7a9bb82b23b35728408fb37bac53d79883c0a19f 
>   src/slave/containerizer/mesos/launch.cpp 
> 0c482f46a97063133edfe29ae3c6a2721d29f6c6 
> 
> 
> Diff: https://reviews.apache.org/r/70741/diff/2/
> 
> 
> Testing
> ---
> 
> sudo make check (Fedora 30)
> 
> 
> Thanks,
> 
> James Peach
> 
>



Re: Review Request 70741: Adopted container file operations for secrets volumes.

2019-06-05 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70741/
---

(Updated June 5, 2019, 8:44 a.m.)


Review request for mesos, Xudong Ni, Gilbert Song, Jie Yu, Jacob Janco, and 
Jiang Yan Xu.


Bugs: MESOS-9799
https://issues.apache.org/jira/browse/MESOS-9799


Repository: mesos


Description
---

Switched the `volumes/secrets` isolator from using container
pre-exec commands to using file operations. This allows us to
reduce the number of forked child processes, consolidate code
and leaves the `network/port_mapping` isolator as the last
consumer of pre-exec commands.


Diffs (updated)
-

  include/mesos/slave/containerizer.proto 
b2e35cbf01caea6c1e4f45c7b7d833bc7f065099 
  src/common/protobuf_utils.hpp 8aa9aff71cbbd8e98573d0095cfc007cdea163bc 
  src/common/protobuf_utils.cpp 870859de93b44a1debc6786a562b4bc28955ddab 
  src/slave/containerizer/mesos/isolators/volume/secret.cpp 
7a9bb82b23b35728408fb37bac53d79883c0a19f 
  src/slave/containerizer/mesos/launch.cpp 
0c482f46a97063133edfe29ae3c6a2721d29f6c6 


Diff: https://reviews.apache.org/r/70741/diff/3/

Changes: https://reviews.apache.org/r/70741/diff/2-3/


Testing
---

sudo make check (Fedora 30)


Thanks,

James Peach



Re: Review Request 70741: Adopted container file operations for secrets volumes.

2019-06-05 Thread Jiang Yan Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70741/#review215696
---




src/common/protobuf_utils.hpp
Lines 100 (patched)


Why this? The paragraphs above are still about `validateProtobufUnion` 
right?



src/common/protobuf_utils.hpp
Lines 436 (patched)


One empty line between the last element and the ending `}`



src/slave/containerizer/mesos/isolators/volume/secret.cpp
Line 138 (original), 147 (patched)


Do you know why `launchInfo.mounts` was not used (and for the same reason 
it couldn't be used here)?



src/slave/containerizer/mesos/launch.cpp
Lines 516 (patched)


Is it more explicit if you just name the operation 
`ContainerFileOperation::MV` or `ContainerFileOperation::MOVE` and implement it 
as such (rename + fallback) either here or in stout since you are essentially 
implementing how we use a `mv` command?


- Jiang Yan Xu


On May 28, 2019, 9:29 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70741/
> ---
> 
> (Updated May 28, 2019, 9:29 p.m.)
> 
> 
> Review request for mesos, Xudong Ni, Gilbert Song, Jie Yu, Jacob Janco, and 
> Jiang Yan Xu.
> 
> 
> Bugs: MESOS-9799
> https://issues.apache.org/jira/browse/MESOS-9799
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Switched the `volumes/secrets` isolator from using container
> pre-exec commands to using file operations. This allows us to
> reduce the number of forked child processes, consolidate code
> and leaves the `network/port_mapping` isolator as the last
> consumer of pre-exec commands.
> 
> 
> Diffs
> -
> 
>   include/mesos/slave/containerizer.proto 
> b2e35cbf01caea6c1e4f45c7b7d833bc7f065099 
>   src/common/protobuf_utils.hpp 8aa9aff71cbbd8e98573d0095cfc007cdea163bc 
>   src/common/protobuf_utils.cpp 870859de93b44a1debc6786a562b4bc28955ddab 
>   src/slave/containerizer/mesos/isolators/volume/secret.cpp 
> 7a9bb82b23b35728408fb37bac53d79883c0a19f 
>   src/slave/containerizer/mesos/launch.cpp 
> 0c482f46a97063133edfe29ae3c6a2721d29f6c6 
> 
> 
> Diff: https://reviews.apache.org/r/70741/diff/2/
> 
> 
> Testing
> ---
> 
> sudo make check (Fedora 30)
> 
> 
> Thanks,
> 
> James Peach
> 
>