Re: Review Request 65962: Made the default executor use references to `Owned` pointers.

2018-03-07 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65962/#review198861
---




src/launcher/default_executor.cpp
Line 729 (original), 729 (patched)


We don't make use of `Owned` semantics here, and should just use

Container* container = containers.at(taskId).get();


- Benjamin Bannier


On March 8, 2018, 3:07 a.m., Gaston Kleiman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65962/
> ---
> 
> (Updated March 8, 2018, 3:07 a.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> `Owned` pointers are copied in multiple places of the default executor.
> This violates the semantic of owned pointers and works only because
> `Owned` is currently implemented with `shared_ptr`, it'd otherwise lead
> to double-freeing the pointers.
> 
> This patch changes those places to use references to the original
> `Owned` objects instead of copies.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 906836f3b8e0af79d7c61f90fd8a95f193b26e84 
> 
> 
> Diff: https://reviews.apache.org/r/65962/diff/1/
> 
> 
> Testing
> ---
> 
> `sudo bin/mesos-tests.sh --gtest_filter="*Default*"` on GNU/Linux
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>



Re: Review Request 65692: Changed the signature of a DefaultExecutor kill method.

2018-03-07 Thread Benjamin Bannier


> On March 8, 2018, 12:42 a.m., Joseph Wu wrote:
> > src/launcher/default_executor.cpp
> > Line 986 (original), 986 (patched)
> > 
> >
> > Here's another copy that should be made into a reference.
> 
> Gaston Kleiman wrote:
> This happens in many other places as well. My initial plan was to create 
> a tech debt issue to audit the copies of `Owned` pointers, but I instead 
> fixed this file here: https://reviews.apache.org/r/65962/

https://issues.apache.org/jira/browse/MESOS-5122


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65692/#review198667
---


On Feb. 17, 2018, 1:24 a.m., Gaston Kleiman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65692/
> ---
> 
> (Updated Feb. 17, 2018, 1:24 a.m.)
> 
> 
> Review request for mesos, Joseph Wu, Qian Zhang, and Vinod Kone.
> 
> 
> Bugs: MESOS-8530
> https://issues.apache.org/jira/browse/MESOS-8530
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This method is being passed the container to be killed as
> `Owned`.
> 
> The container object is owned by the executor and stored in
> `LinkedHashMap containers`.
> 
> We should pass around raw pointers and not make copies of the stored
> `Owned` object, so this patch changes the signature of the `kill`
> method.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 906836f3b8e0af79d7c61f90fd8a95f193b26e84 
> 
> 
> Diff: https://reviews.apache.org/r/65692/diff/2/
> 
> 
> Testing
> ---
> 
> `sudo bin/mesos-tests.sh` on GNU/Linux
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>



Re: Review Request 65962: Made the default executor use references to `Owned` pointers.

2018-03-07 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65962/#review198860
---



Patch looks great!

Reviews applied: [65692, 65693, 65694, 65695, 65962]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On March 8, 2018, 2:07 a.m., Gaston Kleiman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65962/
> ---
> 
> (Updated March 8, 2018, 2:07 a.m.)
> 
> 
> Review request for mesos and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> `Owned` pointers are copied in multiple places of the default executor.
> This violates the semantic of owned pointers and works only because
> `Owned` is currently implemented with `shared_ptr`, it'd otherwise lead
> to double-freeing the pointers.
> 
> This patch changes those places to use references to the original
> `Owned` objects instead of copies.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 906836f3b8e0af79d7c61f90fd8a95f193b26e84 
> 
> 
> Diff: https://reviews.apache.org/r/65962/diff/1/
> 
> 
> Testing
> ---
> 
> `sudo bin/mesos-tests.sh --gtest_filter="*Default*"` on GNU/Linux
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>



Re: Review Request 65973: Avoid copying executor->scheduler message data in the master.

2018-03-07 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65973/#review198857
---



PASS: Mesos patch 65973 was successfully built and tested.

Reviews applied: `['65964', '65965', '65966', '65967', '65968', '65969', 
'65970', '65971', '65972', '65973']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65973

- Mesos Reviewbot Windows


On March 7, 2018, 8:36 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65973/
> ---
> 
> (Updated March 7, 2018, 8:36 p.m.)
> 
> 
> Review request for mesos and Meng Zhu.
> 
> 
> Bugs: MESOS-8628
> https://issues.apache.org/jira/browse/MESOS-8628
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This avoids a copy of the executor->scheduler message for the
> v0 message path in the master.
> 
> 
> Diffs
> -
> 
>   src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
>   src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 
> 
> 
> Diff: https://reviews.apache.org/r/65973/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 65312: Refactored authorization logic in the master.

2018-03-07 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65312/#review198853
---


Fix it, then Ship it!




Thanks Alexander, this is amazing! Much appreciated. And thanks to BenH for the 
inspiration :)


src/master/http.cpp
Line 867 (original), 838 (patched)


Nit: indented too far.



src/master/http.cpp
Line 4442 (original), 4138 (patched)


Nit: `&` is mis-placed.



src/master/master.cpp
Lines 11241-11242 (original), 11228-11229 (patched)


Nit: not indented enough.


- Greg Mann


On Feb. 27, 2018, 8:29 a.m., Alexander Rojas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65312/
> ---
> 
> (Updated Feb. 27, 2018, 8:29 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Greg Mann.
> 
> 
> Bugs: MESOS-8434
> https://issues.apache.org/jira/browse/MESOS-8434
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch makes uses of the new `ObjectApprovers` class which greatly
> simplifies the logic for constructing and using authorization.
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp 9a940db188436b9c5a39b0637cb4c15ee2ab5266 
>   src/common/protobuf_utils.cpp f516091e9208552488c154f34adde6f20d3413bf 
>   src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
>   src/master/master.hpp 1fadbe61f774e8ca9926af4e39b3d6c8e24fe5df 
>   src/master/master.cpp f2c2dbe6a4568027f11ad14121c2f9a1d1a43f80 
> 
> 
> Diff: https://reviews.apache.org/r/65312/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Alexander Rojas
> 
>



Re: Review Request 65968: Updated the master's decline handler to take an rvalue reference.

2018-03-07 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65968/#review198856
---



PASS: Mesos patch 65968 was successfully built and tested.

Reviews applied: `['65964', '65965', '65966', '65967', '65968']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65968

- Mesos Reviewbot Windows


On March 8, 2018, 3:52 a.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65968/
> ---
> 
> (Updated March 8, 2018, 3:52 a.m.)
> 
> 
> Review request for mesos and Meng Zhu.
> 
> 
> Bugs: MESOS-8628
> https://issues.apache.org/jira/browse/MESOS-8628
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The master's handlers should all take rvalue references for
> consistency.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
>   src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
>   src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 
> 
> 
> Diff: https://reviews.apache.org/r/65968/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Review Request 65976: Allowed profiles to be missing from `DiskProfileAdaptor`.

2018-03-07 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65976/
---

Review request for mesos, Benjamin Bannier, Jie Yu, and Joseph Wu.


Bugs: MESOS-8492
https://issues.apache.org/jira/browse/MESOS-8492


Repository: mesos


Description
---

Allowed profiles to be missing from `DiskProfileAdaptor`.


Diffs
-

  src/resource_provider/storage/provider.cpp 
63dde512fd8cc9f68f5f48a96869eb09b23b6f4a 
  src/resource_provider/storage/uri_disk_profile.cpp 
665798fdb085ea34f93bd287fe6f9ab29a265cbf 


Diff: https://reviews.apache.org/r/65976/diff/1/


Testing
---

sudo make check


Thanks,

Chun-Hung Hsiao



Review Request 65975: Sequentialize reconciliations of storage pools in SLRP.

2018-03-07 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65975/
---

Review request for mesos, Benjamin Bannier, Jie Yu, and Joseph Wu.


Bugs: MESOS-8492
https://issues.apache.org/jira/browse/MESOS-8492


Repository: mesos


Description
---

The storage pools needs to be reconciled in the following two scenarios:

1. When there is a change in the set of known profiles.
2. When a volume/block of an unknown profile is destroyed, because the
   disk space being freed up may belong to a known profile.

This patch adds a sequence to coordinate the reconciliations for the
above two cases.


Diffs
-

  src/resource_provider/storage/provider.cpp 
63dde512fd8cc9f68f5f48a96869eb09b23b6f4a 


Diff: https://reviews.apache.org/r/65975/diff/1/


Testing
---

sudo make check

NOTE: `StorageLocalResourceProviderTest.ROOT_RetryOperationStatusUpdate` and 
`StorageLocalResourceProviderTest.ROOT_RetryOperationStatusUpdateAfterRecovery` 
seems flaky. Will investigate tomorrow.


Thanks,

Chun-Hung Hsiao



Re: Review Request 65594: Checkpointed profiles in storage local resource provider.

2018-03-07 Thread Chun-Hung Hsiao


> On Feb. 24, 2018, 12:11 a.m., Jie Yu wrote:
> > src/resource_provider/storage/provider.cpp
> > Line 1304 (original), 1329 (patched)
> > 
> >
> > This flag is a bit confusing. When we will set this flag to `true`? Do 
> > we need to set it to true when `reconcileStoragePools` is called 
> > (currently, it's not)?
> > 
> > What if `GetCapacity` reports shrinked capacity? Should we drop 
> > `CreateVolume` in that case?

Refactored in a followup patch: https://reviews.apache.org/r/65975/

It is fine for `GetCapacity` to shrink the resources since will wait for all 
pending `CreateVolume` to complete before calling `GetCapacity`.


> On Feb. 24, 2018, 12:11 a.m., Jie Yu wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 2968-2969 (patched)
> > 
> >
> > Do you need a dispatch here?

Removed in https://reviews.apache.org/r/65975/.


> On Feb. 24, 2018, 12:11 a.m., Jie Yu wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 2970-2971 (patched)
> > 
> >
> > This is a bit confusing to me initially, and a bit hard to follow.
> > 
> > I think the goal here is trying to prevent concurrent reconciliation. 
> > However, currently in the code, we used two ways to enforce that:
> > 1) the loop for detecting profile changes (to prevent two 
> > reconcileProfiles run concurrently
> > 2) operationSequence to prevent reconcileProfiles and 
> > reconcileStoragePool run concurrently
> > 
> > Maybe we should use a unified way.

Refactored in https://reviews.apache.org/r/65975/.


> On Feb. 24, 2018, 12:11 a.m., Jie Yu wrote:
> > src/resource_provider/storage/provider.cpp
> > Lines 3087 (patched)
> > 
> >
> > Do you want to add some CHECK here to check resource `has_disk`, 
> > `has_source` and `has_profile`?

I think `CHECK(resource.disk().source().has_profile())` is sufficient and 
concise since if either `disk` or `source` is missing, we will get an empty 
`Resource::DiskInfo::Source`.


- Chun-Hung


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65594/#review197572
---


On March 8, 2018, 5:04 a.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65594/
> ---
> 
> (Updated March 8, 2018, 5:04 a.m.)
> 
> 
> Review request for mesos, James DeFelice, Jie Yu, and Joseph Wu.
> 
> 
> Bugs: MESOS-8492
> https://issues.apache.org/jira/browse/MESOS-8492
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> SLRP now checkpoints profiles associated with storage pools, and does
> not depend on the `DiskProfileAdaptor` module to return the set of
> previously-known profiles during recovery.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/state.proto 8577b58b8cdb63b3daddf06ea5d12f80f9d42f2b 
>   src/resource_provider/storage/provider.cpp 
> 63dde512fd8cc9f68f5f48a96869eb09b23b6f4a 
> 
> 
> Diff: https://reviews.apache.org/r/65594/diff/4/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 65594: Checkpointed profiles in storage local resource provider.

2018-03-07 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65594/
---

(Updated March 8, 2018, 5:04 a.m.)


Review request for mesos, James DeFelice, Jie Yu, and Joseph Wu.


Changes
---

Refactored and addressed Jie's comments. Will address Ben's comments tomorrow.


Bugs: MESOS-8492
https://issues.apache.org/jira/browse/MESOS-8492


Repository: mesos


Description (updated)
---

SLRP now checkpoints profiles associated with storage pools, and does
not depend on the `DiskProfileAdaptor` module to return the set of
previously-known profiles during recovery.


Diffs (updated)
-

  src/resource_provider/state.proto 8577b58b8cdb63b3daddf06ea5d12f80f9d42f2b 
  src/resource_provider/storage/provider.cpp 
63dde512fd8cc9f68f5f48a96869eb09b23b6f4a 


Diff: https://reviews.apache.org/r/65594/diff/4/

Changes: https://reviews.apache.org/r/65594/diff/3-4/


Testing (updated)
---

sudo make check


Thanks,

Chun-Hung Hsiao



Review Request 65974: Added comments and made some renaming in SLRP.

2018-03-07 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65974/
---

Review request for mesos and Jie Yu.


Repository: mesos


Description
---

Added comments and made some renaming in SLRP.


Diffs
-

  src/resource_provider/storage/provider.cpp 
63dde512fd8cc9f68f5f48a96869eb09b23b6f4a 


Diff: https://reviews.apache.org/r/65974/diff/1/


Testing
---

sudo make check


Thanks,

Chun-Hung Hsiao



Review Request 65973: Avoid copying executor->scheduler message data in the master.

2018-03-07 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65973/
---

Review request for mesos and Meng Zhu.


Bugs: MESOS-8628
https://issues.apache.org/jira/browse/MESOS-8628


Repository: mesos


Description
---

This avoids a copy of the executor->scheduler message for the
v0 message path in the master.


Diffs
-

  src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
  src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 


Diff: https://reviews.apache.org/r/65973/diff/1/


Testing
---

make check


Thanks,

Benjamin Mahler



Review Request 65972: Avoid copying scheduler->executor message data in the master.

2018-03-07 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65972/
---

Review request for mesos and Meng Zhu.


Bugs: MESOS-8628
https://issues.apache.org/jira/browse/MESOS-8628


Repository: mesos


Description
---

This avoids a copy of the scheduler->executor message for the
v0 message path in the master.


Diffs
-

  src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
  src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 


Diff: https://reviews.apache.org/r/65972/diff/1/


Testing
---

make check


Thanks,

Benjamin Mahler



Review Request 65971: Avoid a copy of the scheduler->executor message in the master.

2018-03-07 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65971/
---

Review request for mesos and Meng Zhu.


Bugs: MESOS-8628
https://issues.apache.org/jira/browse/MESOS-8628


Repository: mesos


Description
---

This avoids a copy in both the v1 http and v0 message code paths.


Diffs
-

  src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
  src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
  src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 


Diff: https://reviews.apache.org/r/65971/diff/1/


Testing
---

make check


Thanks,

Benjamin Mahler



Review Request 65970: Avoid copying acknowledgement message data in the master.

2018-03-07 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65970/
---

Review request for mesos and Meng Zhu.


Bugs: MESOS-8628
https://issues.apache.org/jira/browse/MESOS-8628


Repository: mesos


Description
---

This updates the acknowledge call handler to take an rvalue reference
for consistency. This also updates the handler to move the data into
the outgoing message.


Diffs
-

  src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
  src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
  src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 


Diff: https://reviews.apache.org/r/65970/diff/1/


Testing
---

make check


Thanks,

Benjamin Mahler



Review Request 65969: Avoid a copy of the status update acknowledgements in the master.

2018-03-07 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65969/
---

Review request for mesos and Meng Zhu.


Bugs: MESOS-8628
https://issues.apache.org/jira/browse/MESOS-8628


Repository: mesos


Description
---

This updates the handler to take an rvalue reference and move the
data into the `scheduler::Call::Acknowledge` message.


Diffs
-

  src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
  src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 


Diff: https://reviews.apache.org/r/65969/diff/1/


Testing
---

make check


Thanks,

Benjamin Mahler



Re: Review Request 65311: Added the ObjectApprovers to which unifies authorization logic.

2018-03-07 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65311/#review198811
---




src/common/http.hpp
Lines 215-220 (patched)


Looks like we'll end up logging an extra space when `principal.isNone()`.



src/common/http.hpp
Lines 244 (patched)


Could you verify that this is true in the case of resource-provider-capable 
agents? See 
https://github.com/apache/mesos/blob/66bae088de61ae0cc3f9c5fac6f4b6f9695a27d1/src/master/master.cpp#L7087-L7117

I haven't yet identified code in the agent which downgrades resources 
before checkpointing.



src/common/http.cpp
Lines 874 (patched)


Indented too far.



src/common/http.cpp
Lines 894 (patched)


Should we use an underscore for this `approvers` variable?


- Greg Mann


On Feb. 5, 2018, 1:52 p.m., Alexander Rojas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65311/
> ---
> 
> (Updated Feb. 5, 2018, 1:52 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Greg Mann.
> 
> 
> Bugs: MESOS-8434
> https://issues.apache.org/jira/browse/MESOS-8434
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Introduces the class `ObjectApprovers` which unifies the different
> mechanisms used in Mesos in order to perform authorization.
> 
> This new class will supersede the `Acceptor` interfaces and their
> logic.
> 
> Follow up patches will make use of this interface and remove
> deprecated code.
> 
> 
> Diffs
> -
> 
>   src/common/http.hpp 750d3bfba1647624983108bdd23295a3b3091fe4 
>   src/common/http.cpp 8d3a4ad63bd74e0313082bad3d89c21fbf54f256 
> 
> 
> Diff: https://reviews.apache.org/r/65311/diff/5/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Alexander Rojas
> 
>



Re: Review Request 65959: Added a test to make sure `slave/recovery_secs` is reported in metrics.

2018-03-07 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65959/#review198852
---



Patch looks great!

Reviews applied: [65954, 65959]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On March 7, 2018, 11:21 p.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65959/
> ---
> 
> (Updated March 7, 2018, 11:21 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Greg Mann, Jason Lai, and James Peach.
> 
> 
> Bugs: MESOS-8609
> https://issues.apache.org/jira/browse/MESOS-8609
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a test to make sure `slave/recovery_secs` is reported in metrics.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_tests.cpp e253317a67019302f18afe11e2a314e716cec226 
> 
> 
> Diff: https://reviews.apache.org/r/65959/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>



Review Request 65968: Updated the master's decline handler to take an rvalue reference.

2018-03-07 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65968/
---

Review request for mesos and Meng Zhu.


Bugs: MESOS-8628
https://issues.apache.org/jira/browse/MESOS-8628


Repository: mesos


Description
---

The master's handlers should all take rvalue references for
consistency.


Diffs
-

  src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
  src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
  src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 


Diff: https://reviews.apache.org/r/65968/diff/1/


Testing
---

make check


Thanks,

Benjamin Mahler



Review Request 65964: Updated the master's status update handler to avoid unnecessary copying.

2018-03-07 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65964/
---

Review request for mesos and Meng Zhu.


Bugs: MESOS-8628
https://issues.apache.org/jira/browse/MESOS-8628


Repository: mesos


Description
---

Previously, we took a copy in order to mutate the UUID within the
update if communicating with an agent prior to 0.26. We now no longer
need to copy the update.


Diffs
-

  src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
  src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 


Diff: https://reviews.apache.org/r/65964/diff/1/


Testing
---

make check


Thanks,

Benjamin Mahler



Review Request 65966: Updated the master's receive handler to take an rvalue reference.

2018-03-07 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65966/
---

Review request for mesos and Meng Zhu.


Bugs: MESOS-8628
https://issues.apache.org/jira/browse/MESOS-8628


Repository: mesos


Description
---

This allows us to perform copy elimination in subsequent patches.


Diffs
-

  src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
  src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 


Diff: https://reviews.apache.org/r/65966/diff/1/


Testing
---

make check


Thanks,

Benjamin Mahler



Review Request 65965: Avoid copying of launch task messages in the master.

2018-03-07 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65965/
---

Review request for mesos and Meng Zhu.


Bugs: MESOS-8628
https://issues.apache.org/jira/browse/MESOS-8628


Repository: mesos


Description
---

Prior to this patch, the launch task message was copied into an
Accept or Decline call. This updates the handler to perform moves
instead.


Diffs
-

  src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
  src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 


Diff: https://reviews.apache.org/r/65965/diff/1/


Testing
---

make check


Thanks,

Benjamin Mahler



Review Request 65967: Eliminated some copying in the master's Accept call handler path.

2018-03-07 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65967/
---

Review request for mesos and Meng Zhu.


Bugs: MESOS-8628
https://issues.apache.org/jira/browse/MESOS-8628


Repository: mesos


Description
---

This eliminates two copies of the `Accept` message.


Diffs
-

  src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
  src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
  src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 


Diff: https://reviews.apache.org/r/65967/diff/1/


Testing
---

make check


Thanks,

Benjamin Mahler



Re: Review Request 65962: Made the default executor use references to `Owned` pointers.

2018-03-07 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65962/#review198851
---



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['65692', '65693', '65694', '65695', '65962']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65962

Relevant logs:

- 
[mesos-tests-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65962/logs/mesos-tests-stdout.log):

```

[--] 2 tests from ContainerizerType/DefaultContainerDNSFlagTest
[ RUN  ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/0
[   OK ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/0 (51 
ms)
[ RUN  ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/1
[   OK ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/1 (58 
ms)
[--] 2 tests from ContainerizerType/DefaultContainerDNSFlagTest (112 ms 
total)

[--] 1 test from IsolationFlag/CpuIsolatorTest
[ RUN  ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0
[   OK ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0 (2437 ms)
[--] 1 test from IsolationFlag/CpuIsolatorTest (2466 ms total)

[--] 1 test from IsolationFlag/MemoryIsolatorTest
[ RUN  ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0
[   OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (2454 ms)
[--] 1 test from IsolationFlag/MemoryIsolatorTest (2481 ms total)

[--] Global test environment tear-down
[==] 916 tests from 91 test cases ran. (511941 ms total)
[  PASSED  ] 913 tests.
[  FAILED  ] 3 tests, listed below:
[  FAILED  ] MesosContainerizer/DefaultExecutorTest.KillTask/0, where 
GetParam() = "mesos"
[  FAILED  ] MesosContainerizer/DefaultExecutorTest.KillMultipleTasks/0, where 
GetParam() = "mesos"
[  FAILED  ] MesosContainerizer/DefaultExecutorTest.CommitSuicideOnKillTask/0, 
where GetParam() = "mesos"

 3 FAILED TESTS
  YOU HAVE 210 DISABLED TESTS

```

- 
[mesos-tests-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65962/logs/mesos-tests-stderr.log):

```
I0308 03:28:53.669703  1996 master.cpp:10210] Updating the state of task 
660c7063-72eb-4aac-a218-62dd4718a515 of framework 
c6767351-4881-42fb-9832-930b5b2bee2a- (latest state: TASK_KILLED, status 
update state: TASKI0308 03:28:52.950726 10868 exec.cpp:162] Version: 1.6.0
I0308 03:28:52.978703  6172 exec.cpp:236] Executor registered on agent 
c6767351-4881-42fb-9832-930b5b2bee2a-S0
I0308 03:28:52.983726  9212 executor.cpp:176] Received SUBSCRIBED event
I0308 03:28:52.987726  9212 executor.cpp:180] Subscribed executor on 
build-srv-04.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net
I0308 03:28:52.988725  9212 executor.cpp:176] Received LAUNCH event
I0308 03:28:52.993731  9212 executor.cpp:648] Starting task 
660c7063-72eb-4aac-a218-62dd4718a515
I0308 03:28:53.073702  9212 executor.cpp:483] Running 
'D:\DCOS\mesos\src\mesos-containerizer.exe launch '
I0308 03:28:53.618705  9212 executor.cpp:661] Forked command at 5876
I0308 03:28:53.674851  5116 exec.cpp:445] Executor asked to shutdown
I0308 03:28:53.675705  9212 executor.cpp:176] Received SHUTDOWN event
I0308 03:28:53.675705  9212 executor.cpp:758] Shutting down
I0308 03:28:53.675705  9212 executor.cpp:868] Sending SIGTERM to process tree 
at pid 5_KILLED)
I0308 03:28:53.669703  6528 slave.cpp:3878] Shutting down framework 
c6767351-4881-42fb-9832-930b5b2bee2a-
I0308 03:28:53.670702  6528 slave.cpp:6571] Shutting down executor 
'660c7063-72eb-4aac-a218-62dd4718a515' of framework 
c6767351-4881-42fb-9832-930b5b2bee2a- at executor(1)@10.3.1.5:55132
I0308 03:28:53.672705  6528 slave.cpp:924] Agent terminating
W0308 03:28:53.672705  6528 slave.cpp:3874] Ignoring shutdown framework 
c6767351-4881-42fb-9832-930b5b2bee2a- because it is terminating
I0308 03:28:53.674851  1996 master.cpp:10309] Removing task 
660c7063-72eb-4aac-a218-62dd4718a515 with resources cpus(allocated: *):4; 
mem(allocated: *):2048; disk(allocated: *):1024; ports(allocated: 
*):[31000-32000] of framework c6767351-4881-42fb-9832-930b5b2bee2a- on 
agent c6767351-4881-42fb-9832-930b5b2bee2a-S0 at slave(398)@10.3.1.5:55111 
(build-srv-04.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0308 03:28:53.678709 10112 containerizer.cpp:2338] Destroying container 
2ac36cfa-e124-4dc7-8ae7-adeeed493141 in RUNNING state
I0308 03:28:53.678709 10328 hierarchical.cpp:344] Removed framework 
c6767351-4881-42fb-9832-930b5b2bee2a-
I0308 03:28:53.679710 10112 containerizer.cpp:2952] Transitioning the state of 
container 2ac36cfa-e124-4dc7-8ae7-adeeed493141 from RUNNING to DESTROYING
I0308 03:28:53.679710  1996 master.cpp:1306] Agent 
c6767351-4881-42fb-9832-930b5b2bee2a-S0 at slave(398)@10.3.1.5:55111 

Re: Review Request 59987: Added protobuf map support to stout JSON<->protobuf conversion.

2018-03-07 Thread Chun-Hung Hsiao


> On March 1, 2018, 4:10 a.m., Benjamin Mahler wrote:
> > 3rdparty/stout/include/stout/protobuf.hpp
> > Lines 388 (patched)
> > 
> >
> > s/name/key/?
> 
> Qian Zhang wrote:
> I'd like to be consistent with the existing code: 
> https://github.com/apache/mesos/blob/1.5.0/3rdparty/stout/include/stout/protobuf.hpp#L575:L576.
>  And in another hand, in the code below, I already have another local 
> variable named `key`, e.g.,:
> ```
> Try key = numify(name);
> ```
> So I'd like to differentiate them.

I have no objection to keep the nameing as is. But on the other hand, we name 
it `name` probably because of `FindFieldByName(name)`. I'll leave this up to 
you.


> On March 1, 2018, 4:10 a.m., Benjamin Mahler wrote:
> > 3rdparty/stout/include/stout/protobuf.hpp
> > Lines 391-487 (patched)
> > 
> >
> > It looks like we can avoid this logic since protobuf's JSON conversion 
> > allows the protobuf key types (bool, integers, strings) to be converted 
> > from JSON strings. This means we could just recurse for both key and value 
> > here:
> > 
> > ```
> > // Some comment explaining what map is equivalent to with
> > // a reference to the google documentation.
> > google::protobuf::Message* entry =
> >   reflection->AddMessage(message, field);
> > 
> > const google::protobuf::FieldDescriptor* key_field =
> >   entry->GetDescriptor()->FindFieldByNumber(1);
> >   
> > // Maybe passing just 'key' works due to implicit 
> > construction?
> > // TODO(...): This copies the key, consider avoiding this.
> > Try apply =
> >   boost::apply_visitor(Parser(entry, key_field), 
> > JSON::String(key));
> > 
> > if (apply.isError()) {
> >   return Error(apply.error());
> > }
> >   
> > const google::protobuf::FieldDescriptor* value_field =
> >   entry->GetDescriptor()->FindFieldByNumber(2);
> >   
> > Try apply =
> >   boost::apply_visitor(Parser(entry, value_field), value);
> > 
> > if (apply.isError()) {
> >   return Error(apply.error());
> > }
> > ```
> > 
> > Now, to make this simplification work, we need to update our JSON 
> > conversion in a separate patch to allow bools and integers to be parsed 
> > from JSON strings to match google's logic for conversion:
> > 
> > 
> > https://github.com/google/protobuf/blob/v3.5.1/src/google/protobuf/util/internal/datapiece.cc#L203
> > 
> > Documentation here: 
> > https://developers.google.com/protocol-buffers/docs/proto3#json
> 
> Qian Zhang wrote:
> > // Maybe passing just 'key' works due to implicit construction?
> 
> We cannot pass `key` or even `JSON::String` to `boost::apply_visitor()` 
> because it cannot pass compilation:
> ```
> 
> ../3rdparty/boost-1.65.0/boost/variant/detail/apply_visitor_unary.hpp:84:22: 
> error: no member named 'apply_visitor' in 'std::__1::basic_string'
> ```
> or
> ```
> 
> ../3rdparty/boost-1.65.0/boost/variant/detail/apply_visitor_unary.hpp:84:22: 
> error: no member named 'apply_visitor' in 'JSON::String'
> ```
> That is because `std::string` and `JSON::String` do not have a member 
> `apply_visitor`, so I think we have to use `JSON::Value` like this:
> ```
> google::protobuf::Message* entry =
>   reflection->AddMessage(message, field);
> 
> const google::protobuf::FieldDescriptor* key_field =
>   entry->GetDescriptor()->FindFieldByNumber(1);
> 
> JSON::Value key(name);
> 
> Try apply =
>   boost::apply_visitor(Parser(entry, key_field), key);
> 
> if (apply.isError()) {
>   return Error(apply.error());
> }
> 
> const google::protobuf::FieldDescriptor* value_field =
>   entry->GetDescriptor()->FindFieldByNumber(2);
> 
> apply = boost::apply_visitor(Parser(entry, value_field), 
> value);
> if (apply.isError()) {
>   return Error(apply.error());
> }
> ```
> 
> > we need to update our JSON conversion in a separate patch to allow 
> bools and integers to be parsed from JSON strings to match google's logic for 
> conversion
> 
> Did you mean we should improve the method below by adding more cases for 
> converting `JSON::String` to bools and integers?
> ```
>   Try operator()(const JSON::String& string) const
>   {

Review Request 65962: Made the default executor use references to `Owned` pointers.

2018-03-07 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65962/
---

Review request for mesos and Joseph Wu.


Repository: mesos


Description
---

`Owned` pointers are copied in multiple places of the default executor.
This violates the semantic of owned pointers and works only because
`Owned` is currently implemented with `shared_ptr`, it'd otherwise lead
to double-freeing the pointers.

This patch changes those places to use references to the original
`Owned` objects instead of copies.


Diffs
-

  src/launcher/default_executor.cpp 906836f3b8e0af79d7c61f90fd8a95f193b26e84 


Diff: https://reviews.apache.org/r/65962/diff/1/


Testing
---

`sudo bin/mesos-tests.sh --gtest_filter="*Default*"` on GNU/Linux


Thanks,

Gaston Kleiman



Re: Review Request 65693: Made the default executor fail kills if the response isn't "200 OK".

2018-03-07 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65693/
---

(Updated March 7, 2018, 6:06 p.m.)


Review request for mesos, Joseph Wu, Qian Zhang, and Vinod Kone.


Changes
---

Improved style.


Bugs: MESOS-8530
https://issues.apache.org/jira/browse/MESOS-8530


Repository: mesos


Description
---

The default executor's `Future kill(const ContainerID&, int)`
method returns `Nothing()` if the agent responded to the
`KILL_NESTED_CONTAINER` call, regardless of the response.

This patch updates the method, so that it returns a failure if the
response is not "200 OK".


Diffs (updated)
-

  src/launcher/default_executor.cpp 906836f3b8e0af79d7c61f90fd8a95f193b26e84 


Diff: https://reviews.apache.org/r/65693/diff/2/

Changes: https://reviews.apache.org/r/65693/diff/1-2/


Testing
---

`sudo bin/mesos-tests.sh` on GNU/Linux


Thanks,

Gaston Kleiman



Re: Review Request 65694: Made the default executor's handling of kill escalations more robust.

2018-03-07 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65694/
---

(Updated March 7, 2018, 6:07 p.m.)


Review request for mesos, Joseph Wu, Qian Zhang, and Vinod Kone.


Changes
---

Fixed a typo in the patch description.


Bugs: MESOS-8530
https://issues.apache.org/jira/browse/MESOS-8530


Repository: mesos


Description (updated)
---

This patch makes the default executor retry SIGKILL escalations if the
executor is disconnected from the agent or the kill call fails.


Diffs (updated)
-

  src/launcher/default_executor.cpp 906836f3b8e0af79d7c61f90fd8a95f193b26e84 


Diff: https://reviews.apache.org/r/65694/diff/2/

Changes: https://reviews.apache.org/r/65694/diff/1-2/


Testing
---

`sudo bin/mesos-tests.sh` on GNU/Linux


Thanks,

Gaston Kleiman



Re: Review Request 65692: Changed the signature of a DefaultExecutor kill method.

2018-03-07 Thread Gaston Kleiman


> On March 7, 2018, 3:42 p.m., Joseph Wu wrote:
> > src/launcher/default_executor.cpp
> > Line 986 (original), 986 (patched)
> > 
> >
> > Here's another copy that should be made into a reference.

This happens in many other places as well. My initial plan was to create a tech 
debt issue to audit the copies of `Owned` pointers, but I instead fixed this 
file here: https://reviews.apache.org/r/65962/


- Gaston


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65692/#review198667
---


On Feb. 16, 2018, 4:24 p.m., Gaston Kleiman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65692/
> ---
> 
> (Updated Feb. 16, 2018, 4:24 p.m.)
> 
> 
> Review request for mesos, Joseph Wu, Qian Zhang, and Vinod Kone.
> 
> 
> Bugs: MESOS-8530
> https://issues.apache.org/jira/browse/MESOS-8530
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This method is being passed the container to be killed as
> `Owned`.
> 
> The container object is owned by the executor and stored in
> `LinkedHashMap containers`.
> 
> We should pass around raw pointers and not make copies of the stored
> `Owned` object, so this patch changes the signature of the `kill`
> method.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 906836f3b8e0af79d7c61f90fd8a95f193b26e84 
> 
> 
> Diff: https://reviews.apache.org/r/65692/diff/2/
> 
> 
> Testing
> ---
> 
> `sudo bin/mesos-tests.sh` on GNU/Linux
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>



Re: Review Request 65930: Start heartbeater after SUBSCRIBED event.

2018-03-07 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65930/#review198845
---



Patch looks great!

Reviews applied: [65930]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On March 7, 2018, 6:54 p.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65930/
> ---
> 
> (Updated March 7, 2018, 6:54 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Jason Lai, Quinn Leng, and Vinod Kone.
> 
> 
> Bugs: MESOS-8641
> https://issues.apache.org/jira/browse/MESOS-8641
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> https://reviews.apache.org/r/61262 added heartbeat support to
> event stream, but the implementation could send a `HEARTBEAT`
> event before `SUBSCRIBED` event, which changed previous behavior.
> 
> This patch fixed the issue by starting heartbeater only after
> `SUBSCRIBED` event is sent.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
> 
> 
> Diff: https://reviews.apache.org/r/65930/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>



Re: Review Request 65936: Changed flags to CreateFile to support Windows symlink path resolution.

2018-03-07 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65936/#review198844
---




3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp
Lines 192-199 (original), 192-199 (patched)


Adding here just so we remember:

Let's add a regression test around `os::realpath` where we open a writable 
handle to file, and then call `os::realpath`.


- Andrew Schwartzmeyer


On March 7, 2018, 10:24 a.m., John Kordich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65936/
> ---
> 
> (Updated March 7, 2018, 10:24 a.m.)
> 
> 
> Review request for mesos, Akash Gupta, Andrew Schwartzmeyer, Jeff Coffler, 
> and Joseph Wu.
> 
> 
> Bugs: MESOS-8646
> https://issues.apache.org/jira/browse/MESOS-8646
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Changed flags to CreateFile to support Windows symlink path resolution.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp 
> 858b3c7c27e95b9cf46a7457c1aaa05a0e19188d 
> 
> 
> Diff: https://reviews.apache.org/r/65936/diff/1/
> 
> 
> Testing
> ---
> 
> I encountered a permissions error stating "The process cannot access the file 
> because it is being used by another process." when attempting to download 
> stderr/stdout from a task running on a Windows agent, or when attempting to 
> download the mesos agent log on a Windows agent. 
> 
> After some very helpful information from Andy and Akash, I made this change 
> and confirmed it fixed this issue with the task/agent logs.
> 
> I also ran the automated tests on Windows and all tests pass.
> 
> I did not test this on Linux, but because this file is compiled only on 
> Windows, it shouldn't be necessary.
> 
> 
> Thanks,
> 
> John Kordich
> 
>



Re: Review Request 65780: Removed an unnecessary argument from master::Call validation.

2018-03-07 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65780/#review198843
---


Ship it!




Ship It!

- Meng Zhu


On Feb. 28, 2018, 6:31 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65780/
> ---
> 
> (Updated Feb. 28, 2018, 6:31 p.m.)
> 
> 
> Review request for mesos, Michael Park and Meng Zhu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The principal is not needed by the validation function.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
>   src/master/validation.hpp 7c129ceb929596acbb64d37025e055661277e6bf 
>   src/master/validation.cpp 9254cd22e67d9aca53135701575fb0857ff6f7d6 
> 
> 
> Diff: https://reviews.apache.org/r/65780/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 65851: Disallow limit in /quota requests and SET_QUOTA calls.

2018-03-07 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65851/#review198842
---




src/master/quota.cpp
Lines 166-171 (patched)


Hmm, it is surprising to me that this check is put here. I think the 
assumption here is that this function is called only by quota `set()`. This is 
not obvious. Let's add a todo here to either consolidate this with the new 
`quotaRequest` validation or put this limit check into the master call 
validation.


- Meng Zhu


On Feb. 28, 2018, 6:31 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65851/
> ---
> 
> (Updated Feb. 28, 2018, 6:31 p.m.)
> 
> 
> Review request for mesos, Michael Park and Meng Zhu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This ensures that after the protobuf changes to introduce limit
> are committed, users cannot set limit in the existing API.
> 
> 
> Diffs
> -
> 
>   src/master/quota.cpp 58bab6a678bac9e41a7994ba0b7cc1ed069a8a18 
> 
> 
> Diff: https://reviews.apache.org/r/65851/diff/1/
> 
> 
> Testing
> ---
> 
> Test added in the subsequent patch.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 65695: Made the default executor allow schedulers to retry task kills.

2018-03-07 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65695/#review198841
---


Ship it!




LGTM.

I wonder if we could repurpose/extend any of our existing tests to cover this 
case...  The DefaultExecutorTests already cover a couple of kill-cases, but 
none where the `KILL` call itself fails.

- Joseph Wu


On Feb. 16, 2018, 4:27 p.m., Gaston Kleiman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65695/
> ---
> 
> (Updated Feb. 16, 2018, 4:27 p.m.)
> 
> 
> Review request for mesos, Joseph Wu, Qian Zhang, and Vinod Kone.
> 
> 
> Bugs: MESOS-8530
> https://issues.apache.org/jira/browse/MESOS-8530
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The default executor transitions a task to `TASK_KILLING` and marks its
> child container as being killed before posting a `KILL` call to the
> agent.
> 
> The executor ignores kill requests for containers that are marked as
> being killed, and it doesn't remove this mark if the `KILL` call fails.
> This means that it's possible for tasks to get stuck in a `TASK_KILLING`
> state.
> 
> This patch makes the default executor remove the killing mark if a
> `KILL` call fails. That way a scheduler can retry a kill.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 8720dada8bc6ca66f9e0fec6dc265eda3dcc7407 
> 
> 
> Diff: https://reviews.apache.org/r/65695/diff/1/
> 
> 
> Testing
> ---
> 
> `sudo bin/mesos-tests.sh` on GNU/Linux
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>



Re: Review Request 65334: Added quota limit to the master API protos.

2018-03-07 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65334/#review198834
---




include/mesos/v1/quota/quota.proto
Lines 38-40 (original), 30 (patched)


`QuotaInfo` is used as internal message. Users will most likely look at 
`QuotaRequest`, these user-facing comments are better to be placed in 
`QuotaRequest`?



include/mesos/v1/quota/quota.proto
Lines 46-49 (original), 36-82 (patched)


As discussed offline, let's give more thoughts on `force` semantic. What 
kind of check do we want to do regarding guarantee? Do we want to check the 
attainability of limit?



include/mesos/v1/quota/quota.proto
Line 57 (original), 93-94 (patched)


This `force` flag is used for both guarantee and limit. What if the user 
only wants to `force` one of the fields?

Also as discussed offline, one alternative is to not have the `force` 
option all-together.


- Meng Zhu


On Feb. 23, 2018, 2:23 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65334/
> ---
> 
> (Updated Feb. 23, 2018, 2:23 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov, Michael Park, and Meng Zhu.
> 
> 
> Bugs: MESOS-8487
> https://issues.apache.org/jira/browse/MESOS-8487
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This introduces a `limit` in `QuotaInfo` and `QuotaRequest`.
> 
> A new `UPDATE_QUOTA` master call is added as a replacement of
> `SET_QUOTA` and `REMOVE_QUOTA`, which provides the new semantics
> of unset limit meaning "no limit" and unset guarantee meaning
> "no guarantee".
> 
> The new APIs are marked as experimental for now.
> 
> 
> Diffs
> -
> 
>   include/mesos/master/master.proto f40caa2a6338ff58a325444cfb72cd24f2c99472 
>   include/mesos/quota/quota.proto f2ed6f555e1e7b0290fefa3e301c9bdbf550bd11 
>   include/mesos/v1/master/master.proto 
> 67c9560eaa7bd6b20b73af07f7666f0d98c4453d 
>   include/mesos/v1/quota/quota.proto 5dd772fee46f740b062827ed7c9704c26187cb12 
>   src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
>   src/master/validation.cpp 9254cd22e67d9aca53135701575fb0857ff6f7d6 
> 
> 
> Diff: https://reviews.apache.org/r/65334/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 65957: Updated Windows documentation.

2018-03-07 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65957/#review198838
---




docs/windows.md
Line 17 (original), 17 (patched)


That works. I just didn't want to conflate with "free software" because VS 
is definitely not. Thanks!



docs/windows.md
Lines 99 (patched)


Eh, if you have GCC installed on Windows, I figure you already know what 
you're doing. This was more of a "getting started" not how to fix any bug with 
it. Setting CXX and CC applies to more than just Ninja, too, that's generic 
CMake.


- Andrew Schwartzmeyer


On March 7, 2018, 12:58 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65957/
> ---
> 
> (Updated March 7, 2018, 12:58 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Benjamin Bannier, Jeff Coffler, Eric 
> Mumau, and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This notes that Developer Mode is required for non-admin symlinks,
> removes reference to the close JIRA epic, and adds information on how
> to build with Ninja on Windows.
> 
> 
> Diffs
> -
> 
>   docs/windows.md cf2c7c230ec9fc062b4a4fe721d7669ef646253c 
> 
> 
> Diff: https://reviews.apache.org/r/65957/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 65959: Added a test to make sure `slave/recovery_secs` is reported in metrics.

2018-03-07 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65959/#review198837
---



PASS: Mesos patch 65959 was successfully built and tested.

Reviews applied: `['65954', '65959']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65959

- Mesos Reviewbot Windows


On March 7, 2018, 11:21 p.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65959/
> ---
> 
> (Updated March 7, 2018, 11:21 p.m.)
> 
> 
> Review request for mesos, Gilbert Song, Greg Mann, Jason Lai, and James Peach.
> 
> 
> Bugs: MESOS-8609
> https://issues.apache.org/jira/browse/MESOS-8609
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a test to make sure `slave/recovery_secs` is reported in metrics.
> 
> 
> Diffs
> -
> 
>   src/tests/slave_tests.cpp e253317a67019302f18afe11e2a314e716cec226 
> 
> 
> Diff: https://reviews.apache.org/r/65959/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>



Re: Review Request 65957: Updated Windows documentation.

2018-03-07 Thread Akash Gupta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65957/#review198835
---


Fix it, then Ship it!





docs/windows.md
Line 17 (original), 17 (patched)


I don't think "unpaid" can be used in that way. "free of charge" works.



docs/windows.md
Lines 99 (patched)


Ninja will pick the first compiler that's in your `$PATH`, so it picked GCC 
for me. You should mention that you can fix that by setting CC and CXX 
environmental variables like `$env:CC=$(Get-Command).Source` and 
`$env:CXX=$(Get-Command cl).Source`


- Akash Gupta


On March 7, 2018, 8:58 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65957/
> ---
> 
> (Updated March 7, 2018, 8:58 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Benjamin Bannier, Jeff Coffler, Eric 
> Mumau, and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This notes that Developer Mode is required for non-admin symlinks,
> removes reference to the close JIRA epic, and adds information on how
> to build with Ninja on Windows.
> 
> 
> Diffs
> -
> 
>   docs/windows.md cf2c7c230ec9fc062b4a4fe721d7669ef646253c 
> 
> 
> Diff: https://reviews.apache.org/r/65957/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 65939: Updated role endpoints for hierarchical accounting.

2018-03-07 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65939/#review198755
---



Thanks Till! Can you add Meng to any hierachical role related reviews?

I looked through master and agent metrics and agent endpoints and didn't see 
any other places where this should be used.

>From an API perspective, I took at look at what cgroups did for flat vs 
>hierarchical information, look for "total_" here:
https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt

We could probably do something similar here, like:

```
{
  "name": "role",
  "weight": 1,
  
  "quota": 1 cpus, 10GB mem, 100GB disk,
  
  "allocation": 0
  "total_allocation": 1 cpus, 10GB mem, 100GB disk,
  
  "frameworks": ..., // frameworks directly subscribed to "role",
  "total_frameworks": ..., // all frameworks subscribed to roles within "role" 
tree.
}
```

Make sense?

(1) I was also expecting to see the master's `Role::allocatedResources` return 
the correct information for the protobuf rather than logic on top having to do 
the aggregation. Can we have the `Role` struct in the master be able to return 
all the information shown above? (you can tackle the different pieces in 
separate patches)

(2) This also lets us update the 'Frameworks' column of the Roles table to be 
hierarchical via 'total_frameworks', at which point we probably want to call 
that column something like 'Total Frameworks'. I think that's probably what a 
user expects to see in the UI? (E.g. "engineering" contains 10 frameworks, 2 of 
which are in "engineering/frontend" and 8 of which are in 
"engineering/backend").

Also, this would be a great opportunity (once we think we're happy with the API 
change) to publish it to the dev@ list to give folks a chance to give feedback. 
We're trying to start this practice for API changes as part of the API working 
group.


src/master/http.cpp
Lines 3476-3477 (patched)


Any reason you avoided being specific about this being an "allocation" 
tree? (e.g. you wanted to re-use it for other cases?)

Otherwise, I would suggest calling this:

```
// Provides hierarchical accounting of resource allocation.
// The allocation for a role is the aggregated allocation of
// its children (and itself since allocations can be made to
// non-leaf roles).
//
// For example: a (allocation = 1 + 3 = 5)
// / \
//   (allocation = 1) b   c (allocation = 3)
//
// TODO(tillt): This only supports building up the tree and
// querying allocation. More interface is needed to keep a
// tree up-to-date as things change.
class ResourceAllocationTree
```



src/master/http.cpp
Lines 3487 (patched)


s/resources/allocation/ ?



src/master/http.cpp
Lines 3491 (patched)


Just to be sure, do we disallow double slashes? E.g. "a//b"? If so, then 
tokenize and split provide the same functionality here, and I would be inclined 
to just use split to avoid having to reason about what the implications of 
stripping empty tokens are (e.g. it breaks the lookup logic?)



src/master/http.cpp
Lines 3494-3513 (patched)


The way this is written the root 'Node' does not contain the sum of its 
children's allocations, unlike all other nodes. This doesn't seem like an issue 
since the API does not allow you to query for the root allocation. However, 
might as well make the root 'Node.resources' contain the right information.



src/master/http.cpp
Lines 3539-3542 (patched)


Do we need pointers here? It seems to me the only pointer we need is in 
`nodes`. Then you won't need to do any `delete`s?



src/master/http.cpp
Lines 3702-3710 (patched)


Rather than having to do this in the http code, can we have the Role struct 
directly expose the aggregated information as well? I'd also like the Role 
struct to have quota information, but that's for another patch :)


- Benjamin Mahler


On March 7, 2018, 12:22 a.m., Till Toenshoff wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65939/
> ---
> 
> (Updated March 7, 2018, 12:22 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Mahler, Kapil Arya, and 
> Michael Park.
> 
> 
> Bugs: MESOS-8069
> https://issues.apache.org/jira/browse/MESOS-8069
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> When listing role "a", resources allocated to role "a/b" are added to
> 

Re: Review Request 65932: Added a generic mechanism to check for isolator requirements.

2018-03-07 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65932/#review198824
---



See my comment around the CNI isolator. Need more discussion on this one


include/mesos/slave/isolator.hpp
Lines 57 (patched)


why returning `Option` instead of `bool`?

What does `None()` mean here?



src/slave/containerizer/mesos/containerizer.cpp
Lines 468 (patched)


Should we use `geteuid` here (so that mesos agent can be a setuid program)?



src/slave/containerizer/mesos/isolators/docker/volume/isolator.hpp
Line 56 (original), 56 (patched)


not yours, can you follow up with patch to make those `const` as well (just 
to be consistent)



src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
Line 226 (original), 222 (patched)


2 lines apart



src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
Lines 229-232 (patched)


Looks like this override is not necessary. I'd suggest remove this 
override. That also allows us to perform optimization (fine grind capabilities 
check) in the future.



src/slave/containerizer/mesos/isolators/filesystem/posix.hpp
Lines 37 (patched)


`{` to the next line



src/slave/containerizer/mesos/isolators/filesystem/posix.hpp
Lines 38 (patched)


hum, why posix filesystem isolator requires root?



src/slave/containerizer/mesos/isolators/filesystem/shared.cpp
Lines 221-224 (patched)


Kill



src/slave/containerizer/mesos/isolators/gpu/isolator.cpp
Lines 239 (patched)


gpu isolator requires mount namespaces so that the injected volumes won't 
show up on the host.



src/slave/containerizer/mesos/isolators/namespaces/ipc.cpp
Lines 70-73 (patched)


kill



src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
Lines 88-101 (original), 88-96 (patched)


CNI isolator is a bit tricky here. We do enable that currently when the 
agent is running not under 'root' so that it can reject container launch 
requesting joining a named CNI network.

So this change will break it.

I was thinking maybe we should make `root` permission check a per container 
check during launch time (rather than an isolator level global check). Of 
course, the downside is that we cannot fail fast.

Also, the namespace requirements are different for each container. If a 
container joins host network and does not define container image, no namespace 
is required.

If we really want to add an isolator level check, that'll probably check 
the "minimal" requirement.


- Jie Yu


On March 6, 2018, 6:18 p.m., James Peach wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65932/
> ---
> 
> (Updated March 6, 2018, 6:18 p.m.)
> 
> 
> Review request for mesos, Andrew Schwartzmeyer and Jie Yu.
> 
> 
> Bugs: MESOS-6555
> https://issues.apache.org/jira/browse/MESOS-6555
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Many isolators require root privileges and support for specific
> Linux namespaces. By adding some additional virtual functions
> to the `Isolator` interface, we can let isolators declare
> their needs and perform the checks in a containerizer single
> pass when we create the isolators.
> 
> 
> Diffs
> -
> 
>   include/mesos/slave/isolator.hpp f682e3037523b1c210b5ba36246556518f261e1b 
>   src/linux/ns.cpp 64722c785cecce5b861efbc3f3f2027a61f666d4 
>   src/slave/containerizer/mesos/containerizer.cpp 
> a61ed5bb4f60ec63aebd6bdabbc9a4b01235623e 
>   src/slave/containerizer/mesos/isolator.hpp 
> b7af3946155e28d80013a650973ae230de7d01c5 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp 
> 5763c9880728f02e44116fd50e62b592a8ef69b6 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp 
> 4431ce13d28035b0c5c037b2848ae03aeaf65562 
>   src/slave/containerizer/mesos/isolators/docker/volume/isolator.hpp 
> 0e1761d44f7dce51bef138c5eda2ac3d8eb6ea21 
>   src/slave/containerizer/mesos/isolators/docker/volume/isolator.cpp 
> a6f05a8cf7eab2b2cc4c2142efdf3125462ec68e 
>   src/slave/containerizer/mesos/isolators/filesystem/linux.hpp 
> d933af4e5a605f63d3c4bc584e82de7f3f80da55 
>   

Re: Review Request 65694: Made the default executor's handling of kill escalations more robust.

2018-03-07 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65694/#review198833
---


Ship it!




LGTM.

Note to self: `s/it/executor/` in the commit description.

- Joseph Wu


On Feb. 16, 2018, 4:30 p.m., Gaston Kleiman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65694/
> ---
> 
> (Updated Feb. 16, 2018, 4:30 p.m.)
> 
> 
> Review request for mesos, Joseph Wu, Qian Zhang, and Vinod Kone.
> 
> 
> Bugs: MESOS-8530
> https://issues.apache.org/jira/browse/MESOS-8530
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch makes the default executor retry SIGKILL escalations if the
> it is disconnected from the agent or the kill call fails.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 8720dada8bc6ca66f9e0fec6dc265eda3dcc7407 
> 
> 
> Diff: https://reviews.apache.org/r/65694/diff/1/
> 
> 
> Testing
> ---
> 
> `sudo bin/mesos-tests.sh` on GNU/Linux
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>



Re: Review Request 65693: Made the default executor fail kills if the response isn't "200 OK".

2018-03-07 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65693/#review198670
---



What-is-basically-a-style-nit:


src/launcher/default_executor.cpp
Lines 1172-1176 (original), 1174-1183 (patched)


If you start the lambda with:
```
[=](const Response& response) -> Future
```

Then you won't need to wrap the two return statements with 
`Future(...)`.


- Joseph Wu


On Feb. 16, 2018, 4:25 p.m., Gaston Kleiman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65693/
> ---
> 
> (Updated Feb. 16, 2018, 4:25 p.m.)
> 
> 
> Review request for mesos, Joseph Wu, Qian Zhang, and Vinod Kone.
> 
> 
> Bugs: MESOS-8530
> https://issues.apache.org/jira/browse/MESOS-8530
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The default executor's `Future kill(const ContainerID&, int)`
> method returns `Nothing()` if the agent responded to the
> `KILL_NESTED_CONTAINER` call, regardless of the response.
> 
> This patch updates the method, so that it returns a failure if the
> response is not "200 OK".
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 8720dada8bc6ca66f9e0fec6dc265eda3dcc7407 
> 
> 
> Diff: https://reviews.apache.org/r/65693/diff/1/
> 
> 
> Testing
> ---
> 
> `sudo bin/mesos-tests.sh` on GNU/Linux
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>



Re: Review Request 65692: Changed the signature of a DefaultExecutor kill method.

2018-03-07 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65692/#review198667
---



I suppose it's lucky that `Owned` is implemented with `shared_ptr`, otherwise, 
we'd be double-freeing most of these.


src/launcher/default_executor.cpp
Line 986 (original), 986 (patched)


Here's another copy that should be made into a reference.



src/launcher/default_executor.cpp
Lines 1074-1076 (original), 1074-1076 (patched)


As an alternative, you could change this to take a `Owned&` 
instead.


- Joseph Wu


On Feb. 16, 2018, 4:24 p.m., Gaston Kleiman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65692/
> ---
> 
> (Updated Feb. 16, 2018, 4:24 p.m.)
> 
> 
> Review request for mesos, Joseph Wu, Qian Zhang, and Vinod Kone.
> 
> 
> Bugs: MESOS-8530
> https://issues.apache.org/jira/browse/MESOS-8530
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This method is being passed the container to be killed as
> `Owned`.
> 
> The container object is owned by the executor and stored in
> `LinkedHashMap containers`.
> 
> We should pass around raw pointers and not make copies of the stored
> `Owned` object, so this patch changes the signature of the `kill`
> method.
> 
> 
> Diffs
> -
> 
>   src/launcher/default_executor.cpp 8720dada8bc6ca66f9e0fec6dc265eda3dcc7407 
> 
> 
> Diff: https://reviews.apache.org/r/65692/diff/1/
> 
> 
> Testing
> ---
> 
> `sudo bin/mesos-tests.sh` on GNU/Linux
> 
> 
> Thanks,
> 
> Gaston Kleiman
> 
>



Review Request 65959: Added a test to make sure `slave/recovery_secs` is reported in metrics.

2018-03-07 Thread Zhitao Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65959/
---

Review request for mesos, Gilbert Song, Greg Mann, Jason Lai, and James Peach.


Bugs: MESOS-8609
https://issues.apache.org/jira/browse/MESOS-8609


Repository: mesos


Description
---

Added a test to make sure `slave/recovery_secs` is reported in metrics.


Diffs
-

  src/tests/slave_tests.cpp e253317a67019302f18afe11e2a314e716cec226 


Diff: https://reviews.apache.org/r/65959/diff/1/


Testing
---


Thanks,

Zhitao Li



Re: Review Request 65936: Changed flags to CreateFile to support Windows symlink path resolution.

2018-03-07 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65936/#review198829
---



Patch looks great!

Reviews applied: [65936]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On March 7, 2018, 6:24 p.m., John Kordich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65936/
> ---
> 
> (Updated March 7, 2018, 6:24 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Andrew Schwartzmeyer, Jeff Coffler, 
> and Joseph Wu.
> 
> 
> Bugs: MESOS-8646
> https://issues.apache.org/jira/browse/MESOS-8646
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Changed flags to CreateFile to support Windows symlink path resolution.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp 
> 858b3c7c27e95b9cf46a7457c1aaa05a0e19188d 
> 
> 
> Diff: https://reviews.apache.org/r/65936/diff/1/
> 
> 
> Testing
> ---
> 
> I encountered a permissions error stating "The process cannot access the file 
> because it is being used by another process." when attempting to download 
> stderr/stdout from a task running on a Windows agent, or when attempting to 
> download the mesos agent log on a Windows agent. 
> 
> After some very helpful information from Andy and Akash, I made this change 
> and confirmed it fixed this issue with the task/agent logs.
> 
> I also ran the automated tests on Windows and all tests pass.
> 
> I did not test this on Linux, but because this file is compiled only on 
> Windows, it shouldn't be necessary.
> 
> 
> Thanks,
> 
> John Kordich
> 
>



Re: Review Request 65954: Add a gauge for how long agent recovery takes.

2018-03-07 Thread Zhitao Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65954/
---

(Updated March 7, 2018, 11:20 p.m.)


Review request for mesos, Gilbert Song, Greg Mann, Jason Lai, and James Peach.


Bugs: MESOS-8609
https://issues.apache.org/jira/browse/MESOS-8609


Repository: mesos


Description
---

The new metric `slave/recover_secs` can be used to tell us how long
Mesos agent needed to finish its recovery cycle. This is an important
metric on agent machines which have a lot of completed executor
sandboxes.

Note that the metric 1) will only be available after recovery succeeded
and 2) never change its value across agent process lifecycle afterwards.


Diffs (updated)
-

  src/slave/metrics.hpp 3fc933ca65690d6fad63156398ad9c2c53789296 
  src/slave/metrics.cpp 0eb2b59ed67e14e73b29d7592c239441df0008d5 
  src/slave/slave.cpp e2facb3c15a2f907f6497c58a36842ed707f2c70 


Diff: https://reviews.apache.org/r/65954/diff/2/

Changes: https://reviews.apache.org/r/65954/diff/1-2/


Testing
---


Thanks,

Zhitao Li



Re: Review Request 65781: Added a TODO to clean up master::Call validation.

2018-03-07 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65781/#review198828
---


Ship it!




Ship It!

- Meng Zhu


On Feb. 23, 2018, 2:43 p.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65781/
> ---
> 
> (Updated Feb. 23, 2018, 2:43 p.m.)
> 
> 
> Review request for mesos, Michael Park and Meng Zhu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> See summary.
> 
> 
> Diffs
> -
> 
>   src/master/validation.hpp 7c129ceb929596acbb64d37025e055661277e6bf 
> 
> 
> Diff: https://reviews.apache.org/r/65781/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 65955: Fixed implicit conversion warnings.

2018-03-07 Thread James Peach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65955/#review198827
---


Ship it!




Ship It!

- James Peach


On March 7, 2018, 6:45 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65955/
> ---
> 
> (Updated March 7, 2018, 6:45 p.m.)
> 
> 
> Review request for mesos, Jie Yu, James Peach, and Zhitao Li.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch fixes three new "possible loss of data" implicit conversion
> warnings by using `static_cast` to the LHS type.
> 
> 
> Diffs
> -
> 
>   src/docker/docker.cpp bb81c4d68df95b0ccb2a0830e651c75f4493752e 
>   src/slave/containerizer/fetcher.cpp 
> a49411b7bac2d5a50a75d0b802842c2f61fe58c6 
>   src/slave/containerizer/mesos/containerizer.cpp 
> a61ed5bb4f60ec63aebd6bdabbc9a4b01235623e 
> 
> 
> Diff: https://reviews.apache.org/r/65955/diff/1/
> 
> 
> Testing
> ---
> 
> Ninja build on Linux and VS build on Windows. Warnings gone AFAICT.
> 
> We really need to turn on conversion warnings on the Linux build. Do any of 
> you know how? Until we do, I'm hesitant to add -Werror to the Windows build.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 65957: Updated Windows documentation.

2018-03-07 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65957/#review198823
---



PASS: Mesos patch 65957 was successfully built and tested.

Reviews applied: `['65719', '65721', '65720', '65957']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65957

- Mesos Reviewbot Windows


On March 7, 2018, 8:58 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65957/
> ---
> 
> (Updated March 7, 2018, 8:58 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Benjamin Bannier, Jeff Coffler, Eric 
> Mumau, and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This notes that Developer Mode is required for non-admin symlinks,
> removes reference to the close JIRA epic, and adds information on how
> to build with Ninja on Windows.
> 
> 
> Diffs
> -
> 
>   docs/windows.md cf2c7c230ec9fc062b4a4fe721d7669ef646253c 
> 
> 
> Diff: https://reviews.apache.org/r/65957/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 65720: Windows: Fixed location of imported libraries for Ninja.

2018-03-07 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65720/
---

(Updated March 7, 2018, 1:44 p.m.)


Review request for mesos, Akash Gupta, Benjamin Bannier, Jeff Coffler, and 
Joseph Wu.


Bugs: MESOS-8599
https://issues.apache.org/jira/browse/MESOS-8599


Repository: mesos


Description (updated)
---

The Ninja generator does not emit binaries to "Debug" or "Release"
folders (unlike the Visual Studio generator), as it is not a
multi-configuration generator. To fix this, we explicitly set the
`IMPORTED_LOCATION` instead of `IMPORTED_LOCATION_(DEBUG|RELEASE)`
when using the Ninja generator on Windows. This does not reuse the
location used on non-Windows platforms, as they're not consistent
despite the generator being the same.

This also simplifies the `GET_BYPRODUCTS` function to set `BYPRODUCTS`
to an empty string when the generator is not Ninja, as it should be a
no-op for other generators.


Diffs (updated)
-

  3rdparty/CMakeLists.txt da605707b89bbe9b3db9e60bc0b0a26dac46e56e 


Diff: https://reviews.apache.org/r/65720/diff/4/

Changes: https://reviews.apache.org/r/65720/diff/3-4/


Testing
---


Thanks,

Andrew Schwartzmeyer



Re: Review Request 65719: Windows: Fixed CMake configuration for Ninja.

2018-03-07 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65719/
---

(Updated March 7, 2018, 1:43 p.m.)


Review request for mesos, Akash Gupta, Benjamin Bannier, Jeff Coffler, and 
Joseph Wu.


Summary (updated)
-

Windows: Fixed CMake configuration for Ninja.


Bugs: MESOS-8599
https://issues.apache.org/jira/browse/MESOS-8599


Repository: mesos


Description (updated)
---

Only the Visual Studio generator(s) should check the toolset. The
Ninja generator requires the environment to be setup in advance, and
does not have the concept of a toolset, so this warning broke the
Ninja build.

Instead of warning that we're deprecating some arbitrary generator, we
should just say we don't officially support it so the user knows it
may not work as expected.

Fixed the assumption that all generators on Windows are
multi-configuration. Ninja is single-configuration, so the debug CRT
flag must sometimes be set for `CMAKE_CXX_FLAGS` depending on
`CMAKE_BUILD_TYPE`.


Diffs (updated)
-

  cmake/CompilationConfigure.cmake 50cddf9476c8c5196c4824a7b060c2680a96b277 


Diff: https://reviews.apache.org/r/65719/diff/3/

Changes: https://reviews.apache.org/r/65719/diff/2-3/


Testing
---


Thanks,

Andrew Schwartzmeyer



Review Request 65957: Updated Windows documentation.

2018-03-07 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65957/
---

Review request for mesos, Akash Gupta, Benjamin Bannier, Jeff Coffler, Eric 
Mumau, and Joseph Wu.


Repository: mesos


Description
---

This notes that Developer Mode is required for non-admin symlinks,
removes reference to the close JIRA epic, and adds information on how
to build with Ninja on Windows.


Diffs
-

  docs/windows.md cf2c7c230ec9fc062b4a4fe721d7669ef646253c 


Diff: https://reviews.apache.org/r/65957/diff/1/


Testing
---


Thanks,

Andrew Schwartzmeyer



Re: Review Request 65720: Windows: Fixed location of imported libraries for Ninja.

2018-03-07 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65720/
---

(Updated March 7, 2018, 12:54 p.m.)


Review request for mesos, Akash Gupta, Benjamin Bannier, Jeff Coffler, and 
Joseph Wu.


Changes
---

Updated per review comments.


Bugs: MESOS-8599
https://issues.apache.org/jira/browse/MESOS-8599


Repository: mesos


Description
---

The Ninja generator does not emit binaries to "Debug" or "Release"
folders (unlike the Visual Studio generator), as it is not a
multi-configuration generator. To fix this, we explicitly set the
`IMPORTED_LOCATION` instead of `IMPORTED_LOCATION_(DEBUG|RELEASE)`
when using the Ninja generator on Windows. This does not reuse the
location used on non-Windows platforms, as they're not consistent
despite the generator being the same.


Diffs (updated)
-

  3rdparty/CMakeLists.txt da605707b89bbe9b3db9e60bc0b0a26dac46e56e 


Diff: https://reviews.apache.org/r/65720/diff/3/

Changes: https://reviews.apache.org/r/65720/diff/2-3/


Testing
---


Thanks,

Andrew Schwartzmeyer



Re: Review Request 65930: Start heartbeater after SUBSCRIBED event.

2018-03-07 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65930/#review198817
---



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['65930']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65930

Relevant logs:

- 
[mesos-tests-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65930/logs/mesos-tests-stdout.log):

```
[   OK ] Endpoint/SlaveEndpointTest.NoAuthorizer/2 (137 ms)
[--] 9 tests from Endpoint/SlaveEndpointTest (1254 ms total)

[--] 2 tests from ContainerizerType/DefaultContainerDNSFlagTest
[ RUN  ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/0
[   OK ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/0 (38 
ms)
[ RUN  ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/1
[   OK ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/1 (42 
ms)
[--] 2 tests from ContainerizerType/DefaultContainerDNSFlagTest (84 ms 
total)

[--] 1 test from IsolationFlag/CpuIsolatorTest
[ RUN  ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0
[   OK ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0 (2474 ms)
[--] 1 test from IsolationFlag/CpuIsolatorTest (2499 ms total)

[--] 1 test from IsolationFlag/MemoryIsolatorTest
[ RUN  ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0
[   OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (2480 ms)
[--] 1 test from IsolationFlag/MemoryIsolatorTest (2506 ms total)

[--] Global test environment tear-down
[==] 916 tests from 91 test cases ran. (474304 ms total)
[  PASSED  ] 915 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] CommandExecutorCheckTest.CommandCheckTimeout

 1 FAILED TEST
  YOU HAVE 210 DISABLED TESTS

```

- 
[mesos-tests-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65930/logs/mesos-tests-stderr.log):

```
I0307 20:19:11.253438  1632 master.cpp:10210] Updating the state of task 
2cf6c68f-5454-4267-a4d3-5b853f619eca of framework 
0c730431-b34c-46e1-95ac-1b545365f0a1- (latest state: TASK_KILLED, status 
update state: TASK_KILLED)
I0307 20:19:11.253438  3464 slave.cpp:3878] Shutting down framework 
0c730431-b34c-46e1-95ac-1b545365f0a1-
I0307 20:19:11.254441  3464 slave.cpp:6571] Shutting down executor 
'2cf6c68f-5454-4267-a4d3-5b853f619eca' of framework 
0c730431-b34c-46e1-95ac-1b545365f0a1- at executor(1)@10.3.1.11:50038
I0307 20:19:11.255725  3464 slave.cpp:924] Agent terminating
W0307 20:19:11.255725  3464 slave.cpp:3874] Ignoring shutdown framework 
0c730431-b34c-46e1-95aI0307 20:19:10.544463  8536 exec.cpp:162] Version: 1.6.0
I0307 20:19:10.575470  6352 exec.cpp:236] Executor registered on agent 
0c730431-b34c-46e1-95ac-1b545365f0a1-S0
I0307 20:19:10.580438  9528 executor.cpp:176] Received SUBSCRIBED event
I0307 20:19:10.585465  9528 executor.cpp:180] Subscribed executor on 
build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net
I0307 20:19:10.586465  9528 executor.cpp:176] Received LAUNCH event
I0307 20:19:10.591440  9528 executor.cpp:648] Starting task 
2cf6c68f-5454-4267-a4d3-5b853f619eca
I0307 20:19:10.674466  9528 executor.cpp:483] Running 
'D:\DCOS\mesos\src\mesos-containerizer.exe launch '
I0307 20:19:11.219439  9528 executor.cpp:661] Forked command at 4920
I0307 20:19:11.256438  9780 exec.cpp:445] Executor asked to shutdown
I0307 20:19:11.257442  9528 executor.cpp:176] Received SHUTDOWN event
I0307 20:19:11.257442  9528 executor.cpp:758] Shutting down
I0307 20:19:11.258440  9528 executor.cpp:868] Sending SIGTERM to process tree 
at pid 4c-1b545365f0a1- because it is terminating
I0307 20:19:11.256438  1632 master.cpp:10309] Removing task 
2cf6c68f-5454-4267-a4d3-5b853f619eca with resources cpus(allocated: *):4; 
mem(allocated: *):2048; disk(allocated: *):1024; ports(allocated: 
*):[31000-32000] of framework 0c730431-b34c-46e1-95ac-1b545365f0a1- on 
agent 0c730431-b34c-46e1-95ac-1b545365f0a1-S0 at slave(398)@10.3.1.11:50017 
(build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0307 20:19:11.259440  1632 master.cpp:1306] Agent 
0c730431-b34c-46e1-95ac-1b545365f0a1-S0 at slave(398)@10.3.1.11:50017 
(build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net) disconnected
I0307 20:19:11.259440  1632 master.cpp:3276] Disconnecting agent 
0c730431-b34c-46e1-95ac-1b545365f0a1-S0 at slave(398)@10.3.1.11:50017 
(build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0307 20:19:11.259440  1632 master.cpp:3295] Deactivating agent 
0c730431-b34c-46e1-95ac-1b545365f0a1-S0 at slave(398)@10.3.1.11:50017 
(build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0307 20:19:11.260437  3464 hierarchical.cpp:344] 

Re: Review Request 65955: Fixed implicit conversion warnings.

2018-03-07 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65955/#review198816
---



PASS: Mesos patch 65955 was successfully built and tested.

Reviews applied: `['65955']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65955

- Mesos Reviewbot Windows


On March 7, 2018, 6:45 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65955/
> ---
> 
> (Updated March 7, 2018, 6:45 p.m.)
> 
> 
> Review request for mesos, Jie Yu, James Peach, and Zhitao Li.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch fixes three new "possible loss of data" implicit conversion
> warnings by using `static_cast` to the LHS type.
> 
> 
> Diffs
> -
> 
>   src/docker/docker.cpp bb81c4d68df95b0ccb2a0830e651c75f4493752e 
>   src/slave/containerizer/fetcher.cpp 
> a49411b7bac2d5a50a75d0b802842c2f61fe58c6 
>   src/slave/containerizer/mesos/containerizer.cpp 
> a61ed5bb4f60ec63aebd6bdabbc9a4b01235623e 
> 
> 
> Diff: https://reviews.apache.org/r/65955/diff/1/
> 
> 
> Testing
> ---
> 
> Ninja build on Linux annd VS build on Windows. Warnings gone AFAICT.
> 
> We really need to turn on conversion warnings on the Linux build. Do any of 
> you know how? Until we do, I'm hesitant to add -Werror to the Windows build.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 65954: Add a gauge for how long agent recovery takes.

2018-03-07 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65954/#review198813
---



Patch looks great!

Reviews applied: [65954]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On March 7, 2018, 9:38 a.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65954/
> ---
> 
> (Updated March 7, 2018, 9:38 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jason Lai, and James Peach.
> 
> 
> Bugs: MESOS-8609
> https://issues.apache.org/jira/browse/MESOS-8609
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The new metric `slave/recover_secs` can be used to tell us how long
> Mesos agent needed to finish its recovery cycle. This is an important
> metric on agent machines which have a lot of completed executor
> sandboxes.
> 
> Note that the metric 1) will only be available after recovery succeeded
> and 2) never change its value across agent process lifecycle afterwards.
> 
> 
> Diffs
> -
> 
>   src/slave/metrics.hpp 3fc933ca65690d6fad63156398ad9c2c53789296 
>   src/slave/metrics.cpp 0eb2b59ed67e14e73b29d7592c239441df0008d5 
>   src/slave/slave.cpp e2facb3c15a2f907f6497c58a36842ed707f2c70 
> 
> 
> Diff: https://reviews.apache.org/r/65954/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>



Re: Review Request 63992: Handled incorrect offer operation IDs in master's ACCEPT handler.

2018-03-07 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63992/#review198812
---




src/master/master.cpp
Line 4136 (original), 4184 (patched)


If the operation has an operation ID, we'll always land on this `continue` 
even if the validations/checks pass.

This means that line #4187 will be skipped, and operations with an operatin 
ID won't be added to the object passed to the `_accept` continuation.


- Gaston Kleiman


On March 2, 2018, 2:52 p.m., Greg Mann wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63992/
> ---
> 
> (Updated March 2, 2018, 2:52 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Gaston Kleiman, and Jie Yu.
> 
> 
> Bugs: MESOS-8190
> https://issues.apache.org/jira/browse/MESOS-8190
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds code to send operation status updates in the master's
> ACCEPT call handler. In cases where operations are dropped and in
> cases where offer operation IDs are set when they should not be, the
> master will send operation status updates for the dropped operations.
> 
> 
> Diffs
> -
> 
>   src/master/master.cpp 9cea7bb6a6ee8bf5f4226d07111bcfa6f5d3a88c 
> 
> 
> Diff: https://reviews.apache.org/r/63992/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Greg Mann
> 
>



Re: Review Request 65936: Changed flags to CreateFile to support Windows symlink path resolution.

2018-03-07 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65936/#review198810
---



PASS: Mesos patch 65936 was successfully built and tested.

Reviews applied: `['65936']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65936

- Mesos Reviewbot Windows


On March 7, 2018, 6:24 p.m., John Kordich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65936/
> ---
> 
> (Updated March 7, 2018, 6:24 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Andrew Schwartzmeyer, Jeff Coffler, 
> and Joseph Wu.
> 
> 
> Bugs: MESOS-8646
> https://issues.apache.org/jira/browse/MESOS-8646
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Changed flags to CreateFile to support Windows symlink path resolution.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp 
> 858b3c7c27e95b9cf46a7457c1aaa05a0e19188d 
> 
> 
> Diff: https://reviews.apache.org/r/65936/diff/1/
> 
> 
> Testing
> ---
> 
> I encountered a permissions error stating "The process cannot access the file 
> because it is being used by another process." when attempting to download 
> stderr/stdout from a task running on a Windows agent, or when attempting to 
> download the mesos agent log on a Windows agent. 
> 
> After some very helpful information from Andy and Akash, I made this change 
> and confirmed it fixed this issue with the task/agent logs.
> 
> I also ran the automated tests on Windows and all tests pass.
> 
> I did not test this on Linux, but because this file is compiled only on 
> Windows, it shouldn't be necessary.
> 
> 
> Thanks,
> 
> John Kordich
> 
>



Re: Review Request 65930: Start heartbeater after SUBSCRIBED event.

2018-03-07 Thread Zhitao Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65930/
---

(Updated March 7, 2018, 6:54 p.m.)


Review request for mesos, Greg Mann, Jason Lai, Quinn Leng, and Vinod Kone.


Bugs: MESOS-8641
https://issues.apache.org/jira/browse/MESOS-8641


Repository: mesos


Description
---

https://reviews.apache.org/r/61262 added heartbeat support to
event stream, but the implementation could send a `HEARTBEAT`
event before `SUBSCRIBED` event, which changed previous behavior.

This patch fixed the issue by starting heartbeater only after
`SUBSCRIBED` event is sent.


Diffs (updated)
-

  src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 


Diff: https://reviews.apache.org/r/65930/diff/3/

Changes: https://reviews.apache.org/r/65930/diff/2-3/


Testing
---


Thanks,

Zhitao Li



Re: Review Request 65936: Changed flags to CreateFile to support Windows symlink path resolution.

2018-03-07 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65936/#review198806
---




3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp
Lines 182-184 (original), 182-184 (patched)


This comment needs to be updated.



3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp
Line 194 (original), 194 (patched)


Actually, since other changes should be made, can you add the comment?


- Andrew Schwartzmeyer


On March 7, 2018, 10:24 a.m., John Kordich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65936/
> ---
> 
> (Updated March 7, 2018, 10:24 a.m.)
> 
> 
> Review request for mesos, Akash Gupta, Andrew Schwartzmeyer, Jeff Coffler, 
> and Joseph Wu.
> 
> 
> Bugs: MESOS-8646
> https://issues.apache.org/jira/browse/MESOS-8646
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Changed flags to CreateFile to support Windows symlink path resolution.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp 
> 858b3c7c27e95b9cf46a7457c1aaa05a0e19188d 
> 
> 
> Diff: https://reviews.apache.org/r/65936/diff/1/
> 
> 
> Testing
> ---
> 
> I encountered a permissions error stating "The process cannot access the file 
> because it is being used by another process." when attempting to download 
> stderr/stdout from a task running on a Windows agent, or when attempting to 
> download the mesos agent log on a Windows agent. 
> 
> After some very helpful information from Andy and Akash, I made this change 
> and confirmed it fixed this issue with the task/agent logs.
> 
> I also ran the automated tests on Windows and all tests pass.
> 
> I did not test this on Linux, but because this file is compiled only on 
> Windows, it shouldn't be necessary.
> 
> 
> Thanks,
> 
> John Kordich
> 
>



Re: Review Request 65310: Added lambda::zip.

2018-03-07 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65310/#review198805
---


Fix it, then Ship it!





3rdparty/stout/tests/lambda_tests.cpp
Lines 18-19 (patched)


I don't think these includes are necessary?

I do see a few missing, however:









- Greg Mann


On March 6, 2018, 10:11 a.m., Alexander Rojas wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65310/
> ---
> 
> (Updated March 6, 2018, 10:11 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Greg Mann.
> 
> 
> Bugs: MESOS-8434
> https://issues.apache.org/jira/browse/MESOS-8434
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Introduces the abstractions `lambda::zip` and `lambda::zipto` which
> provides a functionality similar to python's  `zip()` call.
> 
> This patch supersedes the stalled review
> https://reviews.apache.org/r/63292/.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/lambda.hpp 
> c9212be27b45da45f15371689f2aabf18c98657b 
>   3rdparty/stout/tests/lambda_tests.cpp 
> 11b034306b11b95efa9bcd9811c5145ad39314a5 
> 
> 
> Diff: https://reviews.apache.org/r/65310/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Alexander Rojas
> 
>



Re: Review Request 65936: Changed flags to CreateFile to support Windows symlink path resolution.

2018-03-07 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65936/#review198804
---




3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp
Lines 190-200 (original), 190-200 (patched)


What about `get_handle_no_follow`?


- Andrew Schwartzmeyer


On March 7, 2018, 10:24 a.m., John Kordich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65936/
> ---
> 
> (Updated March 7, 2018, 10:24 a.m.)
> 
> 
> Review request for mesos, Akash Gupta, Andrew Schwartzmeyer, Jeff Coffler, 
> and Joseph Wu.
> 
> 
> Bugs: MESOS-8646
> https://issues.apache.org/jira/browse/MESOS-8646
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Changed flags to CreateFile to support Windows symlink path resolution.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp 
> 858b3c7c27e95b9cf46a7457c1aaa05a0e19188d 
> 
> 
> Diff: https://reviews.apache.org/r/65936/diff/1/
> 
> 
> Testing
> ---
> 
> I encountered a permissions error stating "The process cannot access the file 
> because it is being used by another process." when attempting to download 
> stderr/stdout from a task running on a Windows agent, or when attempting to 
> download the mesos agent log on a Windows agent. 
> 
> After some very helpful information from Andy and Akash, I made this change 
> and confirmed it fixed this issue with the task/agent logs.
> 
> I also ran the automated tests on Windows and all tests pass.
> 
> I did not test this on Linux, but because this file is compiled only on 
> Windows, it shouldn't be necessary.
> 
> 
> Thanks,
> 
> John Kordich
> 
>



Review Request 65955: Fixed implicit conversion warnings.

2018-03-07 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65955/
---

Review request for mesos, Jie Yu, James Peach, and Zhitao Li.


Repository: mesos


Description
---

This patch fixes three new "possible loss of data" implicit conversion
warnings by using `static_cast` to the RHS type.


Diffs
-

  src/docker/docker.cpp bb81c4d68df95b0ccb2a0830e651c75f4493752e 
  src/slave/containerizer/fetcher.cpp a49411b7bac2d5a50a75d0b802842c2f61fe58c6 
  src/slave/containerizer/mesos/containerizer.cpp 
a61ed5bb4f60ec63aebd6bdabbc9a4b01235623e 


Diff: https://reviews.apache.org/r/65955/diff/1/


Testing
---

Ninja build on Linux annd VS build on Windows. Warnings gone AFAICT.

We really need to turn on conversion warnings on the Linux build. Do any of you 
know how? Until we do, I'm hesitant to add -Werror to the Windows build.


Thanks,

Andrew Schwartzmeyer



Re: Review Request 65720: Windows: Fixed location of imported libraries for Ninja.

2018-03-07 Thread Andrew Schwartzmeyer


> On March 7, 2018, 4:21 a.m., Benjamin Bannier wrote:
> > 3rdparty/CMakeLists.txt
> > Lines 158-160 (original), 158-165 (patched)
> > 
> >
> > The paths for vs look more specific to me, so I wonder whether it maes 
> > sense to either turn the logic around (`if (VC) ... else ()`), or use 
> > `elseif` instead of a catch all `else`.
> > 
> > Here and below.

I'm not following. There are only two supported generators on Windows: VS and 
Ninja. Flipping this `if` around is just style that doesn't change anything. 
(NOTE: This is all in the WIN32 block.)


- Andrew


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65720/#review198783
---


On March 6, 2018, 12:05 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65720/
> ---
> 
> (Updated March 6, 2018, 12:05 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Benjamin Bannier, Jeff Coffler, and 
> Joseph Wu.
> 
> 
> Bugs: MESOS-8599
> https://issues.apache.org/jira/browse/MESOS-8599
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The Ninja generator does not emit binaries to "Debug" or "Release"
> folders (unlike the Visual Studio generator), as it is not a
> multi-configuration generator. To fix this, we explicitly set the
> `IMPORTED_LOCATION` instead of `IMPORTED_LOCATION_(DEBUG|RELEASE)`
> when using the Ninja generator on Windows. This does not reuse the
> location used on non-Windows platforms, as they're not consistent
> despite the generator being the same.
> 
> 
> Diffs
> -
> 
>   3rdparty/CMakeLists.txt da605707b89bbe9b3db9e60bc0b0a26dac46e56e 
> 
> 
> Diff: https://reviews.apache.org/r/65720/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 65936: Changed flags to CreateFile to support Windows symlink path resolution.

2018-03-07 Thread Andrew Schwartzmeyer

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65936/#review198801
---


Fix it, then Ship it!





3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp
Line 194 (original), 194 (patched)


I'll add a comment detailing why we open with these sharing permissions 
when I commit.


- Andrew Schwartzmeyer


On March 7, 2018, 10:24 a.m., John Kordich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65936/
> ---
> 
> (Updated March 7, 2018, 10:24 a.m.)
> 
> 
> Review request for mesos, Akash Gupta, Andrew Schwartzmeyer, Jeff Coffler, 
> and Joseph Wu.
> 
> 
> Bugs: MESOS-8646
> https://issues.apache.org/jira/browse/MESOS-8646
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Changed flags to CreateFile to support Windows symlink path resolution.
> 
> 
> Diffs
> -
> 
>   3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp 
> 858b3c7c27e95b9cf46a7457c1aaa05a0e19188d 
> 
> 
> Diff: https://reviews.apache.org/r/65936/diff/1/
> 
> 
> Testing
> ---
> 
> I encountered a permissions error stating "The process cannot access the file 
> because it is being used by another process." when attempting to download 
> stderr/stdout from a task running on a Windows agent, or when attempting to 
> download the mesos agent log on a Windows agent. 
> 
> After some very helpful information from Andy and Akash, I made this change 
> and confirmed it fixed this issue with the task/agent logs.
> 
> I also ran the automated tests on Windows and all tests pass.
> 
> I did not test this on Linux, but because this file is compiled only on 
> Windows, it shouldn't be necessary.
> 
> 
> Thanks,
> 
> John Kordich
> 
>



Re: Review Request 65936: Changed flags to CreateFile to support Windows symlink path resolution.

2018-03-07 Thread John Kordich via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65936/
---

(Updated March 7, 2018, 10:24 a.m.)


Review request for mesos, Akash Gupta, Andrew Schwartzmeyer, and Jeff Coffler.


Changes
---

Added bug.


Bugs: MESOS-8646
https://issues.apache.org/jira/browse/MESOS-8646


Repository: mesos


Description
---

Changed flags to CreateFile to support Windows symlink path resolution.


Diffs
-

  3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp 
858b3c7c27e95b9cf46a7457c1aaa05a0e19188d 


Diff: https://reviews.apache.org/r/65936/diff/1/


Testing
---

I encountered a permissions error stating "The process cannot access the file 
because it is being used by another process." when attempting to download 
stderr/stdout from a task running on a Windows agent, or when attempting to 
download the mesos agent log on a Windows agent. 

After some very helpful information from Andy and Akash, I made this change and 
confirmed it fixed this issue with the task/agent logs.

I also ran the automated tests on Windows and all tests pass.

I did not test this on Linux, but because this file is compiled only on 
Windows, it shouldn't be necessary.


Thanks,

John Kordich



Re: Review Request 65930: Start heartbeater after SUBSCRIBED event.

2018-03-07 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65930/#review198800
---


Fix it, then Ship it!





src/master/http.cpp
Lines 873 (patched)


Nit: could you move the declaration of the connection above L861?

Then we have the pattern:










- Greg Mann


On March 6, 2018, 9:07 p.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65930/
> ---
> 
> (Updated March 6, 2018, 9:07 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Jason Lai, Quinn Leng, and Vinod Kone.
> 
> 
> Bugs: MESOS-8641
> https://issues.apache.org/jira/browse/MESOS-8641
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> https://reviews.apache.org/r/61262 added heartbeat support to
> event stream, but the implementation could send a `HEARTBEAT`
> event before `SUBSCRIBED` event, which changed previous behavior.
> 
> This patch fixed the issue by starting heartbeater only after
> `SUBSCRIBED` event is sent.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
> 
> 
> Diff: https://reviews.apache.org/r/65930/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>



Re: Review Request 65591: Explicitly tracked resource providers in master.

2018-03-07 Thread Greg Mann


> On March 7, 2018, 3:13 a.m., Greg Mann wrote:
> > src/master/master.cpp
> > Lines 7324 (patched)
> > 
> >
> > Is there any case where `message.has_resource_version_uuid() && 
> > slave->resourceVersion.isNone()` will be true?
> > 
> > I'm wondering if we should do something like:
> > ```
> >   if (!updated && message.has_resource_version_uuid()) {
> > CHECK_SOME(slave->resourceVersion);
> > if (message.resource_version_uuid() != 
> > slave->resourceVersion.get()) {
> >   updated = true;
> > }
> >   }
> > ```
> 
> Benjamin Bannier wrote:
> It is possible on the proto level since e.g., 
> `RegisterSlaveMessage.resource_version_uuid` is `optional`. I'd argue that 
> the currently proposed implementation is more conservative in that it would 
> run the full handler even in the case you point out instead of aborting.
> 
> I'd suggest we keep the impl like it is currently. Dropping.

What does it mean if `message.has_resource_version_uuid() && 
slave->resourceVersion.isNone()` is true? IIUC, that means that the agent 
previously registered with no resource version UUID for agent default 
resources, and it is now providing a resource version UUID in an 
UpdateSlaveMessage. Doesn't this indicate an error in the agent? Does it make 
sense to run the rest of the handler in such a case? I'm not sure what kind of 
precedent we have in the code for aborting the master on unexpected input from 
an agent, but if that's something we're in the habit of doing, this would seem 
like a suitable place for it.


> On March 7, 2018, 3:13 a.m., Greg Mann wrote:
> > src/master/master.cpp
> > Line 7722 (original), 7680-7681 (patched)
> > 
> >
> > Perhaps we also want to rescind offers when 
> > `message.has_resource_providers()` is false? In that case, the agent has 
> > told us that it has no registered RPs, so offers with RP resources would 
> > not be valid?
> 
> Benjamin Bannier wrote:
> I do not think that is true. Only when the `resource_providers` field is 
> set does the agent send any resource provider information. The field could 
> contain information on zero or more resource providers in its `providers` 
> field, but we do not distinguish that here, exactly as you suggest.
> 
> Dropping.

IIUC, the agent always sends info on all registered RPs in each 
UpdateSlaveMessage. So, if the master receives an UpdateSlaveMessage which 
indicates that no RPs are currently registered on an agent, doesn't that mean 
that offers for that agent with RP resources are currently invalid?


> On March 7, 2018, 3:13 a.m., Greg Mann wrote:
> > src/master/master.cpp
> > Lines 10572-10574 (original), 10534-10536 (patched)
> > 
> >
> > Are you sure this will always be true? Could a framework send us a 
> > well-formed, but unknown RP ID in a message and make this false?
> 
> Benjamin Bannier wrote:
> This is an important invariant of the current implementation, see 
> https://issues.apache.org/jira/browse/MESOS-8321.
> 
> Dropping.

Are we sure that invariant holds in this case? The resource provider ID here is 
framework-supplied, and I don't think we validate that it's present in the 
master state before we hit this code.


- Greg


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65591/#review198741
---


On March 7, 2018, 11:36 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65591/
> ---
> 
> (Updated March 7, 2018, 11:36 a.m.)
> 
> 
> Review request for mesos, Greg Mann, Jie Yu, and Jan Schlicht.
> 
> 
> Bugs: MESOS-8382
> https://issues.apache.org/jira/browse/MESOS-8382
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds explicit tracking of resource providers to the master
> process. While we already had explicitly send resource provider
> information in e.g., `UpdateSlaveMessage`, we only stored that
> information aggregated over the full agent in the master up to now.
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.cpp 9c5fb97afb58f98013b79f3cbaea7dacc3603271 
>   src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
>   src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
>   src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 
> 
> 
> Diff: https://reviews.apache.org/r/65591/diff/8/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 65954: Add a gauge for how long agent recovery takes.

2018-03-07 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65954/#review198799
---



PASS: Mesos patch 65954 was successfully built and tested.

Reviews applied: `['65954']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65954

- Mesos Reviewbot Windows


On March 7, 2018, 9:38 a.m., Zhitao Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65954/
> ---
> 
> (Updated March 7, 2018, 9:38 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Jason Lai, and James Peach.
> 
> 
> Bugs: MESOS-8609
> https://issues.apache.org/jira/browse/MESOS-8609
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The new metric `slave/recover_secs` can be used to tell us how long
> Mesos agent needed to finish its recovery cycle. This is an important
> metric on agent machines which have a lot of completed executor
> sandboxes.
> 
> Note that the metric 1) will only be available after recovery succeeded
> and 2) never change its value across agent process lifecycle afterwards.
> 
> 
> Diffs
> -
> 
>   src/slave/metrics.hpp 3fc933ca65690d6fad63156398ad9c2c53789296 
>   src/slave/metrics.cpp 0eb2b59ed67e14e73b29d7592c239441df0008d5 
>   src/slave/slave.cpp e2facb3c15a2f907f6497c58a36842ed707f2c70 
> 
> 
> Diff: https://reviews.apache.org/r/65954/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>



Review Request 65936: Changed flags to CreateFile to support Windows symlink path resolution.

2018-03-07 Thread John Kordich via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65936/
---

Review request for mesos, Akash Gupta, Andrew Schwartzmeyer, and Jeff Coffler.


Repository: mesos


Description
---

Changed flags to CreateFile to support Windows symlink path resolution.


Diffs
-

  3rdparty/stout/include/stout/internal/windows/reparsepoint.hpp 
858b3c7c27e95b9cf46a7457c1aaa05a0e19188d 


Diff: https://reviews.apache.org/r/65936/diff/1/


Testing
---

I encountered a permissions error stating "The process cannot access the file 
because it is being used by another process." when attempting to download 
stderr/stdout from a task running on a Windows agent, or when attempting to 
download the mesos agent log on a Windows agent. 

After some very helpful information from Andy and Akash, I made this change and 
confirmed it fixed this issue with the task/agent logs.

I also ran the automated tests on Windows and all tests pass.

I did not test this on Linux, but because this file is compiled only on 
Windows, it shouldn't be necessary.


Thanks,

John Kordich



Review Request 65954: Add a gauge for how long agent recovery takes.

2018-03-07 Thread Zhitao Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65954/
---

Review request for mesos, Gilbert Song and Jason Lai.


Bugs: MESOS-8609
https://issues.apache.org/jira/browse/MESOS-8609


Repository: mesos


Description
---

The new metric `slave/recover_secs` can be used to tell us how long
Mesos agent needed to finish its recovery cycle. This is an important
metric on agent machines which have a lot of completed executor
sandboxes.

Note that the metric 1) will only be available after recovery succeeded
and 2) never change its value across agent process lifecycle afterwards.


Diffs
-

  src/slave/metrics.hpp 3fc933ca65690d6fad63156398ad9c2c53789296 
  src/slave/metrics.cpp 0eb2b59ed67e14e73b29d7592c239441df0008d5 
  src/slave/slave.cpp e2facb3c15a2f907f6497c58a36842ed707f2c70 


Diff: https://reviews.apache.org/r/65954/diff/1/


Testing
---


Thanks,

Zhitao Li



Re: Review Request 65905: Used SHA512 for release file checksums.

2018-03-07 Thread Till Toenshoff

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65905/#review198794
---


Ship it!





support/vote.sh
Lines 36-38 (patched)


Thanks a bunch for this - much appreciated.


- Till Toenshoff


On March 7, 2018, 12:37 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65905/
> ---
> 
> (Updated March 7, 2018, 12:37 p.m.)
> 
> 
> Review request for mesos, Kapil Arya, Till Toenshoff, and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Apache now requires SHA checksum files instead of the previously
> required MD5, see the [signing recommendations](1). This patch updates
> the Mesos vote and release tooling to accommodate that change in
> policy. We use SHA512 as recommended in the [Apache SHA checksum
> FAQ](2).
> 
> We also fix the format of the produced digest file to be compatible
> with `sha512sum` to ease automatic release verification.
> 
> [1]: http://www.apache.org/dev/release-distribution#sigs-and-sums
> [2]: http://www.apache.org/dev/release-signing#sha-checksum
> 
> 
> Diffs
> -
> 
>   support/release.sh 3aeda92e6bd48683cf609fa527633cd47b9f7dce 
>   support/vote.sh 649eebc6b5fe1b3783ae0c2c1706f1349ddc436c 
> 
> 
> Diff: https://reviews.apache.org/r/65905/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 65906: Updated site's middleman to latest 3.x release.

2018-03-07 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65906/#review198792
---



Patch looks great!

Reviews applied: [65906]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On March 5, 2018, 4:42 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65906/
> ---
> 
> (Updated March 5, 2018, 4:42 p.m.)
> 
> 
> Review request for mesos, Armand Grillet, Benjamin Mahler, Greg Mann, and 
> haosdent huang.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Updated site's middleman to latest 3.x release.
> 
> 
> Diffs
> -
> 
>   site/Gemfile 877fe914a9787c140848fdf9958571fec5fa58ff 
>   site/Gemfile.lock 909f3f3badeaa47c80929e243ce36307766edee4 
> 
> 
> Diff: https://reviews.apache.org/r/65906/diff/2/
> 
> 
> Testing
> ---
> 
> Locally tested the generated website: `bundle install && bundle exec rake && 
> bundle exec rake dev`.
> 
> The only manually updated file was `site/Gemfile`; `site/Gemfile.lock` was 
> updated automatically by `bundler`.
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 65905: Used SHA512 for release file checksums.

2018-03-07 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65905/#review198789
---



PASS: Mesos patch 65905 was successfully built and tested.

Reviews applied: `['65905']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65905

- Mesos Reviewbot Windows


On March 7, 2018, 12:37 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65905/
> ---
> 
> (Updated March 7, 2018, 12:37 p.m.)
> 
> 
> Review request for mesos, Kapil Arya, Till Toenshoff, and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Apache now requires SHA checksum files instead of the previously
> required MD5, see the [signing recommendations](1). This patch updates
> the Mesos vote and release tooling to accommodate that change in
> policy. We use SHA512 as recommended in the [Apache SHA checksum
> FAQ](2).
> 
> We also fix the format of the produced digest file to be compatible
> with `sha512sum` to ease automatic release verification.
> 
> [1]: http://www.apache.org/dev/release-distribution#sigs-and-sums
> [2]: http://www.apache.org/dev/release-signing#sha-checksum
> 
> 
> Diffs
> -
> 
>   support/release.sh 3aeda92e6bd48683cf609fa527633cd47b9f7dce 
>   support/vote.sh 649eebc6b5fe1b3783ae0c2c1706f1349ddc436c 
> 
> 
> Diff: https://reviews.apache.org/r/65905/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 65906: Updated site's middleman to latest 3.x release.

2018-03-07 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65906/#review198788
---



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['65906']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65906

Relevant logs:

- 
[mesos-tests-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65906/logs/mesos-tests-stdout.log):

```
[   OK ] Endpoint/SlaveEndpointTest.NoAuthorizer/2 (130 ms)
[--] 9 tests from Endpoint/SlaveEndpointTest (1177 ms total)

[--] 2 tests from ContainerizerType/DefaultContainerDNSFlagTest
[ RUN  ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/0
[   OK ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/0 (38 
ms)
[ RUN  ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/1
[   OK ] ContainerizerType/DefaultContainerDNSFlagTest.ValidateFlag/1 (47 
ms)
[--] 2 tests from ContainerizerType/DefaultContainerDNSFlagTest (86 ms 
total)

[--] 1 test from IsolationFlag/CpuIsolatorTest
[ RUN  ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0
[   OK ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0 (2434 ms)
[--] 1 test from IsolationFlag/CpuIsolatorTest (2459 ms total)

[--] 1 test from IsolationFlag/MemoryIsolatorTest
[ RUN  ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0
[   OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (2468 ms)
[--] 1 test from IsolationFlag/MemoryIsolatorTest (2494 ms total)

[--] Global test environment tear-down
[==] 916 tests from 91 test cases ran. (474789 ms total)
[  PASSED  ] 915 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] CommandExecutorCheckTest.CommandCheckTimeout

 1 FAILED TEST
  YOU HAVE 210 DISABLED TESTS

```

- 
[mesos-tests-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65906/logs/mesos-tests-stderr.log):

```
I0307 13:42:59.527459  7508 slave.cpp:3878] Shutting down framework 
cffb23f4-885b-43eb-91b7-806efc5320b5-
I0307 13:42:59.527459  6056 master.cpp:10210] Updating the state of task 
41121d5f-1fcb-4680-b525-3d7effe48d50 of framework 
cffb23f4-885b-43eb-91b7-806efc5320b5- (latest state: TASK_KILLED, status 
update state: TASK_KILLED)
I0307 13:42:59.528472  7508 slave.cpp:6571] Shutting down executor 
'41121d5f-1fcb-4680-b525-3d7effe48d50' of framework 
cffb23f4-885b-43eb-91b7-806efc5320b5- at executor(1)@10.3.1.11:61511
I0307 13:42:59.529444  7508 slave.cpp:924] Agent terminating
W0307 13:42:59.529444  7508 slave.cpp:3874] Ignoring shutdown framework 
cffb23f4-885b-43eb-91b7-806efc5320b5- because it is terminatiI0307 
13:42:58.816445  8468 exec.cpp:162] Version: 1.6.0
I0307 13:42:58.846470  8848 exec.cpp:236] Executor registered on agent 
cffb23f4-885b-43eb-91b7-806efc5320b5-S0
I0307 13:42:58.850469  7500 executor.cpp:176] Received SUBSCRIBED event
I0307 13:42:58.855469  7500 executor.cpp:180] Subscribed executor on 
build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net
I0307 13:42:58.856470  7500 executor.cpp:176] Received LAUNCH event
I0307 13:42:58.861469  7500 executor.cpp:648] Starting task 
41121d5f-1fcb-4680-b525-3d7effe48d50
I0307 13:42:58.944473  7500 executor.cpp:483] Running 
'D:\DCOS\mesos\src\mesos-containerizer.exe launch '
I0307 13:42:59.494446  7500 executor.cpp:661] Forked command at 4264
I0307 13:42:59.530447  7808 exec.cpp:445] Executor asked to shutdown
I0307 13:42:59.531447  7500 executor.cpp:176] Received SHUTDOWN event
I0307 13:42:59.531447  7500 executor.cpp:758] Shutting down
I0307 13:42:59.531447  7500 executor.cpp:868] Sending SIGTERM to process tree 
at pid 4ng
I0307 13:42:59.530447  6056 master.cpp:10309] Removing task 
41121d5f-1fcb-4680-b525-3d7effe48d50 with resources cpus(allocated: *):4; 
mem(allocated: *):2048; disk(allocated: *):1024; ports(allocated: 
*):[31000-32000] of framework cffb23f4-885b-43eb-91b7-806efc5320b5- on 
agent cffb23f4-885b-43eb-91b7-806efc5320b5-S0 at slave(398)@10.3.1.11:61490 
(build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I0307 13:42:59.533447  2664 containerizer.cpp:2338] Destroying container 
a814ea24-5d22-4996-95c6-14df4f1aa6aa in RUNNING state
I0307 13:42:59.533447  2664 containerizer.cpp:2952] Transitioning the state of 
container a814ea24-5d22-4996-95c6-14df4f1aa6aa from RUNNING to DESTROYING
I0307 13:42:59.533447  6056 master.cpp:1306] Agent 
cffb23f4-885b-43eb-91b7-806efc5320b5-S0 at slave(398)@10.3.1.11:61490 
(build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net) disconnected
I0307 13:42:59.533447  6056 master.cpp:3276] Disconnecting agent 
cffb23f4-885b-43eb-91b7-806efc5320b5-S0 at slave(398)@10.3.1.11:61490 

Re: Review Request 65906: Updated site's middleman to latest 3.x release.

2018-03-07 Thread Armand Grillet

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65906/#review198786
---



Running `mesos-website-dev.sh` with this patch fails:
```
Fetching gem metadata from https://rubygems.org/
nokogiri-1.8.2 requires ruby version >= 2.1.0, which is incompatible with the 
current version, ruby 2.0.0p648
```
The issue does not not exist using `master` and is thus likely introduced by 
this patch.

- Armand Grillet


On March 5, 2018, 4:42 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65906/
> ---
> 
> (Updated March 5, 2018, 4:42 p.m.)
> 
> 
> Review request for mesos, Armand Grillet, Benjamin Mahler, Greg Mann, and 
> haosdent huang.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Updated site's middleman to latest 3.x release.
> 
> 
> Diffs
> -
> 
>   site/Gemfile 877fe914a9787c140848fdf9958571fec5fa58ff 
>   site/Gemfile.lock 909f3f3badeaa47c80929e243ce36307766edee4 
> 
> 
> Diff: https://reviews.apache.org/r/65906/diff/2/
> 
> 
> Testing
> ---
> 
> Locally tested the generated website: `bundle install && bundle exec rake && 
> bundle exec rake dev`.
> 
> The only manually updated file was `site/Gemfile`; `site/Gemfile.lock` was 
> updated automatically by `bundler`.
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 65354: Fix flakyness in MasterTest.RegistryGcByCount.

2018-03-07 Thread Alexander Rukletsov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65354/#review198785
---


Ship it!




Ship It!

- Alexander Rukletsov


On Feb. 1, 2018, 4:46 p.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65354/
> ---
> 
> (Updated Feb. 1, 2018, 4:46 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Vinod Kone.
> 
> 
> Bugs: MESOS-8485
> https://issues.apache.org/jira/browse/MESOS-8485
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The test could fail on very slow hosts due to a timeout
> in the agent waiting for the registration and a subsequent
> re-sent `SlaveRegisteredMessage. See MESOS-8485.
> 
> 
> Diffs
> -
> 
>   src/tests/master_tests.cpp 7112bb4efcc312dc4f68bdc44fb685c7624002b1 
> 
> 
> Diff: https://reviews.apache.org/r/65354/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Benno Evers
> 
>



Re: Review Request 65571: Handle 'None' passed from the MasterDetector in 'Master::detect()'.

2018-03-07 Thread Alexander Rukletsov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65571/#review198784
---



Looks like the test failure ^ is related to this change. Benno, can you please 
investigate?

- Alexander Rukletsov


On Feb. 13, 2018, 8:05 p.m., Benno Evers wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65571/
> ---
> 
> (Updated Feb. 13, 2018, 8:05 p.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Vinod Kone.
> 
> 
> Bugs: MESOS-8550
> https://issues.apache.org/jira/browse/MESOS-8550
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The function `MasterDetector::detect()` returns a value of type
> `Future

Re: Review Request 65905: Used SHA512 for release file checksums.

2018-03-07 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65905/
---

(Updated March 7, 2018, 1:37 p.m.)


Review request for mesos, Kapil Arya, Till Toenshoff, and Vinod Kone.


Changes
---

Make script with homebrew's coreutils as suggested by Till offline.


Repository: mesos


Description
---

Apache now requires SHA checksum files instead of the previously
required MD5, see the [signing recommendations](1). This patch updates
the Mesos vote and release tooling to accommodate that change in
policy. We use SHA512 as recommended in the [Apache SHA checksum
FAQ](2).

We also fix the format of the produced digest file to be compatible
with `sha512sum` to ease automatic release verification.

[1]: http://www.apache.org/dev/release-distribution#sigs-and-sums
[2]: http://www.apache.org/dev/release-signing#sha-checksum


Diffs (updated)
-

  support/release.sh 3aeda92e6bd48683cf609fa527633cd47b9f7dce 
  support/vote.sh 649eebc6b5fe1b3783ae0c2c1706f1349ddc436c 


Diff: https://reviews.apache.org/r/65905/diff/4/

Changes: https://reviews.apache.org/r/65905/diff/3-4/


Testing
---


Thanks,

Benjamin Bannier



Re: Review Request 65720: Windows: Fixed location of imported libraries for Ninja.

2018-03-07 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65720/#review198783
---




3rdparty/CMakeLists.txt
Lines 158-160 (original), 158-165 (patched)


The paths for vs look more specific to me, so I wonder whether it maes 
sense to either turn the logic around (`if (VC) ... else ()`), or use `elseif` 
instead of a catch all `else`.

Here and below.



3rdparty/CMakeLists.txt
Lines 229 (patched)


Unneeded empty line here?



3rdparty/CMakeLists.txt
Lines 230-239 (patched)


Ditto.



3rdparty/CMakeLists.txt
Lines 319-331 (patched)


Ditto.



3rdparty/CMakeLists.txt
Lines 396-399 (original), 421-430 (patched)


Ditto.



3rdparty/CMakeLists.txt
Lines 441-444 (original), 472-481 (patched)


Ditto.



3rdparty/CMakeLists.txt
Lines 557-566 (patched)


Ditto.



3rdparty/CMakeLists.txt
Lines 648-660 (patched)


Ditto.



3rdparty/CMakeLists.txt
Lines 725-733 (patched)


Ditto.



3rdparty/CMakeLists.txt
Lines 796-804 (patched)


Ditto.



3rdparty/CMakeLists.txt
Lines 872-881 (patched)


Ditto.



3rdparty/CMakeLists.txt
Lines 941-949 (patched)


Ditto.


- Benjamin Bannier


On March 6, 2018, 9:05 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65720/
> ---
> 
> (Updated March 6, 2018, 9:05 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Benjamin Bannier, Jeff Coffler, and 
> Joseph Wu.
> 
> 
> Bugs: MESOS-8599
> https://issues.apache.org/jira/browse/MESOS-8599
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The Ninja generator does not emit binaries to "Debug" or "Release"
> folders (unlike the Visual Studio generator), as it is not a
> multi-configuration generator. To fix this, we explicitly set the
> `IMPORTED_LOCATION` instead of `IMPORTED_LOCATION_(DEBUG|RELEASE)`
> when using the Ninja generator on Windows. This does not reuse the
> location used on non-Windows platforms, as they're not consistent
> despite the generator being the same.
> 
> 
> Diffs
> -
> 
>   3rdparty/CMakeLists.txt da605707b89bbe9b3db9e60bc0b0a26dac46e56e 
> 
> 
> Diff: https://reviews.apache.org/r/65720/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 65721: Windows: Specified byproducts of `sasl2` imported target.

2018-03-07 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65721/#review198782
---


Ship it!




Ship It!

- Benjamin Bannier


On March 6, 2018, 9:05 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65721/
> ---
> 
> (Updated March 6, 2018, 9:05 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Benjamin Bannier, Jeff Coffler, and 
> Joseph Wu.
> 
> 
> Bugs: MESOS-8599
> https://issues.apache.org/jira/browse/MESOS-8599
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This is necessary to enable the Ninja build. This was not previously
> fixed because we have not yet used Ninja on Windows. The same pattern
> is used for other imported targets.
> 
> 
> Diffs
> -
> 
>   3rdparty/CMakeLists.txt da605707b89bbe9b3db9e60bc0b0a26dac46e56e 
> 
> 
> Diff: https://reviews.apache.org/r/65721/diff/2/
> 
> 
> Testing
> ---
> 
> Built with Ninja on Windows in 17 minutes, versus 21 minutes using the VS 
> generator (both builds completely from scratch).
> 
> No more Ninja warnings (only expected compiler warnings).
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 65719: Windows: Fixed CMake check of toolset for Ninja.

2018-03-07 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65719/#review198781
---


Ship it!




Ship It!

- Benjamin Bannier


On March 6, 2018, 9:04 p.m., Andrew Schwartzmeyer wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65719/
> ---
> 
> (Updated March 6, 2018, 9:04 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Benjamin Bannier, Jeff Coffler, and 
> Joseph Wu.
> 
> 
> Bugs: MESOS-8599
> https://issues.apache.org/jira/browse/MESOS-8599
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Only the Visual Studio generator(s) should check the toolset. The
> Ninja generator requires the environment to be setup in advance, and
> does not have the concept of a toolset, so this warning broke the
> Ninja build.
> 
> 
> Diffs
> -
> 
>   cmake/CompilationConfigure.cmake 50cddf9476c8c5196c4824a7b060c2680a96b277 
> 
> 
> Diff: https://reviews.apache.org/r/65719/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>



Re: Review Request 65591: Explicitly tracked resource providers in master.

2018-03-07 Thread Benjamin Bannier


> On March 7, 2018, 4:13 a.m., Greg Mann wrote:
> > src/master/master.cpp
> > Lines 7324 (patched)
> > 
> >
> > Is there any case where `message.has_resource_version_uuid() && 
> > slave->resourceVersion.isNone()` will be true?
> > 
> > I'm wondering if we should do something like:
> > ```
> >   if (!updated && message.has_resource_version_uuid()) {
> > CHECK_SOME(slave->resourceVersion);
> > if (message.resource_version_uuid() != 
> > slave->resourceVersion.get()) {
> >   updated = true;
> > }
> >   }
> > ```

It is possible on the proto level since e.g., 
`RegisterSlaveMessage.resource_version_uuid` is `optional`. I'd argue that the 
currently proposed implementation is more conservative in that it would run the 
full handler even in the case you point out instead of aborting.

I'd suggest we keep the impl like it is currently. Dropping.


> On March 7, 2018, 4:13 a.m., Greg Mann wrote:
> > src/master/master.cpp
> > Line 7722 (original), 7680-7681 (patched)
> > 
> >
> > Perhaps we also want to rescind offers when 
> > `message.has_resource_providers()` is false? In that case, the agent has 
> > told us that it has no registered RPs, so offers with RP resources would 
> > not be valid?

I do not think that is true. Only when the `resource_providers` field is set 
does the agent send any resource provider information. The field could contain 
information on zero or more resource providers in its `providers` field, but we 
do not distinguish that here, exactly as you suggest.

Dropping.


> On March 7, 2018, 4:13 a.m., Greg Mann wrote:
> > src/master/master.cpp
> > Lines 10572-10574 (original), 10534-10536 (patched)
> > 
> >
> > Are you sure this will always be true? Could a framework send us a 
> > well-formed, but unknown RP ID in a message and make this false?

This is an important invariant of the current implementation, see 
https://issues.apache.org/jira/browse/MESOS-8321.

Dropping.


> On March 7, 2018, 4:13 a.m., Greg Mann wrote:
> > src/master/master.cpp
> > Lines 11515 (patched)
> > 
> >
> > Ditto with this CHECK - are we sure that a well-formed but unknown RP 
> > ID wouldn't hit this?

Ditto, see https://issues.apache.org/jira/browse/MESOS-8321.

Dropping.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65591/#review198741
---


On March 7, 2018, 12:36 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65591/
> ---
> 
> (Updated March 7, 2018, 12:36 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Jie Yu, and Jan Schlicht.
> 
> 
> Bugs: MESOS-8382
> https://issues.apache.org/jira/browse/MESOS-8382
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds explicit tracking of resource providers to the master
> process. While we already had explicitly send resource provider
> information in e.g., `UpdateSlaveMessage`, we only stored that
> information aggregated over the full agent in the master up to now.
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.cpp 9c5fb97afb58f98013b79f3cbaea7dacc3603271 
>   src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
>   src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
>   src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 
> 
> 
> Diff: https://reviews.apache.org/r/65591/diff/8/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 65591: Explicitly tracked resource providers in master.

2018-03-07 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65591/
---

(Updated March 7, 2018, 12:36 p.m.)


Review request for mesos, Greg Mann, Jie Yu, and Jan Schlicht.


Changes
---

Addressed issues raised by Greg.


Bugs: MESOS-8382
https://issues.apache.org/jira/browse/MESOS-8382


Repository: mesos


Description
---

This patch adds explicit tracking of resource providers to the master
process. While we already had explicitly send resource provider
information in e.g., `UpdateSlaveMessage`, we only stored that
information aggregated over the full agent in the master up to now.


Diffs (updated)
-

  src/common/protobuf_utils.cpp 9c5fb97afb58f98013b79f3cbaea7dacc3603271 
  src/master/http.cpp acf10a0f396234ec9f444c9b41515d657379d6ad 
  src/master/master.hpp c9c8a968b6f56fe261ac9fe374e926a28d40fccb 
  src/master/master.cpp e666b664dd125317cda5d16285d444b9c21e1f35 


Diff: https://reviews.apache.org/r/65591/diff/8/

Changes: https://reviews.apache.org/r/65591/diff/7-8/


Testing
---

`make check`


Thanks,

Benjamin Bannier



Re: Review Request 65772: Moved some docker image setups to dedicated directory.

2018-03-07 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65772/
---

(Updated March 7, 2018, 11:05 a.m.)


Review request for mesos and Michael Park.


Bugs: MESOS-8464
https://issues.apache.org/jira/browse/MESOS-8464


Repository: mesos


Description
---

Moved some docker image setups to dedicated directory.


Diffs
-

  support/mesos-build/centos-7.dockerfile  
  support/mesos-build/enable-devtoolset-4.sh  
  support/mesos-build/entrypoint.sh  
  support/mesos-build/ubuntu-16.04.dockerfile  
  support/mesos-tidy/Dockerfile  
  support/mesos-tidy/README.md  
  support/mesos-tidy/entrypoint.sh  


Diff: https://reviews.apache.org/r/65772/diff/1/


Testing
---


Thanks,

Benjamin Bannier



Re: Review Request 65773: Introduced Docker image build and publish scripts.

2018-03-07 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65773/
---

(Updated March 7, 2018, 11:05 a.m.)


Review request for mesos and Michael Park.


Bugs: MESOS-8464
https://issues.apache.org/jira/browse/MESOS-8464


Repository: mesos


Description
---

This patch introduces a build and publish script for Mesos-related
docker images. The intention is to run these scripts from CI so that
images on Dockerhub are kept up to date.


Diffs
-

  support/docker/Makefile PRE-CREATION 
  support/docker/test/Dockerfile PRE-CREATION 
  support/docker/test/foo.sh PRE-CREATION 
  support/jenkins/build_docker_images.sh PRE-CREATION 


Diff: https://reviews.apache.org/r/65773/diff/1/


Testing
---

Tested with my own account.

If image sources are unchanged, the images are only pulled down, but not 
republished. If images sources changed they are rebuild and published.


Thanks,

Benjamin Bannier