Re: Review Request 69224: Fixed a test flake in `HealthCheckTest.HealthyTaskNonShell`.

2018-10-31 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69224/#review210248
---



PASS: Mesos patch 69224 was successfully built and tested.

Reviews applied: `['69224']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2560/mesos-review-69224

- Mesos Reviewbot Windows


On Nov. 1, 2018, 3:15 a.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69224/
> ---
> 
> (Updated Nov. 1, 2018, 3:15 a.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Benno Evers.
> 
> 
> Bugs: MESOS-9366
> https://issues.apache.org/jira/browse/MESOS-9366
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed a test flake in `HealthCheckTest.HealthyTaskNonShell`.
> 
> 
> Diffs
> -
> 
>   src/tests/health_check_tests.cpp da64dc2b4de6e1526628522af34ee83b8ef9f469 
> 
> 
> Diff: https://reviews.apache.org/r/69224/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Review Request 69224: Fixed a test flake in `HealthCheckTest.HealthyTaskNonShell`.

2018-10-31 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69224/
---

Review request for mesos, Alexander Rukletsov and Benno Evers.


Bugs: MESOS-9366
https://issues.apache.org/jira/browse/MESOS-9366


Repository: mesos


Description
---

Fixed a test flake in `HealthCheckTest.HealthyTaskNonShell`.


Diffs
-

  src/tests/health_check_tests.cpp da64dc2b4de6e1526628522af34ee83b8ef9f469 


Diff: https://reviews.apache.org/r/69224/diff/1/


Testing
---

make check


Thanks,

Chun-Hung Hsiao



Re: Review Request 69116: Added 'popen_tty' to test util functions for the new CLI.

2018-10-31 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69116/#review210246
---



Patch looks great!

Reviews applied: [69116]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Oct. 22, 2018, 12:23 p.m., Armand Grillet wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69116/
> ---
> 
> (Updated Oct. 22, 2018, 12:23 p.m.)
> 
> 
> Review request for mesos and Kevin Klues.
> 
> 
> Bugs: MESOS-9343
> https://issues.apache.org/jira/browse/MESOS-9343
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This code was pulled directly from:
> https://github.com/dcos/dcos-core-cli/blob/
> 7fd55421939a7782c237e2b8719c0fe2f543acd7/
> python/lib/dcoscli/dcoscli/test/common.py
> 
> This function will be used by tests requiring a TTY. This will be the
> case for tests concerning the 'task attach' subcommand.
> 
> 
> Diffs
> -
> 
>   src/python/cli_new/lib/cli/tests/base.py 
> 7fd39af665164eeef8bb1359de340ef923dc2272 
> 
> 
> Diff: https://reviews.apache.org/r/69116/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Armand Grillet
> 
>



Re: Review Request 69116: Added 'popen_tty' to test util functions for the new CLI.

2018-10-31 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69116/#review210245
---



PASS: Mesos patch 69116 was successfully built and tested.

Reviews applied: `['69116']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2559/mesos-review-69116

- Mesos Reviewbot Windows


On Oct. 22, 2018, 5:23 a.m., Armand Grillet wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69116/
> ---
> 
> (Updated Oct. 22, 2018, 5:23 a.m.)
> 
> 
> Review request for mesos and Kevin Klues.
> 
> 
> Bugs: MESOS-9343
> https://issues.apache.org/jira/browse/MESOS-9343
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This code was pulled directly from:
> https://github.com/dcos/dcos-core-cli/blob/
> 7fd55421939a7782c237e2b8719c0fe2f543acd7/
> python/lib/dcoscli/dcoscli/test/common.py
> 
> This function will be used by tests requiring a TTY. This will be the
> case for tests concerning the 'task attach' subcommand.
> 
> 
> Diffs
> -
> 
>   src/python/cli_new/lib/cli/tests/base.py 
> 7fd39af665164eeef8bb1359de340ef923dc2272 
> 
> 
> Diff: https://reviews.apache.org/r/69116/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Armand Grillet
> 
>



Re: Review Request 69010: Synced SLRP checkpoints to the filesystem.

2018-10-31 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69010/#review210238
---



PASS: Mesos patch 69010 was successfully built and tested.

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2558/mesos-review-69010

- Mesos Reviewbot Windows


On Oct. 31, 2018, 7:21 p.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69010/
> ---
> 
> (Updated Oct. 31, 2018, 7:21 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.
> 
> 
> Bugs: MESOS-9281
> https://issues.apache.org/jira/browse/MESOS-9281
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently if a system crashes, SLRP checkpoints might not be synced to
> the filesystem, so it is possible that an old or empty checkpoint will
> be read upon recovery. Moreover, if a CSI call has been issued right
> before the crash, the recovered state may be inconsistent with the
> actual state reported by the plugin. For example, the plugin might have
> created a volume but the checkpointed state does not know about it.
> 
> To avoid this inconsistency, we always call fsync()  when checkpointing
> SLRP states.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/storage/provider.cpp 
> db783b53558811081fb2671e005e8bbbd9edbede 
>   src/slave/state.hpp 003211e4670c1092acb1634220d76bafd39e3a20 
> 
> 
> Diff: https://reviews.apache.org/r/69010/diff/8/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 69163: Set agent and/or resource provider ID in operation status updates.

2018-10-31 Thread James DeFelice

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69163/#review210237
---



> This patch sets agent and/or resource provider ID operation status
update messages. This is not always possible, e.g., some operations
might fail validation so that no corresponding IDs can be extracted

Will the framework be expected to ACK in these cases? If so, missing the IDs is 
problematic.

- James DeFelice


On Oct. 25, 2018, 10:54 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69163/
> ---
> 
> (Updated Oct. 25, 2018, 10:54 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch sets agent and/or resource provider ID operation status
> update messages. This is not always possible, e.g., some operations
> might fail validation so that no corresponding IDs can be extracted.
> 
> Since operations failing validation are currently directly rejected by
> the master without going through a status update manager, they are not
> retried either. If a master status update manager for operations is
> introduced at a later point it should be possible to forward
> acknowledgements for updates to the master's update manager (no agent
> ID, not resource provider ID).
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp 1662125ed3e47b179ee32d08c1d3af75553066ba 
>   src/common/protobuf_utils.cpp 2ae37ac610894652a0214022a0cf04c4365396dd 
>   src/master/master.cpp f88c7c1f03f0de7236aad9e3bf4bfac82e91bc65 
>   src/resource_provider/storage/provider.cpp 
> db783b53558811081fb2671e005e8bbbd9edbede 
>   src/slave/slave.cpp 74f6fb9036a9ac4f587f53ec2df04eeb4c167bfb 
>   src/tests/slave_tests.cpp 637bedce20f63dfc1702ef549bcea8f790477be3 
> 
> 
> Diff: https://reviews.apache.org/r/69163/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69035: Added a comment for `Resource.provider_id`.

2018-10-31 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69035/#review210236
---


Ship it!




Ship It!

- Meng Zhu


On Oct. 15, 2018, 9:07 p.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69035/
> ---
> 
> (Updated Oct. 15, 2018, 9:07 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Added a comment for `Resource.provider_id`.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 5a985fca39cdfb7e9b4775650a7e5dbe68c3b8ae 
>   include/mesos/v1/mesos.proto a5ebb786a98bd3fd34745ddc553aa7a751c0e337 
> 
> 
> Diff: https://reviews.apache.org/r/69035/diff/1/
> 
> 
> Testing
> ---
> 
> N/A
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 68806: Fixed outdated comments for mocking the secret generator.

2018-10-31 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68806/#review210235
---


Ship it!




Ship It!

- Meng Zhu


On Sept. 21, 2018, 4:17 p.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68806/
> ---
> 
> (Updated Sept. 21, 2018, 4:17 p.m.)
> 
> 
> Review request for mesos and Benjamin Bannier.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Fixed outdated comments for mocking the secret generator.
> 
> 
> Diffs
> -
> 
>   src/tests/cluster.cpp cb7d3f0298dfa8c11634d6674f93f2f355b48a53 
>   src/tests/slave_authorization_tests.cpp 
> 061e230b52805c4260e93ac83734aed5454ea3b5 
>   src/tests/slave_tests.cpp 9597067799aaedf9d1c9d797454bb4bdf240cde1 
> 
> 
> Diff: https://reviews.apache.org/r/68806/diff/2/
> 
> 
> Testing
> ---
> 
> N/A
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 69010: Synced SLRP checkpoints to the filesystem.

2018-10-31 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69010/
---

(Updated Oct. 31, 2018, 6:21 p.m.)


Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.


Changes
---

Added more comments.


Bugs: MESOS-9281
https://issues.apache.org/jira/browse/MESOS-9281


Repository: mesos


Description
---

Currently if a system crashes, SLRP checkpoints might not be synced to
the filesystem, so it is possible that an old or empty checkpoint will
be read upon recovery. Moreover, if a CSI call has been issued right
before the crash, the recovered state may be inconsistent with the
actual state reported by the plugin. For example, the plugin might have
created a volume but the checkpointed state does not know about it.

To avoid this inconsistency, we always call fsync()  when checkpointing
SLRP states.


Diffs (updated)
-

  src/resource_provider/storage/provider.cpp 
db783b53558811081fb2671e005e8bbbd9edbede 
  src/slave/state.hpp 003211e4670c1092acb1634220d76bafd39e3a20 


Diff: https://reviews.apache.org/r/69010/diff/8/

Changes: https://reviews.apache.org/r/69010/diff/7-8/


Testing
---

make check


Thanks,

Chun-Hung Hsiao



Re: Review Request 69163: Set agent and/or resource provider ID in operation status updates.

2018-10-31 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69163/#review210234
---



Patch looks great!

Reviews applied: [69159, 69160, 68147, 69157, 69158, 69161, 69162, 69163]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Oct. 25, 2018, 10:54 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69163/
> ---
> 
> (Updated Oct. 25, 2018, 10:54 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch sets agent and/or resource provider ID operation status
> update messages. This is not always possible, e.g., some operations
> might fail validation so that no corresponding IDs can be extracted.
> 
> Since operations failing validation are currently directly rejected by
> the master without going through a status update manager, they are not
> retried either. If a master status update manager for operations is
> introduced at a later point it should be possible to forward
> acknowledgements for updates to the master's update manager (no agent
> ID, not resource provider ID).
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp 1662125ed3e47b179ee32d08c1d3af75553066ba 
>   src/common/protobuf_utils.cpp 2ae37ac610894652a0214022a0cf04c4365396dd 
>   src/master/master.cpp f88c7c1f03f0de7236aad9e3bf4bfac82e91bc65 
>   src/resource_provider/storage/provider.cpp 
> db783b53558811081fb2671e005e8bbbd9edbede 
>   src/slave/slave.cpp 74f6fb9036a9ac4f587f53ec2df04eeb4c167bfb 
>   src/tests/slave_tests.cpp 637bedce20f63dfc1702ef549bcea8f790477be3 
> 
> 
> Diff: https://reviews.apache.org/r/69163/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69163: Set agent and/or resource provider ID in operation status updates.

2018-10-31 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69163/#review210227
---



PASS: Mesos patch 69163 was successfully built and tested.

Reviews applied: `['69159', '69160', '68147', '69157', '69158', '69161', 
'69162', '69163']`

All the build artifacts available at: 
http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2557/mesos-review-69163

- Mesos Reviewbot Windows


On Oct. 25, 2018, 10:54 a.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69163/
> ---
> 
> (Updated Oct. 25, 2018, 10:54 a.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch sets agent and/or resource provider ID operation status
> update messages. This is not always possible, e.g., some operations
> might fail validation so that no corresponding IDs can be extracted.
> 
> Since operations failing validation are currently directly rejected by
> the master without going through a status update manager, they are not
> retried either. If a master status update manager for operations is
> introduced at a later point it should be possible to forward
> acknowledgements for updates to the master's update manager (no agent
> ID, not resource provider ID).
> 
> 
> Diffs
> -
> 
>   src/common/protobuf_utils.hpp 1662125ed3e47b179ee32d08c1d3af75553066ba 
>   src/common/protobuf_utils.cpp 2ae37ac610894652a0214022a0cf04c4365396dd 
>   src/master/master.cpp f88c7c1f03f0de7236aad9e3bf4bfac82e91bc65 
>   src/resource_provider/storage/provider.cpp 
> db783b53558811081fb2671e005e8bbbd9edbede 
>   src/slave/slave.cpp 74f6fb9036a9ac4f587f53ec2df04eeb4c167bfb 
>   src/tests/slave_tests.cpp 637bedce20f63dfc1702ef549bcea8f790477be3 
> 
> 
> Diff: https://reviews.apache.org/r/69163/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 68147: Added agent support to remove local resource providers.

2018-10-31 Thread Benjamin Bannier


> On Aug. 18, 2018, 12:29 a.m., Chun-Hung Hsiao wrote:
> > src/slave/slave.cpp
> > Lines 7934 (patched)
> > 
> >
> > `OPERATION_GONE_BY_OPERATOR` is not a terminal state:
> > 
> > https://github.com/apache/mesos/blob/808485da01387bd27a51cb82a90b1f8301d613ee/include/mesos/mesos.proto#L2342-L2349
> 
> Benjamin Bannier wrote:
> From https://reviews.apache.org/r/68410/#comment_rc290945-207560,
> 
> > ... It seems that OPERATOR_GONE_BY_OPERATOR does not carry the 
> semantics we want. Looking at the other possible error codes, it seems we 
> could either use OPERATION_ERROR or OPERATION_FAILED instead where 
> OPERATION_ERROR possibly better fits our requirements as the operation's side 
> effects might have materialized. That would mean we should drop this patch 
> completely.  
> > WDYT?
> 
> Chun-Hung Hsiao wrote:
> I'm not sure if either is a good choice:
> 
> `OPERATION_ERROR` usually means some validation failure before applying 
> the operation, the consumed resource remains (i.e., can be used by other 
> operations), but the operation is non-retryable (i.e., the caller must fix 
> the error).
> `OPERATION_FAILED` usually means something goes wrong after applying the 
> operation, the consumed resource remains, and the operation is retryable.
> 
> In the case of mark RP gone, although we can argue that any operation 
> becomes an non-retryable one and we don't care about the consumed resoure 
> because it will then be removed. There still are some issues:
> 
> 1. The reason of the operation being non-retryable is not due to any 
> client-side error, but a server-side "error," and this seems inconsistent to 
> the current `OPERATION_ERROR` usage.
> 2. We transition an operation to `OPERATION_GONE_BY_OPERATOR` when 
> marking an agent as gone, but to `OPERATION_ERROR` when marking a local RP as 
> gone. This sounds inconsistent to me from an API perspective.
> 3. For `OPERATION_ERROR` and `OPERATION_FAILED`, Mesos provides reliable 
> retry (at least under "normal" circumstances), and expects acks. For 
> `OPERATION_GONE_BY_OPERATOR`, we could just do a best-effort notification 
> (the implementation is missing though) and don't need any ack, similar to 
> `TASK_GONE_BY_OPERATOR`. In the case of mark RP as gone, we probably want to 
> go with the later.
> 
> So it seems better if we still go with `OPERATION_GONE_BY_OPERATOR` and 
> make the proper master changes. And we need to make sure that we call 
> `allocator->updateSlave` before calling `allocator->recoverResources` to 
> avoid the consumed resource being offered to other framework in between.

I added changes to use `OPERATION_GONE_BY_OPERATOR` here, and to treat it as 
terminal.


> On Aug. 18, 2018, 12:29 a.m., Chun-Hung Hsiao wrote:
> > src/slave/slave.cpp
> > Lines 8195 (patched)
> > 
> >
> > Let's validate that there is no task using the resources provided by 
> > this RP before doing the removal, or fail the call.
> 
> Benjamin Bannier wrote:
> I believe this should be treated as orthogonal issue. It might e.g., 
> happen that the user triggering the RP removal cannot kill tasks or prevent 
> them from being rescheduled. I'd suggest to leaves as is.
> 
> WDYT?
> 
> Chun-Hung Hsiao wrote:
> The scenario I am worried about is that when the RP is marked as gone, 
> the RP may fail to tear itself down if there is any task using its resources.
> 
> Benjamin Bannier wrote:
> We don't seem to have a good story on incompatible resource changes ATM. 
> If we e.g., update a RP config in an incompatible way, we make not attempt to 
> detect or rectify this in any way, either. At least in the agent we were in 
> general we were operating under the assumption that tasks would die on 
> incompatible changes to their resources.
> 
> I'd suggest to drop this issue for now and potentially file a JIRA to 
> track draining and situations where it might be needed.

Dropping this, see above comment.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68147/#review207559
---


On Oct. 25, 2018, 12:47 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68147/
> ---
> 
> (Updated Oct. 25, 2018, 12:47 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, Greg Mann, and Jan 
> Schlicht.
> 
> 
> Bugs: MESOS-8403
> https://issues.apache.org/jira/browse/MESOS-8403
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds 

Re: Review Request 69162: Added agent and resource provider IDs to operation status messages.

2018-10-31 Thread Benjamin Bannier


> On Oct. 25, 2018, 2:13 p.m., James DeFelice wrote:
> > include/mesos/v1/scheduler/scheduler.proto
> > Lines 147 (patched)
> > 
> >
> > For these "certain cases" does Mesos still expect an ACK? If so, that's 
> > a problem.

No, it currently does not as these updates are just sent from the master not a 
status update manager.

Dropping.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69162/#review210031
---


On Oct. 25, 2018, 12:54 p.m., Benjamin Bannier wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69162/
> ---
> 
> (Updated Oct. 25, 2018, 12:54 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao, Gastón Kleiman, and James DeFelice.
> 
> 
> Bugs: MESOS-9293
> https://issues.apache.org/jira/browse/MESOS-9293
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch add agent and resource provide IDs to
> `UpdateOperationStatus` and `UpdateOperationStatusMessage`. With that
> frameworks are able to reconcile enough information after failover to
> construct operation acknowledgements.
> 
> We will add code to populate these fields in a follow-up patch.
> 
> 
> Diffs
> -
> 
>   include/mesos/scheduler/scheduler.proto 
> f6a780a7b75878b9e74402a28a25bb868f7ac36f 
>   include/mesos/v1/scheduler/scheduler.proto 
> fcfec5e417463103e98dd6917722b4fde41cac7c 
>   src/internal/evolve.cpp aa60efed5a28ed4d847a4a27aa8e994cda82175d 
>   src/messages/messages.cpp dd8f60ecdbc06d10be1152bee1ddb65feaaf8fbb 
>   src/messages/messages.proto 41e6a8a2eab0ae7c2878c1d3286c5dea0eb68ed7 
> 
> 
> Diff: https://reviews.apache.org/r/69162/diff/2/
> 
> 
> Testing
> ---
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>



Re: Review Request 69036: WIP: Changed the semantics of `CREATE_DISK` and `DESTROY_DISK` operations.

2018-10-31 Thread James DeFelice


> On Oct. 23, 2018, 2:51 p.m., James DeFelice wrote:
> > include/mesos/mesos.proto
> > Lines 2030 (patched)
> > 
> >
> > Referring to an implementation artifact like the "disk profile adaptor" 
> > seems strange in this context. Perhaps re-word this?
> > 
> > Instead of:
> > > has been removed by the disk profile adaptor
> > 
> > Maybe something like this instead?
> > > is no longer known by Mesos
> 
> Chun-Hung Hsiao wrote:
> The "disk profile adaptor" is a public module interface in Mesos so I was 
> trying to be explicit. WDYT?

" the disk resource is no longer reported by the disk profile adapter." ?

the adapter interface just reports a matrix, right? it doesn't actually 
create/remove things.


- James


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69036/#review209920
---


On Oct. 23, 2018, 2:37 a.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69036/
> ---
> 
> (Updated Oct. 23, 2018, 2:37 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, James DeFelice, and Jie Yu.
> 
> 
> Bugs: MESOS-9275
> https://issues.apache.org/jira/browse/MESOS-9275
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The semantics of these two operations has been updated to provide
> primitives to import CSI volumes and recover CSI volumes against agent
> ID changes and metadata loss.
> 
> 
> Diffs
> -
> 
>   include/mesos/mesos.proto 5a985fca39cdfb7e9b4775650a7e5dbe68c3b8ae 
>   include/mesos/v1/mesos.proto a5ebb786a98bd3fd34745ddc553aa7a751c0e337 
> 
> 
> Diff: https://reviews.apache.org/r/69036/diff/4/
> 
> 
> Testing
> ---
> 
> Test will be done later in the chain.
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 69010: Synced SLRP checkpoints to the filesystem.

2018-10-31 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69010/#review210216
---



Patch looks great!

Reviews applied: [69009, 69085, 69010]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Oct. 30, 2018, 9:08 p.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69010/
> ---
> 
> (Updated Oct. 30, 2018, 9:08 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht.
> 
> 
> Bugs: MESOS-9281
> https://issues.apache.org/jira/browse/MESOS-9281
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> Currently if a system crashes, SLRP checkpoints might not be synced to
> the filesystem, so it is possible that an old or empty checkpoint will
> be read upon recovery. Moreover, if a CSI call has been issued right
> before the crash, the recovered state may be inconsistent with the
> actual state reported by the plugin. For example, the plugin might have
> created a volume but the checkpointed state does not know about it.
> 
> To avoid this inconsistency, we always call fsync()  when checkpointing
> SLRP states.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/storage/provider.cpp 
> db783b53558811081fb2671e005e8bbbd9edbede 
>   src/slave/state.hpp 003211e4670c1092acb1634220d76bafd39e3a20 
> 
> 
> Diff: https://reviews.apache.org/r/69010/diff/7/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>