Re: Review Request 64475: Initialized offer operation status update manager in SLRP.

2017-12-13 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64475/
---

(Updated Dec. 13, 2017, 8:20 p.m.)


Review request for mesos, Gaston Kleiman, Greg Mann, and Jie Yu.


Changes
---

Addressed Jie's comments.


Repository: mesos


Description
---

This patch adds an agent filesystem layout for checkpointing offer
operation status updates for resource providers, and initialized
a status update manager in storage local resource provider.


Diffs (updated)
-

  src/resource_provider/storage/provider.cpp 
e806f44ef33405d4a2b133576c60be56e9fe3435 
  src/slave/paths.hpp d645d871c36bbe8e766a98650f6aa23b6eab65d8 
  src/slave/paths.cpp b8004e76964abc210820368a89dbfa6928ef7bfd 


Diff: https://reviews.apache.org/r/64475/diff/5/

Changes: https://reviews.apache.org/r/64475/diff/4-5/


Testing
---

sudo make check


Thanks,

Chun-Hung Hsiao



Re: Review Request 64475: Initialized offer operation status update manager in SLRP.

2017-12-13 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64475/#review193722
---


Fix it, then Ship it!





src/slave/paths.cpp
Lines 581-587 (patched)


Can you use `basename` here to extract the uuid part?


- Jie Yu


On Dec. 13, 2017, 2:33 a.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64475/
> ---
> 
> (Updated Dec. 13, 2017, 2:33 a.m.)
> 
> 
> Review request for mesos, Gaston Kleiman, Greg Mann, and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds an agent filesystem layout for checkpointing offer
> operation status updates for resource providers, and initialized
> a status update manager in storage local resource provider.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/storage/provider.cpp 
> e806f44ef33405d4a2b133576c60be56e9fe3435 
>   src/slave/paths.hpp d645d871c36bbe8e766a98650f6aa23b6eab65d8 
>   src/slave/paths.cpp b8004e76964abc210820368a89dbfa6928ef7bfd 
> 
> 
> Diff: https://reviews.apache.org/r/64475/diff/4/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 64475: Initialized offer operation status update manager in SLRP.

2017-12-12 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64475/#review193637
---


Ship it!




Ship It!

- Gaston Kleiman


On Dec. 12, 2017, 6:33 p.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64475/
> ---
> 
> (Updated Dec. 12, 2017, 6:33 p.m.)
> 
> 
> Review request for mesos, Gaston Kleiman, Greg Mann, and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds an agent filesystem layout for checkpointing offer
> operation status updates for resource providers, and initialized
> a status update manager in storage local resource provider.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/storage/provider.cpp 
> e806f44ef33405d4a2b133576c60be56e9fe3435 
>   src/slave/paths.hpp d645d871c36bbe8e766a98650f6aa23b6eab65d8 
>   src/slave/paths.cpp b8004e76964abc210820368a89dbfa6928ef7bfd 
> 
> 
> Diff: https://reviews.apache.org/r/64475/diff/4/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 64475: Initialized offer operation status update manager in SLRP.

2017-12-12 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64475/
---

(Updated Dec. 13, 2017, 2:33 a.m.)


Review request for mesos, Gaston Kleiman, Greg Mann, and Jie Yu.


Changes
---

Fixed a typo. Thanks Gaston!


Repository: mesos


Description
---

This patch adds an agent filesystem layout for checkpointing offer
operation status updates for resource providers, and initialized
a status update manager in storage local resource provider.


Diffs (updated)
-

  src/resource_provider/storage/provider.cpp 
e806f44ef33405d4a2b133576c60be56e9fe3435 
  src/slave/paths.hpp d645d871c36bbe8e766a98650f6aa23b6eab65d8 
  src/slave/paths.cpp b8004e76964abc210820368a89dbfa6928ef7bfd 


Diff: https://reviews.apache.org/r/64475/diff/4/

Changes: https://reviews.apache.org/r/64475/diff/3-4/


Testing
---

sudo make check


Thanks,

Chun-Hung Hsiao



Re: Review Request 64475: Initialized offer operation status update manager in SLRP.

2017-12-12 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64475/
---

(Updated Dec. 13, 2017, 12:31 a.m.)


Review request for mesos, Gaston Kleiman, Greg Mann, and Jie Yu.


Changes
---

Addressed Gaston's comments.


Repository: mesos


Description
---

This patch adds an agent filesystem layout for checkpointing offer
operation status updates for resource providers, and initialized
a status update manager in storage local resource provider.


Diffs (updated)
-

  src/resource_provider/storage/provider.cpp 
e806f44ef33405d4a2b133576c60be56e9fe3435 
  src/slave/paths.hpp d645d871c36bbe8e766a98650f6aa23b6eab65d8 
  src/slave/paths.cpp b8004e76964abc210820368a89dbfa6928ef7bfd 


Diff: https://reviews.apache.org/r/64475/diff/3/

Changes: https://reviews.apache.org/r/64475/diff/2-3/


Testing
---

sudo make check


Thanks,

Chun-Hung Hsiao



Re: Review Request 64475: Initialized offer operation status update manager in SLRP.

2017-12-12 Thread Gaston Kleiman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64475/#review193587
---




src/resource_provider/storage/provider.cpp
Line 362 (original), 365 (patched)


s/synchoronusly/synchronously/



src/resource_provider/storage/provider.cpp
Lines 2476-2477 (patched)


This loses the error message returned by `slave::state::checkpoint()`.

I think that the following would be make debugging easier:

```
  Try result = slave::state::checkpoint(
  statePath, volumes.at(volumeId).state);

  CHECK_SOME(result) << "Failed to checkpoint volume state to '" << 
statePath
 << "': " << result.error();
```



src/resource_provider/storage/provider.cpp
Lines 2445-2447 (original), 2492-2494 (patched)


```
// TODO(chhsiao): Maintain a list of terminated but unacknowledged
// offer operations in memory and reconstruct it during recovery
// by querying the status update manager.
```



src/resource_provider/storage/provider.cpp
Lines 2541-2543 (patched)


Once the following patch is committed, `statusUpdate.has_latest_status()` 
will always return `true`, so we don't the if statement: 
https://reviews.apache.org/r/64521/


- Gaston Kleiman


On Dec. 12, 2017, 9:51 a.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64475/
> ---
> 
> (Updated Dec. 12, 2017, 9:51 a.m.)
> 
> 
> Review request for mesos, Gaston Kleiman, Greg Mann, and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds an agent filesystem layout for checkpointing offer
> operation status updates for resource providers, and initialized
> a status update manager in storage local resource provider.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/storage/provider.cpp 
> e806f44ef33405d4a2b133576c60be56e9fe3435 
>   src/slave/paths.hpp d645d871c36bbe8e766a98650f6aa23b6eab65d8 
>   src/slave/paths.cpp b8004e76964abc210820368a89dbfa6928ef7bfd 
> 
> 
> Diff: https://reviews.apache.org/r/64475/diff/2/
> 
> 
> Testing
> ---
> 
> sudo make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 64475: Initialized offer operation status update manager in SLRP.

2017-12-12 Thread Chun-Hung Hsiao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64475/
---

(Updated Dec. 12, 2017, 5:51 p.m.)


Review request for mesos, Gaston Kleiman, Greg Mann, and Jie Yu.


Changes
---

Addressed Greg's comments and moved the SUM initialization after SUBSCRIBED.


Repository: mesos


Description
---

This patch adds an agent filesystem layout for checkpointing offer
operation status updates for resource providers, and initialized
a status update manager in storage local resource provider.


Diffs (updated)
-

  src/resource_provider/storage/provider.cpp 
e806f44ef33405d4a2b133576c60be56e9fe3435 
  src/slave/paths.hpp d645d871c36bbe8e766a98650f6aa23b6eab65d8 
  src/slave/paths.cpp b8004e76964abc210820368a89dbfa6928ef7bfd 


Diff: https://reviews.apache.org/r/64475/diff/2/

Changes: https://reviews.apache.org/r/64475/diff/1-2/


Testing (updated)
---

sudo make check


Thanks,

Chun-Hung Hsiao



Re: Review Request 64475: Initialized offer operation status update manager in SLRP.

2017-12-11 Thread Greg Mann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64475/#review193434
---


Fix it, then Ship it!





src/resource_provider/storage/provider.cpp
Lines 2533 (patched)


Since we will have multiple operation IDs once feedback is implemented, 
let's be explicit here:

"Failed to send status update for offer operation with operation_uuid "


- Greg Mann


On Dec. 9, 2017, 12:06 a.m., Chun-Hung Hsiao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64475/
> ---
> 
> (Updated Dec. 9, 2017, 12:06 a.m.)
> 
> 
> Review request for mesos, Greg Mann and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch adds an agent filesystem layout for checkpointing offer
> operation status updates for resource providers, and initialized
> a status update manager in storage local resource provider.
> 
> 
> Diffs
> -
> 
>   src/resource_provider/storage/provider.cpp 
> 2193866e83850a04a3dc8231ab07e6104485f2b6 
>   src/slave/paths.hpp d645d871c36bbe8e766a98650f6aa23b6eab65d8 
>   src/slave/paths.cpp b8004e76964abc210820368a89dbfa6928ef7bfd 
> 
> 
> Diff: https://reviews.apache.org/r/64475/diff/1/
> 
> 
> Testing
> ---
> 
> make
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>



Re: Review Request 64475: Initialized offer operation status update manager in SLRP.

2017-12-11 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64475/#review193400
---



FAIL: Some Mesos tests failed.

Reviews applied: `['64439', '64469', '64475']`

Failed command: `C:\DCOS\mesos\src\mesos-tests.exe --verbose`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64475

Relevant logs:

- 
[mesos-tests-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64475/logs/mesos-tests-stdout.log):

```

[--] 1 test from IsolationFlag/CpuIsolatorTest
[ RUN  ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0
[   OK ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0 (2475 ms)
[--] 1 test from IsolationFlag/CpuIsolatorTest (2513 ms total)

[--] 1 test from IsolationFlag/MemoryIsolatorTest
[ RUN  ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0
[   OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (2565 ms)
[--] 1 test from IsolationFlag/MemoryIsolatorTest (2600 ms total)

[--] Global test environment tear-down
[==] 825 tests from 84 test cases ran. (374018 ms total)
[  PASSED  ] 815 tests.
[  FAILED  ] 10 tests, listed below:
[  FAILED  ] OfferOperationStatusUpdateManagerTest.UpdateAndAckNonTerminalUpdate
[  FAILED  ] OfferOperationStatusUpdateManagerTest.RecoverCheckpointedStream
[  FAILED  ] OfferOperationStatusUpdateManagerTest.RecoverEmptyFile
[  FAILED  ] OfferOperationStatusUpdateManagerTest.RecoverTerminatedStream
[  FAILED  ] OfferOperationStatusUpdateManagerTest.IgnoreDuplicateUpdate
[  FAILED  ] 
OfferOperationStatusUpdateManagerTest.IgnoreDuplicateUpdateAfterRecover
[  FAILED  ] OfferOperationStatusUpdateManagerTest.RejectDuplicateAck
[  FAILED  ] 
OfferOperationStatusUpdateManagerTest.RejectDuplicateAckAfterRecover
[  FAILED  ] 
OfferOperationStatusUpdateManagerTest.NonStrictRecoveryCorruptedFile
[  FAILED  ] SlaveTest.ResourceProviderPublishAll

10 FAILED TESTS
  YOU HAVE 201 DISABLED TESTS

```

- 
[mesos-tests-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64475/logs/mesos-tests-stderr.log):

```
I1211 16:25:11.456578  6300 master.cpp:3332] Deactivating framework 
c63a23ba-89cc-415e-bffd-9253d345b107- (default) at 
scheduler-d0274afd-15fc-4d0d-9910-b2bb18a6bff4@10.3.1.11:64867
I1211 16:25:11.456578  6108 slave.cpp:3400] Shutting down framework 
c63a23ba-89cc-415e-bffd-9253d345b107-
I1211 16:25:11.457577  6108 slave.cpp:6091] Shutting down executor 
'4de9a7db-ed68-4a52-8603-cc116780390b' of framework 
c63a23ba-89cc-415e-bffd-9253d345b107- at executor(1)@10.3.1.11:64888
I1211 16:25:11.457577  8756 hierarchical.cpp:405] Deactivated framework 
c63a23ba-89cc-415e-bffd-9253d345b107-
I1211 16:25:11.457577  6300 master.cpp:10105] Updating the state of task 4I1211 
16:25:10.744648  5568 exec.cpp:162] Version: 1.5.0
I1211 16:25:10.768647  3416 exec.cpp:237] Executor registered on agent 
c63a23ba-89cc-415e-bffd-9253d345b107-S0
I1211 16:25:10.771648  3712 executor.cpp:171] Received SUBSCRIBED event
I1211 16:25:10.775676  3712 executor.cpp:175] Subscribed executor on 
build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net
I1211 16:25:10.776670  3712 executor.cpp:171] Received LAUNCH event
I1211 16:25:10.779644  3712 executor.cpp:637] Starting task 
4de9a7db-ed68-4a52-8603-cc116780390b
I1211 16:25:10.853644  3712 executor.cpp:477] Running 
'C:\DCOS\mesos\src\mesos-containerizer.exe launch '
I1211 16:25:11.421582  3712 executor.cpp:650] Forked command at 3924
I1211 16:25:11.458578  7048 exec.cpp:435] Executor asked to shutdown
I1211 16:25:11.459578  3712 executor.cpp:171] Received SHUTDOWN event
I1211 16:25:11.459578  3712 executor.cpp:747] Shutting down
I1211 16:25:11.459578  3712 executor.cpp:854] Sending SIGTERM to process tree 
at pid 3de9a7db-ed68-4a52-8603-cc116780390b of framework 
c63a23ba-89cc-415e-bffd-9253d345b107- (latest state: TASK_KILLED, status 
update state: TASK_KILLED)
I1211 16:25:11.459578  9752 containerizer.cpp:2328] Destroying container 
fb1537d2-d1c7-4a75-b419-493d744400f3 in RUNNING state
I1211 16:25:11.462577  9752 containerizer.cpp:2930] Transitioning the state of 
container fb1537d2-d1c7-4a75-b419-493d744400f3 from RUNNING to DESTROYING
I1211 16:25:11.463577  9752 launcher.cpp:156] Asked to destroy container 
fb1537d2-d1c7-4a75-b419-493d744400f3
I1211 16:25:11.464581  6300 master.cpp:10211] Removing task 
4de9a7db-ed68-4a52-8603-cc116780390b with resources cpus(allocated: *):4; 
mem(allocated: *):2048; disk(allocated: *):1024; ports(allocated: 
*):[31000-32000] of framework c63a23ba-89cc-415e-bffd-9253d345b107- on 
agent c63a23ba-89cc-415e-bffd-9253d345b107-S0 at slave(326)@10.3.1.11:64867 
(build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I1211 16:25:11.466578  6300 master.cpp:1310] Agent