Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-12 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review197383
---


Ship it!




Would be great to see some TODOs or a ticket to clean up the confusion around 
StartSlave not starting the slave when mock=true.

- Benjamin Mahler


On Feb. 12, 2018, 7:48 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 12, 2018, 7:48 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp c5593c25dde18c005f195a1885a8586bc72c849f 
>   src/tests/mesos.cpp ac789297d6aa034121b73e4efc0aafee1ee3b60f 
>   src/tests/slave_tests.cpp 628b0d0fc862264c9553e1660c7df548df9cd4a1 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-12 Thread Benjamin Mahler


> On Feb. 10, 2018, 1:17 a.m., Benjamin Mahler wrote:
> > src/tests/mesos.cpp
> > Line 391 (original), 400 (patched)
> > 
> >
> > I realize this is a copy/paste, but do you know why we don't start() 
> > when mock is true?
> 
> Meng Zhu wrote:
> If we want to set up expect calls for the agent start code path, we need 
> to (1) create the mock slave instance; (2) setup expectations; (3) start the 
> slave.
> 
> Benjamin Mahler wrote:
> Ah that make sense, but it's confusing that StartSlave doesn't start the 
> slave if mock=false. Probably we should have only had something like 
> CreateSlave or CreateUnstartedSlave to avoid this confusion?

That makes sense, but it's confusing that StartSlave doesn't start a slave, 
isn't it? =/

Ideally we would have something more accurately named, like CreateSlave.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review197211
---


On Feb. 12, 2018, 7:48 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 12, 2018, 7:48 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp c5593c25dde18c005f195a1885a8586bc72c849f 
>   src/tests/mesos.cpp ac789297d6aa034121b73e4efc0aafee1ee3b60f 
>   src/tests/slave_tests.cpp 628b0d0fc862264c9553e1660c7df548df9cd4a1 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-12 Thread Benjamin Mahler


> On Feb. 10, 2018, 1:17 a.m., Benjamin Mahler wrote:
> > src/tests/mesos.cpp
> > Line 391 (original), 400 (patched)
> > 
> >
> > I realize this is a copy/paste, but do you know why we don't start() 
> > when mock is true?
> 
> Meng Zhu wrote:
> If we want to set up expect calls for the agent start code path, we need 
> to (1) create the mock slave instance; (2) setup expectations; (3) start the 
> slave.

Ah that make sense, but it's confusing that StartSlave doesn't start the slave 
if mock=false. Probably we should have only had something like CreateSlave or 
CreateUnstartedSlave to avoid this confusion?


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review197211
---


On Feb. 12, 2018, 7:48 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 12, 2018, 7:48 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp c5593c25dde18c005f195a1885a8586bc72c849f 
>   src/tests/mesos.cpp ac789297d6aa034121b73e4efc0aafee1ee3b60f 
>   src/tests/slave_tests.cpp 628b0d0fc862264c9553e1660c7df548df9cd4a1 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-12 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review197315
---



PASS: Mesos patch 65497 was successfully built and tested.

Reviews applied: `['65613', '65497']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65497

- Mesos Reviewbot Windows


On Feb. 12, 2018, 7:48 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 12, 2018, 7:48 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp c5593c25dde18c005f195a1885a8586bc72c849f 
>   src/tests/mesos.cpp ac789297d6aa034121b73e4efc0aafee1ee3b60f 
>   src/tests/slave_tests.cpp 628b0d0fc862264c9553e1660c7df548df9cd4a1 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-12 Thread Meng Zhu


> On Feb. 7, 2018, 6:19 p.m., Benjamin Mahler wrote:
> > src/tests/mock_slave.cpp
> > Lines 107-108 (original), 107-109 (patched)
> > 
> >
> > What's going on here?
> 
> Meng Zhu wrote:
> For the agent failover test, I need the failed agent to start with the 
> same pid. There is a parameter `id` you can pass to the Slave constructor 
> above. But it turns out the original code ignore that argument and just call 
> `process::ID::generate("slave")` every time, changed it to take the argument.
> 
> Benjamin Mahler wrote:
> I see, so this was a bug? Can you fix this in a separate patch?
> 
> I also didn't follow why you needed to add the `ProcessBase` constructor 
> call, since that's done within the slave constructor already? What does it 
> mean for it to happen twice?

OK, separate patch for `id` initialization in #65613.


> On Feb. 7, 2018, 6:19 p.m., Benjamin Mahler wrote:
> > src/tests/slave_tests.cpp
> > Lines 4926-4927 (patched)
> > 
> >
> > even after the exeutor has been registered..? I think you want to 
> > remove that?
> 
> Meng Zhu wrote:
> Why? The agent failover happens after the executor has been registered. 
> Then the executor re-registers and got killed.
> 
> Benjamin Mahler wrote:
> Oh I read it as the executor being registered after failover rather than 
> before, how about:
> 
> ```
> // This test verifies that if an agent fails over after registering
> // a v1 executor but before delivering its initial task groups, the
> // executor will be shut down since all of its initial task groups
> // were dropped. See MESOS-8411.
> ```
> 
> This also avoids the hanging sentence at the end that mentions task group 
> by stating it within the description.

Sounds good. Thank you!


> On Feb. 7, 2018, 6:19 p.m., Benjamin Mahler wrote:
> > src/tests/slave_tests.cpp
> > Lines 4945 (patched)
> > 
> >
> > "NotSlave"?
> > 
> > We should probably have found a way to use the logic from 
> > Master::newSlaveId here.
> 
> Meng Zhu wrote:
> Changed it to "slave". That we will need to change cluster::master, I do 
> not think it is important here. Let's just use "slave".
> 
> Benjamin Mahler wrote:
> Oh, I mistook this for the SlaveID rather than the PID::id. I'm not sure 
> I followed your response, but re-using the master code isn't applicable here 
> anyway.
> 
> To avoid this confusion for other readers, can you do the following?
> 
> ```
> string processId = process::ID::generate("slave");
> 
> StartSlave(..., processId, ...);
> ```

Sounds good.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review197070
---


On Feb. 12, 2018, 11:48 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 12, 2018, 11:48 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp c5593c25dde18c005f195a1885a8586bc72c849f 
>   src/tests/mesos.cpp ac789297d6aa034121b73e4efc0aafee1ee3b60f 
>   src/tests/slave_tests.cpp 628b0d0fc862264c9553e1660c7df548df9cd4a1 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-12 Thread Meng Zhu


> On Feb. 9, 2018, 5:17 p.m., Benjamin Mahler wrote:
> > src/tests/mesos.cpp
> > Line 391 (original), 400 (patched)
> > 
> >
> > I realize this is a copy/paste, but do you know why we don't start() 
> > when mock is true?

If we want to set up expect calls for the agent start code path, we need to (1) 
create the mock slave instance; (2) setup expectations; (3) start the slave.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review197211
---


On Feb. 12, 2018, 11:48 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 12, 2018, 11:48 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp c5593c25dde18c005f195a1885a8586bc72c849f 
>   src/tests/mesos.cpp ac789297d6aa034121b73e4efc0aafee1ee3b60f 
>   src/tests/slave_tests.cpp 628b0d0fc862264c9553e1660c7df548df9cd4a1 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-12 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/
---

(Updated Feb. 12, 2018, 11:48 a.m.)


Review request for mesos and Benjamin Mahler.


Changes
---

Patch updated. Thanks for the comments!


Bugs: MESOS-8411
https://issues.apache.org/jira/browse/MESOS-8411


Repository: mesos


Description
---

This test verifies that the v1 executor is shutdown if all of its
initial tasks could not be delivered when re-subscribing with
a recovered agent. See MESOS-8411.


Diffs (updated)
-

  src/tests/mesos.hpp c5593c25dde18c005f195a1885a8586bc72c849f 
  src/tests/mesos.cpp ac789297d6aa034121b73e4efc0aafee1ee3b60f 
  src/tests/slave_tests.cpp 628b0d0fc862264c9553e1660c7df548df9cd4a1 


Diff: https://reviews.apache.org/r/65497/diff/4/

Changes: https://reviews.apache.org/r/65497/diff/3-4/


Testing
---

make check


Thanks,

Meng Zhu



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-09 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review197211
---




src/tests/mesos.cpp
Line 391 (original), 400 (patched)


I realize this is a copy/paste, but do you know why we don't start() when 
mock is true?



src/tests/slave_tests.cpp
Lines 5034 (patched)


Two spaces here?


- Benjamin Mahler


On Feb. 3, 2018, 1:18 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 3, 2018, 1:18 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp c5593c25dde18c005f195a1885a8586bc72c849f 
>   src/tests/mesos.cpp ac789297d6aa034121b73e4efc0aafee1ee3b60f 
>   src/tests/mock_slave.cpp 597d7abef20dd5f89b16e4616233f02760b9d037 
>   src/tests/slave_tests.cpp b54688add9949b9c4ac2ce3a42503a411e6da09f 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-09 Thread Benjamin Mahler


> On Feb. 8, 2018, 2:19 a.m., Benjamin Mahler wrote:
> > src/tests/mock_slave.cpp
> > Lines 107-108 (original), 107-109 (patched)
> > 
> >
> > What's going on here?
> 
> Meng Zhu wrote:
> For the agent failover test, I need the failed agent to start with the 
> same pid. There is a parameter `id` you can pass to the Slave constructor 
> above. But it turns out the original code ignore that argument and just call 
> `process::ID::generate("slave")` every time, changed it to take the argument.

I see, so this was a bug? Can you fix this in a separate patch?

I also didn't follow why you needed to add the `ProcessBase` constructor call, 
since that's done within the slave constructor already? What does it mean for 
it to happen twice?


> On Feb. 8, 2018, 2:19 a.m., Benjamin Mahler wrote:
> > src/tests/slave_tests.cpp
> > Lines 4926-4927 (patched)
> > 
> >
> > even after the exeutor has been registered..? I think you want to 
> > remove that?
> 
> Meng Zhu wrote:
> Why? The agent failover happens after the executor has been registered. 
> Then the executor re-registers and got killed.

Oh I read it as the executor being registered after failover rather than 
before, how about:

```
// This test verifies that if an agent fails over after registering
// a v1 executor but before delivering its initial task groups, the
// executor will be shut down since all of its initial task groups
// were dropped. See MESOS-8411.
```

This also avoids the hanging sentence at the end that mentions task group by 
stating it within the description.


> On Feb. 8, 2018, 2:19 a.m., Benjamin Mahler wrote:
> > src/tests/slave_tests.cpp
> > Lines 4945 (patched)
> > 
> >
> > "NotSlave"?
> > 
> > We should probably have found a way to use the logic from 
> > Master::newSlaveId here.
> 
> Meng Zhu wrote:
> Changed it to "slave". That we will need to change cluster::master, I do 
> not think it is important here. Let's just use "slave".

Oh, I mistook this for the SlaveID rather than the PID::id. I'm not sure I 
followed your response, but re-using the master code isn't applicable here 
anyway.

To avoid this confusion for other readers, can you do the following?

```
string processId = process::ID::generate("slave");

StartSlave(..., processId, ...);
```


> On Feb. 8, 2018, 2:19 a.m., Benjamin Mahler wrote:
> > src/tests/slave_tests.cpp
> > Lines 5032-5038 (patched)
> > 
> >
> > Instead of using a slave mock here, can we just expect the executorLost 
> > signal from the scheduler?
> 
> Meng Zhu wrote:
> This is v1 scheduler, I think the `Failure` message is not as clear as 
> the `_shutdownExecutor` call of the slave.

Ok! Another option is to expect the container is destroyed, but this looks 
good, marking as resolved.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review197070
---


On Feb. 3, 2018, 1:18 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 3, 2018, 1:18 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp c5593c25dde18c005f195a1885a8586bc72c849f 
>   src/tests/mesos.cpp ac789297d6aa034121b73e4efc0aafee1ee3b60f 
>   src/tests/mock_slave.cpp 597d7abef20dd5f89b16e4616233f02760b9d037 
>   src/tests/slave_tests.cpp b54688add9949b9c4ac2ce3a42503a411e6da09f 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-09 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review197156
---



Patch looks great!

Reviews applied: [65109, 65110, 65111, 65369, 65496, 65497]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' 
CONFIGURATION='--verbose --disable-libtool-wrappers' ENVIRONMENT='GLOG_v=1 
MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Feb. 3, 2018, 9:18 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 3, 2018, 9:18 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp c5593c25dde18c005f195a1885a8586bc72c849f 
>   src/tests/mesos.cpp ac789297d6aa034121b73e4efc0aafee1ee3b60f 
>   src/tests/mock_slave.cpp 597d7abef20dd5f89b16e4616233f02760b9d037 
>   src/tests/slave_tests.cpp b54688add9949b9c4ac2ce3a42503a411e6da09f 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-08 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review197126
---



PASS: Mesos patch 65497 was successfully built and tested.

Reviews applied: `['65497']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65497

- Mesos Reviewbot Windows


On Feb. 3, 2018, 1:18 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 3, 2018, 1:18 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp c5593c25dde18c005f195a1885a8586bc72c849f 
>   src/tests/mesos.cpp ac789297d6aa034121b73e4efc0aafee1ee3b60f 
>   src/tests/mock_slave.cpp 597d7abef20dd5f89b16e4616233f02760b9d037 
>   src/tests/slave_tests.cpp b54688add9949b9c4ac2ce3a42503a411e6da09f 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-08 Thread Meng Zhu


> On Feb. 7, 2018, 6:19 p.m., Benjamin Mahler wrote:
> > src/tests/mock_slave.cpp
> > Lines 107-108 (original), 107-109 (patched)
> > 
> >
> > What's going on here?

For the agent failover test, I need the failed agent to start with the same 
pid. There is a parameter `id` you can pass to the Slave constructor above. But 
it turns out the original code ignore that argument and just call 
`process::ID::generate("slave")` every time, changed it to take the argument.


> On Feb. 7, 2018, 6:19 p.m., Benjamin Mahler wrote:
> > src/tests/slave_tests.cpp
> > Lines 4926-4927 (patched)
> > 
> >
> > even after the exeutor has been registered..? I think you want to 
> > remove that?

Why? The agent failover happens after the executor has been registered. Then 
the executor re-registers and got killed.


> On Feb. 7, 2018, 6:19 p.m., Benjamin Mahler wrote:
> > src/tests/slave_tests.cpp
> > Lines 4945 (patched)
> > 
> >
> > "NotSlave"?
> > 
> > We should probably have found a way to use the logic from 
> > Master::newSlaveId here.

Changed it to "slave". That we will need to change cluster::master, I do not 
think it is important here. Let's just use "slave".


> On Feb. 7, 2018, 6:19 p.m., Benjamin Mahler wrote:
> > src/tests/slave_tests.cpp
> > Lines 5032-5038 (patched)
> > 
> >
> > Instead of using a slave mock here, can we just expect the executorLost 
> > signal from the scheduler?

This is v1 scheduler, I think the `Failure` message is not as clear as the 
`_shutdownExecutor` call of the slave.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review197070
---


On Feb. 2, 2018, 5:18 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 2, 2018, 5:18 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp c5593c25dde18c005f195a1885a8586bc72c849f 
>   src/tests/mesos.cpp ac789297d6aa034121b73e4efc0aafee1ee3b60f 
>   src/tests/mock_slave.cpp 597d7abef20dd5f89b16e4616233f02760b9d037 
>   src/tests/slave_tests.cpp b54688add9949b9c4ac2ce3a42503a411e6da09f 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-07 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review197070
---



Looks good, just a few minor comments.


src/tests/mock_slave.cpp
Lines 107-108 (original), 107-109 (patched)


What's going on here?



src/tests/slave_tests.cpp
Lines 4926-4927 (patched)


even after the exeutor has been registered..? I think you want to remove 
that?



src/tests/slave_tests.cpp
Lines 4945 (patched)


"NotSlave"?

We should probably have found a way to use the logic from 
Master::newSlaveId here.



src/tests/slave_tests.cpp
Lines 5032-5038 (patched)


Instead of using a slave mock here, can we just expect the executorLost 
signal from the scheduler?


- Benjamin Mahler


On Feb. 3, 2018, 1:18 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 3, 2018, 1:18 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp a35c68e8645384f6244d17e37cad71373aba6893 
>   src/tests/mesos.cpp d751b2e9c635eb6a5039678de426467176cda908 
>   src/tests/mock_slave.cpp 8357edc7b3a35624c813eccb9ecca9d3b5dbe07c 
>   src/tests/slave_tests.cpp f393a8433a984267adc4db307ef07fcbafd1e62f 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-06 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review196908
---



PASS: Mesos patch 65497 was successfully built and tested.

Reviews applied: `['65109', '65110', '65111', '65369', '65496', '65497']`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65497

- Mesos Reviewbot Windows


On Feb. 3, 2018, 1:18 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 3, 2018, 1:18 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp a35c68e8645384f6244d17e37cad71373aba6893 
>   src/tests/mesos.cpp d751b2e9c635eb6a5039678de426467176cda908 
>   src/tests/mock_slave.cpp 8357edc7b3a35624c813eccb9ecca9d3b5dbe07c 
>   src/tests/slave_tests.cpp f393a8433a984267adc4db307ef07fcbafd1e62f 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 65497: Added test to verify task-less executor is shutdown when re-subscribing.

2018-02-02 Thread Mesos Reviewbot Windows

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65497/#review196777
---



FAIL: Failed to apply the dependent review: 65496.

Failed command: `python.exe .\support\apply-reviews.py -n -r 65496`

All the build artifacts available at: 
http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65497

Relevant logs:

- 
[apply-review-65496-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/65497/logs/apply-review-65496-stdout.log):

```
error: patch failed: src/slave/slave.cpp:4238
error: src/slave/slave.cpp: patch does not apply
```

- Mesos Reviewbot Windows


On Feb. 3, 2018, 1:18 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65497/
> ---
> 
> (Updated Feb. 3, 2018, 1:18 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-8411
> https://issues.apache.org/jira/browse/MESOS-8411
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This test verifies that the v1 executor is shutdown if all of its
> initial tasks could not be delivered when re-subscribing with
> a recovered agent. See MESOS-8411.
> 
> 
> Diffs
> -
> 
>   src/tests/mesos.hpp a35c68e8645384f6244d17e37cad71373aba6893 
>   src/tests/mesos.cpp d751b2e9c635eb6a5039678de426467176cda908 
>   src/tests/mock_slave.cpp 8357edc7b3a35624c813eccb9ecca9d3b5dbe07c 
>   src/tests/slave_tests.cpp f393a8433a984267adc4db307ef07fcbafd1e62f 
> 
> 
> Diff: https://reviews.apache.org/r/65497/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>