-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65409/
-----------------------------------------------------------

(Updated Feb. 8, 2018, 11:53 a.m.)


Review request for mesos, Akash Gupta, Jie Yu, and Joseph Wu.


Bugs: MESOS-6713
    https://issues.apache.org/jira/browse/MESOS-6713


Repository: mesos


Description
-------

Because it is not possible to delete a file (or a folder recursively)
with open handles on Windows, we have to explicitly `reset()` the agent
before removing the framework meta directory. Otherwise, the task status
update manager will be destructed too late, and so an open handle for
`task.updates` will cause the `os::rmdir` to fail.

This is safe because we previously destructed the agent anyway, just
later in the test when it was reassigned.


Diffs (updated)
-----

  src/tests/slave_recovery_tests.cpp 77aa60c953bd0769eaba05f001755e4cec9ba028 


Diff: https://reviews.apache.org/r/65409/diff/3/

Changes: https://reviews.apache.org/r/65409/diff/2-3/


Testing
-------

make check on CentOS 7, all passed
ctest on Windows, all passed including new SlaveRecoveryTests

Note that while this chain enables recovery of Docker tasks on Windows, it 
explicitly does not fix MESOS-8519 (recovery of job object tasks).

```
I0131 11:52:01.545505  8316 docker.cpp:898] Recovering Docker containers
I0131 11:52:01.546005   660 containerizer.cpp:674] Recovering containerizer
I0131 11:52:01.546505   660 containerizer.cpp:725] Skipping recovery of 
executor 'iis.feae9d12-06ba-11e8-8f77-02421c3bc93c' of framework 
eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000 because it was not launched from 
mesos containerizer
I0131 11:52:01.557006 11272 provisioner.cpp:493] Provisioner recovery complete
I0131 11:52:02.521003  8720 docker.cpp:1008] Recovering container 
'f7978e90-32f5-458d-ad4e-3ffa25a7b190' for executor 
'iis.feae9d12-06ba-11e8-8f77-02421c3bc93c' of framework 
eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
I0131 11:52:02.530527  8316 slave.cpp:6695] Sending reconnect request to 
executor 'iis.feae9d12-06ba-11e8-8f77-02421c3bc93c' of framework 
eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000 at executor(1)@10.123.7.41:63903
I0131 11:52:02.549062  8720 slave.cpp:4519] Received re-registration message 
from executor 'iis.feae9d12-06ba-11e8-8f77-02421c3bc93c' of framework 
eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
I0131 11:52:04.548064 10556 slave.cpp:4737] Cleaning up un-reregistered 
executors
I0131 11:52:04.548064 10556 slave.cpp:6824] Finished recovery
I0131 11:52:04.566066   660 task_status_update_manager.cpp:181] Pausing sending 
task status updates
I0131 11:52:04.567059 14636 slave.cpp:1146] New master detected at 
master@10.123.6.78:5050
I0131 11:52:04.567059 14636 slave.cpp:1190] No credentials provided. Attempting 
to register without authentication
I0131 11:52:04.568047 14636 slave.cpp:1201] Detecting new master
I0131 11:52:04.604035  8720 slave.cpp:1471] Re-registered with master 
master@10.123.6.78:5050
I0131 11:52:04.605060   660 task_status_update_manager.cpp:188] Resuming 
sending task status updates
I0131 11:52:04.606036  8720 slave.cpp:1516] Forwarding agent update 
{"operations":{},"resource_version_uuid":{"value":"mzwol7M6SrGxOml4zYlA8Q=="},"slave_id":{"value":"7dc02270-a4e1-4f59-9ad7-56bad5182ea4-S0"},"update_oversubscribed_resource
s":true}
I0131 11:52:04.612036  8720 slave.cpp:3625] Updating info for framework 
eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000 with pid updated to 
scheduler-aaa62980-8b1b-4775-b8bb-c6890b41941e@10.123.6.78:45907
I0131 11:52:04.636543 13468 task_status_update_manager.cpp:188] Resuming 
sending task status updates
```


Thanks,

Andrew Schwartzmeyer

Reply via email to