> On Feb. 1, 2018, 2:32 p.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/main.cpp
> > Lines 40-50 (patched)
> > <https://reviews.apache.org/r/65465/diff/1/?file=1951378#file1951378line40>
> >
> >     Flying by. Why this logic is not in launch.cpp? Sounds to me it's 
> > unrelated to, for example, Mount below?

Where in `launch.cpp` would you put it? The handle needs to exist for exactly 
as long as the process exists (or as close as we can get, which putting it here 
gets it really close).


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65465/#review196662
-----------------------------------------------------------


On Feb. 1, 2018, 11:57 a.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65465/
> -----------------------------------------------------------
> 
> (Updated Feb. 1, 2018, 11:57 a.m.)
> 
> 
> Review request for mesos, Akash Gupta, Jie Yu, and Joseph Wu.
> 
> 
> Bugs: MESOS-8519
>     https://issues.apache.org/jira/browse/MESOS-8519
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The Windows OS deletes the job object created in the agent process when
> the agent dies, because no other process holds a handle to it (despite
> processes being assigned to the job object). While this is
> counter-intuitive, it is the observed behavior. So in order for recovery
> to succeed, the containerizer must also hold an otherwise unused handle
> to its job object to keep it alive in the kernel, and available for
> recovery to find.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/main.cpp 
> a53ccd68bf975d919f9d1f920cf3fa74d4e43f24 
> 
> 
> Diff: https://reviews.apache.org/r/65465/diff/1/
> 
> 
> Testing
> -------
> 
> ```
> [----------] Global test environment tear-down
> [==========] 874 tests from 85 test cases ran. (253311 ms total)
> [  PASSED  ] 874 tests.
> 
> I0201 12:46:58.159368  3116 slave.cpp:6921] Recovering framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
> I0201 12:46:58.159368  3116 slave.cpp:8543] Recovering executor 
> 'notepad.01d79d48-0791-11e8-8f77-02421c3bc93c' of framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
> I0201 12:46:58.162847  9456 task_status_update_manager.cpp:207] Recovering 
> task status update manager
> I0201 12:46:58.162847  9456 task_status_update_manager.cpp:215] Recovering 
> executor 'notepad.01d79d48-0791-11e8-8f77-02421c3bc93c' of framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
> I0201 12:46:58.166851  7344 containerizer.cpp:674] Recovering containerizer
> I0201 12:46:58.167351  7344 containerizer.cpp:731] Recovering container 
> 69cefa53-61e0-444b-a808-e38ffb4cb18f for executor 
> 'notepad.01d79d48-0791-11e8-8f77-02421c3bc93c' of framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
> I0201 12:46:58.183379 17088 provisioner.cpp:493] Provisioner recovery complete
> I0201 12:46:58.186367 16792 slave.cpp:6695] Sending reconnect request to 
> executor 'notepad.01d79d48-0791-11e8-8f77-02421c3bc93c' of framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000 at executor(1)@10.123.7.41:52591
> I0201 12:46:58.194370  7344 slave.cpp:4519] Received re-registration message 
> from executor 'notepad.01d79d48-0791-11e8-8f77-02421c3bc93c' of framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
> I0201 12:47:00.193958 16792 slave.cpp:4737] Cleaning up un-reregistered 
> executors
> I0201 12:47:00.193958 16792 slave.cpp:6824] Finished recovery
> I0201 12:47:00.200943  9456 task_status_update_manager.cpp:181] Pausing 
> sending task status updates
> I0201 12:47:00.200943  3116 slave.cpp:1146] New master detected at 
> master@10.123.6.78:5050
> I0201 12:47:00.200943  3116 slave.cpp:1190] No credentials provided. 
> Attempting to register without authentication
> I0201 12:47:00.200943  3116 slave.cpp:1201] Detecting new master
> I0201 12:47:00.214944 16792 slave.cpp:1471] Re-registered with master 
> master@10.123.6.78:5050
> I0201 12:47:00.214944 13180 task_status_update_manager.cpp:188] Resuming 
> sending task status updates
> I0201 12:47:00.215942 16792 slave.cpp:1516] Forwarding agent update 
> {"operations":{},"resource_version_uuid" 
> {"value":"jLIL1d\/PQnuwmFxpMf8CLQ=="},"slave_id":{"value":"7dc02270-a4e1-4f59-9ad7-56bad5182ea4S3"},"update_oversubscribed_resources":true}
> I0201 12:47:00.219952  3116 slave.cpp:3625] Updating info for framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000 with pid updated to 
> scheduler-aaa62980-8b1b-4775-b8bb-c6890b41941e@10.123.6.78:45907
> I0201 12:47:00.233942  7344 task_status_update_manager.cpp:188] Resuming 
> sending task status updates
> ```
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>

Reply via email to