> On Feb. 1, 2018, 10:32 p.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/main.cpp
> > Lines 40-50 (patched)
> > <https://reviews.apache.org/r/65465/diff/1/?file=1951378#file1951378line40>
> >
> >     Flying by. Why this logic is not in launch.cpp? Sounds to me it's 
> > unrelated to, for example, Mount below?
> 
> Andrew Schwartzmeyer wrote:
>     Where in `launch.cpp` would you put it? The handle needs to exist for 
> exactly as long as the process exists (or as close as we can get, which 
> putting it here gets it really close).

well, i don't think putting here or in launch.cpp has any noticible difference 
in terms of "closeness" (probably a dozen of instructions?).

my question is: is this logic only related to the launch of a container or not? 
If yes, this should be moved to launch.cpp (i.e., 
`MesosContainerizerLaunch::execute()`).


- Jie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65465/#review196662
-----------------------------------------------------------


On Feb. 1, 2018, 7:57 p.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65465/
> -----------------------------------------------------------
> 
> (Updated Feb. 1, 2018, 7:57 p.m.)
> 
> 
> Review request for mesos, Akash Gupta, Jie Yu, and Joseph Wu.
> 
> 
> Bugs: MESOS-8519
>     https://issues.apache.org/jira/browse/MESOS-8519
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The Windows OS deletes the job object created in the agent process when
> the agent dies, because no other process holds a handle to it (despite
> processes being assigned to the job object). While this is
> counter-intuitive, it is the observed behavior. So in order for recovery
> to succeed, the containerizer must also hold an otherwise unused handle
> to its job object to keep it alive in the kernel, and available for
> recovery to find.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/main.cpp 
> a53ccd68bf975d919f9d1f920cf3fa74d4e43f24 
> 
> 
> Diff: https://reviews.apache.org/r/65465/diff/1/
> 
> 
> Testing
> -------
> 
> ```
> [----------] Global test environment tear-down
> [==========] 874 tests from 85 test cases ran. (253311 ms total)
> [  PASSED  ] 874 tests.
> 
> I0201 12:46:58.159368  3116 slave.cpp:6921] Recovering framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
> I0201 12:46:58.159368  3116 slave.cpp:8543] Recovering executor 
> 'notepad.01d79d48-0791-11e8-8f77-02421c3bc93c' of framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
> I0201 12:46:58.162847  9456 task_status_update_manager.cpp:207] Recovering 
> task status update manager
> I0201 12:46:58.162847  9456 task_status_update_manager.cpp:215] Recovering 
> executor 'notepad.01d79d48-0791-11e8-8f77-02421c3bc93c' of framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
> I0201 12:46:58.166851  7344 containerizer.cpp:674] Recovering containerizer
> I0201 12:46:58.167351  7344 containerizer.cpp:731] Recovering container 
> 69cefa53-61e0-444b-a808-e38ffb4cb18f for executor 
> 'notepad.01d79d48-0791-11e8-8f77-02421c3bc93c' of framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
> I0201 12:46:58.183379 17088 provisioner.cpp:493] Provisioner recovery complete
> I0201 12:46:58.186367 16792 slave.cpp:6695] Sending reconnect request to 
> executor 'notepad.01d79d48-0791-11e8-8f77-02421c3bc93c' of framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000 at executor(1)@10.123.7.41:52591
> I0201 12:46:58.194370  7344 slave.cpp:4519] Received re-registration message 
> from executor 'notepad.01d79d48-0791-11e8-8f77-02421c3bc93c' of framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000
> I0201 12:47:00.193958 16792 slave.cpp:4737] Cleaning up un-reregistered 
> executors
> I0201 12:47:00.193958 16792 slave.cpp:6824] Finished recovery
> I0201 12:47:00.200943  9456 task_status_update_manager.cpp:181] Pausing 
> sending task status updates
> I0201 12:47:00.200943  3116 slave.cpp:1146] New master detected at 
> master@10.123.6.78:5050
> I0201 12:47:00.200943  3116 slave.cpp:1190] No credentials provided. 
> Attempting to register without authentication
> I0201 12:47:00.200943  3116 slave.cpp:1201] Detecting new master
> I0201 12:47:00.214944 16792 slave.cpp:1471] Re-registered with master 
> master@10.123.6.78:5050
> I0201 12:47:00.214944 13180 task_status_update_manager.cpp:188] Resuming 
> sending task status updates
> I0201 12:47:00.215942 16792 slave.cpp:1516] Forwarding agent update 
> {"operations":{},"resource_version_uuid" 
> {"value":"jLIL1d\/PQnuwmFxpMf8CLQ=="},"slave_id":{"value":"7dc02270-a4e1-4f59-9ad7-56bad5182ea4S3"},"update_oversubscribed_resources":true}
> I0201 12:47:00.219952  3116 slave.cpp:3625] Updating info for framework 
> eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000 with pid updated to 
> scheduler-aaa62980-8b1b-4775-b8bb-c6890b41941e@10.123.6.78:45907
> I0201 12:47:00.233942  7344 task_status_update_manager.cpp:188] Resuming 
> sending task status updates
> ```
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>

Reply via email to