Re: Review Request 51407: Updated mesos containerizer to checkpoint container runtime information.

Jie Yu Sun, 18 Sep 2016 16:51:50 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51407/#review149406
-----------------------------------------------------------





src/slave/containerizer/mesos/containerizer.cpp (lines 151 - 153)
<https://reviews.apache.org/r/51407/#comment217015>

    We we make this a member function of MesosContainerizer. In that way, no 
need to pass in the 'flags'
    
    Binding 'flags' sounds expensive for each reap callas flags can be pretty 
big. You can use
    ```
    .then(defer(self(), [this](...) ...)
    ```
    to avoid binding flags.



src/slave/containerizer/mesos/containerizer.cpp (lines 1344 - 1362)
<https://reviews.apache.org/r/51407/#comment217019>

    THis can be combined by simply using state::checkpoint(...)



src/slave/containerizer/mesos/containerizer.cpp (line 1353)
<https://reviews.apache.org/r/51407/#comment217018>

    I'd have a helper path function for pid file as well.



src/slave/containerizer/mesos/containerizer.cpp (lines 1791 - 1797)
<https://reviews.apache.org/r/51407/#comment217021>

    Hum, should we do the deletion here if we don't even sure the processes are 
killed properly?
    
    I think my suggestion on RuntimePath for container is:
    1) we create the RuntimePath as the first thing in 'launch', even before we 
call any provisioner/isolator functions.
    2) we checkpoint the pid right after fork
    3) we delete the RuntimePath after the destroy is successful.
    
    The invariants we have if using the above way are:
    1) If RuntimePath exists, we know that provisioner/isolator prepare might 
be called, so cleanup is necessary during recovery.
    2) If RuntimePath does not exist, we know that all cleanups have been done 
properly and we no longer need to worry about cleanup.
    3) If pid file exists, we know that the process has been forked.
    4) If the pid file does not exist, we may or maynot have process being 
forked.
    
    For the upgrade situation, some checkpointed containers or orphans may not 
have RuntimePath. In such case, we should not create RuntimePath (i.e., do not 
checkpoint pid or exit status) in order to maintain the above invariant. So for 
old containers launched by previous version of agent, there will be no 
RuntimePath for it at any time (this is another invariant).
    
    It's likely that a container has RuntimePath, but launcher does not know 
about it. This is the case where launcher->destroy has been called (and being 
successful), but agent crashes before removing RuntimePath.
    
    It's also likely that a container does not have RuntimePath, but launcher 
knows about it. This is the legacy container case.


- Jie Yu


On Sept. 18, 2016, 11:14 p.m., Kevin Klues wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51407/
> -----------------------------------------------------------
> 
> (Updated Sept. 18, 2016, 11:14 p.m.)
> 
> 
> Review request for mesos and Jie Yu.
> 
> 
> Bugs: MESOS-6204
>     https://issues.apache.org/jira/browse/MESOS-6204
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This includes checkpointing both the container pid and the status of
> the container upon exit. This also includes an update to tests to
> account for new 'init' process semantics in a container. That is, the
> name of the init process of the container is now "mesos-containerizer"
> not "sh".
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/containerizer.cpp 
> e54169ba00f6e0cdd7043075b4cdd12d714c99e3 
>   src/tests/containerizer/isolator_tests.cpp 
> 93ce75180520d382f63ce7323be844fe43c6594e 
> 
> Diff: https://reviews.apache.org/r/51407/diff/
> 
> 
> Testing
> -------
> 
> $ GTEST_FILTER="" make -j check
> $ src/mesos-tests
> $ sudo src/mesos-tests
> 
> 
> Thanks,
> 
> Kevin Klues
> 
>

Re: Review Request 51407: Updated mesos containerizer to checkpoint container runtime information.

Reply via email to