-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8535/#review14456
-----------------------------------------------------------



src/exec/exec.cpp
<https://reviews.apache.org/r/8535/#comment30767>

    Can you add these for getpid?
    
    #include <unistd.h>
    
    #include <sys/types.h>



src/messages/messages.proto
<https://reviews.apache.org/r/8535/#comment30768>

    Again, I'm scared about adding required fields.



src/slave/cgroups_isolation_module.cpp
<https://reviews.apache.org/r/8535/#comment30769>

    I know it's the same but given what you're checking I think it's clearer to 
leave the logic to read literally as your message reads:
    
    !(checkpoint && path.isNone())
    
    or
    
    if (checkpoint) {
      CHECK_SOME(path) << "Blah";
    }



src/slave/cgroups_isolation_module.cpp
<https://reviews.apache.org/r/8535/#comment30771>

    Use CHECK_SOME (you don't need to store the write result if you don't like).
    Change the message to use the 'failed' format.
    
    CHECK_SOME(os::write(path.get(), stringify(getpid())))
      << "Failed to checkpoint forked pid " << add pid to message?



src/slave/flags.hpp
<https://reviews.apache.org/r/8535/#comment30772>

    s/Enable checkpointing/checkpoint



src/slave/flags.hpp
<https://reviews.apache.org/r/8535/#comment30776>

    Can you be more explicit about the fact that checkpoint enables slave 
recovery? Or is there another flag for that?
    
    Can you mention that we checkpoint to disk? (there's a cost of disk io)



src/slave/lxc_isolation_module.cpp
<https://reviews.apache.org/r/8535/#comment30777>

    ditto on the check



src/slave/lxc_isolation_module.cpp
<https://reviews.apache.org/r/8535/#comment30778>

    ditto on my other comments with check_some and the check message



src/slave/process_based_isolation_module.cpp
<https://reviews.apache.org/r/8535/#comment30779>

    ditto on the check



src/slave/process_based_isolation_module.cpp
<https://reviews.apache.org/r/8535/#comment30780>

    ditto on the check_some and the check message



src/slave/slave.hpp
<https://reviews.apache.org/r/8535/#comment30781>

    With checkpointing enabled, will the slave id stay the same across 
recovered rolling of the slave?



src/slave/slave.cpp
<https://reviews.apache.org/r/8535/#comment30782>

    Do we end these messages with periods?



src/slave/slave.cpp
<https://reviews.apache.org/r/8535/#comment30786>

    Can you add a relevant message?



src/slave/slave.cpp
<https://reviews.apache.org/r/8535/#comment30783>

    might as well not store result, if not storing the mkdir result either:
    
    CHECK_SOME(::protobuf::write(path, id)) << "Error checkpointing slave id " 
<< id;



src/slave/slave.cpp
<https://reviews.apache.org/r/8535/#comment30787>

    ditto for adding message



src/slave/slave.cpp
<https://reviews.apache.org/r/8535/#comment30784>

    ditto for not storing result if you don't like



src/slave/slave.cpp
<https://reviews.apache.org/r/8535/#comment30788>

    ditto for adding message



src/slave/slave.cpp
<https://reviews.apache.org/r/8535/#comment30789>

    ditto for not storing if you don't like



src/slave/slave.cpp
<https://reviews.apache.org/r/8535/#comment30790>

    ditto for not storing



src/slave/slave.cpp
<https://reviews.apache.org/r/8535/#comment30791>

    ditto for adding message



src/slave/slave.cpp
<https://reviews.apache.org/r/8535/#comment30793>

    ditto for not storing


- Ben Mahler


On Dec. 12, 2012, 11:16 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8535/
> -----------------------------------------------------------
> 
> (Updated Dec. 12, 2012, 11:16 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Description
> -------
> 
> Checkpoints slave id, framework pid, executor pids (libprocess, execed, 
> forked) and task info.
> 
> 
> Diffs
> -----
> 
>   src/exec/exec.cpp 821a94fab1f5969183ecf9e28d7b6bc10920db24 
>   src/messages/messages.proto 815fcbbcb4a8643f50950a294cedf7281b2a187f 
>   src/slave/cgroups_isolation_module.hpp 
> 669efa14ba2603764aa68ae19a44e79dbfdec192 
>   src/slave/cgroups_isolation_module.cpp 
> 0f2975d1adf874dba4d0a539eb5c99233cef6e6b 
>   src/slave/flags.hpp 39e57f4104ee7a1538436ebbb9493581e28c99dd 
>   src/slave/isolation_module.hpp b962365ebeddd047896a66b02a327aa26ae323d3 
>   src/slave/lxc_isolation_module.hpp 2bc844f491befbe588965da2ada7cfcef0b6f0a4 
>   src/slave/lxc_isolation_module.cpp 30cff2a49339bb07030727d30352536a0a22d58c 
>   src/slave/process_based_isolation_module.hpp 
> f1817192582e3646f8dcf17934ba7998829e8fd6 
>   src/slave/process_based_isolation_module.cpp 
> 3d50a4b652e4e09dd57e744e408c8fb79ff3fbf5 
>   src/slave/slave.hpp e9f7b659ca2860501840b3d01e69915ebd162039 
>   src/slave/slave.cpp 9755b46f97173d6fcc9ab1fd63e0e4814b3bc018 
>   src/tests/protobuf_io_tests.cpp 979efac0f28c3d361f5c347f15933d89d9356bc9 
>   src/tests/status_update_manager_tests.cpp PRE-CREATION 
>   src/tests/utils.hpp be457117515ee727af101370b26bf9188afb8f45 
> 
> Diff: https://reviews.apache.org/r/8535/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> Will write new tests as part of Part 5, when I write the analogous 
> read/replay functions of the checkpoint functions.
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>

Reply via email to