----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/8535/#review14360 -----------------------------------------------------------
src/slave/slave.cpp <https://reviews.apache.org/r/8535/#comment30554> Hmm, I'm not sure how I feel about these semantics. I think I'd prefer to see the frameworks tasks keep going lost or failed with a message "Couldn't launch task because framework expects checkpointing but slave doesn't support it ..." rather than the slave being killed. Thoughts? src/slave/slave.cpp <https://reviews.apache.org/r/8535/#comment30556> I wonder if the flag should really be called 'checkpoint' and the 'recover' flag should be whether or not to recover from an existing checkpoint? src/slave/slave.cpp <https://reviews.apache.org/r/8535/#comment30553> s/.// src/slave/slave.cpp <https://reviews.apache.org/r/8535/#comment30558> Could we deal with some of these failures by just killing the framework/task? src/slave/slave.cpp <https://reviews.apache.org/r/8535/#comment30560> Ditto question above. src/slave/slave.cpp <https://reviews.apache.org/r/8535/#comment30563> CHECK_SOME? src/slave/slave.cpp <https://reviews.apache.org/r/8535/#comment30562> Not following this ... src/slave/slave.cpp <https://reviews.apache.org/r/8535/#comment30564> Why a CHECK above and just a LOG here? src/slave/slave.cpp <https://reviews.apache.org/r/8535/#comment30566> Ditto questions above. src/slave/slave.cpp <https://reviews.apache.org/r/8535/#comment30568> Same comments as above. src/tests/protobuf_io_tests.cpp <https://reviews.apache.org/r/8535/#comment30569> Did the compiler want/need these? - Benjamin Hindman On Dec. 12, 2012, 8:58 a.m., Vinod Kone wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/8535/ > ----------------------------------------------------------- > > (Updated Dec. 12, 2012, 8:58 a.m.) > > > Review request for mesos, Benjamin Hindman and Ben Mahler. > > > Description > ------- > > Checkpoints slave id, framework pid, executor pids (libprocess, execed, > forked) and task info. > > > Diffs > ----- > > src/exec/exec.cpp 11cb64dd3f6d3992072582f2b6d6e6d9220eb48d > src/messages/messages.proto 815fcbbcb4a8643f50950a294cedf7281b2a187f > src/slave/cgroups_isolation_module.hpp > 669efa14ba2603764aa68ae19a44e79dbfdec192 > src/slave/cgroups_isolation_module.cpp > 0f2975d1adf874dba4d0a539eb5c99233cef6e6b > src/slave/flags.hpp 39e57f4104ee7a1538436ebbb9493581e28c99dd > src/slave/isolation_module.hpp b962365ebeddd047896a66b02a327aa26ae323d3 > src/slave/lxc_isolation_module.hpp 2bc844f491befbe588965da2ada7cfcef0b6f0a4 > src/slave/lxc_isolation_module.cpp 30cff2a49339bb07030727d30352536a0a22d58c > src/slave/process_based_isolation_module.hpp > f1817192582e3646f8dcf17934ba7998829e8fd6 > src/slave/process_based_isolation_module.cpp > ede080069d814f410662f759806d1d5b260e8569 > src/slave/slave.hpp bbba4404e9e2b1ff1e246f017cdad704438973ba > src/slave/slave.cpp 28fd4c336d8ac658cf92811d20066a6cfdf5a95e > src/tests/protobuf_io_tests.cpp 979efac0f28c3d361f5c347f15933d89d9356bc9 > src/tests/status_update_manager_tests.cpp PRE-CREATION > src/tests/utils.hpp be457117515ee727af101370b26bf9188afb8f45 > > Diff: https://reviews.apache.org/r/8535/diff/ > > > Testing > ------- > > make check > > Will write new tests as part of Part 5, when I write the analogous > read/replay functions of the checkpoint functions. > > > Thanks, > > Vinod Kone > >
