> On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > I've only gotten halfway through ... but there is a bunch here already. I'd > > like to break this up into at least four patches. (1) The utils stuff that > > was added. (2) The master changes. (3) The slave::path namespace stuff. (3) > > The status update manager API + implementation (but not the slave using it > > yet). And (4) the slave using each of these components, and the executor > > changes that are included. > > > > These comments are across all of those patches, but I'll make future passes > > on each of those components.
addressed the comments for the utils.hpp part. Will send a review for utils.hpp and protobuf_utils.hpp (forgot to include it in this review) shortly. > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 211 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line211> > > > > Add that this is at the current file position of the file descriptor. done > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 218 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line218> > > > > How is this helpful? (If this came from my code, it should be removed.) it did come from your code. killed it. > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 259 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line259> > > > > Blah. fixed format. > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 253 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line253> > > > > I'd prefer if this did not seek to the beginning and read the file, but > > rather read from the current position until the end (and have the comment > > say as much). line 263 was a bug. fixed it. this line should make sense now! > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 264 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line264> > > > > This looks like a bug ('offset' as the third argument?). good catch! fixed. > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 267 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line267> > > > > No need for the space here though. fixed > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 276 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line276> > > > > Again, should be killed (only makes sense in a macro). killed > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 290 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line290> > > > > You should refactor the protobuf::read and protobuf::write to use these > > versions of read and write now as well. didnt get u? > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 329 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line329> > > > > man 3 dirname (and use it please). aha..done > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 532 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line532> > > > > s/file_pattern/pattern done > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 534 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line534> > > > > s/result/results done > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 536 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line536> > > > > Kill. done > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 538 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line538> > > > > Why not return a Try instead? done > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 546 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line546> > > > > Why is this a hack? because we are using exists() function to check isDir() semantics, based on the knowledge that the entry always exists. > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 549 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line549> > > > > s/p/result or s/p/path done > On 2012-04-25 22:11:01, Benjamin Hindman wrote: > > src/common/utils.hpp, line 555 > > <https://reviews.apache.org/r/4462/diff/3/?file=103024#file103024line555> > > > > Kill. done - Vinod ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4462/#review7232 ----------------------------------------------------------- On 2012-04-19 16:53:07, Vinod Kone wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/4462/ > ----------------------------------------------------------- > > (Updated 2012-04-19 16:53:07) > > > Review request for mesos, Benjamin Hindman and John Sirois. > > > Summary > ------- > > Sorry for the huge CL! > > Slave restarts now supports recovery! > --> Non-disruptive restart means running tasks are not lost > --> Re-connects with live executors > --> Checkpoints and reliably sends status updates > --> Ability to kill executors if the slave upgrade is incompatible with > running executors > > > This addresses bug mesos-110. > https://issues.apache.org/jira/browse/mesos-110 > > > Diffs > ----- > > src/Makefile.am d5edaa2 > src/common/hashset.hpp 1feb610 > src/common/utils.hpp 1d81e21 > src/exec/exec.cpp e8db407 > src/launcher/launcher.cpp a141b9a > src/local/local.hpp 55f9eaf > src/local/local.cpp affe432 > src/master/master.cpp 4dc9ee0 > src/messages/messages.proto 87e1548 > src/sched/sched.cpp dcadb10 > src/scripts/killtree.sh bceae9d > src/slave/constants.hpp f0c8679 > src/slave/http.cpp 19c48a0 > src/slave/isolation_module.hpp c896908 > src/slave/lxc_isolation_module.hpp b7beefe > src/slave/lxc_isolation_module.cpp 66a2a89 > src/slave/main.cpp 85cba25 > src/slave/process_based_isolation_module.hpp f6f9554 > src/slave/process_based_isolation_module.cpp 2b37d42 > src/slave/slave.hpp 279bc7b > src/slave/slave.cpp 3358ec4 > src/slave/statusupdates_manager.hpp PRE-CREATION > src/slave/statusupdates_manager.cpp PRE-CREATION > src/tests/external_tests.cpp d1b20e4 > src/tests/fault_tolerance_tests.cpp 6772daf > src/tests/slave_restart_tests.cpp PRE-CREATION > src/tests/utils.hpp e81ec82 > > Diff: https://reviews.apache.org/r/4462/diff > > > Testing > ------- > > make check. > > Note that only the new test in tests/slave_restart_tests.cpp engages in > recovery! > > Recovery is disabled for old tests (though they still checkpoint relevant > info!) > > > Thanks, > > Vinod > >
