----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/8762/#review16990 -----------------------------------------------------------
src/exec/exec.cpp <https://reviews.apache.org/r/8762/#comment35992> s/install </install</ src/exec/exec.cpp <https://reviews.apache.org/r/8762/#comment35993> Your error message here makes it sound like this case could happen? Why is this a check then? If this case shouldn't ever happen (because you programmed it that way) then you'll want an error message more along the lines of "Unexpected duplicated task ...". src/exec/exec.cpp <https://reviews.apache.org/r/8762/#comment35995> Should 'slaveId' be an option? I don't really like the .IsInitialized semantics. src/exec/exec.cpp <https://reviews.apache.org/r/8762/#comment35994> s/re-connect/reconnect/ src/exec/exec.cpp <https://reviews.apache.org/r/8762/#comment35996> I tend to put spaces around my elipsis. src/exec/exec.cpp <https://reviews.apache.org/r/8762/#comment35997> Why this change? src/exec/exec.cpp <https://reviews.apache.org/r/8762/#comment35998> At this point in the review I still don't exactly know how/why you're saving all the tasks. I see you do a duplicate check but that seems a bit crazy to save the tasks. Also, sometimes you refer to the tasks as "unacknowledged" and sometimes you refer to them as "un-updated" ... some more comments here are probably necessary. src/exec/exec.cpp <https://reviews.apache.org/r/8762/#comment35999> Or you could remove the other old style comments and put a period at the end of this comment. ;) src/exec/exec.cpp <https://reviews.apache.org/r/8762/#comment36000> Doing numify<bool> seems a little weird here. Why not just: checkpoint = string(value) == "1"; Or something like it? Likewise, if you replace all of the getenv calls here with their modern os::getenv versions then you can get a string back instead of a char*. src/slave/cgroups_isolation_module.cpp <https://reviews.apache.org/r/8762/#comment36002> . src/slave/cgroups_isolation_module.cpp <https://reviews.apache.org/r/8762/#comment36001> Not your bug, but s/tag/uuid/g. Also, I'm much more inclined to always store UUIDs as the UUID type in code and only use toString to pretty print them. Then people don't have to think about what the representation of this UUID might be given that a string could either be the pretty-printed version or just the bytes. This also isn't really your bug, but we should fix this. Note that when we use the UUID to construct the cgroup name or directories then it's totally fine to use the pretty-printed version, let's just do it there not here. src/slave/lxc_isolation_module.cpp <https://reviews.apache.org/r/8762/#comment36003> Just elide the paramater name entirely. src/slave/process_based_isolation_module.cpp <https://reviews.apache.org/r/8762/#comment36004> Ditto. src/slave/process_based_isolation_module.cpp <https://reviews.apache.org/r/8762/#comment36005> Ditto. src/slave/slave.hpp <https://reviews.apache.org/r/8762/#comment36006> Comment or add a TODO on why const&. src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36007> Single quote to make it clear what is "unknown". src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36008> s/shutdown/shut down/g src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36009> I think we need a more descriptive name for 'active' (something more along the lines of shutdown like 'halt' or 'halted' or 'halting'. src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36010> Your spelling fix is incorrect. src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36011> s/it is/it was/ src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36012> s/Re-l/rel/ src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36013> s/Re-l/Rel/ src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36014> Maybe we should do this first? Before we handle status updates or relaunch any tasks? src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36027> What "event" does this deliver? How is it used? Some comments here are necessary. Finally, who owns the promise? Is this is a memory leak? src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36015> s/:/,/ src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36022> Also, add a todo here to just embed the defer directly in the 'then' below (I have a review forthcoming for that!). src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36016> Don't seem like very useful temporaries to me. src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36017> With naming conflicts like this, I'd prefer s/fw/f/. src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36018> Same here, although s/exec/e/ is rather unfortunate. I think we probably want to change the name of framework and executor above. If you do that I could see pulling out more temporaries above too (in this case it makes more sense, the temporaries you pulled out earlier didn't really make sense). src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36019> Comment doesn't make sense. src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36020> So, we have the isolation module recover this executor even if we're just killing it? Probably needs some more comments about this interplay. src/slave/slave.cpp <https://reviews.apache.org/r/8762/#comment36021> s/un-registered/unregistered/ src/slave/status_update_manager.cpp <https://reviews.apache.org/r/8762/#comment36023> Are there other kinds of directories a recover function would need? src/tests/slave_recovery_tests.cpp <https://reviews.apache.org/r/8762/#comment36024> s/",t/", t/ src/tests/slave_recovery_tests.cpp <https://reviews.apache.org/r/8762/#comment36026> Is there a way to just pass in the type T and then instantiate that directly? I think Thomas might have done something like this in allocator tests. If we can do without using the factory here that will save someone time when we eliminate these factories for factory functions. src/tests/slave_recovery_tests.cpp <https://reviews.apache.org/r/8762/#comment36025> This is not Python. - Benjamin Hindman On Feb. 20, 2013, 2:10 a.m., Vinod Kone wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/8762/ > ----------------------------------------------------------- > > (Updated Feb. 20, 2013, 2:10 a.m.) > > > Review request for mesos, Benjamin Hindman and Ben Mahler. > > > Description > ------- > > Properly recovers and reconnects with executors. > > Also, recovers isolation module. > > This is pretty much the whole of recovery! (I'm going to send out another > tiny review for properly doing incompatible upgrades) > > > Diffs > ----- > > src/exec/exec.cpp 821a94fab1f5969183ecf9e28d7b6bc10920db24 > src/launcher/launcher.hpp ead58c0f60b6bff532f241a224b547b25e292175 > src/launcher/launcher.cpp f2d68c5fd1201b75c5a8f96599d7fb36516687fb > src/launcher/main.cpp e90df85ae433431defdd251490c74a482a58f743 > src/messages/messages.proto 815fcbbcb4a8643f50950a294cedf7281b2a187f > src/slave/cgroups_isolation_module.hpp > 669efa14ba2603764aa68ae19a44e79dbfdec192 > src/slave/cgroups_isolation_module.cpp > 14f549edaf1b37a6bca8f75309864333ae775e7c > src/slave/constants.hpp ddf02570caf3793106b3c48e158a5bb48c1ae80c > src/slave/constants.cpp 1735a6b55a93e6537a5a119e5345961f3d84a000 > src/slave/flags.hpp 39e57f4104ee7a1538436ebbb9493581e28c99dd > src/slave/isolation_module.hpp b962365ebeddd047896a66b02a327aa26ae323d3 > src/slave/lxc_isolation_module.hpp 2bc844f491befbe588965da2ada7cfcef0b6f0a4 > src/slave/lxc_isolation_module.cpp 30cff2a49339bb07030727d30352536a0a22d58c > src/slave/process_based_isolation_module.hpp > f1817192582e3646f8dcf17934ba7998829e8fd6 > src/slave/process_based_isolation_module.cpp > 12a579cba56cd3dac384bc7919b0d5537b0e429d > src/slave/slave.hpp e9f7b659ca2860501840b3d01e69915ebd162039 > src/slave/slave.cpp d4721c3eb51db87278d05f6fbe2eadb8a3a9b4dd > src/slave/state.cpp 2688567719c5b12f1d8a56fa6193c11d4077685b > src/slave/status_update_manager.hpp PRE-CREATION > src/slave/status_update_manager.cpp PRE-CREATION > src/tests/slave_recovery_tests.cpp PRE-CREATION > src/tests/utils.hpp be457117515ee727af101370b26bf9188afb8f45 > > Diff: https://reviews.apache.org/r/8762/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Vinod Kone > >
