> On March 1, 2013, 8:43 p.m., Ben Mahler wrote: > > src/exec/exec.cpp, line 393 > > <https://reviews.apache.org/r/8762/diff/7/?file=263157#file263157line393> > > > > Can we maybe spell this out in the public executor interfaces? > > Considering how harsh this is, I like the harshness if we have this > > contract, as people will _definitely_ discover they are doing this. > > > > One would probably assume the driver will store messages for later > > delivery if sent pre-registration. That would be a nicer contract than > > doing this, but I see the added complexity =/
Decided to pass SlaveID through environment. So, no need for enforcing this in the contract. Also, now, we actually buffer status updates. > On March 1, 2013, 8:43 p.m., Ben Mahler wrote: > > src/exec/exec.cpp, line 539 > > <https://reviews.apache.org/r/8762/diff/7/?file=263157#file263157line539> > > > > Were you going to add slave id to these? > > > > While you're at it: https://issues.apache.org/jira/browse/MESOS-334 > > > > Or maybe let's not do too many things in this change.. done. > On March 1, 2013, 8:43 p.m., Ben Mahler wrote: > > src/slave/slave.cpp, lines 415-416 > > <https://reviews.apache.org/r/8762/diff/7/?file=263174#file263174line415> > > > > given the context here, "terminating" seems more appropriate > > > > you would set terminating to true, and then you call terminate, makes > > more sense IMO "terminating" would be confusing, because intuitively it sounds like someone called 'terminate()'. > On March 1, 2013, 8:43 p.m., Ben Mahler wrote: > > src/slave/slave.cpp, line 1115 > > <https://reviews.apache.org/r/8762/diff/7/?file=263174#file263174line1115> > > > > Does this imply the StatusUpdateManager provides idempotent primitives > > in general? yes. > On March 1, 2013, 8:43 p.m., Ben Mahler wrote: > > src/slave/slave.cpp, line 1151 > > <https://reviews.apache.org/r/8762/diff/7/?file=263174#file263174line1151> > > > > Kill "If we are here, ", I know the comment is for this spot in the > > code because the comment was written in this spot for a reason! ;) i like the emphasis. > On March 1, 2013, 8:43 p.m., Ben Mahler wrote: > > src/slave/slave.cpp, line 1153 > > <https://reviews.apache.org/r/8762/diff/7/?file=263174#file263174line1153> > > > > Isn't the reaper only for the process isolation module? nope. > On March 1, 2013, 8:43 p.m., Ben Mahler wrote: > > src/slave/slave.cpp, line 1702 > > <https://reviews.apache.org/r/8762/diff/7/?file=263174#file263174line1702> > > > > quotes on executor id but not framework id? as i explained in an earlier review, this has been the convention, for better or worse. > On March 1, 2013, 8:43 p.m., Ben Mahler wrote: > > src/slave/slave.cpp, line 1753 > > <https://reviews.apache.org/r/8762/diff/7/?file=263174#file263174line1753> > > > > s/by/for/? for tasks, by isolation module. > On March 1, 2013, 8:43 p.m., Ben Mahler wrote: > > src/slave/slave.cpp, line 1797 > > <https://reviews.apache.org/r/8762/diff/7/?file=263174#file263174line1797> > > > > else? else, no action needed. > On March 1, 2013, 8:43 p.m., Ben Mahler wrote: > > src/tests/slave_recovery_tests.cpp, line 616 > > <https://reviews.apache.org/r/8762/diff/7/?file=263177#file263177line616> > > > > Line wrap at 70 ? > On March 1, 2013, 8:43 p.m., Ben Mahler wrote: > > src/slave/cgroups_isolation_module.cpp, line 620 > > <https://reviews.apache.org/r/8762/diff/7/?file=263163#file263163line620> > > > > Can you split off the killed case in terms of the failed message? > > > > I see unknown as more serious than killed (since killed can happen due > > to known races). I don't see the benefit, considering the fact that a killed executor becomes unknown as soon as it gets cleaned up. Also note, that we use the same phrasing across a bunch of functions, across isolation modules. > On March 1, 2013, 8:43 p.m., Ben Mahler wrote: > > src/slave/process_based_isolation_module.cpp, line 329 > > <https://reviews.apache.org/r/8762/diff/7/?file=263171#file263171line329> > > > > ditto on splitting the failure message see above. - Vinod ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/8762/#review17260 ----------------------------------------------------------- On Feb. 27, 2013, 7:59 p.m., Vinod Kone wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/8762/ > ----------------------------------------------------------- > > (Updated Feb. 27, 2013, 7:59 p.m.) > > > Review request for mesos, Benjamin Hindman and Ben Mahler. > > > Description > ------- > > Properly recovers and reconnects with executors. > > Also, recovers isolation module. > > This is pretty much the whole of recovery! (I'm going to send out another > tiny review for properly doing incompatible upgrades) > > > Diffs > ----- > > src/exec/exec.cpp 821a94fab1f5969183ecf9e28d7b6bc10920db24 > src/launcher/launcher.hpp ead58c0f60b6bff532f241a224b547b25e292175 > src/launcher/launcher.cpp f2d68c5fd1201b75c5a8f96599d7fb36516687fb > src/launcher/main.cpp e90df85ae433431defdd251490c74a482a58f743 > src/messages/messages.proto 815fcbbcb4a8643f50950a294cedf7281b2a187f > src/slave/cgroups_isolation_module.hpp > a04fc46b15d2741886f5847cadbdc9bed351c588 > src/slave/cgroups_isolation_module.cpp > a779de80d13c67e507d7d2ee788fcdaa5e71daca > src/slave/constants.hpp 08db75f3e5af0ae79298546386deddf00beb8de9 > src/slave/constants.cpp 159dd12ec1e6b5f33fd75fc3a66d8ac897fc40b7 > src/slave/flags.hpp 9cd436cae5772b906b3566eb1f344904d5894214 > src/slave/isolation_module.hpp d7cc02b8ea0cde591ee61059efd79fbacaa74769 > src/slave/lxc_isolation_module.hpp 6be99038fb547e43e74ef5469f2339bbe82017cf > src/slave/lxc_isolation_module.cpp 1ed3c87f884089e032bba2ed89be109abe6c1c13 > src/slave/process_based_isolation_module.hpp > 6a4e6aef9ab4b13cce9341f6f35c6b24e6f8fde7 > src/slave/process_based_isolation_module.cpp > ff98d105af675dfc66070feaa43b42c1aa438fd8 > src/slave/reaper.cpp c0ee4b4c07fd792bcb39455b666808b712eb32c2 > src/slave/slave.hpp 7648c33230c1900eda7529045c5df9ccab105d47 > src/slave/slave.cpp 8c2e1bfc363491c681177676f9dfe5f229276f7d > src/slave/state.cpp 2688567719c5b12f1d8a56fa6193c11d4077685b > src/slave/status_update_manager.cpp PRE-CREATION > src/tests/slave_recovery_tests.cpp PRE-CREATION > src/tests/utils.hpp 4600c2f6e0ceabd6a38f8bb762da314b795c23a0 > > Diff: https://reviews.apache.org/r/8762/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Vinod Kone > >
