> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote: > > src/slave/slave.hpp, line 583 > > <https://reviews.apache.org/r/38874/diff/2/?file=1091510#file1091510line583> > > > > Could you add the closed() method here to make this consistent with the > > scheduler's HttpConnection struct in the master? We may want to consolidate > > these two structs later.
Done > On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote: > > src/slave/slave.hpp, lines 622-625 > > <https://reviews.apache.org/r/38874/diff/2/?file=1091510#file1091510line622> > > > > Looks like you also want to warn about TERMINATING? > > > > How about just printing the state instead of saying "disconnected" to > > be a bit more explicit? When the executor is launched, the agent sets its `state = REGISTERING`. So it looks fine to have the `state == REGISTERING` check. However, what I had missed was, when the agent sends an executor a `Shutdown` event, it marks it's state as `TERMINATING` and then sends the event. In short, we would always be disconnected when the states are `TERMINATED/REGISTERING`. I added the state to the `LOG(WARNING)`. Thanks for the suggestion. > On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote: > > src/slave/slave.cpp, lines 5117-5119 > > <https://reviews.apache.org/r/38874/diff/2/?file=1091511#file1091511line5117> > > > > Per our chat and my other suggestion, hoping we can turn this case into > > neither 'pid' nor 'http' being set. Fixed > On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote: > > src/slave/slave.hpp, line 671 > > <https://reviews.apache.org/r/38874/diff/2/?file=1091510#file1091510line671> > > > > Does this first "when" belong? It seems a bit confusing that we have > > this pid = UPID() case, ideally we can be explicit about when we don't know > > the connection type yet by having both set to None, i.e. the equivalent of: > > > > ``` > > Either<None, UPID, HttpConnection> > > ``` As we had discussed offline, `PID/HTTP` would be both `None()` on launch. While recovering, if the agent is not able to find the `pid/http` marker file it would set `pid = UPID()`. > On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote: > > src/slave/slave.cpp, line 2391 > > <https://reviews.apache.org/r/38874/diff/2/?file=1091511#file1091511line2391> > > > > Ah, I meant to remove `reply` completely due to it being error prone > > (the implicit use of a UPID means it breaks if called asynchronously), see: > > > > https://issues.apache.org/jira/browse/MESOS-765 > > Here's a case of it causing a bug: > > https://issues.apache.org/jira/browse/MESOS-1310 > > > > We should update these in a separate patch to do explicit sending based > > on `from` and remove `reply` completely. Sending in a patch for these shortly :) > On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote: > > src/slave/slave.cpp, line 2840 > > <https://reviews.apache.org/r/38874/diff/2/?file=1091511#file1091511line2840> > > > > It's a little weird to have these pid checks down here. > > > > I was going to suggest having a CHECK_SOME at the top of each message > > handler, but I'm not sure that's safe (can these paths get triggered when > > 'pid' is none?). Previously, we supported other executors to send `statusUpdates` on behalf of other executors as you had pointed out. If the other executor has not yet `registered`, the value of `executor->pid` would still be `None()` for it. So , I had to include these extra `isSome()` checks since these paths can get triggered when `pid` is None() > On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote: > > src/slave/slave.cpp, lines 2850-2851 > > <https://reviews.apache.org/r/38874/diff/2/?file=1091511#file1091511line2850> > > > > Just for some context, we originally added this for Aurora's GC > > executor, which has now been obsoleted by reconciliation and is no longer > > in use. > > > > With the HTTP api, this gets tricker to support (we can't use PIDs and > > have to use session identifiers for this). Given that, I'm not sure if we > > should continue supporting it in the HTTP code paths. Based on my conversations with Vinod + what we had already discussed, we won't be supporting the ability of `HTTP` executors to send messages on behalf of others. Hopefully, that should make the code-paths in `MESOS-3476` much simpler. > On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote: > > src/slave/slave.hpp, lines 668-669 > > <https://reviews.apache.org/r/38874/diff/2/?file=1091510#file1091510line668> > > > > "libprocess message passing" Fixed - Anand ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/38874/#review101830 ----------------------------------------------------------- On Oct. 6, 2015, 2:22 a.m., Anand Mazumdar wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/38874/ > ----------------------------------------------------------- > > (Updated Oct. 6, 2015, 2:22 a.m.) > > > Review request for mesos, Ben Mahler, Isabel Jimenez, and Vinod Kone. > > > Bugs: MESOS-3480 > https://issues.apache.org/jira/browse/MESOS-3480 > > > Repository: mesos > > > Description > ------- > > This change refactors the Executor struct on Agent and adds support for > Executors to connect via the `api/v1/executor` endpoint on Agent. This is > similar to the change done in Master for the Scheduler HTTP API. > > > Diffs > ----- > > src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 > src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c > > Diff: https://reviews.apache.org/r/38874/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Anand Mazumdar > >
