> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.hpp, line 583
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091510#file1091510line583>
> >
> >     Could you add the closed() method here to make this consistent with the 
> > scheduler's HttpConnection struct in the master? We may want to consolidate 
> > these two structs later.

Done


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.hpp, lines 622-625
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091510#file1091510line622>
> >
> >     Looks like you also want to warn about TERMINATING?
> >     
> >     How about just printing the state instead of saying "disconnected" to 
> > be a bit more explicit?

When the executor is launched, the agent sets its `state = REGISTERING`. So it 
looks fine to have the `state == REGISTERING` check.

However, what I had missed was, when the agent sends an executor a `Shutdown` 
event, it marks it's state as `TERMINATING` and then sends the event. In short, 
we would always be disconnected when the states are `TERMINATED/REGISTERING`. I 
added the state to the `LOG(WARNING)`. Thanks for the suggestion.


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.cpp, lines 5117-5119
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091511#file1091511line5117>
> >
> >     Per our chat and my other suggestion, hoping we can turn this case into 
> > neither 'pid' nor 'http' being set.

Fixed


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.hpp, line 671
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091510#file1091510line671>
> >
> >     Does this first "when" belong? It seems a bit confusing that we have 
> > this pid = UPID() case, ideally we can be explicit about when we don't know 
> > the connection type yet by having both set to None, i.e. the equivalent of:
> >     
> >     ```
> >     Either<None, UPID, HttpConnection>
> >     ```

As we had discussed offline, `PID/HTTP` would be both `None()` on launch. While 
recovering, if the agent is not able to find the `pid/http` marker file it 
would set `pid = UPID()`.


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.cpp, line 2391
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091511#file1091511line2391>
> >
> >     Ah, I meant to remove `reply` completely due to it being error prone 
> > (the implicit use of a UPID means it breaks if called asynchronously), see:
> >     
> >     https://issues.apache.org/jira/browse/MESOS-765
> >     Here's a case of it causing a bug: 
> > https://issues.apache.org/jira/browse/MESOS-1310
> >     
> >     We should update these in a separate patch to do explicit sending based 
> > on `from` and remove `reply` completely.

Sending in a patch for these shortly :)


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.cpp, line 2840
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091511#file1091511line2840>
> >
> >     It's a little weird to have these pid checks down here.
> >     
> >     I was going to suggest having a CHECK_SOME at the top of each message 
> > handler, but I'm not sure that's safe (can these paths get triggered when 
> > 'pid' is none?).

Previously, we supported other executors to send `statusUpdates` on behalf of 
other executors as you had pointed out. If the other executor has not yet 
`registered`, the value of `executor->pid` would still be `None()` for it. So , 
I had to include these extra `isSome()` checks since these paths can get 
triggered when `pid` is None()


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.cpp, lines 2850-2851
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091511#file1091511line2850>
> >
> >     Just for some context, we originally added this for Aurora's GC 
> > executor, which has now been obsoleted by reconciliation and is no longer 
> > in use.
> >     
> >     With the HTTP api, this gets tricker to support (we can't use PIDs and 
> > have to use session identifiers for this). Given that, I'm not sure if we 
> > should continue supporting it in the HTTP code paths.

Based on my conversations with Vinod + what we had already discussed, we won't 
be supporting the ability of `HTTP` executors to send messages on behalf of 
others. Hopefully, that should make the code-paths in `MESOS-3476` much simpler.


> On Oct. 7, 2015, 11:31 p.m., Ben Mahler wrote:
> > src/slave/slave.hpp, lines 668-669
> > <https://reviews.apache.org/r/38874/diff/2/?file=1091510#file1091510line668>
> >
> >     "libprocess message passing"

Fixed


- Anand


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38874/#review101830
-----------------------------------------------------------


On Oct. 6, 2015, 2:22 a.m., Anand Mazumdar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38874/
> -----------------------------------------------------------
> 
> (Updated Oct. 6, 2015, 2:22 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, Isabel Jimenez, and Vinod Kone.
> 
> 
> Bugs: MESOS-3480
>     https://issues.apache.org/jira/browse/MESOS-3480
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This change refactors the Executor struct on Agent and adds support for 
> Executors to connect via the `api/v1/executor` endpoint on Agent. This is 
> similar to the change done in Master for the Scheduler HTTP API.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 18be4f8188ad34ef4d0aa4b5eba241053d071476 
>   src/slave/slave.cpp d1c9977feeb30ad43586a4560eed155865d27a6c 
> 
> Diff: https://reviews.apache.org/r/38874/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Anand Mazumdar
> 
>

Reply via email to