> On Sept. 10, 2015, 8:20 a.m., Michael Park wrote:
> > I've made a few nit comments below but I have some higher-level questions.
> > 
> >   (1) In this patch, when the destructor of `ProcessManager` is invoked we 
> > immediately start to ignore messages. It's not obvious to me that this is 
> > necessary or desired. Could we just enqueue the `TerminateEvent` by calling 
> > `process::terminate(process, false)`, let it race against other messages 
> > that may be coming in, and ignore anything that arrives after 
> > `TerminateEvent`?
> >   (2) What are the implications of shutting down the event loop via 
> > `EventLoop::stop()`? This would be useful to know to learn about the 
> > required order between process termination, event loop shutdown, and thread 
> > pool joins.

Thanks for the review mpark!!

Regarding (2), `EventLoop::stop()` calls methods from libevent/libev that 
terminate the event loop after all active callbacks have completed. This means 
that we can have messages coming in from a socket after this call is made.

Regarding (1), I was worried about messages coming in on sockets preventing the 
queues from draining, but as I think Joris mentioned in our previous 
discussion, if messages are coming in fast enough to prevent draining, then 
they're coming in too fast for just about anything to happen, so it's not worth 
worrying about. Further, it's possible that an existing process needs to 
receive a socket message in order to terminate, so they should be allowed. I 
think that we can do away with `draining_queue`.

Once the event loop is stopped, we won't have any new socket messages coming 
in, but the question is if we still need a boolean check in 
`ProcessManager::handle()` to make sure that any final callbacks on the event 
loop don't attempt to queue up events. I think this would only be necessary if 
it's possible for a socket message to spawn a new process directly, and I'm not 
yet sure if that is the case? If the socket message attempts to queue up an 
event on an existing process, the run queue will be empty, so it won't be 
possible, but if it adds a process to the run queue, we have a problem. I've 
removed the boolean check from `ProcessManager::handle()` for now.


> On Sept. 10, 2015, 8:20 a.m., Michael Park wrote:
> > 3rdparty/libprocess/src/libev.cpp, line 30
> > <https://reviews.apache.org/r/37821/diff/6/?file=1064284#file1064284line30>
> >
> >     Is there a reason why this was moved up? Seems like `async_shutdown` 
> > could've been added below this, at its original location?

This was moved up because, strictly speaking, the `async_watcher` declaration 
doesn't match the comment below, which states that we will "Define the initial 
values for all of the declarations made in libev.hpp", so it seemed more 
appropriate to place before the comment. If this order is awkward the watcher 
declarations could be moved.


> On Sept. 10, 2015, 8:20 a.m., Michael Park wrote:
> > 3rdparty/libprocess/src/process.cpp, lines 2143-2145
> > <https://reviews.apache.org/r/37821/diff/6/?file=1064286#file1064286line2143>
> >
> >     Could you explain why we want to start ignoring messages here?
> >     
> >     Below, we do `process::terminate(process, false);` which purposely does 
> > not inject the `TerminateEvent` in order to allow the actor to process its 
> > queue.
> >     
> >     This is great, but wouldn't we be dropping messages that were sent 
> > between  `draining_queue.store(true);` and when the `TerminateEvent` 
> > actually gets enqueued at the back? Is the decision that this doesn't 
> > matter?

See my first comment above.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37821/#review98329
-----------------------------------------------------------


On Sept. 8, 2015, 5:54 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37821/
> -----------------------------------------------------------
> 
> (Updated Sept. 8, 2015, 5:54 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Artem Harutyunyan, Joris Van 
> Remoortere, Joseph Wu, and Michael Park.
> 
> 
> Bugs: MESOS-3158
>     https://issues.apache.org/jira/browse/MESOS-3158
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Join threads in libprocess when shutting down.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/src/event_loop.hpp 
> 36a4cd2b1ff59f6922173ad17115bf80cc3c8f30 
>   3rdparty/libprocess/src/libev.cpp 97a2694f9b10bc61841443b21f4f96055493e840 
>   3rdparty/libprocess/src/libevent.cpp 
> ee7906470069b0391dde7cd685b1d4eb3a158c03 
>   3rdparty/libprocess/src/process.cpp 
> 0e5394acff16376809918d583d7aee582cc6da54 
> 
> Diff: https://reviews.apache.org/r/37821/diff/
> 
> 
> Testing
> -------
> 
> After configuring with both "../configure" and "../configure 
> --enable-libevent --enable-ssl":
> 
> make check
> 
> 
> Also, to check for race conditions related to the initialization/shutdown of 
> libprocess, try something like:
> 
> for n in {1..1000}; do echo $n; 3rdparty/libprocess/tests 
> --gtest_filter=ProcessTest.Spawn; done
> 
> 
> Thanks,
> 
> Greg Mann
> 
>

Reply via email to