> On Sept. 10, 2015, 8:20 a.m., Michael Park wrote:
> > I've made a few nit comments below but I have some higher-level questions.
> > 
> >   (1) In this patch, when the destructor of `ProcessManager` is invoked we 
> > immediately start to ignore messages. It's not obvious to me that this is 
> > necessary or desired. Could we just enqueue the `TerminateEvent` by calling 
> > `process::terminate(process, false)`, let it race against other messages 
> > that may be coming in, and ignore anything that arrives after 
> > `TerminateEvent`?
> >   (2) What are the implications of shutting down the event loop via 
> > `EventLoop::stop()`? This would be useful to know to learn about the 
> > required order between process termination, event loop shutdown, and thread 
> > pool joins.
> 
> Greg Mann wrote:
>     Thanks for the review mpark!!
>     
>     Regarding (2), `EventLoop::stop()` calls methods from libevent/libev that 
> terminate the event loop after all active callbacks have completed. This 
> means that we can have messages coming in from a socket after this call is 
> made.
>     
>     Regarding (1), I was worried about messages coming in on sockets 
> preventing the queues from draining, but as I think Joris mentioned in our 
> previous discussion, if messages are coming in fast enough to prevent 
> draining, then they're coming in too fast for just about anything to happen, 
> so it's not worth worrying about. Further, it's possible that an existing 
> process needs to receive a socket message in order to terminate, so they 
> should be allowed. I think that we can do away with `draining_queue`.
>     
>     Once the event loop is stopped, we won't have any new socket messages 
> coming in, but the question is if we still need a boolean check in 
> `ProcessManager::handle()` to make sure that any final callbacks on the event 
> loop don't attempt to queue up events. I think this would only be necessary 
> if it's possible for a socket message to spawn a new process directly, and 
> I'm not yet sure if that is the case? If the socket message attempts to queue 
> up an event on an existing process, the run queue will be empty, so it won't 
> be possible, but if it adds a process to the run queue, we have a problem. 
> I've removed the boolean check from `ProcessManager::handle()` for now.

So I see that new processes can be spawned and enqueued by socket messages, 
which means that at shutdown, new processes could get enqueued after the queue 
has been drained but before the process threads have exited. To avoid this, I 
added a check on `joining_threads` into `ProcessManager::enqueue()` and moved 
the `joining_threads.store(true)` to occur immediately following the queue 
draining.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37821/#review98329
-----------------------------------------------------------


On Sept. 13, 2015, 11:01 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37821/
> -----------------------------------------------------------
> 
> (Updated Sept. 13, 2015, 11:01 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Artem Harutyunyan, Joris Van 
> Remoortere, Joseph Wu, and Michael Park.
> 
> 
> Bugs: MESOS-3158
>     https://issues.apache.org/jira/browse/MESOS-3158
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Join threads in libprocess when shutting down.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/src/event_loop.hpp 
> 36a4cd2b1ff59f6922173ad17115bf80cc3c8f30 
>   3rdparty/libprocess/src/libev.cpp 97a2694f9b10bc61841443b21f4f96055493e840 
>   3rdparty/libprocess/src/libevent.cpp 
> ee7906470069b0391dde7cd685b1d4eb3a158c03 
>   3rdparty/libprocess/src/process.cpp 
> 4afa30569b4d235637b49a624602e6b199c32e0e 
> 
> Diff: https://reviews.apache.org/r/37821/diff/
> 
> 
> Testing
> -------
> 
> After configuring with both "../configure" and "../configure 
> --enable-libevent --enable-ssl":
> 
> make check
> 
> 
> Also, to check for race conditions related to the initialization/shutdown of 
> libprocess, try something like:
> 
> for n in {1..1000}; do echo $n; 3rdparty/libprocess/tests 
> --gtest_filter=ProcessTest.Spawn; done
> 
> 
> Thanks,
> 
> Greg Mann
> 
>

Reply via email to