> On Nov. 20, 2015, 2:16 a.m., Neil Conway wrote:
> > Good find. I wonder:
> > 
> > (a) is there some general advice we should give to people implementing 
> > Processes (e.g., "always provide a destructor that does terminate/wait" -- 
> > that is probably too broad though). Would be nice to add this to the 
> > libprocess README.
> > (b) does this problem occur anywhere else? and/or is there a way to detect 
> > it?
> 
> Bernd Mathiske wrote:
>     [~neilc] a) I agree that we should provide documentation in this regard. 
> This kind of pattern is too easily overlooked, also by reviewers, as 
> exemplified when the bug got introduced: https://reviews.apache.org/r/27483/
>     b) For test code (such as is the case here), we could put something into 
> the inherited fixture that scans for orphaned processes at TearDown.

Note that there is some assymmetry in the API: when *not giving* a `manage` arg 
to `spawn`. you get the default value `false`; but you then *do need to add 
code* to terminate and wait for the process.

I personally would naively have expected an API to either (i) require me to be 
explicitly in both places (explictly set `manage=false`, and to manual 
`terminate` and `wait`), or (ii) it just do The Right Thing if I was brief (I 
called `spawn` with default args, and do not have to worry about cleanup).

I.e., wouldn't it in the long run make more sense the change `spawn` to `PID<T> 
spawn(T& t, bool manage = true)` than to add more documentation?


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40501/#review107290
-----------------------------------------------------------


On Nov. 19, 2015, 9:51 p.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40501/
> -----------------------------------------------------------
> 
> (Updated Nov. 19, 2015, 9:51 p.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Artem Harutyunyan, and Joris Van 
> Remoortere.
> 
> 
> Bugs: MESOS-3753
>     https://issues.apache.org/jira/browse/MESOS-3753
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Two of the fetcher tests will spawn a process which is stored in the stack 
> (i.e. local variable in the test).  `spawn` will store a pointer to the 
> process in libprocess's `ProcessManager`.  When the test finishes, the 
> process goes out of scope and is therefore lost.  However, the process is 
> **not** terminated.
> 
> Failing to terminate this process will lead to an infinite loop in 
> `~ProcessManager`, which is called in `process::finalize`.  In 
> `ProcessManager` 's destructor, we will loop and try to kill all processes.  
> The process spawned in the test will be running.  However, since the pointer 
> lives in the stack, the `ProcessManager` will be unable to find the process 
> and will thereby be stuck trying to kill a process it cannot find.
> 
> 
> Diffs
> -----
> 
>   src/tests/fetcher_tests.cpp 04079964b3539f555351d1444f3635c64700a1a8 
> 
> Diff: https://reviews.apache.org/r/40501/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> Additional testing:
> 
> Insert a `process::finalize` in `src/test/main.cpp`.  i.e.
> ```
>   // Replace `return RUN_ALL_TESTS();` with this:
>   int ret = RUN_ALL_TESTS();
>   process::finalize();
>   return ret;
> ```
> 
> Then `make check 
> GTEST_FILTER="*FetcherTest.OSNetUriTest*:*FetcherTest.OSNetUriSpaceTest*"`.
> The test program should not stall or segfault or abort in some weird way.
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>

Reply via email to