Re: Review Request 40266: Libprocess Reinitialization: Cleanup SocketManager along side ProcessManager.

2016-01-19 Thread haosdent huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40266/#review115322
---



3rdparty/libprocess/src/process.cpp (line 1306)


Is it possible sockets not empty when exit this loop?


- haosdent huang


On Dec. 3, 2015, 12:50 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40266/
> ---
> 
> (Updated Dec. 3, 2015, 12:50 a.m.)
> 
> 
> Review request for mesos, Artem Harutyunyan and Joris Van Remoortere.
> 
> 
> Bugs: MESOS-3910
> https://issues.apache.org/jira/browse/MESOS-3910
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The `SocketManager` and `ProcessManager` are highly inter-dependent, which 
> requires some untangling in `process::finalize`.
> 
> * Logic originally found in `~ProcessManager` has been split into 
> `ProcessManager::finalize` due to what happens during cleanup.
> * Some additional cleanup was added for dangling pointers.
> * The future from `__s__->accept()` must be explicitly discarded as libevent 
> does not detect a locally closed socket.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/process.cpp 
> a7af4671efa2f370137ed4a749e5247c91e6f95e 
> 
> Diff: https://reviews.apache.org/r/40266/diff/
> 
> 
> Testing
> ---
> 
> `make check` (libev)
> `make check` (--enable-libevent --enable-ssl)
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 40266: Libprocess Reinitialization: Cleanup SocketManager along side ProcessManager.

2015-12-02 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40266/#review108650
---



3rdparty/libprocess/src/process.cpp (line 461)


While using a raw pointer here lets you not worry about `socket_mutex`' 
lifetime, it does create a future false positive for any leak checks.

Instead you should really try to be explicit. You could e.g., use a 
`shared_ptr`, and, if you feel this adds too much noise to the 
call sites, pass that one to the callbacks as an implicit parameter, i.e. 
declare

void finalize(shared_ptr m = process::socket_mutex);

and similarly for `internal::on_accept`.



3rdparty/libprocess/src/process.cpp (line 493)


You should be more explicit about lifetimes here and use `unique_ptrs` of 
.. instead (you can always `reset` in place of `delete` if you need to destroy 
at a certain point).



3rdparty/libprocess/src/process.cpp (line 993)


Once you use a proper smart pointer for `process_route` this comment will 
be right (maybe `s/deleted/cleaned up/`).



3rdparty/libprocess/src/process.cpp (line 2200)


I find adding another manual iteration index manipulation here makes this 
even harder to read (e.g., do we really iterate over all elements? Are there 
assumptions about ordering (hopefully not)?, ...). 

You could e.g., factor out a `synchronized` helper to get the next 
not-ignored element (or a `nullptr` if nothing is left); the whole existing 
loop could then collapse to

while (true) {
  ProcessBase* process = next_cleanup(processes));
  if (!process) {
break;
  }
  process::terminate(process, false);
  process::wait(process);
}

This would also make it clear that we intent over all not-ignored processes 
(which currently is implicit through `wait` removing processes one after the 
other, and `processes` not containing any `nullptr` elements).

The helper `next_cleanup` can be implemented without querying size 
information, e.g., it could iterate `processes` until the process doesn't 
pattern-match with ignored processes.


- Benjamin Bannier


On Nov. 20, 2015, 7:19 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40266/
> ---
> 
> (Updated Nov. 20, 2015, 7:19 p.m.)
> 
> 
> Review request for mesos, Artem Harutyunyan and Joris Van Remoortere.
> 
> 
> Bugs: MESOS-3910
> https://issues.apache.org/jira/browse/MESOS-3910
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The `SocketManager` and `ProcessManager` are highly inter-dependent, which 
> requires some untangling in `process::finalize`.
> 
> * Logic originally found in `~ProcessManager` has been split into 
> `ProcessManager::finalize` due to what happens during cleanup.
> * Some additional cleanup was added for dangling pointers.
> * The future from `__s__->accept()` must be explicitly discarded as libevent 
> does not detect a locally closed socket.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/process.cpp 
> 8fa0594b671969c2713d1b361f2dbbb07d25b3a8 
> 
> Diff: https://reviews.apache.org/r/40266/diff/
> 
> 
> Testing
> ---
> 
> `make check` (libev)
> `make check` (--enable-libevent --enable-ssl)
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 40266: Libprocess Reinitialization: Cleanup SocketManager along side ProcessManager.

2015-12-02 Thread Joseph Wu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40266/
---

(Updated Dec. 2, 2015, 4:50 p.m.)


Review request for mesos, Artem Harutyunyan and Joris Van Remoortere.


Changes
---

Addressed one comment.  Found and fixed a bug while doing so.


Bugs: MESOS-3910
https://issues.apache.org/jira/browse/MESOS-3910


Repository: mesos


Description
---

The `SocketManager` and `ProcessManager` are highly inter-dependent, which 
requires some untangling in `process::finalize`.

* Logic originally found in `~ProcessManager` has been split into 
`ProcessManager::finalize` due to what happens during cleanup.
* Some additional cleanup was added for dangling pointers.
* The future from `__s__->accept()` must be explicitly discarded as libevent 
does not detect a locally closed socket.


Diffs (updated)
-

  3rdparty/libprocess/src/process.cpp a7af4671efa2f370137ed4a749e5247c91e6f95e 

Diff: https://reviews.apache.org/r/40266/diff/


Testing
---

`make check` (libev)
`make check` (--enable-libevent --enable-ssl)


Thanks,

Joseph Wu



Re: Review Request 40266: Libprocess Reinitialization: Cleanup SocketManager along side ProcessManager.

2015-12-02 Thread Joseph Wu


> On Dec. 2, 2015, 8:08 a.m., Benjamin Bannier wrote:
> > 3rdparty/libprocess/src/process.cpp, line 2200
> > 
> >
> > I find adding another manual iteration index manipulation here makes 
> > this even harder to read (e.g., do we really iterate over all elements? Are 
> > there assumptions about ordering (hopefully not)?, ...). 
> > 
> > You could e.g., factor out a `synchronized` helper to get the next 
> > not-ignored element (or a `nullptr` if nothing is left); the whole existing 
> > loop could then collapse to
> > 
> > while (true) {
> >   ProcessBase* process = next_cleanup(processes));
> >   if (!process) {
> > break;
> >   }
> >   process::terminate(process, false);
> >   process::wait(process);
> > }
> > 
> > This would also make it clear that we intent over all not-ignored 
> > processes (which currently is implicit through `wait` removing processes 
> > one after the other, and `processes` not containing any `nullptr` elements).
> > 
> > The helper `next_cleanup` can be implemented without querying size 
> > information, e.g., it could iterate `processes` until the process doesn't 
> > pattern-match with ignored processes.

This comment actually helped me uncover another edge case in the finalization 
code :)  
As it turns out, the current patch **is dependent** on ordering.

---
The existing `process::finalize` (prior to this patch) ends up dereferencing a 
terminated process.  But we don't segfault because the terminated process's 
pointer is never deleted:
1) Run a test like `ProcessTest.Http1`.
2) This leaves two processes behind (bad test cleanup?), a client 
`ConnectionProcess` and an `HttpProxy` on the server side.
3) After the libprocess tests, we call `process::finalize`.
4) We delete in alphabetical order, because `std::map` sorts the processes 
alphabetically.
4) It just so happens that `__gc__` is always the first to die.  This means no 
processes spawned via `spawn(..., true)` will be deleted.
5) The `HttpProxy` (named `__http__`) is deleted next.  We leak this pointer :(
6) The `ConnectionProcess` (named `__http_connection__`) is deleted next.  This 
also fires `process::internal::decode_recv`, which cleans up the socket.
7) During socket cleanup, we terminate the associated `HttpProxy` (which was 
terminated in step 5).  Termination actually requires a dereference (i.e. 
process->pid).  This works because we leaked the pointer.


- Joseph


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40266/#review108650
---


On Dec. 2, 2015, 4:50 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40266/
> ---
> 
> (Updated Dec. 2, 2015, 4:50 p.m.)
> 
> 
> Review request for mesos, Artem Harutyunyan and Joris Van Remoortere.
> 
> 
> Bugs: MESOS-3910
> https://issues.apache.org/jira/browse/MESOS-3910
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The `SocketManager` and `ProcessManager` are highly inter-dependent, which 
> requires some untangling in `process::finalize`.
> 
> * Logic originally found in `~ProcessManager` has been split into 
> `ProcessManager::finalize` due to what happens during cleanup.
> * Some additional cleanup was added for dangling pointers.
> * The future from `__s__->accept()` must be explicitly discarded as libevent 
> does not detect a locally closed socket.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/process.cpp 
> a7af4671efa2f370137ed4a749e5247c91e6f95e 
> 
> Diff: https://reviews.apache.org/r/40266/diff/
> 
> 
> Testing
> ---
> 
> `make check` (libev)
> `make check` (--enable-libevent --enable-ssl)
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 40266: Libprocess Reinitialization: Cleanup SocketManager along side ProcessManager.

2015-11-25 Thread Joseph Wu


> On Nov. 25, 2015, 2:30 a.m., Alexander Rojas wrote:
> > Overall, it looks good to me. But does the finalizes are called at all in 
> > this patch (Probablly just when the system process go down).

As of this patch, `process::finalize` is only called once at the end of the 
libprocess tests (3rdparty/libprocess/src/tests/main.cpp).

For tests later on (when `process::reinitialize` was finished), I did insert 
`process::reinitialize` between every `MesosTest`.


> On Nov. 25, 2015, 2:30 a.m., Alexander Rojas wrote:
> > 3rdparty/libprocess/src/process.cpp, line 735
> > 
> >
> > I've been thinking for a while about this mutexes we hold for such a 
> > short period of time, that makes me wonder if we actually want to introduce 
> > a spinlock mutex for situations like this.

Good point.  We could follow the reasoning behind `process::initialize` 's 
locks.


- Joseph


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40266/#review107959
---


On Nov. 20, 2015, 11:19 a.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40266/
> ---
> 
> (Updated Nov. 20, 2015, 11:19 a.m.)
> 
> 
> Review request for mesos, Artem Harutyunyan and Joris Van Remoortere.
> 
> 
> Bugs: MESOS-3910
> https://issues.apache.org/jira/browse/MESOS-3910
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The `SocketManager` and `ProcessManager` are highly inter-dependent, which 
> requires some untangling in `process::finalize`.
> 
> * Logic originally found in `~ProcessManager` has been split into 
> `ProcessManager::finalize` due to what happens during cleanup.
> * Some additional cleanup was added for dangling pointers.
> * The future from `__s__->accept()` must be explicitly discarded as libevent 
> does not detect a locally closed socket.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/process.cpp 
> 8fa0594b671969c2713d1b361f2dbbb07d25b3a8 
> 
> Diff: https://reviews.apache.org/r/40266/diff/
> 
> 
> Testing
> ---
> 
> `make check` (libev)
> `make check` (--enable-libevent --enable-ssl)
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>



Re: Review Request 40266: Libprocess Reinitialization: Cleanup SocketManager along side ProcessManager.

2015-11-25 Thread Alexander Rojas

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40266/#review107959
---


Overall, it looks good to me. But does the finalizes are called at all in this 
patch (Probablly just when the system process go down).


3rdparty/libprocess/src/process.cpp (line 735)


I've been thinking for a while about this mutexes we hold for such a short 
period of time, that makes me wonder if we actually want to introduce a 
spinlock mutex for situations like this.


- Alexander Rojas


On Nov. 20, 2015, 8:19 p.m., Joseph Wu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40266/
> ---
> 
> (Updated Nov. 20, 2015, 8:19 p.m.)
> 
> 
> Review request for mesos, Artem Harutyunyan and Joris Van Remoortere.
> 
> 
> Bugs: MESOS-3910
> https://issues.apache.org/jira/browse/MESOS-3910
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> The `SocketManager` and `ProcessManager` are highly inter-dependent, which 
> requires some untangling in `process::finalize`.
> 
> * Logic originally found in `~ProcessManager` has been split into 
> `ProcessManager::finalize` due to what happens during cleanup.
> * Some additional cleanup was added for dangling pointers.
> * The future from `__s__->accept()` must be explicitly discarded as libevent 
> does not detect a locally closed socket.
> 
> 
> Diffs
> -
> 
>   3rdparty/libprocess/src/process.cpp 
> 8fa0594b671969c2713d1b361f2dbbb07d25b3a8 
> 
> Diff: https://reviews.apache.org/r/40266/diff/
> 
> 
> Testing
> ---
> 
> `make check` (libev)
> `make check` (--enable-libevent --enable-ssl)
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>