Hi, This series objective is to review how the sessiond teardown is conducted. This is the result of stress testing the teardown code path using [1] and the discussion following previous proposed fixes [2][3].
The main problematic is that applications could hang on application-initiated communication via the notify socket. This would end up causing multiple deadlock scenarios where sessiond would wait on communications for an already stuck application. This is caused by the absence of certain threads at the moment the application is communicating and data handling race during the teardown. I propose to impose a teardown order for thread to prevent such issue. The teardown mechanisms were revolving around the use of the thread_quit_pipe. Most thread would listen on this pipe for the go signal of termination. This result is a non-deterministic teardown code path and result in scenarios quite hard to reproduce. This series mostly introduce specialized quit pipes offering better control over the lifetime of important/interdependent(data-wise) threads. It also includes fixes to problems found along the way. [1] https://github.com/PSRCode/lttng-stress [2] https://lists.lttng.org/pipermail/lttng-dev/2017-August/027366.html [3] https://lists.lttng.org/pipermail/lttng-dev/2017-August/027365.html _______________________________________________ lttng-dev mailing list [email protected] https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
