* changz (zheng.ch...@emc.com) wrote: > On 9/17/2012 21:33 PM, Mathieu Desnoyers wrote: >> * changz (zheng.ch...@emc.com) wrote: >>> ...... >>> >>> The child process calls _fini when it calls API exit. It gets hung and >>> meanwhile the parent is waiting for its termination. >>> I think the whole life-cycle of the process should be considered. The >>> parent's waiting in critical region is dangerous. >>> Is it possible to refine the critical region with smaller fineness? >>> >>> What do you think? >> Hrm, yes you're right. I'm looking into it. >> >> The main issue is that get_wait_shm() bypass the fork() wrapper (with >> lttng_ust_nest_count), which is responsible for holding the UST mutex >> across fork(). Therefore, when exiting the context of the child process, >> we execute the destructor, which try to grab the UST mutex, which might >> be in pretty much any state. >> >> Given that we don't want this process to try to register to >> lttng-sessiond (because this is internal to lttng-ust), we might want to >> let it skip the destructor execution. This would actually be the easiest >> way out. >> >> Does the follow patch fix the issue for you ? >> >> diff --git a/liblttng-ust/lttng-ust-comm.c b/liblttng-ust/lttng-ust-comm.c >> index be64acd..596fd7d 100644 >> --- a/liblttng-ust/lttng-ust-comm.c >> +++ b/liblttng-ust/lttng-ust-comm.c >> @@ -616,9 +616,9 @@ int get_wait_shm(struct sock_info *sock_info, size_t >> mmap_size) >> ret = ftruncate(wait_shm_fd, mmap_size); >> if (ret) { >> PERROR("ftruncate"); >> - exit(EXIT_FAILURE); >> + _exit(EXIT_FAILURE); >> } >> - exit(EXIT_SUCCESS); >> + _exit(EXIT_SUCCESS); >> } >> /* >> * For local shm, we need to have rw access to accept > Yes, it works. > Just a reminder, here arefour callings of exit in child's path in my git > repository.
Indeed! Thanks for the reminder! Here is the fix: commit 5d3bc5ed74a4c9f557a75d7de82ed7056adb812e Author: Mathieu Desnoyers <mathieu.desnoy...@efficios.com> Date: Tue Sep 18 00:52:10 2012 -0400 Fix: get_wait_shm() ust mutex deadlock (add 2 missing exit calls) Reported-by: changz <zheng.ch...@emc.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoy...@efficios.com> backported to stable-2.0. -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev