On Tue, Jan 29, 2008 at 03:50:43PM -0600, Nicolas Williams wrote: > On Tue, Jan 29, 2008 at 12:10:11PM -0600, Mark Martin wrote: > > I'd appreciate comments on my fix for bug 6409970. > > > > The diffs are available at: http://cr.opensolaris.org/~devnull/6409970/ > > > > I have an extremely simple test harness available at: > > http://cr.opensolaris.org/~devnull/6409970/checkstartd. > > Please do not judge me by my lack of script-fu. > > I've recently had to write code to handle signals in a daemon (idmapd) > that has an event loop. I considered using condition variables, but, > looking at the code and manpages I could not convince myself that > pthread_mutex_lock() and pthread_cond_signal() are async-signal-safe.
You can call these functions from signal handlers just fine. But if your code is arranged such that the signal handler can attempt to grab a lock that could be held by the same thread whose stack is now being used to execute the signal handler, then you will deadlock on yourself. As Dan said, the right way to do this is to have the main thread sit in sigsuspend(), while child threads run with all signals blocked (except SIGABRT, so they can assert()). When your main thread awakens from SIGTERM or whatever, it will return from sigsuspend, and then can cleanly either cond_signal() or cancel your child threads, and then wait for them to exit or join with them. > But I was able to convince myself (and others) that port_send() is > async-signal-safe, and got its manpage fixed to reflect that too. > > svc.startd is already using event ports, so using port_send() from the > signal handler seems like the perfect solution: just make the existing > svc.startd wait thread handle the event and exit() the process. > > A clean exit, where every thread exits cleanly then main() returns is > probably going to be difficult to implement in a daemon with so many > threads doing such varied things. I don't think a clean exit is really > needed either. > > Nico fmd does this too. You can look at the source to see how I do it. It requires careful planning of how you signal threads and how they have cleanly defined places they can unwind to, etc. Although one can argue that you can just exit() and let the kernel take care of your state, I find it useful to actually spend the time necessary to get this right because it helps you find other bugs like memory leaks and so forth by properly executing all cleanup paths. -Mike -- Mike Shapiro, Solaris Kernel Development. blogs.sun.com/mws/