Quoth Michen Chang on Thu, Jan 03, 2008 at 11:16:24AM -0600: > David Bustos wrote: ... > > Well smf_get_state() reads property values, which can require an > > authentication check, so smf_get_state() can block on getuseruid(). > > From your description, I'm not sure whether getuseruid() can block on > > smf_get_state() or not. Can you clarify, please? > > I don't think deadlock can occur as there is no lock being > held when smf_get_state() is called. However, infinite > recursion can occur when nscd calls smf_get_state(), which > calls getuseruid(), which causes nscd to call smf_get_state() > again. I will look into a way to have nscd break the loop. Will > file a CR later.
Right, now that we've identified a calling cycle, the first thing to worry about is infinite recursion. To ward that off I think we need to understand the conditions under which smf_get_state() can block on getuseruid(), what arguments it calls getuseruid() with, the conditions under which getuseruid() can block on smf_get_state(), and the arguments it calls smf_get_state() with. For our part, I believe smf_get_state() will only block on getuseruid() when the "restarter" property group of the argument is read-protected. This should never happen during normal operation of SMF, though we don't prevent a user from making it so. When we do call getuseruid(), I think we do so with the uid of the invoking process. Deadlocks don't require a direct calling cycle, but do require examination of the form "libA`F() calls libB`G(), libB`G() calls libA`H(), libA`H() blocks on the caller of libA`F()". If getuseruid() can't block on a libnsl call which is blocking on smf_get_state(), which would seem to be the case if nscd doesn't hold any locks while calling smf_get_state(), then indeed nscd can't cause a deadlock. I'm not confident that svc.configd can't cause a deadlock, though. We need to determine whether an smf_get_state() call can block on a thread blocked on a call to getuseruid(). I wonder if we should investigate implicit serialization, such as in the doors subsystem. If nscd can only process N calls a time, then if the Nth request causes nscd to call smf_get_state(), which triggers the N + 1'th call to nscd, then the calls will be locked until nscd can get around to processing that N + 1'th call. This might not be enough to worry about unless someone is capable of generating N outstanding calls. David