David Bustos wrote: > Quoth Michen Chang on Thu, Jan 03, 2008 at 11:16:24AM -0600: > >> David Bustos wrote: >> > ... > >>> Well smf_get_state() reads property values, which can require an >>> authentication check, so smf_get_state() can block on getuseruid(). >>> From your description, I'm not sure whether getuseruid() can block on >>> smf_get_state() or not. Can you clarify, please? >>> >> I don't think deadlock can occur as there is no lock being >> held when smf_get_state() is called. However, infinite >> recursion can occur when nscd calls smf_get_state(), which >> calls getuseruid(), which causes nscd to call smf_get_state() >> again. I will look into a way to have nscd break the loop. Will >> file a CR later. >> > > Right, now that we've identified a calling cycle, the first thing to > worry about is infinite recursion. To ward that off I think we need to > understand the conditions under which smf_get_state() can block on > getuseruid(), what arguments it calls getuseruid() with, the conditions > under which getuseruid() can block on smf_get_state(), and the arguments > it calls smf_get_state() with. > > For our part, I believe smf_get_state() will only block on getuseruid() > when the "restarter" property group of the argument is read-protected. > This should never happen during normal operation of SMF, though we don't > prevent a user from making it so. When we do call getuseruid(), I think > we do so with the uid of the invoking process. > > Deadlocks don't require a direct calling cycle, but do require > examination of the form "libA`F() calls libB`G(), libB`G() calls > libA`H(), libA`H() blocks on the caller of libA`F()". If getuseruid() > can't block on a libnsl call which is blocking on smf_get_state(), which > would seem to be the case if nscd doesn't hold any locks while calling > smf_get_state(), then indeed nscd can't cause a deadlock. > > I'm not confident that svc.configd can't cause a deadlock, though. We > need to determine whether an smf_get_state() call can block on a thread > blocked on a call to getuseruid(). > > I wonder if we should investigate implicit serialization, such as in the > doors subsystem. If nscd can only process N calls a time, then if the > Nth request causes nscd to call smf_get_state(), which triggers the > N + 1'th call to nscd, then the calls will be locked until nscd can get > around to processing that N + 1'th call. This might not be enough to > worry about unless someone is capable of generating N outstanding calls.
Yes. nscd does not call smf_get_state() for every lookup. Only when it has to invoke a backend and the cached state of the backend's associated SMF service is not up. The cached states are maintained by a separate monitor thread in nscd which continuously check the state of the services being used. To make it safer, we can also modify nscd to allow at most one outstanding smf_get_state() call for each backend at any given time. This way, infinite recursion won't happen and blocking caused by the implicit serialization shouldn't happen. -- Michen