Quoth Michen Chang on Thu, Jan 03, 2008 at 11:16:24AM -0600:
> David Bustos wrote:
...
> > Well smf_get_state() reads property values, which can require an
> > authentication check, so smf_get_state() can block on getuseruid().
> > From your description, I'm not sure whether getuseruid() can block on
> > smf_get_state() or not.  Can you clarify, please?
> 
> I don't think deadlock can occur as there is no lock being
> held when smf_get_state() is called. However,  infinite
> recursion can occur when nscd calls smf_get_state(),  which
> calls getuseruid(), which causes nscd to call smf_get_state()
> again. I will look into a way to have nscd break the loop. Will
> file a CR later.

Right, now that we've identified a calling cycle, the first thing to
worry about is infinite recursion.  To ward that off I think we need to
understand the conditions under which smf_get_state() can block on
getuseruid(), what arguments it calls getuseruid() with, the conditions
under which getuseruid() can block on smf_get_state(), and the arguments
it calls smf_get_state() with.

For our part, I believe smf_get_state() will only block on getuseruid()
when the "restarter" property group of the argument is read-protected.
This should never happen during normal operation of SMF, though we don't
prevent a user from making it so.  When we do call getuseruid(), I think
we do so with the uid of the invoking process.

Deadlocks don't require a direct calling cycle, but do require
examination of the form "libA`F() calls libB`G(), libB`G() calls
libA`H(), libA`H() blocks on the caller of libA`F()".  If getuseruid()
can't block on a libnsl call which is blocking on smf_get_state(), which
would seem to be the case if nscd doesn't hold any locks while calling
smf_get_state(), then indeed nscd can't cause a deadlock.

I'm not confident that svc.configd can't cause a deadlock, though.  We
need to determine whether an smf_get_state() call can block on a thread
blocked on a call to getuseruid().

I wonder if we should investigate implicit serialization, such as in the
doors subsystem.  If nscd can only process N calls a time, then if the
Nth request causes nscd to call smf_get_state(), which triggers the
N + 1'th call to nscd, then the calls will be locked until nscd can get
around to processing that N + 1'th call.  This might not be enough to
worry about unless someone is capable of generating N outstanding calls.


David

Reply via email to