Re: [9fans] That deadlock, again

Lucio De Re Thu, 18 Nov 2010 02:55:13 -0800

On Thu, Nov 18, 2010 at 10:20:33AM +0100, cinap_len...@gmx.de wrote:
> 
> hm...  thinking about it...  does the kernel assume (maybe in early
> initialization) that calling qlock() without a proc is ok as long as
> it can make sure it will not be held by another proc?
> 
That's a question for Bell Labs, I suppose, but that's precisely what I
believe.  There is no other explanation for the panic.  Moving the
up == 0 test earlier will invalidate this assumption and cause the panic
we have already seen.


The issue here is whether there is a situation where qlock() is
intentionally invoked where up == 0 (suggested by the positioning of the
up == 0 test _after_ setting the "locked" condition).  This is improbable,
though, and needs sorting out: whereas setting the lock can be done with
up == 0 - and we can also clear the lock - we cannot _fail_ to set the
lock, because then the absence of up will trigger a panic.

Now, we know that qlock() is called with up == 0, we have seen a panic
being generated by such a call.  Will it suffice to locate the invocation
and somehow deal with it, or should we make qlock() more robust and cause
it to reject a request from a space where up == 0?  Definitely, if qlock()
no longer allows invocations with up == 0 there will be simplifications
in its implementation.  For example, the line

        if(up != nil && up->nlocks.ref)
                print("qlock: %#p: nlocks %lud\n", getcallerpc(&q), 
up->nlocks.ref);

will no longer need the up != nil test.

But I'm convinced there's more here than meets the eye.  Unfortunately,
while I have a Plan 9 distribution at my fingertips, I'm not going to
try to fix this problem in a 9vx environment, I'll wait until I get home
to deal with the native stuff.  But one can speculate...

++L

Re: [9fans] That deadlock, again

Reply via email to