On Thu, Nov 18, 2010 at 10:20:33AM +0100, cinap_len...@gmx.de wrote: > > hm... thinking about it... does the kernel assume (maybe in early > initialization) that calling qlock() without a proc is ok as long as > it can make sure it will not be held by another proc? > That's a question for Bell Labs, I suppose, but that's precisely what I believe. There is no other explanation for the panic. Moving the up == 0 test earlier will invalidate this assumption and cause the panic we have already seen.
The issue here is whether there is a situation where qlock() is intentionally invoked where up == 0 (suggested by the positioning of the up == 0 test _after_ setting the "locked" condition). This is improbable, though, and needs sorting out: whereas setting the lock can be done with up == 0 - and we can also clear the lock - we cannot _fail_ to set the lock, because then the absence of up will trigger a panic. Now, we know that qlock() is called with up == 0, we have seen a panic being generated by such a call. Will it suffice to locate the invocation and somehow deal with it, or should we make qlock() more robust and cause it to reject a request from a space where up == 0? Definitely, if qlock() no longer allows invocations with up == 0 there will be simplifications in its implementation. For example, the line if(up != nil && up->nlocks.ref) print("qlock: %#p: nlocks %lud\n", getcallerpc(&q), up->nlocks.ref); will no longer need the up != nil test. But I'm convinced there's more here than meets the eye. Unfortunately, while I have a Plan 9 distribution at my fingertips, I'm not going to try to fix this problem in a 9vx environment, I'll wait until I get home to deal with the native stuff. But one can speculate... ++L