On Sunday, November 7, 2010, Mark Kettenis <[email protected]> wrote: >> Date: Fri, 5 Nov 2010 17:52:23 +0100 >> From: Mike Belopuhov <[email protected]> > > Mike, you might want to take a look at PR 6508. I think the > "sched_lock" panic: > >> ddb{0}> show panic >> kernel diagnostic assertion "__mp_lock_held(&sched_lock) == 0" failed: > > is actually a side effect of trapping in the middle of a context > switch when we're doing the sched_lock/kernel_lock dance. In PR 6508 > it is almost certainly a page fault that happened because we > overflowed the stack. That also might be the cause of your panic. At > least judging from the traceback, your stack is seriously hosed:
Since i386 and amd64 put the lernel stack above the PCB, stack overrun means the PCB has already been overwritten. At that point, trying to save the process context will probably blow up trying to follow some pointer therein. I had chatted some with Theo about putting a guard page below the kernel stack to catch this sort of thing. Would want to move the PCB to above the stack at the same time to save most of a page. Would the result help isolate these problems enough to be worth the effort? Philip Guenther
