On Sunday, November 7, 2010, Mark Kettenis <[email protected]> wrote:
>> Date: Fri, 5 Nov 2010 17:52:23 +0100
>> From: Mike Belopuhov <[email protected]>
>
> Mike, you might want to take a look at PR 6508.  I think the
> "sched_lock" panic:
>
>> ddb{0}> show panic
>> kernel diagnostic assertion "__mp_lock_held(&sched_lock) == 0" failed:
>
> is actually a side effect of trapping in the middle of a context
> switch when we're doing the sched_lock/kernel_lock dance.  In PR 6508
> it is almost certainly a page fault that happened because we
> overflowed the stack.  That also might be the cause of your panic.  At
> least judging from the traceback, your stack is seriously hosed:

Since i386 and amd64 put the lernel stack above the PCB, stack overrun
means the PCB has already been overwritten.  At that point, trying to
save the process context will probably blow up trying to follow some
pointer therein.

I had chatted some with Theo about putting a guard page below the
kernel stack to catch this sort of thing.  Would want to move the PCB
to above the stack at the same time to save most of a page.  Would the
result help isolate these problems enough to be worth the effort?


Philip Guenther

Reply via email to