Gary Allan wrote:
Hello,

I've been experiencing lock-ups using DragonFly HEAD SMP under kvm. Running "make -j8 buildworld" triggers a completely unresponsive state and 100.00% CPU usage on all four cores (Seen from host OS).

I've managed to get gdb attached and get some information.

The kernel is getting caught in a while loop in lwkt_acquire. I can reliably trigger this with with a "make -j8 buildworld" under a SMP kernel (Otherwise identical to GENERIC, no optimisations.) The OS is completely unresponsive and all four cpu cores are running at 100%.

I've included the debug information.

Program received signal SIGINT, Interrupt.
lwkt_acquire (td=0xc6a59e70) at /usr/src/sys/kern/lwkt_thread.c:1048
1048            while (td->td_flags & (TDF_RUNNING|TDF_PREEMPT_LOCK))
(gdb) l
1043        mygd = mycpu;
1044        if (gd != mycpu) {
1045            cpu_lfence();
1046            KKASSERT((td->td_flags & TDF_RUNQ) == 0);
1047            crit_enter_gd(mygd);
1048            while (td->td_flags & (TDF_RUNNING|TDF_PREEMPT_LOCK))
1049                cpu_lfence();
1050            td->td_gd = mygd;
1051            TAILQ_INSERT_TAIL(&mygd->gd_tdallq, td, td_allq);
1052            td->td_flags &= ~TDF_MIGRATING;
(gdb) p td->td_flags
$1 = 8390177
(gdb) p td
$2 = (thread_t) 0xc6a59e70
(gdb) bt
#0  lwkt_acquire (td=0xc6a59e70) at /usr/src/sys/kern/lwkt_thread.c:1048
#1 0xc02c66af in bsd4_select_curproc (gd=0xff800000) at /usr/src/sys/kern/usched_bsd4.c:358 #2 0xc02c6829 in bsd4_release_curproc (lp=0xea634c00) at /usr/src/sys/kern/usched_bsd4.c:322 #3 0xc04b8239 in passive_release (td=0xdfe8aba0) at /usr/src/sys/platform/pc32/i386/trap.c:212
#4  0xc02c870b in lwkt_switch () at /usr/src/sys/kern/lwkt_thread.c:491
#5 0xc02c8b3b in lwkt_mp_lock_contested () at /usr/src/sys/kern/lwkt_thread.c:1374 #6 0xc04b0751 in get_mplock () at /usr/src/sys/platform/pc32/i386/mplock.s:168
#7  0xe9ef6d34 in ?? ()
#8 0xc04b94a4 in syscall2 (frame=0xe9ef6d40) at /usr/src/sys/platform/pc32/i386/trap.c:1371 #9 0xc04a3396 in Xint0x80_syscall () at /usr/src/sys/platform/pc32/i386/exception.s:876
#10 0xe9ef6d40 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) jump 1050
Continuing at 0xc02c8bbb.

Continuing execution does not appear to cause any problems.
I can provide additional debugging info if required but I'm unsure of how to proceed with this myself.

Regards

Gary

Reply via email to