On Tue, Jul 7, 2009 at 4:27 AM, Attilio Rao<[email protected]> wrote: > 2009/7/7 Dan Naumov <[email protected]>: >> On Tue, Jul 7, 2009 at 4:18 AM, Attilio Rao<[email protected]> wrote: >>> 2009/7/7 Dan Naumov <[email protected]>: >>>> I just got a panic following by a reboot a few seconds after running >>>> "portsnap update", /var/log/messages shows the following: >>>> >>>> Jul 7 03:49:38 atom syslogd: kernel boot file is /boot/kernel/kernel >>>> Jul 7 03:49:38 atom kernel: spin lock 0xffffffff80b3edc0 (sched lock >>>> 1) held by 0xffffff00017d8370 (tid 100054) too long >>>> Jul 7 03:49:38 atom kernel: panic: spin lock held too long >>> >>> That's a known bug, affecting -CURRENT as well. >>> The cpustop IPI is handled though an NMI, which means it could >>> interrupt a CPU in any moment, even while holding a spinlock, >>> violating one well known FreeBSD rule. >>> That means that the cpu can stop itself while the thread was holding >>> the sched lock spinlock and not releasing it (there is no way, modulo >>> highly hackish, to fix that). >>> In the while hardclock() wants to schedule something else to run and >>> got stuck on the thread lock. >>> >>> Ideal fix would involve not using a NMI for serving the cpustop while >>> having a cheap way (not making the common path too hard) to tell >>> hardclock() to avoid scheduling while cpustop is in flight. >>> >>> Thanks, >>> Attilio >> >> Any idea if a fix is being worked on and how unlucky must one be to >> run into this issue, should I expect it to happen again? Is it >> basically completely random? > > I'd like to work on that issue before BETA3 (and backport to > STABLE_7), I'm just time-constrained right now. > it is completely random. > > Thanks, > Attilio
Ok, this is getting pretty bad, 23 hours later, I get the same kind of panic, the only difference is that instead of "portsnap update", this was triggered by "portsnap cron" which I have running between 3 and 4 am every day: Jul 8 03:03:49 atom kernel: ssppiinn lloocckk 00xxffffffffffffffff8800bb33eeddc400 ((sscchheedd lloocck k1 )0 )h ehledl db yb y 0x0xfffffffffff0f00001081735339760e 0( t(itdi d 10100006070)5 )t otoo ol olnogng Jul 8 03:03:49 atom kernel: p Jul 8 03:03:49 atom kernel: anic: spin lock held too long Jul 8 03:03:49 atom kernel: cpuid = 0 Jul 8 03:03:49 atom kernel: Uptime: 23h2m38s - Sincerely, Dan Naumov _______________________________________________ [email protected] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[email protected]"
