Quoth Attilio Rao on Thursday, 18 August 2011: > 2011/8/18 Hiroki Sato <[email protected]>: > > Hiroki Sato <[email protected]> wrote > > in <[email protected]>: > > > > hr> Attilio Rao <[email protected]> wrote > > hr> in > > <caj-fndcdow0_b2mv0lzeo-tpea9+7oanj7ihvkqsm4j4b0d...@mail.gmail.com>: > > hr> > > hr> at> 2011/8/17 Hiroki Sato <[email protected]>: > > hr> at> > Hi, > > hr> at> > > > hr> at> > Mike Tancsa <[email protected]> wrote > > hr> at> > in <[email protected]>: > > hr> at> > > > hr> at> > mi> On 7/7/2011 7:32 AM, Mike Tancsa wrote: > > hr> at> > mi> > On 7/7/2011 4:20 AM, Kostik Belousov wrote: > > hr> at> > mi> >> > > hr> at> > mi> >> BTW, we had a similar panic, "spinlock held too long", the > > spinlock > > hr> at> > mi> >> is the sched lock N, on busy 8-core box recently upgraded > > to the > > hr> at> > mi> >> stable/8. Unfortunately, machine hung dumping core, so the > > stack trace > > hr> at> > mi> >> for the owner thread was not available. > > hr> at> > mi> >> > > hr> at> > mi> >> I was unable to make any conclusion from the data that was > > present. > > hr> at> > mi> >> If the situation is reproducable, you coulld try to revert > > r221937. This > > hr> at> > mi> >> is pure speculation, though. > > hr> at> > mi> > > > hr> at> > mi> > Another crash just now after 5hrs uptime. I will try and > > revert r221937 > > hr> at> > mi> > unless there is any extra debugging you want me to add to > > the kernel > > hr> at> > mi> > instead ? > > hr> at> > > > hr> at> > I am also suffering from a reproducible panic on an 8-STABLE > > box, an > > hr> at> > NFS server with heavy I/O load. I could not get a kernel dump > > hr> at> > because this panic locked up the machine just after it occurred, > > but > > hr> at> > according to the stack trace it was the same as posted one. > > hr> at> > Switching to an 8.2R kernel can prevent this panic. > > hr> at> > > > hr> at> > Any progress on the investigation? > > hr> at> > > hr> at> Hiroki, > > hr> at> how easilly can you reproduce it? > > hr> > > hr> It takes 5-10 hours. I installed another kernel for debugging just > > hr> now, so I think I will be able to collect more detail information in > > hr> a couple of days. > > hr> > > hr> at> It would be important to have a DDB textdump with these > > informations: > > hr> at> - bt > > hr> at> - ps > > hr> at> - show allpcpu > > hr> at> - alltrace > > hr> at> > > hr> at> Alternatively, a coredump which has the stop cpu patch which Andryi > > can provide. > > hr> > > hr> Okay, I will post them once I can get another panic. Thanks! > > > > I got the panic with a crash dump this time. The result of bt, ps, > > allpcpu, and traces can be found at the following URL: > > > > http://people.allbsd.org/~hrs/FreeBSD/pool-panic_20110818-1.txt > > Actually, I think I see the bug here. > > In callout_cpu_switch() if a low priority thread is migrating the > callout and gets preempted after the outcoming cpu queue lock is left > (and scheduled much later) we get this problem. > > In order to fix this bug it could be enough to use a critical section, > but I think this should be really interrupt safe, thus I'd wrap them > up with spinlock_enter()/spinlock_exit(). Fortunately > callout_cpu_switch() should be called rarely and also we already do > expensive locking operations in callout, thus we should not have > problem performance-wise. > > Can the guys I also CC'ed here try the following patch, with all the > initial kernel options that were leading you to the deadlock? (thus > revert any debugging patch/option you added for the moment): > http://www.freebsd.org/~attilio/callout-fixup.diff > > Please note that this patch is for STABLE_8, if you can confirm the > good result I'll commit to -CURRENT and then backmarge as soon as > possible. > > Thanks, > Attilio >
Thanks, Attilio. I've applied the patch and removed the extra debug options I had added (though keeping debug symbols). I'll let you know if I experience any more panics. Regards, -- .O. | Sterling (Chip) Camden | http://camdensoftware.com ..O | [email protected] | http://chipsquips.com OOO | 2048R/D6DBAF91 | http://chipstips.com
pgpJ447gdPrNv.pgp
Description: PGP signature
