On Monday 19 September 2005 03:35 pm, Koen Martens wrote: > Vinod Kashyap wrote: > > You seem to be booting off of a 9000 (twa) controller and not 7000/8000 > > (twe). > > It could be because of a 9000 firmware bug that you are not being able > > to > > get the dump. The firmware wrongly interprets physical address 0x0 as > > invalid > > during dumps, and fails the operations. This bug will be fixed in > > future > > firmware releases. > > Ok, it's been a while, here is an update on this. > > I ran a heavily instrumented kernel for two weeks on the server, it > did not crash in that time. I then took out the witness and kdb/ddb > stuff, because the decreased performance was a bit of a nuisance, > however i retained the ability to obtain a crash dump. I had to > limit physical memory, put it on 1.8GB in loader.conf:hw.physmem > because swap and physmem are both 2GB. Tested with 'reboot -d' gave > me a core dump. > > Without the debug stuff in the kernel, it crashed within 2 days, > same story: postgresql process, function propagate_priority. > However, no dump was written to disk :( > > Furthermore, i've been seeing the same crash (in propagate_priority) > on another box in mysql processes. Both servers seem to panic every > 2-3 days. I have another server of the exact same hardware > configuration, but it is mainly idling most of the time. Haven't > seen that one crash yet. > > I am thinking now that it is a bug in the twa driver, so i'll have > to dig in to that. Furthermore, it seems to have to do with some > sort of concurrency issue or otherwise timing-sensitive issue, > because slowing the kernel down with debug code seems to avoid the > panic. But, as i am completely new to the freebsd kernel and don't > even know what turnstiles are, i imagine i will have a hard time. So > if anyone can offer some help, please :) > > Ok, thanks for your attention,
This panic usually happens either because a thread went to sleep while holding a mutex (WITNESS will warn you about this when it happens, but as you noted, it slows things down). It can also happen perhaps if a thread exits while holding a lock or if a thread is blocked on a mutex that is destroyed after it blocks on it. -- John Baldwin <[EMAIL PROTECTED]> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org _______________________________________________ [email protected] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"

