Re: random FreeBSD panics
On Sat, Apr 3, 2010 at 6:21 PM, Masoom Shaikh masoom.sha...@gmail.com wrote: On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras ivo...@freebsd.org wrote: On 28 March 2010 16:42, Masoom Shaikh masoom.sha...@gmail.com wrote: lets assume if this is h/w problem, then how can other OSes overcome this ? is there a way to make FreeBSD ignore this as well, let it result in reasonable performance penalty. Very probably, if only we could detect where the problem is. Try adding options PRINTF_BUFR_SIZE=128 to the kernel configuration file if you can, to see if you can get a less mangled log outout. ok, after few days of silence I am back with more questions this time system feels little better, it is able to sustain for more time that what 7.3-RELEASE could FreeBSD raptor 8.0-RELEASE-p2 FreeBSD 8.0-RELEASE-p2 #0: Thu Apr 1 01:20:45 UTC 2010 root@:/usr/obj/usr/src/sys/INSPIRON amd64 I am using KDE4, and when OS freezes, well it freezes, means I cannot change to tty0 and see the panic text, if any it might possibly have spit. the stuck frozen GUI keeps staring there. So the question is how to I capture that panic text ? unfortunately I am not getting core files too, so there is nothing I can pick up hints is there some option (KDB, DDB), so that on panic system drop to debugger ? Masoom Shaikh I am having the very same problem, with my AMD64 running i386 (both 7.3-REL and 8.0-REL) keeps crashing, The best part is, if I disable ACPI it crashes before it even boots up so is the case with safe-mode and single-user-mode. With ACPI it boots up but crashes after a while. I have the vmcore files on the system. Who do I contact on this regard ? ___ freebsd-questi...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On Thu, Apr 8, 2010 at 4:45 PM, Anoop Kumar Narayanan anoop...@gmail.com wrote: On Sat, Apr 3, 2010 at 6:21 PM, Masoom Shaikh masoom.sha...@gmail.com wrote: On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras ivo...@freebsd.org wrote: On 28 March 2010 16:42, Masoom Shaikh masoom.sha...@gmail.com wrote: lets assume if this is h/w problem, then how can other OSes overcome this ? is there a way to make FreeBSD ignore this as well, let it result in reasonable performance penalty. Very probably, if only we could detect where the problem is. Try adding options PRINTF_BUFR_SIZE=128 to the kernel configuration file if you can, to see if you can get a less mangled log outout. ok, after few days of silence I am back with more questions this time system feels little better, it is able to sustain for more time that what 7.3-RELEASE could FreeBSD raptor 8.0-RELEASE-p2 FreeBSD 8.0-RELEASE-p2 #0: Thu Apr 1 01:20:45 UTC 2010 root@:/usr/obj/usr/src/sys/INSPIRON amd64 I am using KDE4, and when OS freezes, well it freezes, means I cannot change to tty0 and see the panic text, if any it might possibly have spit. the stuck frozen GUI keeps staring there. So the question is how to I capture that panic text ? unfortunately I am not getting core files too, so there is nothing I can pick up hints is there some option (KDB, DDB), so that on panic system drop to debugger ? Masoom Shaikh I am having the very same problem, with my AMD64 running i386 (both 7.3-REL and 8.0-REL) keeps crashing, The best part is, if I disable ACPI it crashes before it even boots up so is the case with safe-mode and single-user-mode. With ACPI it boots up but crashes after a while. I have the vmcore files on the system. Who do I contact on this regard ? ___ freebsd-questi...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org can u load that file in kgdb in get backtrace ? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras ivo...@freebsd.org wrote: On 28 March 2010 16:42, Masoom Shaikh masoom.sha...@gmail.com wrote: lets assume if this is h/w problem, then how can other OSes overcome this ? is there a way to make FreeBSD ignore this as well, let it result in reasonable performance penalty. Very probably, if only we could detect where the problem is. Try adding options PRINTF_BUFR_SIZE=128 to the kernel configuration file if you can, to see if you can get a less mangled log outout. ok, after few days of silence I am back with more questions this time system feels little better, it is able to sustain for more time that what 7.3-RELEASE could FreeBSD raptor 8.0-RELEASE-p2 FreeBSD 8.0-RELEASE-p2 #0: Thu Apr 1 01:20:45 UTC 2010 root@:/usr/obj/usr/src/sys/INSPIRON amd64 I am using KDE4, and when OS freezes, well it freezes, means I cannot change to tty0 and see the panic text, if any it might possibly have spit. the stuck frozen GUI keeps staring there. So the question is how to I capture that panic text ? unfortunately I am not getting core files too, so there is nothing I can pick up hints is there some option (KDB, DDB), so that on panic system drop to debugger ? Masoom Shaikh ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On Sat, 3 Apr 2010 12:51:46 + Masoom Shaikh masoom.sha...@gmail.com wrote: On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras ivo...@freebsd.org wrote: On 28 March 2010 16:42, Masoom Shaikh masoom.sha...@gmail.com wrote: lets assume if this is h/w problem, then how can other OSes overcome this ? is there a way to make FreeBSD ignore this as well, let it result in reasonable performance penalty. Very probably, if only we could detect where the problem is. Try adding options __ __ PRINTF_BUFR_SIZE=128 to the kernel configuration file if you can, to see if you can get a less mangled log outout. ok, after few days of silence I am back with more questions this time system feels little better, it is able to sustain for more time that what 7.3-RELEASE could FreeBSD raptor 8.0-RELEASE-p2 FreeBSD 8.0-RELEASE-p2 #0: Thu Apr 1 01:20:45 UTC 2010 root@:/usr/obj/usr/src/sys/INSPIRON amd64 I am using KDE4, and when OS freezes, well it freezes, means I cannot change to tty0 and see the panic text, if any it might possibly have spit. the stuck frozen GUI keeps staring there. So the question is how to I capture that panic text ? unfortunately I am not getting core files too, so there is nothing I can pick up hints is there some option (KDB, DDB), so that on panic system drop to debugger ? [trimmed Cc - no need to send this to 3 MLs] There's no code in the kernel to switch back out of graphics mode (i.e. what X uses) when a panic happens. You probably can switch to v0, but you won't be able to see it. The only sure-fire way is to hook up a screen (terminal, laptop or another computer) to a serial port. -- Gary Jennejohn ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On Sunday 28 March 2010 4:28:29 am Masoom Shaikh wrote: Hello List, I was a happy FreeBSD user, just before I installed FreeBSD8.0-RC1. Since then, system randomly just freezes, and there is no option other than hard boot. I guessed this will get solved in 8.0-RELEASE, but it was not :( Many times I get vmcore files, not always. I have dumpdev set to AUTO in my rc.conf. Almost every time it just fsck's the file-system on reboot. I have not lost any files though. This is a Dell Inspiron 1525 Laptop with 1GB ram, Intel Core2 Duo T5500 with ATI Radeon X1400 card. The installation in question is KDE4 from ports, with radeon/ati driver. I felt the problem is with wpi driver, then suspected dri driver of X. Then I observed system freezes even if none of this is installed. e.g. if it is under some load, like building a port and simultaneously fetching something over network it hangs, and hangs hard. This persuaded me to think something is wrong in kernel scheduling itself. May be it is lost in some deadlock, etc... Thus last weekend I thought I would see how immediate previous version i.e. FreeBSD-7.3-RELEASE would behave. I reinstalled FreeBSD7.1 from iso images, svn up'ed FreeBSD7.3 source, did the normal buildworld, buildkernel, installkernel, installworld cycle. Unfortunatly this kernel is naughty as well ;-), it also freezes with same stubbornness. But difference is this time I happen to catch something interesting. It panics on NMI, fatal trap 19 while in kernel mode. Loaded the vmcore file in kgdb and got the backtrace. I obtained vmcore files on two occasions. I have attached both the back traces. This error most likely suggests hardware error in RAM, but Windox7 and XP boot just fine and never caused any errors. Yes, and note that the chipset has set a register to indicate a RAM parity error as well, so it is not a random NMI. Have you checked your BIOS' event log? You may also want to try running with machine checks enabled (hw.mca.enabled=1 in loader.conf, but it would have to be on very recent 7/8- stable) to see if you get machine checks for ECC errors. OTOH, if you do not have ECC memory then this will probably not help. To verify if I have errors in my RAM I let run sysutils/memtest86+ overnight, to double verify I also executed Windows Memory Diagnostic test for four times. None of them reported errors. Can anyone here suggest any solution. You can still have bad RAM even if those do not fail. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras ivo...@freebsd.org wrote: On 28 March 2010 16:42, Masoom Shaikh masoom.sha...@gmail.com wrote: lets assume if this is h/w problem, then how can other OSes overcome this ? is there a way to make FreeBSD ignore this as well, let it result in reasonable performance penalty. Very probably, if only we could detect where the problem is. Try adding options PRINTF_BUFR_SIZE=128 to the kernel this option is already there configuration file if you can, to see if you can get a less mangled log outout. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On Mon, Mar 29, 2010 at 05:01:02PM +, Masoom Shaikh wrote: On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras ivo...@freebsd.org wrote: On 28 March 2010 16:42, Masoom Shaikh masoom.sha...@gmail.com wrote: lets assume if this is h/w problem, then how can other OSes overcome this ? is there a way to make FreeBSD ignore this as well, let it result in reasonable performance penalty. Very probably, if only we could detect where the problem is. Try adding options PRINTF_BUFR_SIZE=128 to the kernel this option is already there The key word in Ivan's phrase is less mangled. Neither use of or increasing PRINTF_BUFR_SIZE solves the problem of interspersed console output. I've been ranting/raving about this problem for years now; it truly looks like a mutex lock issue (or lack of such lock), but I've been told numerous times that isn't the case. To developers: what incentives would help get this issue well-needed attention? This problem makes kernel debugging, panic analysis, and other console-oriented viewing basically impossible. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On Monday 29 March 2010 1:30:38 pm Jeremy Chadwick wrote: On Mon, Mar 29, 2010 at 05:01:02PM +, Masoom Shaikh wrote: On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras ivo...@freebsd.org wrote: On 28 March 2010 16:42, Masoom Shaikh masoom.sha...@gmail.com wrote: lets assume if this is h/w problem, then how can other OSes overcome this ? is there a way to make FreeBSD ignore this as well, let it result in reasonable performance penalty. Very probably, if only we could detect where the problem is. Try adding options PRINTF_BUFR_SIZE=128 to the kernel this option is already there The key word in Ivan's phrase is less mangled. Neither use of or increasing PRINTF_BUFR_SIZE solves the problem of interspersed console output. I've been ranting/raving about this problem for years now; it truly looks like a mutex lock issue (or lack of such lock), but I've been told numerous times that isn't the case. To developers: what incentives would help get this issue well-needed attention? This problem makes kernel debugging, panic analysis, and other console-oriented viewing basically impossible. I was recently going to look at it. The somewhat drastic approach I was going to take was to add a simple serializing lock around trap_fatal() and a few other places that do similar block prints (e.g. mca_log()). One of the issues with fixing this in printf itself is that you'd want probably want to serialize complete lines of text on a per-thread basis. You would want to be able to accumulate this line of text across multiple calls to printf (think of it as line-buffering ala stdio). However, some folks may be nervous about printf not printing things immediately. The other issue is that lots of code assumes it can call printf from anywhere and everywhere. Mostly this just means that if you add locking and line- buffering to printf(9) you have to be very careful to make sure it works in odd places. Probably a lot of this could be solved by deferring things like trap_fatal() until panic() has already been called (which is bde's preferred solution I think). -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On Mon, Mar 29, 2010 at 02:27:34PM -0400, John Baldwin wrote: On Monday 29 March 2010 1:30:38 pm Jeremy Chadwick wrote: On Mon, Mar 29, 2010 at 05:01:02PM +, Masoom Shaikh wrote: On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras ivo...@freebsd.org wrote: On 28 March 2010 16:42, Masoom Shaikh masoom.sha...@gmail.com wrote: lets assume if this is h/w problem, then how can other OSes overcome this ? is there a way to make FreeBSD ignore this as well, let it result in reasonable performance penalty. Very probably, if only we could detect where the problem is. Try adding options PRINTF_BUFR_SIZE=128 to the kernel this option is already there The key word in Ivan's phrase is less mangled. Neither use of or increasing PRINTF_BUFR_SIZE solves the problem of interspersed console output. I've been ranting/raving about this problem for years now; it truly looks like a mutex lock issue (or lack of such lock), but I've been told numerous times that isn't the case. To developers: what incentives would help get this issue well-needed attention? This problem makes kernel debugging, panic analysis, and other console-oriented viewing basically impossible. I was recently going to look at it. The somewhat drastic approach I was going to take was to add a simple serializing lock around trap_fatal() and a few other places that do similar block prints (e.g. mca_log()). One of the issues with fixing this in printf itself is that you'd want probably want to serialize complete lines of text on a per-thread basis. You would want to be able to accumulate this line of text across multiple calls to printf (think of it as line-buffering ala stdio). However, some folks may be nervous about printf not printing things immediately. The other issue is that lots of code assumes it can call printf from anywhere and everywhere. Mostly this just means that if you add locking and line- buffering to printf(9) you have to be very careful to make sure it works in odd places. Probably a lot of this could be solved by deferring things like trap_fatal() until panic() has already been called (which is bde's preferred solution I think). John, Thanks for the insights, they're greatly appreciated. I went looking this morning to see how Linux addressed this issue (if at all), and it's been discussed a few times in the past. The longest lkml thread I could find that mentioned the problem was circa 2002. Probably not worth reading as there was work done in 2009 to solve the issue. http://lkml.indiana.edu/hypermail/linux/kernel/0204.1/index.html#161 Work done by RedHat in 2009 details how they implemented a lockless version of their kernel ring buffer (similar to our system message buffer, but probably a lot more complex): http://lwn.net/Articles/340400/ http://lwn.net/Articles/340443/ Supposedly having multiple writers to the ring is 100% safe; no interspersed output. Same goes for interrupt-generated stuff. There's some comments in the technical document (2nd link) that imply there's an individual ring buffer for each CPU; possibly per-CPU kernel message buffers would solve our issue? -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On Sun, Mar 28, 2010 at 10:32 AM, Ivan Voras ivo...@freebsd.org wrote: Masoom Shaikh wrote: Hello List, I was a happy FreeBSD user, just before I installed FreeBSD8.0-RC1. Since then, system randomly just freezes, and there is no option other than hard boot. I guessed this will get solved in 8.0-RELEASE, but it was not :( I wild shot - did you try disabling superpages? ___ freebsd-questi...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org umm, how do I do that ? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On Sun, 28 Mar 2010 11:18:59 + Masoom Shaikh masoom.sha...@gmail.com wrote: On Sun, Mar 28, 2010 at 10:32 AM, Ivan Voras ivo...@freebsd.org wrote: Masoom Shaikh wrote: Hello List, I was a happy FreeBSD user, just before I installed FreeBSD8.0-RC1. Since then, system randomly just freezes, and there is no option other than hard boot. I guessed this will get solved in 8.0-RELEASE, but it was not :( I wild shot - did you try disabling superpages? ___ freebsd-questi...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org umm, how do I do that ? Add this to /boot/loader.conf vm.pmap.pg_ps_enabled=0 -- Gary Jennejohn ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On 28 March 2010 13:18, Masoom Shaikh masoom.sha...@gmail.com wrote: On Sun, Mar 28, 2010 at 10:32 AM, Ivan Voras ivo...@freebsd.org wrote: Masoom Shaikh wrote: Hello List, I was a happy FreeBSD user, just before I installed FreeBSD8.0-RC1. Since then, system randomly just freezes, and there is no option other than hard boot. I guessed this will get solved in 8.0-RELEASE, but it was not :( I wild shot - did you try disabling superpages? umm, how do I do that ? Set vm.pmap.pg_ps_enabled=0 in /boot/loader.conf and reboot. Report back if it helps or not. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On Sun, Mar 28, 2010 at 12:03 PM, Ivan Voras ivo...@freebsd.org wrote: On 28 March 2010 13:18, Masoom Shaikh masoom.sha...@gmail.com wrote: On Sun, Mar 28, 2010 at 10:32 AM, Ivan Voras ivo...@freebsd.org wrote: Masoom Shaikh wrote: Hello List, I was a happy FreeBSD user, just before I installed FreeBSD8.0-RC1. Since then, system randomly just freezes, and there is no option other than hard boot. I guessed this will get solved in 8.0-RELEASE, but it was not :( I wild shot - did you try disabling superpages? umm, how do I do that ? Set vm.pmap.pg_ps_enabled=0 in /boot/loader.conf and reboot. Report back if it helps or not. nopes, this didn't help too, machine freezed again after using for 30 minutes or so all it was doing is playing amarok, fetching sources from svn repos, and using firefox lets assume if this is h/w problem, then how can other OSes overcome this ? is there a way to make FreeBSD ignore this as well, let it result in reasonable performance penalty. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On 28 March 2010 16:42, Masoom Shaikh masoom.sha...@gmail.com wrote: lets assume if this is h/w problem, then how can other OSes overcome this ? is there a way to make FreeBSD ignore this as well, let it result in reasonable performance penalty. Very probably, if only we could detect where the problem is. Try adding options PRINTF_BUFR_SIZE=128 to the kernel configuration file if you can, to see if you can get a less mangled log outout. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: random FreeBSD panics
On Sun, Mar 28, 2010 at 8:42 AM, Masoom Shaikh masoom.sha...@gmail.comwrote: nopes, this didn't help too, machine freezed again after using for 30 minutes or so all it was doing is playing amarok, fetching sources from svn repos, and using firefox lets assume if this is h/w problem, then how can other OSes overcome this ? is there a way to make FreeBSD ignore this as well, let it result in reasonable performance penalty. They would remove or replace the bad hardware. I've seen more that one DIMM which passed every memory checker I could find in it's most extensive testing mode. Only consistently effective option is to replace with a known good piece of memory. -- Adam Vande More ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org