SOLVED - Re: Simple question re: oops
On Sun, 2005-07-31 at 10:40 +1000, Dave Airlie wrote: > > panic_on_oops has no effect, a bunch of stuff flies past and the last > > thing I see is "gam_server: scheduling while atomic" then a stack trace > > of the core dump path then "Aiee, killing interrupt handler". > > > > I am starting to suspect the hard drive, does that sound plausible? > > It's as if it locks up when it hits a certain disk block. > > run memtest on it... you might have bad RAM.. This was some kind of (ACPI related?) kernel bug. I upgraded from Hoary (2.6.11) to Breezy (2.6.12) and the problem which had been 100% reproducible went away. One strange thing I noticed was some strange APM/ACPI related messages in the logs when starting X (APM: overridden by ACPI or something). Now I don't get these and the X log just says /dev/apm_bios: No such device. Oh well, it's working now. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
On Sun, 2005-07-31 at 10:40 +1000, Dave Airlie wrote: > > panic_on_oops has no effect, a bunch of stuff flies past and the last > > thing I see is "gam_server: scheduling while atomic" then a stack trace > > of the core dump path then "Aiee, killing interrupt handler". > > > > I am starting to suspect the hard drive, does that sound plausible? > > It's as if it locks up when it hits a certain disk block. > > run memtest on it... you might have bad RAM.. > Already swapped it out, but I'll try memtest. Any idea why printk_ratelimit does not work? I set it to 1000 (per the docs this should limit to 1 printk per second) and burst to 1 but I still get screenfuls of text flying by. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
> panic_on_oops has no effect, a bunch of stuff flies past and the last > thing I see is "gam_server: scheduling while atomic" then a stack trace > of the core dump path then "Aiee, killing interrupt handler". > > I am starting to suspect the hard drive, does that sound plausible? > It's as if it locks up when it hits a certain disk block. run memtest on it... you might have bad RAM.. Dave. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
On Sun, 2005-07-31 at 02:11 +0200, Alexander Nyberg wrote: > On Sat, Jul 30, 2005 at 07:48:11PM -0400 Lee Revell wrote: > > > I have a machine here that oopses reliably when I start X, but the > > interesting stuff scrolls away too fast, and a bunch more Oopses get > > printed ending with "Aieee, killing interrupt handler". > > > > How do I get the output to stop after the first Oops? > > > > set /proc/sys/kernel/panic_on_oops to 1 > > What version of the kernel is that? It shouldn't do recursive oopses > (of the same task) any more. > panic_on_oops has no effect, a bunch of stuff flies past and the last thing I see is "gam_server: scheduling while atomic" then a stack trace of the core dump path then "Aiee, killing interrupt handler". I am starting to suspect the hard drive, does that sound plausible? It's as if it locks up when it hits a certain disk block. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
On Sun, 2005-07-31 at 02:11 +0200, Alexander Nyberg wrote: > On Sat, Jul 30, 2005 at 07:48:11PM -0400 Lee Revell wrote: > > > I have a machine here that oopses reliably when I start X, but the > > interesting stuff scrolls away too fast, and a bunch more Oopses get > > printed ending with "Aieee, killing interrupt handler". > > > > How do I get the output to stop after the first Oops? > > > > set /proc/sys/kernel/panic_on_oops to 1 > > What version of the kernel is that? It shouldn't do recursive oopses > (of the same task) any more. > 2.6.10 (whatever comes with Ubuntu Hoary). It's a demo install for a client on cobbled together hardware. First I suspected the bleeding edge GeForce video card, then we swapped it which didn't help. Now I suspect the hard drive (or a kernel bug). And I was wrong, it wasn't more Oopses, it was "scheduling while atomic" messages that forced the interesting stuff offscreen. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
On Sat, Jul 30, 2005 at 07:48:11PM -0400 Lee Revell wrote: > I have a machine here that oopses reliably when I start X, but the > interesting stuff scrolls away too fast, and a bunch more Oopses get > printed ending with "Aieee, killing interrupt handler". > > How do I get the output to stop after the first Oops? > set /proc/sys/kernel/panic_on_oops to 1 What version of the kernel is that? It shouldn't do recursive oopses (of the same task) any more. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
On Sat, 2005-07-30 at 19:48 -0400, Lee Revell wrote: > I have a machine here that oopses reliably when I start X, but the > interesting stuff scrolls away too fast, and a bunch more Oopses get > printed ending with "Aieee, killing interrupt handler". > > How do I get the output to stop after the first Oops? > Never mind, /proc/sys/kernel/panic_on_oops should do it. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
On Sat, 2005-07-30 at 19:48 -0400, Lee Revell wrote: I have a machine here that oopses reliably when I start X, but the interesting stuff scrolls away too fast, and a bunch more Oopses get printed ending with Aieee, killing interrupt handler. How do I get the output to stop after the first Oops? Never mind, /proc/sys/kernel/panic_on_oops should do it. Lee - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
On Sat, Jul 30, 2005 at 07:48:11PM -0400 Lee Revell wrote: I have a machine here that oopses reliably when I start X, but the interesting stuff scrolls away too fast, and a bunch more Oopses get printed ending with Aieee, killing interrupt handler. How do I get the output to stop after the first Oops? set /proc/sys/kernel/panic_on_oops to 1 What version of the kernel is that? It shouldn't do recursive oopses (of the same task) any more. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
On Sun, 2005-07-31 at 02:11 +0200, Alexander Nyberg wrote: On Sat, Jul 30, 2005 at 07:48:11PM -0400 Lee Revell wrote: I have a machine here that oopses reliably when I start X, but the interesting stuff scrolls away too fast, and a bunch more Oopses get printed ending with Aieee, killing interrupt handler. How do I get the output to stop after the first Oops? set /proc/sys/kernel/panic_on_oops to 1 What version of the kernel is that? It shouldn't do recursive oopses (of the same task) any more. 2.6.10 (whatever comes with Ubuntu Hoary). It's a demo install for a client on cobbled together hardware. First I suspected the bleeding edge GeForce video card, then we swapped it which didn't help. Now I suspect the hard drive (or a kernel bug). And I was wrong, it wasn't more Oopses, it was scheduling while atomic messages that forced the interesting stuff offscreen. Lee - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
On Sun, 2005-07-31 at 02:11 +0200, Alexander Nyberg wrote: On Sat, Jul 30, 2005 at 07:48:11PM -0400 Lee Revell wrote: I have a machine here that oopses reliably when I start X, but the interesting stuff scrolls away too fast, and a bunch more Oopses get printed ending with Aieee, killing interrupt handler. How do I get the output to stop after the first Oops? set /proc/sys/kernel/panic_on_oops to 1 What version of the kernel is that? It shouldn't do recursive oopses (of the same task) any more. panic_on_oops has no effect, a bunch of stuff flies past and the last thing I see is gam_server: scheduling while atomic then a stack trace of the core dump path then Aiee, killing interrupt handler. I am starting to suspect the hard drive, does that sound plausible? It's as if it locks up when it hits a certain disk block. Lee - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
panic_on_oops has no effect, a bunch of stuff flies past and the last thing I see is gam_server: scheduling while atomic then a stack trace of the core dump path then Aiee, killing interrupt handler. I am starting to suspect the hard drive, does that sound plausible? It's as if it locks up when it hits a certain disk block. run memtest on it... you might have bad RAM.. Dave. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Simple question re: oops
On Sun, 2005-07-31 at 10:40 +1000, Dave Airlie wrote: panic_on_oops has no effect, a bunch of stuff flies past and the last thing I see is gam_server: scheduling while atomic then a stack trace of the core dump path then Aiee, killing interrupt handler. I am starting to suspect the hard drive, does that sound plausible? It's as if it locks up when it hits a certain disk block. run memtest on it... you might have bad RAM.. Already swapped it out, but I'll try memtest. Any idea why printk_ratelimit does not work? I set it to 1000 (per the docs this should limit to 1 printk per second) and burst to 1 but I still get screenfuls of text flying by. Lee - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
SOLVED - Re: Simple question re: oops
On Sun, 2005-07-31 at 10:40 +1000, Dave Airlie wrote: panic_on_oops has no effect, a bunch of stuff flies past and the last thing I see is gam_server: scheduling while atomic then a stack trace of the core dump path then Aiee, killing interrupt handler. I am starting to suspect the hard drive, does that sound plausible? It's as if it locks up when it hits a certain disk block. run memtest on it... you might have bad RAM.. This was some kind of (ACPI related?) kernel bug. I upgraded from Hoary (2.6.11) to Breezy (2.6.12) and the problem which had been 100% reproducible went away. One strange thing I noticed was some strange APM/ACPI related messages in the logs when starting X (APM: overridden by ACPI or something). Now I don't get these and the X log just says /dev/apm_bios: No such device. Oh well, it's working now. Lee - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/