Jim,
I tried to compile 2.2.15 and it locked up the box badly. A typical BP6
lockup with everything dying.
My bad. It's 192MB (typo) 128+64
2.2.14 official Linus version. I don't trust any others.
It's definitely NOT overclocked. 2 brand new 500mhz Celerons (about a month
and a half old)
I haven't gotten any of those errors before the 12th and I haven't gotten
any since I rebooted.
- Jon
----- Original Message -----
From: Jim Troutman <[EMAIL PROTECTED]>
To: Discussion List for Linux on Abit Motherboards <[EMAIL PROTECTED]>
Sent: Saturday, May 13, 2000 4:57 PM
Subject: Re: [LINUX-ABIT] Wierd on my remote server (BP6)
On Sat, 13 May 2000, Jonathan Buset wrote:
> The server specs:
>
> 196MB PC100 RAM
isn't this really 192? (128+64, or 64+64+64). Or do you actually have a
4MB DIMM in there?
> kernel-2.2.14 (compiled from source before I shipped it)
was it the offical Linus version, or RedHat's version (which has other
patches)?
> Since these CODE messages are all the same, you said it's probably
software
> related. Is there any way
> to track what software is conflicting? I can't run memtest86 because I
> cannot physically access the machine. I don't think it's faulty ram
though.
Interesting data. I am not a kernel hacker, only a sysadmin who has lived
through this stuff several times. My ASM is 15 years rusty and for a
different platform, so I am not much help there.
Based on my own past experience, if you are getting OOPSes in exactly the
same routine every time, it is likely to be software, unless that routine
is something that is dealing with hardware and is OOPSing due to a
hardware problem with a bad SCSI controller, for example.
Since the reboot have the OOPSes gone away, or are they still happening?
Do you have a list of all of the daemons you were gettting an OOPS in? Can
it be reproduced?
Of course, I am assuming that you are not overclocking the BP6, too. If
so, all bets are off.
This doesn't seem like a BP6 specific issue -- those just usually lock the
machine up hard without even a kernel oops first (at least that what mine
does every 3-4 weeks).
I'd suggest you post a summary on linux-kernel, using the OOPS report
format found in Appendix B of:
http://www.tux.org/lkml/old-faq.txt
You probably want to read the linux-kernel mailing list FAQ at
http://www.tux.org/lkml/
and in particular the section of how to capture and report an OOPS
http://www.tux.org/lkml/#s4-2
I try to make it a matter of course to setup a serial console on all
"production" linux boxes I maintain now, with logging of all output to a
really stable seperate machine (like a 486 with a 2.0.x kernel -- a good
use for old laptops, actually.) Just in case of a kernel oops... a lot
easier than getting a paniced on-site person to copy down a oops from the
screen by hand.
lastly, not to sound like a M$ person, why not upgrade to the latest
stable kernel? 2.2.15 is out now - release notes at:
http://www.linux.org.uk/VERSION/relnotes.2215.html
Upgrading a remote box can be a little scarey, but it can be
done. Especially helpful is having a working boot disk of the previous
kernel image (appropriately rdev'ed) and the ability for someone to insert
it onsite. I don't know if that is an option for your co-location place.
--
James Troutman, Troutman & Associates - telecommunications consulting
93 Main Street, Waterville, Maine 04901 - 207-861-7067
--
=- To unsubscribe, email [EMAIL PROTECTED] with the -=
=- body of "unsubscribe linux-abit". -=
--
=- To unsubscribe, email [EMAIL PROTECTED] with the -=
=- body of "unsubscribe linux-abit". -=