On Tue, Aug 12, 2003 at 03:52:34PM -0400, Nick Fisher wrote:
> I have a machine that I cannot compile a stable 2.4.20 kernel for, yet the
> one off of the 1.4_rc2 liveCD works fine. I'm guessing there is an option
> or a patch that is/isn't set/applyed. Apart from good old trial and error
> how the heck do I work out what is giving me the problem?

I've often found the NMI watchdog timer to be extremely
helpful with unexplained kernel lockups. Documentation is in
/usr/src/linux/Documentation/nmi_watchdog.txt. Basically, you append
"nmi_watchdog=1" to your kernel launch from LILO or Grub - in a few
rare cases, you need a value other than 1. When the kernel locks up, the
watchdog detects it and dumps interesting traceback to the console. I've
been able to correlate that traceback to symbols in /proc/ksyms and
identify malfunctioning drivers.

It works best if you can set up with a remote console over the serial
port. If you can't, don't run X or you won't see the dump when it happens.
Then grab a pen and start writing down addresses :-).

Nathan Meyers
[EMAIL PROTECTED]

> Every kernel I have compiled for this one machine has failed. My general
> test is to recompile the kernel multiple times (As recomended by
> DRobbins). One of my kernels once made it through three compiles. If I
> start from the liveCD and chroot into my gentoo install and compile, it
> goes for days. So (unless I have missed something) the problem appears to
> be the kernel. I did think it was hardware for some time but after a few
> days of CPUburn and Memtest86 I have basicly discounted that. If it were
> the hardware I couldn't continuously compile the kernel for over a day
> chrooted from the liveCD.
> 
> So from what I can tell there is something *wrong* with the way I'm
> compiling this kernel or the options I'm setting. When the machine crashes
> it just stops. No errors, no nothing. It just stops. I have scoured the
> logs for any kernel panics or segfaults and have found nothing. I even
> remembered stop caching in Metalog. As an experiment I tryed using the
> config from the new and confusing (I find) genkernel. Same result.
> 
> The machine is based on a SuperMicro P6DBU (Rev 1.1), BIOS r3.1 (latest).
> It has a Adaptec 29160 (SCSI wide 160 card, BIOS 2.57.2) that I have been
> compiling in the aic7xxx driver for and a tulip drivin network card. It
> has 1.5GB of RAM and 2xPIII 500 (kalamaths).
> 
> I'm basicly in a rut now. I'm trying any wacky kernel configuration that I
> can think of with no real plan. I have in the past managed to build stable
> kernels for various machines but I just can't get a handle on this one.
> If anyone has any ideas, troubleshooting tips or dumb ideas... I would
> love to hear them.
> 
>   Nick
> 
> --
> [EMAIL PROTECTED] mailing list
> 
> 
> 

-- 

--
[EMAIL PROTECTED] mailing list

Reply via email to