Hey all,

I had a kernel panic on a RHEL4 box the other day, and since then I've been
digging into postmortem analysis.  Is anyone else doing this?  After reading
up on 'crash' and related tools, I've decided I'm going to be adding this to
my list of standard stuff I always do when setting up a new box.  This looks
like a good way to build kernel knowledge, and since I want to do embedded
linux development as my masters project, that's something I need.

On RHEL4 there is netdump and diskdump.  The idea is that you want to
capture the contents of you RAM so you can debug the kernel with GDB.  The
docs to netdump say it's supposed to be better, since the kernel can put the
NIC in a mode that doesn't require it to respond to IRQs.  I guess this is
so when the capture occurs, there are less things that could stomp on your
RAM.

What's supporsed to happen is the netdump module fires up when the panic
occurs, authenticates (via ssh if you desire) against a server, then starts
feeding it UDP packets in cleartext to port 6666.  These should appear in
/var/crash

But for me it didn't work.  The docs don't tell you to make sure you open
6666 to iptables, though it's pretty obvious that'll be closed out of the
box on pretty much any modern distro.  They also say you need to be using a
supported NIC, which l'm not.

Diskdump worked fine, though.  All I had to do was specify I wanted to dump
to the swap partition and then start the tool.

Testing this setup, I had a great time crashing the box with the Magic SysRq
Key: Alt-SysRq-C.  First time in 10 years I've wanted to crash a box on
purpose!

Then on bootup I was pleased to see the raw crash dump file being copied
back out of the swap partition and into /var/crash.

Pass in the /var/crash/.../vmcore with the kernel you built with debug info,
or the one from kernel-debuginfo rpm, and you're off.

Now I can run 'ps', 'fuser', look at the task_struct, interrupts, pretty
much anything!  I've got 4 books on the kernel, and now I've finally getting
some test data.

I was wondering if anyone on this list is already doing this regularly and
wants to provide insight.  Or if anyone has a system crashes, I volunteer to
look into it (contact me directly at dave _AT_ peoplemerge.com).

Dave

--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list

Reply via email to