I am using netdump for this purpose, but I find that I don't always get complete core images on crash; tweaking the wait time before reboot doesn't seem to have the desired effect of allowing the complete core image to transfer, so YMMV. I don't think there are any debugging symbols in the pre-built kernels, either, so you'd have to compile a debugging kernel version in order to dissect crashes. hth, Klaus
________________________________ From: [EMAIL PROTECTED] on behalf of Johann Lombardi Sent: Mon 11/26/2007 7:37 AM To: Somsak Sriprayoonsakul Cc: [email protected] Subject: Re: [Lustre-discuss] Node randomly panic On Mon, Nov 26, 2007 at 08:49:56PM +0700, Somsak Sriprayoonsakul wrote: > Could you tell me how to dump the whole crash log to file? It's not > appear in /var/log/messages. I only seen it once actually. That's why I > don't know the function name :) But the whole screen are something > related to lustre for sure. You should set up serial consoles (or netconsole). A crash dump utility (netdump, LKCD, ...) is also very useful. > Note that, the dump log is longer than a screen size, so taking photo > wouldn't help ( I think ). If /proc/sys/kernel/panic_on_oops is set to 1 on the OSS, you could try to set it to 0 and to log onto the node to get the stack trace via dmesg before rebooting it. Johann _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
