Hi Derek, I got this data using ipmitool from the servers BMC just after (about 3 minutes after robbot) a crash this afternoon.
I will be heading to th NOC this afternoone to copy the harddrive to another machine I have been using for about a year and a half. Anyways, here is the sensor data .... Temp | 38 degrees C | ok Temp | 50 degrees C | ok Ambient Temp | 30 degrees C | ok Planar Temp | 35 degrees C | ok Riser Temp | 34 degrees C | ok Temp | 40 degrees C | ok Temp | 40 degrees C | ok CMOS Battery | 3.15 Volts | ok ROMB Battery | Not Readable | ns VCORE | 0x01 | ok VCORE | Not Readable | ns PROC VTT | 0x01 | ok 1.5V PG | 0x01 | ok 1.8V PG | 0x01 | ok 3.3V PG | 0x01 | ok 5V PG | 0x01 | ok 5V Riser PG | 0x01 | ok Riser PG | 0x01 | ok PFault Fail Safe | Not Readable | ns Presence | 0x01 | ok Presence | 0x02 | ok Presence | 0x01 | ok Presence | 0x02 | ok ROMB Presence | 0x02 | ok FAN 1A RPM | 9600 RPM | ok FAN 1B RPM | 6900 RPM | ok FAN 2A RPM | 9900 RPM | ok FAN 2B RPM | 6825 RPM | ok FAN 3A RPM | 9825 RPM | ok FAN 3B RPM | 6825 RPM | ok FAN 4A RPM | 10200 RPM | ok FAN 4B RPM | 6675 RPM | ok Status | 0x80 | ok Status | Not Readable | ns Status | 0x01 | ok Status | Not Readable | ns VRM | 0x01 | ok VRM | 0x01 | ok OS Watchdog | 0x00 | ok SEL | Not Readable | ns Intrusion | 0x00 | ok PS Redundancy | Not Readable | ns Fan Redundancy | 0x01 | ok SCSI Connector A | Not Readable | ns Drive | 0xc0 | ok ECC Corr Err | 0xc0 | ok ECC Uncorr Err | Not Readable | ns I/O Channel Chk | 0xc0 | ok PCI Parity Err | 0xc0 | ok PCI System Err | 0xc0 | ok SBE Log Disabled | Not Readable | ns Logging Disabled | Not Readable | ns Unknown | Not Readable | ns PROC Protocol | Not Readable | ns PROC Bus PERR | Not Readable | ns PROC Init Err | Not Readable | ns PROC Machine Chk | Not Readable | ns Memory Spared | Not Readable | ns Memory Mirrored | 0x01 | ok Memory RAID | Not Readable | ns Memory Added | 0x01 | ok Memory Removed | 0x01 | ok PCIE Fatal Err | 0x01 | ok Chipset Err | 0x01 | ok Err Reg Pointer | 0x01 | ok root on s1# ----- Original Message ----- From: Derek Ragona To: Grant Peel ; freebsd-questions@freebsd.org Sent: Thursday, March 16, 2006 5:45 PM Subject: Re: More Server Crash Saga Grant, That is a one unit rack mount server, which makes it prone to have heat problems, particularly under any load. You might want to check the ambient heat and the internal heat sensors as well. That server uses an intel chipset (and probably an intel motherboard) which should allow "out-of-band" monitoring. You should see what you can use to monitor the system and see what the system is reporting prior to a lockup. It may be time to just call dell and have them send a replacement MB or entire unit. -Derek At 03:47 PM 3/16/2006, Grant Peel wrote: Hi all, Still getting crashing today ... FreeBSD 6.0 PE 1850 Does the output of vmstat -i for fove seconds show a problem? Interupt storm? I have been searching, trying to find out what the 'rate' means and what should it be? interrupt total rate irq0: clk 3277223 999 irq5: em1 8877 2 irq6: ehci0 atapci0 85 0 irq7: mpt0 uhci2 56401 17 irq8: rtc 419429 127 irq11: em0 uhci0 85684 26 irq13: npx0 1 0 irq14: ata0 48 0 Total 3847748 1173 root on s1# vmstat -i interrupt total rate irq0: clk 3278793 999 irq5: em1 8883 2 irq6: ehci0 atapci0 85 0 irq7: mpt0 uhci2 56408 17 irq8: rtc 419630 127 irq11: em0 uhci0 85752 26 irq13: npx0 1 0 irq14: ata0 48 0 Total 3849600 1174 root on s1# vmstat -i interrupt total rate irq0: clk 3280691 999 irq5: em1 8889 2 irq6: ehci0 atapci0 85 0 irq7: mpt0 uhci2 56408 17 irq8: rtc 419873 127 irq11: em0 uhci0 85843 26 irq13: npx0 1 0 irq14: ata0 48 0 Total 3851838 1173 root on s1# vmstat -i interrupt total rate irq0: clk 3282850 999 irq5: em1 8891 2 irq6: ehci0 atapci0 85 0 irq7: mpt0 uhci2 56408 17 irq8: rtc 420149 127 irq11: em0 uhci0 86153 26 irq13: npx0 1 0 irq14: ata0 48 0 Total 3854585 1174 _______________________________________________ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]" _______________________________________________ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"