Hi Derek,

I got this data using ipmitool from the servers BMC just after (about 3 minutes 
after robbot) a crash this afternoon.

I will be heading to th NOC this afternoone to copy the harddrive to another 
machine I have been using for about a year and a half.

Anyways, here is the sensor data ....

Temp             | 38 degrees C      | ok
Temp             | 50 degrees C      | ok
Ambient Temp     | 30 degrees C      | ok
Planar Temp      | 35 degrees C      | ok
Riser Temp       | 34 degrees C      | ok
Temp             | 40 degrees C      | ok
Temp             | 40 degrees C      | ok
CMOS Battery     | 3.15 Volts        | ok
ROMB Battery     | Not Readable      | ns
VCORE            | 0x01              | ok
VCORE            | Not Readable      | ns
PROC VTT         | 0x01              | ok
1.5V PG          | 0x01              | ok
1.8V PG          | 0x01              | ok
3.3V PG          | 0x01              | ok
5V PG            | 0x01              | ok
5V Riser PG      | 0x01              | ok
Riser PG         | 0x01              | ok
PFault Fail Safe | Not Readable      | ns
Presence         | 0x01              | ok
Presence         | 0x02              | ok
Presence         | 0x01              | ok
Presence         | 0x02              | ok
ROMB Presence    | 0x02              | ok
FAN 1A RPM       | 9600 RPM          | ok
FAN 1B RPM       | 6900 RPM          | ok
FAN 2A RPM       | 9900 RPM          | ok
FAN 2B RPM       | 6825 RPM          | ok
FAN 3A RPM       | 9825 RPM          | ok
FAN 3B RPM       | 6825 RPM          | ok
FAN 4A RPM       | 10200 RPM         | ok
FAN 4B RPM       | 6675 RPM          | ok
Status           | 0x80              | ok
Status           | Not Readable      | ns
Status           | 0x01              | ok
Status           | Not Readable      | ns
VRM              | 0x01              | ok
VRM              | 0x01              | ok
OS Watchdog      | 0x00              | ok
SEL              | Not Readable      | ns
Intrusion        | 0x00              | ok
PS Redundancy    | Not Readable      | ns
Fan Redundancy   | 0x01              | ok
SCSI Connector A | Not Readable      | ns
Drive            | 0xc0              | ok
ECC Corr Err     | 0xc0              | ok
ECC Uncorr Err   | Not Readable      | ns
I/O Channel Chk  | 0xc0              | ok
PCI Parity Err   | 0xc0              | ok
PCI System Err   | 0xc0              | ok
SBE Log Disabled | Not Readable      | ns
Logging Disabled | Not Readable      | ns
Unknown          | Not Readable      | ns
PROC Protocol    | Not Readable      | ns
PROC Bus PERR    | Not Readable      | ns
PROC Init Err    | Not Readable      | ns
PROC Machine Chk | Not Readable      | ns
Memory Spared    | Not Readable      | ns
Memory Mirrored  | 0x01              | ok
Memory RAID      | Not Readable      | ns
Memory Added     | 0x01              | ok
Memory Removed   | 0x01              | ok
PCIE Fatal Err   | 0x01              | ok
Chipset Err      | 0x01              | ok
Err Reg Pointer  | 0x01              | ok
root on s1#
  ----- Original Message ----- 
  From: Derek Ragona 
  To: Grant Peel ; freebsd-questions@freebsd.org 
  Sent: Thursday, March 16, 2006 5:45 PM
  Subject: Re: More Server Crash Saga


  Grant,

  That is a one unit rack mount server, which makes it prone to have heat 
problems, particularly under any load.  You might want to check the ambient 
heat and the internal heat sensors as well.

  That server uses an intel chipset (and probably an intel motherboard) which 
should allow "out-of-band" monitoring.  You should see what you can use to 
monitor the system and see what the system is reporting prior to a lockup.

  It may be time to just call dell and have them send a replacement MB or 
entire unit.

          -Derek


  At 03:47 PM 3/16/2006, Grant Peel wrote:

    Hi all,

    Still getting crashing today ... FreeBSD 6.0 PE 1850

    Does the output of vmstat -i for fove seconds show a problem? Interupt 
storm?

    I have been searching, trying to find out what the 'rate' means and what 
should it be?

    interrupt                          total       rate
    irq0: clk                        3277223        999
    irq5: em1                           8877          2
    irq6: ehci0 atapci0                   85          0
    irq7: mpt0 uhci2                   56401         17
    irq8: rtc                         419429        127
    irq11: em0 uhci0                   85684         26
    irq13: npx0                            1          0
    irq14: ata0                           48          0
    Total                            3847748       1173
    root on s1# vmstat -i
    interrupt                          total       rate
    irq0: clk                        3278793        999
    irq5: em1                           8883          2
    irq6: ehci0 atapci0                   85          0
    irq7: mpt0 uhci2                   56408         17
    irq8: rtc                         419630        127
    irq11: em0 uhci0                   85752         26
    irq13: npx0                            1          0
    irq14: ata0                           48          0
    Total                            3849600       1174
    root on s1# vmstat -i
    interrupt                          total       rate
    irq0: clk                        3280691        999
    irq5: em1                           8889          2
    irq6: ehci0 atapci0                   85          0
    irq7: mpt0 uhci2                   56408         17
    irq8: rtc                         419873        127
    irq11: em0 uhci0                   85843         26
    irq13: npx0                            1          0
    irq14: ata0                           48          0
    Total                            3851838       1173
    root on s1# vmstat -i
    interrupt                          total       rate
    irq0: clk                        3282850        999
    irq5: em1                           8891          2
    irq6: ehci0 atapci0                   85          0
    irq7: mpt0 uhci2                   56408         17
    irq8: rtc                         420149        127
    irq11: em0 uhci0                   86153         26
    irq13: npx0                            1          0
    irq14: ata0                           48          0
    Total                            3854585       1174 

    _______________________________________________
    freebsd-questions@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-questions
    To unsubscribe, send any mail to "[EMAIL PROTECTED]"
_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to