Re: [Freeipmi-devel] ipmi_monitoring_sensor_readings_by_record_id: internal IPMI error
Hey David, On another note. There are a lot of Unknown sensors listed below. Typically Unknown is shown b/c there are sensor reading errors on the remote motherboard. But the below is just way too many to look reasonable. Could you also send the --debug output of ipmi-sensors to me? Al On Tue, 2008-08-26 at 11:43 -0700, Al Chu wrote: Hey David, scanner2 ~ # ipmimonitoring ipmi_monitoring_sensor_readings_by_record_id: internal IPMI error This is odd. Could you run ipmimonitoring with --debug and let me know the output? I suppose I could have some weird corner case bug in ipmimonitoring that doesn't exist in ipmi-sensors. Al On Tue, 2008-08-26 at 11:31 -0700, David Sparks wrote: Hi all, Servers: Dell 1850, 1950. I'm trying to get ipmimonitoring running and have hit a snag. The manpage suggests that ipmimonitoring provides different but similar output to ipmi-sensors. However ipmi-sensors works and ipmimonitoring doesn't. I also noticed I always get an error running ipmiutil getevt or bmchealth. I don't see any Dell specific workarounds on the manpage (or the error either) I'm using this for reference: http:// www. gnu.org/software/freeipmi/manpages/man8/ipmimonitoring.8.html What else should I check? Thanks! scanner2 ~ # ipmimonitoring ipmi_monitoring_sensor_readings_by_record_id: internal IPMI error scanner2 ~ # ipmiutil getevt ipmiutil ver 2.11 getevent ver 2.11 -- BMC version 1.7, IPMI version 1.5 event receiver sa = 20 lun = 00 bmc enables = 08 Event Message Buffers not enabled. set_bmc_enables error 0xcc getevent: Invalid data field in request scanner2 ~ # bmchealth ipmiutil ver 2.11 bmchealth ver 2.11 BMC version 1.7, IPMI version 1.5 BMC manufacturer = 0002a2 (Dell), product = Chassis Status = 01 (on, restore_policy=stay_off) Power State = 00 (S0: working) Selftest status = 0055 (OK) get_chan_auth error: ret = cc bmchealth: Invalid data field in request scanner2 ~ # ipmi-sensors 1: Temp (Temperature): 50.00 C (NA/90.00): [OK] 2: Temp (Temperature): 47.00 C (NA/90.00): [OK] 3: Ambient Temp (Temperature): 16.00 C (5.00/47.00): [OK] 4: Planar Temp (Temperature): 37.00 C (5.00/72.00): [OK] 5: Riser Temp (Temperature): 40.00 C (5.00/67.00): [OK] 6: Temp (Temperature): 40.00 C (NA/NA): [OK] 7: Temp (Temperature): 40.00 C (NA/NA): [OK] 8: CMOS Battery (Voltage): 3.15 V (2.64/NA): [OK] 9: ROMB Battery (Voltage): [State Deasserted] 10: VCORE (Voltage): [State Deasserted] 11: VCORE (Voltage): [State Deasserted] 12: PROC VTT (Voltage): [Unknown] 13: 1.5V PG (Voltage): [Unknown] 14: 1.8V PG (Voltage): [Unknown] 15: 3.3V PG (Voltage): [Unknown] 16: 5V PG (Voltage): [Unknown] 17: 5V Riser PG (Voltage): [Unknown] 18: Riser PG (Voltage): [Unknown] 19: Presence (Entity Presence): [Entity Present] 20: Presence (Entity Presence): [Entity Present] 21: Presence (Entity Presence): [Entity Present] 22: Presence (Entity Presence): [Entity Present] 23: ROMB Presence (Entity Presence): [Entity Present] 24: FAN 1A RPM (Fan): 7425.00 RPM (2175.00/NA): [OK] 25: FAN 1B RPM (Fan): 5700.00 RPM (2175.00/NA): [OK] 26: FAN 2A RPM (Fan): 7725.00 RPM (2175.00/NA): [OK] 27: FAN 2B RPM (Fan): 5175.00 RPM (2175.00/NA): [OK] 28: FAN 3A RPM (Fan): 7725.00 RPM (2175.00/NA): [OK] 29: FAN 3B RPM (Fan): 5325.00 RPM (2175.00/NA): [OK] 30: FAN 4A RPM (Fan): 8025.00 RPM (2175.00/NA): [OK] 31: FAN 4B RPM (Fan): 5475.00 RPM (2175.00/NA): [OK] 32: Status (Processor): [Processor Presence detected] 33: Status (Processor): [Processor Presence detected] 34: Status (Power Supply): [Presence detected] 35: Status (Power Supply): [Presence detected] 36: VRM (Power Supply): [Presence detected] 37: VRM (Power Supply): [Presence detected] 38: OS Watchdog (Watchdog 2): [OK] 39: SEL (Event Logging Disabled): [Unknown] 40: Intrusion (Platform Chassis Intrusion): [OK] 41: PS Redundancy (Power Supply): [Unknown] 42: Fan Redundancy (Fan): [Unknown] 55: SCSI Connector (Cable Interconnect): [Unknown] 56: Drive (Slot Connector): [Unknown] 57: ECC Corr Err (Memory): [Unknown] 58: ECC Uncorr Err (Memory): [Unknown] 59: I/O Channel Chk (Critical Interrupt): [Unknown] 60: PCI Parity Err (Critical Interrupt): [Unknown] 61: PCI System Err (Critical Interrupt): [Unknown] 62: SBE Log Disable (Event Logging Disabled): [Unknown] 63: Logging Disable (Event Logging Disabled): [Unknown] 64: Unknown (System Event): [Unknown] 65: CPU Protocol Er (Processor): [Unknown] 66: CPU Bus PERR (Processor): [Unknown] 67: CPU Init Err (Processor): [Unknown] 68: CPU Machine Chk (Processor): [Unknown] 69: Memory Spared (Memory): [Unknown] 70: Memory Mirrored (Memory): [Unknown] 71: Memory RAID (Memory): [Unknown] 72: Memory Added (Memory): [Unknown] 73: Memory Removed (Memory): [Unknown] 74: PCIE Fatal Err (Critical
Re: [Freeipmi-devel]ipmi_monitoring_sensor_readings_by_record_id:internal IPMI error
Al Chu wrote: Hey David, Are you using the newest FreeIPMI available? 0.6.5 from the FreeIPMI homepage? The debug dump data you've given me seems pretty out of date. I've changed the formatting + the amount of information that gets dumped. [ VALUE TAG NAME:LENGTH ] [ 2Dh] = cmd[ 8b] [ CBh] = comp_code[ 8b] (ipmi_monitoring_sensor_reading.c, _get_sensor_reading, 404): bad completion code: 0x51 ipmi_monitoring_sensor_readings_by_record_id: internal IPMI error Another user w/ a Dell machine hit sensors that returned 0xCB == Requested Sensor, data, or record not present. This fix seems to be in the FreeIPMI 0.6.5 release. On another note, there are several sensors on your motherboard that ipmimonitoring currently does not interpret. I will add those into ipmimonitoring. I've updated to 0.6.5 from 0.5.6. What is strange is the debug output changes from run to run: # ipmimonitoring --debug Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-scanner2.localhost = Get SDR Repository Info Request = [ 20h] = cmd[ 8b] = Get SDR Repository Info Response = ipmi_sdr_cache_create: internal IPMI error # ipmimonitoring --debug Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-scanner2.localhost = Get SDR Repository Info Request = [ 20h] = cmd[ 8b] = Get SDR Repository Info Response = [ 20h] = cmd[ 8b] [ 0h] = comp_code[ 8b] [ 1h] = sdr_version_major[ 4b] [ 5h] = sdr_version_minor[ 4b] [ 4Ch] = record_count[16b] [ EE4h] = free_space[16b] [h] = most_recent_addition_timestamp[32b] [h] = most_recent_erase_timestamp[32b] [ 0h] = get_sdr_repository_allocation_info_command_supported[ 1b] [ 1h] = reserve_sdr_repository_command_supported[ 1b] [ 0h] = partial_add_sdr_command_supported[ 1b] [ 0h] = delete_sdr_command_supported[ 1b] [ 0h] = reserved[ 1b] [ 2h] = modal_non_modal_sdr_repository_update_operation_supported[ 2b] [ 0h] = overflow_flag[ 1b] = Reserve SDR Repository Request = [ 22h] = cmd[ 8b] = Reserve SDR Repository Response = ipmi_sdr_cache_create: internal IPMI error # ipmimonitoring --debug Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-scanner2.localhost = Get SDR Repository Info Request = [ 20h] = cmd[ 8b] = Get SDR Repository Info Response = [ 20h] = cmd[ 8b] [ 0h] = comp_code[ 8b] [ 1h] = sdr_version_major[ 4b] [ 5h] = sdr_version_minor[ 4b] [ 4Ch] = record_count[16b] [ EE4h] = free_space[16b] [h] = most_recent_addition_timestamp[32b] [h] = most_recent_erase_timestamp[32b] [ 0h] = get_sdr_repository_allocation_info_command_supported[ 1b] [ 1h] = reserve_sdr_repository_command_supported[ 1b] [ 0h] = partial_add_sdr_command_supported[ 1b] [ 0h] = delete_sdr_command_supported[ 1b] [ 0h] = reserved[ 1b] [ 2h] = modal_non_modal_sdr_repository_update_operation_supported[ 2b] [ 0h] = overflow_flag[ 1b] = Reserve SDR Repository Request = [ 22h] = cmd[ 8b] = Reserve SDR Repository Response = [ 22h] = cmd[ 8b] [ 0h] = comp_code[ 8b] [ 383h] = reservation_id[16b] = Get SDR Request = [ 23h] = cmd[ 8b] [ 383h] =
Re: [Freeipmi-devel]ipmi_monitoring_sensor_readings_by_record_id:internal IPMIerror
Hey David, On Tue, 2008-08-26 at 17:36 -0700, David Sparks wrote: Al Chu wrote: Hey David, Hmmm. What you're showing me is indicative of timeouts. As if the inband communication is too busy to respond or just isn't responding. Did you load the openipmi kernel driver? Or are you using the freeipmi KCS driver? If I understand how I'm doing things I'm using the in-kernel driver (via /dev/ipmi0). I don't know if the freeipmi-KCS driver works. You may want to give it a shot. Just force it by specify --driver-type=KCS on the command line. Otherwise, I'm sort of stumped. If the IPMI packets don't come back from /dev/ipmi0, I'm not sure what else can be done. I could maybe add a retry-mechanism into the in-band communication code. Never thought I'd have to support retransmissions on in-band communication :P I tried the 0.5.6 LAN driver without success but now that I've upgraded to 0.6.5 the LAN driver works! That's good. Al # ipmimonitoring -h d-scanner1 -u root -p XXX 1 | Temp | Temperature | Nominal | C | 50.00 2 | Temp | Temperature | Nominal | C | 48.00 3 | Ambient Temp | Temperature | Nominal | C | 17.00 4 | Planar Temp | Temperature | Nominal | C | 37.00 5 | Riser Temp | Temperature | Nominal | C | 40.00 6 | Temp | Temperature | Nominal | C | 40.00 7 | Temp | Temperature | Nominal | C | 40.00 8 | CMOS Battery | Voltage | Nominal | V | 3.134700 19 | Presence | Entity Presence | Nominal | N/A | 'Entity Present' 20 | Presence | Entity Presence | Nominal | N/A | 'Entity Present' 21 | Presence | Entity Presence | Nominal | N/A | 'Entity Present' 22 | Presence | Entity Presence | Nominal | N/A | 'Entity Present' 23 | ROMB Presence | Entity Presence | Nominal | N/A | 'Entity Present' 24 | FAN 1A RPM | Fan | Nominal | RPM | 7350.00 25 | FAN 1B RPM | Fan | Nominal | RPM | 5700.00 26 | FAN 2A RPM | Fan | Nominal | RPM | 7800.00 27 | FAN 2B RPM | Fan | Nominal | RPM | 5175.00 28 | FAN 3A RPM | Fan | Nominal | RPM | 7575.00 29 | FAN 3B RPM | Fan | Nominal | RPM | 5250.00 30 | FAN 4A RPM | Fan | Nominal | RPM | 7875.00 31 | FAN 4B RPM | Fan | Nominal | RPM | 5400.00 32 | Status | Group Processor | Nominal | N/A | 'Processor Presence detected' 33 | Status | Group Processor | Nominal | N/A | 'Processor Presence detected' 34 | Status | Power Supply | Nominal | N/A | 'Presence detected' 35 | Status | Power Supply | Nominal | N/A | 'Presence detected' 36 | VRM | Power Supply | Nominal | N/A | 'Presence detected' 37 | VRM | Power Supply | Nominal | N/A | 'Presence detected' 38 | OS Watchdog | Watchdog2 | Nominal | N/A | '' 40 | Intrusion | Physical Security | Nominal | N/A | '' 56 | Drive | Slot Connector | Warning | N/A | 'Slot/Connector Device Removal Request' 57 | ECC Corr Err | Memory | Critical | N/A | 'Presence detected' 58 | ECC Uncorr Err | Memory | Critical | N/A | 'Presence detected' 59 | I/O Channel Chk | Critical Interrupt | Critical | N/A | 'EISA Fail Safe Timeout' 60 | PCI Parity Err | Critical Interrupt | Critical | N/A | 'EISA Fail Safe Timeout' 61 | PCI System Err | Critical Interrupt | Critical | N/A | 'EISA Fail Safe Timeout' 62 | SBE Log Disabled | Event Logging Disabled | Nominal | N/A | '' 63 | Logging Disabled | Event Logging Disabled | Nominal | N/A | '' 72 | Memory Added | Memory | Warning | N/A | 'Correctable ECC/other correctable memory error' 73 | Memory Removed | Memory | Warning | N/A | 'Correctable ECC/other correctable memory error' 74 | PCIE Fatal Err | Critical Interrupt | Critical | N/A | 'Front Panel NMI/Diagnostic Interrupt' 75 | Chipset Err | Critical Interrupt | Critical | N/A | 'Front Panel NMI/Diagnostic Interrupt' -- Albert Chu [EMAIL PROTECTED] 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory ___ Freeipmi-devel mailing list Freeipmi-devel@gnu.org http://lists.gnu.org/mailman/listinfo/freeipmi-devel