Re: [Freeipmi-devel] ipmi_monitoring_sensor_readings_by_record_id: internal IPMI error

2008-08-26 Thread Al Chu
Hey David,

On another note.  There are a lot of Unknown sensors listed below.
Typically Unknown is shown b/c there are sensor reading errors on the
remote motherboard.  But the below is just way too many to look
reasonable.  Could you also send the --debug output of ipmi-sensors to
me?

Al

On Tue, 2008-08-26 at 11:43 -0700, Al Chu wrote:
 Hey David,
 
  scanner2 ~ #  ipmimonitoring
  ipmi_monitoring_sensor_readings_by_record_id: internal IPMI error
 
 This is odd.  Could you run ipmimonitoring with --debug and let me know
 the output?  I suppose I could have some weird corner case bug in
 ipmimonitoring that doesn't exist in ipmi-sensors.
 
 Al
 
 On Tue, 2008-08-26 at 11:31 -0700, David Sparks wrote:
  Hi all,
  
  Servers: Dell 1850, 1950.
  
  I'm trying to get ipmimonitoring running and have hit a snag.  The manpage 
  suggests that ipmimonitoring provides different but similar output to 
  ipmi-sensors.  However ipmi-sensors works and ipmimonitoring doesn't.
  
  I also noticed I always get an error running ipmiutil getevt or bmchealth.  
  I 
  don't see any Dell specific workarounds on the manpage (or the error 
  either) 
  I'm using this for reference:
  
  http:// www. gnu.org/software/freeipmi/manpages/man8/ipmimonitoring.8.html
  
  What else should I check?
  
  Thanks!
  
  
  scanner2 ~ #  ipmimonitoring
  ipmi_monitoring_sensor_readings_by_record_id: internal IPMI error
  
  scanner2 ~ # ipmiutil getevt
  ipmiutil ver 2.11
  getevent ver 2.11
  -- BMC version 1.7, IPMI version 1.5
  event receiver sa = 20 lun = 00
  bmc enables = 08
  Event Message Buffers not enabled.
  set_bmc_enables error 0xcc
  getevent: Invalid data field in request
  
  scanner2 ~ # bmchealth
  ipmiutil ver 2.11
  bmchealth ver 2.11
  BMC version 1.7, IPMI version 1.5
  BMC manufacturer = 0002a2 (Dell), product = 
  Chassis Status   = 01   (on, restore_policy=stay_off)
  Power State  = 00   (S0: working)
  Selftest status  = 0055 (OK)
  get_chan_auth error: ret = cc
  bmchealth: Invalid data field in request
  
  scanner2 ~ #  ipmi-sensors
  1: Temp (Temperature): 50.00 C (NA/90.00): [OK]
  2: Temp (Temperature): 47.00 C (NA/90.00): [OK]
  3: Ambient Temp (Temperature): 16.00 C (5.00/47.00): [OK]
  4: Planar Temp (Temperature): 37.00 C (5.00/72.00): [OK]
  5: Riser Temp (Temperature): 40.00 C (5.00/67.00): [OK]
  6: Temp (Temperature): 40.00 C (NA/NA): [OK]
  7: Temp (Temperature): 40.00 C (NA/NA): [OK]
  8: CMOS Battery (Voltage): 3.15 V (2.64/NA): [OK]
  9: ROMB Battery (Voltage): [State Deasserted]
  10: VCORE (Voltage): [State Deasserted]
  11: VCORE (Voltage): [State Deasserted]
  12: PROC VTT (Voltage): [Unknown]
  13: 1.5V PG (Voltage): [Unknown]
  14: 1.8V PG (Voltage): [Unknown]
  15: 3.3V PG (Voltage): [Unknown]
  16: 5V PG (Voltage): [Unknown]
  17: 5V Riser PG (Voltage): [Unknown]
  18: Riser PG (Voltage): [Unknown]
  19: Presence (Entity Presence): [Entity Present]
  20: Presence (Entity Presence): [Entity Present]
  21: Presence (Entity Presence): [Entity Present]
  22: Presence (Entity Presence): [Entity Present]
  23: ROMB Presence (Entity Presence): [Entity Present]
  24: FAN 1A RPM (Fan): 7425.00 RPM (2175.00/NA): [OK]
  25: FAN 1B RPM (Fan): 5700.00 RPM (2175.00/NA): [OK]
  26: FAN 2A RPM (Fan): 7725.00 RPM (2175.00/NA): [OK]
  27: FAN 2B RPM (Fan): 5175.00 RPM (2175.00/NA): [OK]
  28: FAN 3A RPM (Fan): 7725.00 RPM (2175.00/NA): [OK]
  29: FAN 3B RPM (Fan): 5325.00 RPM (2175.00/NA): [OK]
  30: FAN 4A RPM (Fan): 8025.00 RPM (2175.00/NA): [OK]
  31: FAN 4B RPM (Fan): 5475.00 RPM (2175.00/NA): [OK]
  32: Status (Processor): [Processor Presence detected]
  33: Status (Processor): [Processor Presence detected]
  34: Status (Power Supply): [Presence detected]
  35: Status (Power Supply): [Presence detected]
  36: VRM (Power Supply): [Presence detected]
  37: VRM (Power Supply): [Presence detected]
  38: OS Watchdog (Watchdog 2): [OK]
  39: SEL (Event Logging Disabled): [Unknown]
  40: Intrusion (Platform Chassis Intrusion): [OK]
  41: PS Redundancy (Power Supply): [Unknown]
  42: Fan Redundancy (Fan): [Unknown]
  55: SCSI Connector  (Cable Interconnect): [Unknown]
  56: Drive (Slot Connector): [Unknown]
  57: ECC Corr Err (Memory): [Unknown]
  58: ECC Uncorr Err (Memory): [Unknown]
  59: I/O Channel Chk (Critical Interrupt): [Unknown]
  60: PCI Parity Err (Critical Interrupt): [Unknown]
  61: PCI System Err (Critical Interrupt): [Unknown]
  62: SBE Log Disable (Event Logging Disabled): [Unknown]
  63: Logging Disable (Event Logging Disabled): [Unknown]
  64: Unknown (System Event): [Unknown]
  65: CPU Protocol Er (Processor): [Unknown]
  66: CPU Bus PERR (Processor): [Unknown]
  67: CPU Init Err (Processor): [Unknown]
  68: CPU Machine Chk (Processor): [Unknown]
  69: Memory Spared (Memory): [Unknown]
  70: Memory Mirrored (Memory): [Unknown]
  71: Memory RAID (Memory): [Unknown]
  72: Memory Added (Memory): [Unknown]
  73: Memory Removed (Memory): [Unknown]
  74: PCIE Fatal Err (Critical 

Re: [Freeipmi-devel]ipmi_monitoring_sensor_readings_by_record_id:internal IPMI error

2008-08-26 Thread David Sparks

Al Chu wrote:

Hey David,

Are you using the newest FreeIPMI available?  0.6.5 from the FreeIPMI
homepage?  The debug dump data you've given me seems pretty out of date.
I've changed the formatting + the amount of information that gets
dumped.


[ VALUE   TAG NAME:LENGTH  ]

[  2Dh] = cmd[ 8b]
[  CBh] = comp_code[ 8b]

(ipmi_monitoring_sensor_reading.c, _get_sensor_reading, 404): bad
completion code: 0x51
ipmi_monitoring_sensor_readings_by_record_id: internal IPMI error

Another user w/ a Dell machine hit sensors that returned 0xCB ==
Requested Sensor, data, or record not present.  This fix seems to be
in the FreeIPMI 0.6.5 release.

On another note, there are several sensors on your motherboard that
ipmimonitoring currently does not interpret.  I will add those into
ipmimonitoring.


I've updated to 0.6.5 from 0.5.6.  What is strange is the debug output changes 
from run to run:


# ipmimonitoring --debug
Caching SDR repository information: 
/root/.freeipmi/sdr-cache/sdr-cache-scanner2.localhost

=
Get SDR Repository Info Request
=
[  20h] = cmd[ 8b]
=
Get SDR Repository Info Response
=
ipmi_sdr_cache_create: internal IPMI error

# ipmimonitoring --debug
Caching SDR repository information: 
/root/.freeipmi/sdr-cache/sdr-cache-scanner2.localhost

=
Get SDR Repository Info Request
=
[  20h] = cmd[ 8b]
=
Get SDR Repository Info Response
=
[  20h] = cmd[ 8b]
[   0h] = comp_code[ 8b]
[   1h] = sdr_version_major[ 4b]
[   5h] = sdr_version_minor[ 4b]
[  4Ch] = record_count[16b]
[ EE4h] = free_space[16b]
[h] = most_recent_addition_timestamp[32b]
[h] = most_recent_erase_timestamp[32b]
[   0h] = get_sdr_repository_allocation_info_command_supported[ 1b]
[   1h] = reserve_sdr_repository_command_supported[ 1b]
[   0h] = partial_add_sdr_command_supported[ 1b]
[   0h] = delete_sdr_command_supported[ 1b]
[   0h] = reserved[ 1b]
[   2h] = 
modal_non_modal_sdr_repository_update_operation_supported[ 2b]

[   0h] = overflow_flag[ 1b]
=
Reserve SDR Repository Request
=
[  22h] = cmd[ 8b]
=
Reserve SDR Repository Response
=
ipmi_sdr_cache_create: internal IPMI error

# ipmimonitoring --debug
Caching SDR repository information: 
/root/.freeipmi/sdr-cache/sdr-cache-scanner2.localhost

=
Get SDR Repository Info Request
=
[  20h] = cmd[ 8b]
=
Get SDR Repository Info Response
=
[  20h] = cmd[ 8b]
[   0h] = comp_code[ 8b]
[   1h] = sdr_version_major[ 4b]
[   5h] = sdr_version_minor[ 4b]
[  4Ch] = record_count[16b]
[ EE4h] = free_space[16b]
[h] = most_recent_addition_timestamp[32b]
[h] = most_recent_erase_timestamp[32b]
[   0h] = get_sdr_repository_allocation_info_command_supported[ 1b]
[   1h] = reserve_sdr_repository_command_supported[ 1b]
[   0h] = partial_add_sdr_command_supported[ 1b]
[   0h] = delete_sdr_command_supported[ 1b]
[   0h] = reserved[ 1b]
[   2h] = 
modal_non_modal_sdr_repository_update_operation_supported[ 2b]

[   0h] = overflow_flag[ 1b]
=
Reserve SDR Repository Request
=
[  22h] = cmd[ 8b]
=
Reserve SDR Repository Response
=
[  22h] = cmd[ 8b]
[   0h] = comp_code[ 8b]
[ 383h] = reservation_id[16b]
=
Get SDR Request
=
[  23h] = cmd[ 8b]
[ 383h] = 

Re: [Freeipmi-devel]ipmi_monitoring_sensor_readings_by_record_id:internal IPMIerror

2008-08-26 Thread Al Chu
Hey David,

On Tue, 2008-08-26 at 17:36 -0700, David Sparks wrote:
 Al Chu wrote:
  Hey David,
  
  Hmmm.  What you're showing me is indicative of timeouts.  As if the
  inband communication is too busy to respond or just isn't responding.
  Did you load the openipmi kernel driver?  Or are you using the freeipmi
  KCS driver?
 
 If I understand how I'm doing things I'm using the in-kernel driver (via 
 /dev/ipmi0).  

I don't know if the freeipmi-KCS driver works.  You may want to give it
a shot.  Just force it by specify --driver-type=KCS on the command line.
Otherwise, I'm sort of stumped.  If the IPMI packets don't come back
from /dev/ipmi0, I'm not sure what else can be done.  I could maybe add
a retry-mechanism into the in-band communication code.  Never thought
I'd have to support retransmissions on in-band communication :P

 I tried the 0.5.6 LAN driver without success but now that I've 
 upgraded to 0.6.5 the LAN driver works!

That's good.

Al

 
 # ipmimonitoring -h d-scanner1 -u root -p XXX
 1 | Temp | Temperature | Nominal | C | 50.00
 2 | Temp | Temperature | Nominal | C | 48.00
 3 | Ambient Temp | Temperature | Nominal | C | 17.00
 4 | Planar Temp | Temperature | Nominal | C | 37.00
 5 | Riser Temp | Temperature | Nominal | C | 40.00
 6 | Temp | Temperature | Nominal | C | 40.00
 7 | Temp | Temperature | Nominal | C | 40.00
 8 | CMOS Battery | Voltage | Nominal | V | 3.134700
 19 | Presence  | Entity Presence | Nominal | N/A | 'Entity Present'
 20 | Presence  | Entity Presence | Nominal | N/A | 'Entity Present'
 21 | Presence  | Entity Presence | Nominal | N/A | 'Entity Present'
 22 | Presence  | Entity Presence | Nominal | N/A | 'Entity Present'
 23 | ROMB Presence | Entity Presence | Nominal | N/A | 'Entity Present'
 24 | FAN 1A RPM | Fan | Nominal | RPM | 7350.00
 25 | FAN 1B RPM | Fan | Nominal | RPM | 5700.00
 26 | FAN 2A RPM | Fan | Nominal | RPM | 7800.00
 27 | FAN 2B RPM | Fan | Nominal | RPM | 5175.00
 28 | FAN 3A RPM | Fan | Nominal | RPM | 7575.00
 29 | FAN 3B RPM | Fan | Nominal | RPM | 5250.00
 30 | FAN 4A RPM | Fan | Nominal | RPM | 7875.00
 31 | FAN 4B RPM | Fan | Nominal | RPM | 5400.00
 32 | Status  | Group Processor | Nominal | N/A | 'Processor Presence detected'
 33 | Status  | Group Processor | Nominal | N/A | 'Processor Presence detected'
 34 | Status  | Power Supply | Nominal | N/A | 'Presence detected'
 35 | Status  | Power Supply | Nominal | N/A | 'Presence detected'
 36 | VRM  | Power Supply | Nominal | N/A | 'Presence detected'
 37 | VRM  | Power Supply | Nominal | N/A | 'Presence detected'
 38 | OS Watchdog | Watchdog2 | Nominal | N/A | ''
 40 | Intrusion | Physical Security | Nominal | N/A | ''
 56 | Drive | Slot Connector | Warning | N/A | 'Slot/Connector Device Removal 
 Request'
 57 | ECC Corr Err | Memory | Critical | N/A | 'Presence detected'
 58 | ECC Uncorr Err | Memory | Critical | N/A | 'Presence detected'
 59 | I/O Channel Chk | Critical Interrupt | Critical | N/A | 'EISA Fail Safe 
 Timeout'
 60 | PCI Parity Err | Critical Interrupt | Critical | N/A | 'EISA Fail Safe 
 Timeout'
 61 | PCI System Err | Critical Interrupt | Critical | N/A | 'EISA Fail Safe 
 Timeout'
 62 | SBE Log Disabled | Event Logging Disabled | Nominal | N/A | ''
 63 | Logging Disabled | Event Logging Disabled | Nominal | N/A | ''
 72 | Memory Added | Memory | Warning | N/A | 'Correctable ECC/other 
 correctable memory error'
 73 | Memory Removed | Memory | Warning | N/A | 'Correctable ECC/other 
 correctable memory error'
 74 | PCIE Fatal Err | Critical Interrupt | Critical | N/A | 'Front Panel 
 NMI/Diagnostic Interrupt'
 75 | Chipset Err | Critical Interrupt | Critical | N/A | 'Front Panel 
 NMI/Diagnostic Interrupt'
 
 
 
-- 
Albert Chu
[EMAIL PROTECTED]
925-422-5311
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory



___
Freeipmi-devel mailing list
Freeipmi-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/freeipmi-devel