All:

  This is more of a general IPMI question.  Sorry there's isn't
  a -users@ list.

  There's an old Nagios monitoring script that would look
  through 'ipmitool sdr list', and search the status column
  for values != "ok".

  It turns out that this simple logic may be insufficient for
  checking values such 'Power Supply Fully Redundancy'.

  For example:

  Consider sensor 7.1 on a PowerEdge 2950r1:

  Sensor ID              : PS Redundancy (0x74)
   Entity ID             : 7.1 (System Board)
   Sensor Type (Discrete): Power Supply
   States Asserted       : Redundancy State
                           [Fully Redundant]
   Assertion Events      : Redundancy State
                           [Fully Redundant]
   Assertions Enabled    : Redundancy State
                           [Fully Redundant]
                           [Redundancy Lost]


  ---------------------------------------------------


  When the primary power supply is missing or unplugged, the
  'sdr list' returns the sensor with 'OK' value:

   % sudo ipmitool -U foo -H system-lom sdr elist all
   PS Redundancy | 74h | ok | 7.1 | Redundancy Lost

   Note how the sensor status reads 'OK' in almost all
   conditions (except for possibly both power supplies
   being 'not present' or 'failed', which would hard
   to test! >:}  )

   I'm a bit confused about the data structures, but I understand
   thresholds for assertion and de-asseration can be programmed
   using OpenIPMI (A co-worker had to do this for a broken Dell
   DRAC Card in an r710 or 2950r3 reading the upper warning state
   threshold wrong)

   So is there a way to progarm 7.1 or 10.1/10.2 to set status
   NOT OK during: 1) Predictive Failure 2) Power loss 3) Absence?

   As an alternative, I can script start doing additional
   string matching for key words on specific sensor categories:

   For example, sdr type "Power Supply"

----------------------------

$ ipmitool -P XX -U netadmin -H system-lom sdr entity 10
Presence | 54h | ok | 10.1 | Absent
Presence | 55h | ok | 10.2 | Present
Status | 64h | ok | 10.1 | Failure detected, Power Supply AC lost
Status | 65h | ok | 10.2 | Presence detected



With the power cable pulled:

% ipmitool -P XX -U netadmin -H system-lom sdr entity 10
Presence | 54h | ok | 10.1 | Present
Presence | 55h | ok | 10.2 | Present
Status | 64h | ok | 10.1 | Presence detected,
                            Failure detected,
                            Power Supply AC lost
Status | 65h | ok | 10.2 | Presence detected


Thanks, ~BAS



------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Ipmitool-devel mailing list
Ipmitool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ipmitool-devel

Reply via email to