Duncan,
That paradigm puts server management equivalent to the old pre-IPMI days
with raw lm_sensors, and polling them every 10 or so seconds, which also
impacts on CPU performance.
Iteratively gathering sensors may work sometimes, but there are many
events that do not persist long enough to catch them when running the
sensor command. And then you have IPMI Event-Only sensors (record an
event, but have no meaningful sensor reading), etc., etc. etc.
Waiting for events is part of the elegance of IPMI. You do not have to
poll the sensors, since the firmware does that already several times a
second, without impacting CPU performance. And waiting for events does
not have to mean reading the entire IPMI SEL, reading the events since
the last one it read is fine.
That's how ipmitool's ipmievtd works, and the ipmiutil_evt service also.
Freeipmi has an event service as well.
Parsing of SEL events should be the first option. And there are event
daemons already available to do most of the work.
Andy
From: Duncan Idaho [mailto:dune.id...@gmail.com]
Sent: Thursday, April 12, 2012 2:31 PM
To: Ryan Cox
Cc: ipmitool-devel@lists.sourceforge.net
Subject: Re: [Ipmitool-devel] generic test for system component failure
In my opinion, locate LED serves for chassis(server) identification, not
to determine whether server requires service. But I have to admit I
wouldn't be surprised if some vendors would do such thing.
Anyway, you want to use # ipmitool sdr list full; and look up Indicator
that's supposed to signalize health status of chassis(server). And
perhaps check other sdr list commands later, as it seems to be pointless
to get all records just because of one Indicator.
There is also # ipmitool chassis status; command which gives you health
information and you might want to check it out.
I would leave parsing of SEL entries as the very last option.
My $0.02 USD.
--Duncan
On Thu, Apr 12, 2012 at 5:55 PM, Ryan Cox <ryan_...@byu.edu> wrote:
Not sure about ipmitool, but ipmi-oem from freeipmi has this for Dell
systems:
ipmi-oem dell get-chassis-identify-status
On 04/12/2012 10:47 AM, Timothy Gelter wrote:
Hello Andy,
I appreciate your quick response!
The systems I'm currently targeting are all Dell (C1100, C6220,
and R620) but I'd also want to be able to monitor a variety of HP
servers (DL360, DL185, BL460, & more) in this same way.
I was hoping not to have to parse the SEL but that's what I'll
do if that's my only option.
Thanks,
- Tim
From: Andy Cress <andy.cr...@us.kontron.com>
Date: Thu, 12 Apr 2012 07:34:48 -0700
To: Timothy Gelter <timo...@gelter.com>,
<ipmitool-devel@lists.sourceforge.net>
Subject: RE: [Ipmitool-devel] generic test for system component
failure
Tim,
The 'system health light' will be different for each chassis
vendor. Which chassis vendor is this?
In any case, parsing the IPMI SEL (waiting for IPMI events) is
the surest way to detect faults on an IPMI-capable system. That's the
trigger for the firmware to turn on the system health light.
Andy
From: Timothy Gelter [mailto:timo...@gelter.com]
Sent: Wednesday, April 11, 2012 5:03 PM
To: ipmitool-devel@lists.sourceforge.net
Subject: [Ipmitool-devel] generic test for system component
failure
Hello list members!
I'm trying to come up with a generic Nagios script which can
indicate a failure condition for any of our systems.
We've come across a couple of different memory failures which
resulted in system failure that only turned on the system health light
and added an entry to the SEL.
Unfortunately, until field tech saw the light, we didn't have
anyindication of the failure because we didn't have a Nagios check in
place toalert us of those specific conditions.
What I'd like to be able to do is query the system to determine
whether or not the health light is active.
If there isn't a way to find out this information, I guess we're
stuck parsing SEL output.
I've looked through the ipmitool manpage and so far haven't
found what I need.
Thanks!
- Tim
WARNING - This e-mail or its attachments may contain controlled
technical data or controlled technology within the definition of the
International Traffic in Arms Regulations (ITAR) or Export
Administration Regulations (EAR), and are subject to the export control
laws of the U.S. Government. Transfer of this data or technology by any
means to a foreign person, whether in the United States or abroad,
without an export license or other approval from the U.S. Government, is
prohibited. The information contained in this document is CONFIDENTIAL
and property of Kontron. Any unauthorized review, use, disclosure or
distribution is prohibited without express written consent of Kontron.
If you are not the intended recipient, please contact the sender and
destroy all copies of the original message and enclosed attachments.
--
------------------------------------------------------------------------
------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Ipmitool-devel mailing list
Ipmitool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ipmitool-devel
--
Ryan Cox
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
------------------------------------------------------------------------
------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Ipmitool-devel mailing list
Ipmitool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ipmitool-devel
------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Ipmitool-devel mailing list
Ipmitool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ipmitool-devel