Rob Johnston wrote:
> Hello and happy new year!
>
> What follows is a proposal that Eric Schrock and I have been working on 
> to define a mechanism to enumerate power supplies and fans on platforms 
> that support IPMI.  This is a first step in the larger Sensor 
> Abstraction Layer project.
>
> Comments and questions are encouraged.
>
> --------------------------------------------------------------------
>
> 1. DESCRIPTION
>
> The Solaris FMA framework is designed to diagnose failures in system
> components.  Currently these components are discovered by probing the
> hardware visible to Solaris via standard OS paths (I/O, CPU, DIMMs,
> etc).  However, there exists a set of components that are crucial to the
> ongoing health of the system that have no connection visible to Solaris.
> The most common components, and the most likely to encounter failures,
> are power supplies and fans.
>
> On low-end hardware, these components are often not observable, and it
> is the responsibility of the user to manually detect component failure,
> or run custom (Windows) software to observe the system.  Higher end
> systems (such as the x4000 series shipped by Sun) have a service
> processor that manages the physical components and sensors in the
> system.  Some systems (such as SPARC) have a custom communications
> mechanism between the OS and the SP, but the industry standard is IPMI
> (Intelligent Platform Management Interface).  Solaris already has the
> ability to communicate with the SP over the baseboard management
> controller (/dev/bmc), and a basic library (libipmi) already exists.
>
> Integrating support for power supplies and fans within FMA is an
> important step in bringing all hardware topology enumeration and
> diagnosis under a single infrastructure.  Without this ability, users
> must manage a separate OS instance (on the SP) with different
> configuration, separate management, and separate notification
> mechanisms.
>
> This proposal adds basic enumeration support for power supplies and fans
> on platforms supporting IPMI.  It does not include the ability to
> diagnose psu or fan failures, nor does it provide a way to read
> environmental sensors (fan speed, etc) for these components.  This
> functionality will be provided by a future project.
>
>
> 2. TOPOLOGY CHANGES
>
> On x86 systems, the root of the hc topology tree is hc:///motherboard=0
> (though bay nodes can exist at the root level as well).  It doesn't make
> sense to have physical components like fans underneath the motherboard,
> nor does it make sense to have them directly at the root level.  Future
> projects will add sensors that monitor the chassis itself, and the
> components are contained within the chassis, so a new root hc node is
> created:
>
>       hc:///chassis=0
>
> There is only ever a single chassis.

I have some heartburn over this statement.  I think it will be confusing 
in thoses
cases where a set of domains may share the same physical chassis (eg. blade
servers).  Perhaps we need a different name, other than "chassis'?
 -- richard

>   Within IPMI, fans and psus can be
> grouped together into domains that represent a logical unit (typically a
> FRU).  While uncommon for power supplies, this is quite common for fan
> modules or fan trays that contain multiple fans.  Therefore a
> multi-level topology will be created of the form:
>
>       hc:///chassis=0/psu=0
>       hc:///chassis=0/psu=1
>       hc:///chassis=0/powermodule=0
>       hc:///chassis=0/powermodule=0/psu=0
>       hc:///chassis=0/powermodule=0/psu=1
>
>       hc:///chassis=0/fan=0
>       hc:///chassis=0/fan=1
>       hc:///chassis=0/fanmodule=0
>       hc:///chassis=0/fanmodule=0/fan=0
>       hc:///chassis=0/fanmodule=0/fan=1
>
> The IPMI components are technically 'cooling' elements, not fans. For
> the systems which currently support Solaris and IPMI, only fans are
> supported.  In the future, we may be able to detect non-fan cooling
> elements by examining the set of associated sensors (such as a
> tachometer) and inferring the type of cooling element.
>
> With IPMI, we know all components, even if a component is not currently
> present.  To allow management software to detect empty component slots,
> the FMRIs will always be enumerated, but the is_present method will
> return false if the component is not currently present.
>
>
> 3. DYNAMIC ENUMERATION
>
> A new common libtopo module, ipmi, will be provided that will do dynamic
> enumeration of IPMI components.  While currently only supported on x86
> systems, any system supporting IPMI should work, so the module will be
> present on all architectures.  If future SPARC platforms support IPMI
> over /dev/bmc, then everything should "just work".
>
> IPMI has the unusual property that the world is defined solely by
> 'sensor descriptor records' (which may be sensors, FRUs, etc).  Instead
> of iterating over entities (the IPMI term for components), one instead
> iterates over all SDR records and infers an entity's existence based on
> the sensor records that refer to it.  The logic to handle this will be
> kept within libipmi, and the ipmi enumerator will iterate over all
> discovered entities for any 'power domain', 'power supply', 'cooling
> domain', or 'cooling unit' entities.  Using IPMI entity association
> records, libipmi will have already organized these into the appropriate
> hierarchy.
>
> The default label for each entity will be based on the entity id and the
> entity instance number (which is globally unique).  These labels may or
> may not correspond to the labels on the chassis, but under a correct
> IPMI implementation they will be roughly correct, and there will be a
> means to override them on a per-platform basis (see below).  For
> components with a FRU locator record, it may be possible to assign a
> label matching the FRU name, such as 'ft0.fm1.fru', though it's unclear
> if this is any better (the naming is entirely up to the SP, and the
> '.fru' extension is just a convention currently used by the current SP
> firmware).
>
> Each component that is directly under the chassis will be assigned a FRU
> matching its resource.  Components within an association will default to
> the FRU of their parent, unless they have associated FRU locator
> records, in which case they will have a distinct FRU matching their
> resource.
>
> The sensors associated with the entity will be used to determine
> presence as described in the IPMI specification.
>
> 4. STATIC ENUMERATION
>
> It would be nice if dynamic enumeration were enough to model any system
> supporting IPMI.  Unfortunately, as is the case with most platform
> technologies (such as SMBIOS), complete support for enumeration is
> hampered by limitations of the specification as well as the
> implementation.  With a proper implementation of the IPMI spec, it is
> possible to enumerate all the components, though attaching semantic
> meaning to them (labels, failure sensors, etc) is only possible in some
> cases.
>
> On top of this, most platforms have an IPMI implementation that leaves
> something to be desired.  A common problem is the lack of entity
> association records, so fans that should be part of a logical module
> (even if correctly represented via SDR records) are not associated with
> one another.  Other problems include presence sensors that reference
> incorrect entities, missing or incorrect FRU locator records, etc.
>
> To compensate for both of these problems, libtopo will support both
> dynamic enumeration, static enumeration, and static assignment of senors
> and properties to dynamically discovered entities.
>
>
> 5. LIBIPMI DETAILS
>
> As part of this work, libipmi will be expanded in several different
> capacities, mostly related to parsing SDR records and representing
> entities.
>
> The SDR infrastructure will be expanded to support all possible SDR
> record types (compact sensors, full sensor, entity association, etc).
> The code will also be simplified to separate out the SDR name (when
> available) from the record, since constructing this value is non-trivial
> and should not be left to the consumer.
>
> New interfaces for gathering sensor readings based on a compact or full
> SDR record will be introduced.  This consists mainly of a large number
> of #defines, code to transform readings based on the linearization
> function, and parsing the sensor units.  Some of this infrastracture
> will not be fully used until future sensor work is complete, but enough
> of it is needed at this point (namely parsing sensor-specific state
> masks) to warrant its inclusion as part of this project.
>
> Based on this new infrastructure, libipmi will be enhanced to have a
> native notion of entities, even these do not exist as such in the IPMI
> specification.  The library will scan the SDR records, detect referenced
> entities, group sensors with associated entities, and parse entity
> association records to create a hierarchy of entities.  This will also
> include a function to detect entity presence.
>
> This isolates the details of IPMI entities (of which there are many) to
> within libipmi, simplifying the topo enumerator and allowing other
> software to be developed on top of it.  One of these pieces of software
> will be a private utility under /usr/lib/fm, 'ipmitopo', which will
> display all IPMI entities (id, type, presence) and sensors associated
> with each entity (reading, state, type, etc).  This tool is not designed
> to replace the open source 'ipmitool' and exists solely to debug the
> IPMI topo implementation by leveraging the same code used by libtopo.
>
>
> 6. LIBTOPO ENHANCEMENTS
>
> To make the implementation of this project possible, a handful of
> extensions to both the libtopo enumerator module API and XML schema are
> necessary.
>
> Currently it is not possible to register module methods on nodes that
> are statically enumerated via XML map files.  Typically, node methods
> are registered onto a node by the enumerator module after the node is
> bound to the topology.  However, since statically enumerated modules
> aren't created by the enumerator module this registration doesn't occur.
>
> While there will be cases where we will be forced to statically define
> psu and fan topologies via XML, these nodes still need to support the
> node methods that are implemented by the ipmi enumerator module.  In
> order to allow these methods to be registered on statically defined
> nodes, the topo_modops_t struct will be extended with a new operation
> (tmo_meth_reg) as shown below:
>
> typedef int topo_meth_reg_f(topo_mod_t *, tnode_t *);
>
> typedef struct topo_modops {
>       topo_enum_f *tmo_enum;          /* enumeration op */
>       topo_release_f *tmo_release;    /* resource release op */
>       topo_meth_reg_f *tmo_meth_reg;  /* method registration op */
> } topo_modops_t;
>
> The tmo_meth_reg operation will be optional.  Enumerator modules
> which implement this operation will register the appropriate set of
> methods on the topo node that is passed in.
>
> To provide a connection between this new operation and nodes that are
> statically defined in XML, the syntax of the <node> element will be
> extended to include a new optional "mod" attribute.  The value of this
> attribute should be set to the name of an enumerator module, whose methods
> should be registered on that node.  Below is an example usage of this
> new attribute:
>
>    <range name='fan' min='0' max='2'>
>          <node instance='0' mod='ipmi'>
>              . . .
>          </node>
>    </range>
>
> Additionally, the syntax of the <range> element will also be extended to
> allow a new "set" attribute.  The intention is to allow for conditional
> enumeration of a range of nodes based on the platform type.  This is
> analagous to the conditional specification of properties which is
> currently supported via the <propset> element.  Below is an example
> usage of this new attribute:
>
>    <range name='fanmodule' min='0' max='4' 
> set='Sun-Fire-X4500|Sun-Fire-X4540'>
>        . . .
>    </range>
>
> In the example above, the <range> element (and all children elements
> within) will only be parsed and evaluated if the machine's platform type
> matches one of the platforms specified by the "set" attribute's value.
>
> All of the above extensions will be backwards compatible with any
> existing map files and enumerator modules.
>
>
> 7. FUTURE WORK
>
> This proposal lays the groundwork for a variety of future work under the
> auspices of the FMA Sensor Framework.
>
> The next step will be to include fan and PSU diagnosis.  This requires
> representing failure sensors within libtopo using the facility nodes
> proposed as part of the sensor framework.  These sensors are then read
> by a sensor-transport module that has as 1:1 correspondence between
> ereports and faults.
>
> This will serve as a proof of concept for facility nodes and prepare
> the way for the larger sensor and alert framework, while providing the
> greatest immediate benefit.  Future work will include representing
> analog sensors in libtopo, developing an environmental monitor,
> detecting fan and PSU hotplug, and creating a persistent alert
> framework.
>
>
> 8. REFERENCES
>
> "IPMI v2.0 rev. 1.0 specification markup for IPMI v2.0/v1.5 errata
>   revision 3"
>  
> http://www.intel.com/design/servers/ipmi/pdf/IPMIv2_0_rev1_0_E3_markup.pdf
>
> Sensor Abstraction Layer OpenSolaris Project
>
> http://www.opensolaris.org/os/project/sensors/
>
> Libtopo documentation: FMD Programmer's Reference, Chapter 9
>
> http://www.opensolaris.org/os/community/fm/FMDPRM.pdf
>
> _______________________________________________
> fm-discuss mailing list
> fm-discuss@opensolaris.org
>   

_______________________________________________
fm-discuss mailing list
fm-discuss@opensolaris.org

Reply via email to