Thanks, Cindi. You have seconds. I'll contact you offline to get you
set up.

Eric

On Tue, 27 Feb 2007, cindi wrote:


This project will export the fault management registry of event specifications
and diagnosis article content.  The initial delivery to the OpenSolaris
community will include the registry contents and a set of CLIs
and a web-based browser tools to access event class and payload specifications,
diagnosis messages and article details.

The initial target audience of this project is system administrators and
developers who want a listing of the possible fault diagnosis messages
and the event class and payload specifications.

For example, the ermsg command may be used to list the diagnosis message IDs
for all AMD Opteron and Athlon 64 processor diagnoses:

   # ermsg -a AMD
   Dictionary   Entry No.    ID
   AMD          1            AMD-8000-1W
   AMD          2            AMD-8000-2F
   AMD          3            AMD-8000-3K
   AMD          4            AMD-8000-48
   AMD          5            AMD-8000-5M
   AMD          6            AMD-8000-67
   AMD          7            AMD-8000-7U
   AMD          8            AMD-8000-8L
   AMD          9            AMD-8000-9G
   AMD          10           AMD-8000-AV
   AMD          11           AMD-8000-C0
   AMD          12           AMD-8000-DT
   AMD          13           AMD-8000-E6
   AMD          14           AMD-8000-FN
   AMD          15           AMD-8000-G9
   ...

Specific message and article detail content may also be displayed:

   # ermsg -a AMD-8000-G9
   Dictionary   Entry No.    ID
   AMD          15           AMD-8000-G9

   CPU errors exceeded acceptable levels

   Type
   Fault

   Severity
   Major

   Description
   The number of errors associated with this CPU has exceeded acceptable levels.

   Automated Response
   An attempt will be made to remove this CPU from service.

   Impact
   Performance of this system may be affected.

   Suggested Action for System Administrator
   Schedule a repair procedure to replace the affected CPU.  Use fmdump -v -u
   <EVENT_ID> to identify the module.

   Details
   This message indicates that the Solaris Fault Manager has received a report
   from a CPU indicating that an uncorrectable Level 1 Data Translation
   Look-aside Buffer error has occurred, and a CPU fault has been diagnosed.
   System performance may have been affected. Faults of this nature typically
   result in a system reset and reboot.
   ...

Similarly, FMA event class and payload specifications may also be displayed.

   # erevent -L "ereport.io.pci.*"
   ereport.io.pci.dpe -- Detected data parity error
   ereport.io.pci.dto -- Master never reissued read
   ...

   # erevent -a "ereport.io.pci.dpe"
   ereport.io.pci.dpe -- Detected data parity error

   Event Payload
   Name         Type            Description
   ENA          uint64_t        Error Numeric Association
   class        string          The event class
   detector     fmri            The resource that detected the error
   version      uint8_t         The major version of this event class
   pci-bdg-cntl uint16_t        PCI bridge control register
   pci-command  uint16_t        PCI Local Bus configuration command register
   pci-pa       uint64_t        PCI errant physical address
   pci-status   uint16_t        PCI Local Bus configuration status register

The OpenSolaris event registry source will be regularly updated to coincide
with updates to message IDs at http:///sun.com/msg.  Community contributions
to the event registry source will be permitted and sponsored for developers
contributing fault management error handling and diagnosis software for
hardware and software components that are FMA capable and aware.


_______________________________________________
opensolaris-discuss mailing list
[email protected]

Reply via email to