+1
rob
cindi wrote:
This project will export the fault management registry of event specifications
and diagnosis article content. The initial delivery to the OpenSolaris
community will include the registry contents and a set of CLIs
and a web-based browser tools to access event class and payload specifications,
diagnosis messages and article details.
The initial target audience of this project is system administrators and
developers who want a listing of the possible fault diagnosis messages
and the event class and payload specifications.
For example, the ermsg command may be used to list the diagnosis message IDs
for all AMD Opteron and Athlon 64 processor diagnoses:
# ermsg -a AMD
Dictionary Entry No. ID
AMD 1 AMD-8000-1W
AMD 2 AMD-8000-2F
AMD 3 AMD-8000-3K
AMD 4 AMD-8000-48
AMD 5 AMD-8000-5M
AMD 6 AMD-8000-67
AMD 7 AMD-8000-7U
AMD 8 AMD-8000-8L
AMD 9 AMD-8000-9G
AMD 10 AMD-8000-AV
AMD 11 AMD-8000-C0
AMD 12 AMD-8000-DT
AMD 13 AMD-8000-E6
AMD 14 AMD-8000-FN
AMD 15 AMD-8000-G9
...
Specific message and article detail content may also be displayed:
# ermsg -a AMD-8000-G9
Dictionary Entry No. ID
AMD 15 AMD-8000-G9
CPU errors exceeded acceptable levels
Type
Fault
Severity
Major
Description
The number of errors associated with this CPU has exceeded acceptable levels.
Automated Response
An attempt will be made to remove this CPU from service.
Impact
Performance of this system may be affected.
Suggested Action for System Administrator
Schedule a repair procedure to replace the affected CPU. Use fmdump -v -u
<EVENT_ID> to identify the module.
Details
This message indicates that the Solaris Fault Manager has received a report
from a CPU indicating that an uncorrectable Level 1 Data Translation
Look-aside Buffer error has occurred, and a CPU fault has been diagnosed.
System performance may have been affected. Faults of this nature typically
result in a system reset and reboot.
...
Similarly, FMA event class and payload specifications may also be displayed.
# erevent -L "ereport.io.pci.*"
ereport.io.pci.dpe -- Detected data parity error
ereport.io.pci.dto -- Master never reissued read
...
# erevent -a "ereport.io.pci.dpe"
ereport.io.pci.dpe -- Detected data parity error
Event Payload
Name Type Description
ENA uint64_t Error Numeric Association
class string The event class
detector fmri The resource that detected the error
version uint8_t The major version of this event class
pci-bdg-cntl uint16_t PCI bridge control register
pci-command uint16_t PCI Local Bus configuration command register
pci-pa uint64_t PCI errant physical address
pci-status uint16_t PCI Local Bus configuration status register
The OpenSolaris event registry source will be regularly updated to coincide
with updates to message IDs at http:///sun.com/msg. Community contributions
to the event registry source will be permitted and sponsored for developers
contributing fault management error handling and diagnosis software for
hardware and software components that are FMA capable and aware.
Cynthia McGuire
Senior Staff Engineer
Solaris Kernel RAS
[EMAIL PROTECTED]
------------------------------------------------------------------------
_______________________________________________
opensolaris-discuss mailing list
[email protected]
_______________________________________________
opensolaris-discuss mailing list
[email protected]