Hi
I have written a comprehensive Dell server probe, I wanted to share with
the group.
Not sure if this list supports attachments, if you dont get it please
let me know I will send to you directly.
This probe works very well with any Dell server with OpenManage 4.0 and
above installed. There are notes in the probe file to adjust it for
Blade servers as well.
This only works with IM 4.5 as it makes heavy use of Table views. Please
read the comments in the probe file for details.
I have found that with this probe I no longer need to run a Dell IT
Assistant box.
I am also working on similar probes for HP and IBM gear. Please let me
know if you find this of value or interest, I would be happy to share.
Cheers
Joe Vogt
Systems Engineer
Electronic Arts Canada
[EMAIL PROTECTED]
<header>
type = "custom-snmp"
package = "com.private.snmp.dell.poweredge"
probe_name = "snmp.dell.poweredge"
human_name = "Dell PowerEdge Server Probe"
version = "1.0"
address_type = "IP"
port_number = "161"
FLAGS=MINIMAL,NOLINKS
</header>
<!--This is a comprehensive system state monitor probe for Dell PowerEdge
servers. -->
<!--This probe references the following MIB files:
-->
<!--
-->
<!-- MIB-Dell-10892 (10892.mib)
-->
<!-- StorageManagement-MIB (dcstorage.mib)
-->
<!-- DCS3RMT-MIB (dcs3rmt.mib)
-->
<!--
-->
<!--These MIB files must be imported into IM 4.5 in order for the OID refernces
to work.-->
<!--These files are available in the Openmanage\support\mibs folder on any Dell
host -->
<!--with SA isntalled.
-->
<!--In addition, the Dell Instrumentation drivers must be installed on the
host. -->
<!--
-->
<!--This probe has been tested on various Dell 1855, 1955, 2650, 2850, 2950,
1750, 1850 -->
<!--and 6650 servers.
-->
<!--It is helpful to have the latest version of Dell OpenManage installed on
the host -->
<!--and all firmware updated to latest levels.
-->
<!--
-->
<!--Note regarding 1855 and 1955 Blade Servers
-->
<!--
-->
<!--You may want to comment out the Power and Cooling lines in the snmp-display
section -->
<!--as blades do not report these devices.
-->
<!--The sections below contain a few other commented-out lines that enable more
-->
<!--functionality for Dell blade servers.
-->
<!--
-->
<!--Done in Notepad, please forgive my formatting
-->
<description>
\GB++\Dell PowerEdge Server Probe\P\
\B-\This is a comprehensive system state monitor probe for Dell PowerEdge
servers.\P\
\B-\Probe Details:\p\
Polls the overall System Status values for the following subsystems:
- Global System
- Chassis
- Power
- Cooling
- Voltage
- Tempurature
- Storage
Status severity levels translate to IM as follows:
\BM\Dell Status - IM Status\P\
\M\Unknown - Warn
Normal - OK
Non-Critical - Alarm
Critical - Critical \P\
\G\Table views for all subsystems are available, allowing you to 'drill down'
and see detailed status and configuration information.
Handy links to the monitored host's Dell OpenManage Server Administrator web
interface as well as the host's DRAC remote access controller are presented in
the status display for convenience.
Further functionailty for 1855 and 1955 blade servers are available in the
custom probe file.
</description>
<parameters>
<!--
-->
<!--"Blade Enclosure Name" = "Name of Blade Enclosure"
-->
<!--"Blade Enclosure Mgmt Address" = "IP address or DNS name"
-->
<!--
-->
<!--Uncomment the lines above to enable capturing the Dell Blade Enclosure
-->
<!--name and mgmt URL. This information is displayed in the status window
-->
<!--if the lines in snmp-display are enabled
-->
<!--
-->
</parameters>
<snmp-device-properties>
</snmp-device-properties>
nomib2="true"
<snmp-device-variables>
<!--This section defines the values we want to poll against, -->
<!--alert on, and display on the status page as live data. -->
SystemModel,
MIB-Dell-10892::chassisModelName+, DEFAULT,
" "
SystemName,
MIB-Dell-10892::chassisSystemName+, DEFAULT,
" "
DRACIP,
DCS3RMT-MIB::remoteAccessNICCurrentIPAddress+, DEFAULT,
" "
OSname,
MIB-Dell-10892::operatingSystemOperatingSystemName+, DEFAULT,
" "
SystemStatus,
MIB-Dell-10892::systemStateGlobalSystemStatus+, DEFAULT,
" "
PowerUnitStatus,
MIB-Dell-10892::powerUnitStatus+, DEFAULT,
" "
CoolingUnitStatus,
MIB-Dell-10892::coolingUnitStatus+, DEFAULT,
" "
VoltageStatus,
MIB-Dell-10892::systemStateVoltageStatusCombined+, DEFAULT,
" "
TempuratureStatus,
MIB-Dell-10892::systemStateTemperatureStatusCombined+, DEFAULT,
" "
ProcessorStatus,
MIB-Dell-10892::systemStateProcessorDeviceStatusCombined+, DEFAULT,
" "
MemoryStatus,
MIB-Dell-10892::systemStateMemoryDeviceStatusCombined+, DEFAULT,
" "
StorageStatus,
StorageManagement-MIB::agentGlobalSystemStatus+, DEFAULT,
" "
<!--
-->
<!--BladeSlotNumber,
MIB-DELL-10892::baseBoardLocationName+ DEFAULT,
" " -->
<!--
-->
<!--Uncomment the line above to add BladeSlotNumber which is helpful with 1855s
and 1955s -->
<!--You may also find it helpful to uncomment the BladeEncName and
BladeEncMgmtAddr lines in the snmp-paramaters section -->
<!--It is not possible to probe the enclosure name or management interface
address from the blade, however -->
<!--it is very conveinient to have this information displayed on the status
page. -->
<!--
-->
<!--This section performs value to text conversions
-->
SystemStatusText,
($SystemStatus=1)?"Other":($SystemStatus=2)?"Unknown":($SystemStatus=3)?"Normal":($SystemStatus=4)?"Non-Critical":($SystemStatus=5)?"Critical":"",
CALCULATION
PowerUnitStatusText,
($PowerUnitStatus=1)?"Other":($PowerUnitStatus=2)?"Unknown":($PowerUnitStatus=3)?"Normal":($PowerUnitStatus=4)?"Non-Critical":($PowerUnitStatus=5)?"Critical
Failure":"", CALCULATION
CoolingUnitStatusText,
($CoolingUnitStatus=1)?"Other":($CoolingUnitStatus=2)?"Unknown":($CoolingUnitStatus=3)?"Normal":($CoolingUnitStatus=4)?"Non-Critical":($CoolingUnitStatus=5)?"Critical":"",
CALCULATION
VoltageStatusText,
($VoltageStatus=1)?"Other":($VoltageStatus=2)?"Unknown":($VoltageStatus=3)?"Normal":($VoltageStatus=4)?"Non-Critical":($VoltageStatus=5)?"Critical":"",
CALCULATION
TempuratureStatusText,
($TempuratureStatus=1)?"Other":($TempuratureStatus=2)?"Unknown":($TempuratureStatus=3)?"Normal":($TempuratureStatus=4)?"Non-Critical":($TempuratureStatus=5)?"Critical":"",
CALCULATION
ProcessorStatusText,
($ProcessorStatus=1)?"Other":($ProcessorStatus=2)?"Unknown":($ProcessorStatus=3)?"Normal":($ProcessorStatus=4)?"Non-Critical":($ProcessorStatus=5)?"Critical":"",
CALCULATION
MemoryStatusText,
($MemoryStatus=1)?"Other":($MemoryStatus=2)?"Unknown":($MemoryStatus=3)?"Normal":($MemoryStatus=4)?"Non-Critical":($MemoryStatus=5)?"Critical":"",
CALCULATION
StorageStatusText,
($StorageStatus=1)?"Other":($StorageStatus=2)?"Unknown":($StorageStatus=3)?"Normal":($StorageStatus=4)?"Non-Critical":($StorageStatus=5)?"Critical":"",
CALCULATION
</snmp-device-variables>
<!--Now we list out all the tables -->
<snmp-device-variables-ondemand>
ChassisInfoTable,
1.3.6.1.4.1.674.10892.1.300.10.1, TABLE, "Chassis Information Table"
ChassisInfoTable/Type, ChassisInfoTable.6,
DEFAULT, "Chassis Type"
ChassisInfoTable/Model, ChassisInfoTable.9,
DEFAULT, "Model"
ChassisInfoTable/Status, ChassisInfoTable.4,
DEFAULT, "Status"
ChassisInfoTable/Name, ChassisInfoTable.7,
DEFAULT, "Name"
ChassisInfoTable/AssetTag, ChassisInfoTable.10,
DEFAULT, "Asset Tag"
ChassisInfoTable/ServiceTag, ChassisInfoTable.11,
DEFAULT, "Dell Service Tag"
ChassisInfoTable/BootTime, ChassisInfoTable.16,
DEFAULT, "Boot Time"
ChassisInfoTable/SysDate, ChassisInfoTable.17,
DEFAULT, "System Date"
ESMEventLogEntry,
1.3.6.1.4.1.674.10892.1.300.40.1, TABLE, "ESM Event Log Table"
ESMEventLogEntry/EventLogDateName, ESMEventLogEntry.8,
DEFAULT, "Log Date and Time"
ESMEventLogEntry/EventLogSeverityStatus, ESMEventLogEntry.7,
DEFAULT, "Entry Severity"
ESMEventLogEntry/EventLogRecord, ESMEventLogEntry.5,
STRING, "Log Entry Record"
systemStateTable,
1.3.6.1.4.1.674.10892.1.200.10.1, TABLE, "System State Table"
systemStateTable/GlobalSystemStatus, systemStateTable.2,
DEFAULT, "Global System Status"
systemStateTable/ChassisStatus, systemStateTable.4,
DEFAULT, "Chassis Status"
systemStateTable/PowerUnitRedundancyStatus, systemStateTable.6,
DEFAULT, "Power Redundancy Status"
systemStateTable/VoltageStatus, systemStateTable.12,
DEFAULT, "Global Voltage Status"
systemStateTable/CoolingRedundancyUnitStatus, systemStateTable.18,
DEFAULT, "Cooling Redundancy Status"
systemStateTable/TempuratureStatus, systemStateTable.24,
DEFAULT, "Global Tempurature Status"
systemStateTable/MemoryStatus, systemStateTable.27,
DEFAULT, "Global Memory Status"
systemStateTable/ChassisIntrusionStatus, systemStateTable.30,
DEFAULT, "Chassis Intrusion Status"
PowerSupplyTable,
1.3.6.1.4.1.674.10892.1.600.12.1, TABLE, "Power Supply Table"
PowerSupplyTable/LocationName, PowerSupplyTable.8,
DEFAULT, "Name"
PowerSupplyTable/SupplyType, PowerSupplyTable.7,
DEFAULT, "Power Supply Type"
PowerSupplyTable/Status, PowerSupplyTable.5,
DEFAULT, "Status"
PowerSupplyTable/OutputWatts, PowerSupplyTable.6,
DEFAULT, "Output in 10ths of Watts"
VoltageProbeTable,
1.3.6.1.4.1.674.10892.1.600.20.1, TABLE, "Voltage Probe Table"
VoltageProbeTable/LocationName, VoltageProbeTable.8,
DEFAULT, "Probe Location"
VoltageProbeTable/Type, VoltageProbeTable.7,
DEFAULT, "Probe Type"
VoltageProbeTable/Status, VoltageProbeTable.5,
DEFAULT, "Probe Status"
VoltageProbeTable/Reading, VoltageProbeTable.6,
DEFAULT, "Probe Reading"
VoltageProbeTable/UpperCriticalThresh, VoltageProbeTable.10,
DEFAULT, "Upper Critical Threshold"
VoltageProbeTable/UpperNonCriticalThresh, VoltageProbeTable.11,
DEFAULT, "Upper NonCritical Threshold"
VoltageProbeTable/LowerNonCriticalThresh, VoltageProbeTable.12,
DEFAULT, "Lower NonCritical Threshold"
VoltageProbeTable/LowerCritcalThresh, VoltageProbeTable.13,
DEFAULT, "Lower Critcal Threshold"
CoolingFanTable,
1.3.6.1.4.1.674.10892.1.700.12.1, TABLE, "Cooling Fan Table"
CoolingFanTable/LocationName, CoolingFanTable.8,
DEFAULT, "Fan Location"
CoolingFanTable/Type, CoolingFanTable.7,
DEFAULT, "Fan Type"
CoolingFanTable/Status, CoolingFanTable.5,
DEFAULT, "Fan Status"
CoolingFanTable/Reading, CoolingFanTable.6,
DEFAULT, "Fan Reading"
CoolingFanTable/UpperCriticalThresh, CoolingFanTable.10,
DEFAULT, "Upper Critical Threshold"
CoolingFanTable/UpperNonCriticalThresh, CoolingFanTable.11,
DEFAULT, "Upper NonCritical Threshold"
CoolingFanTable/LowerNonCriticalThresh, CoolingFanTable.12,
DEFAULT, "Lower NonCritical Threshold"
CoolingFanTable/LowerCriticalThresh, CoolingFanTable.13,
DEFAULT, "Lower Critical Threshold"
TempuratureProbeTable,
1.3.6.1.4.1.674.10892.1.700.20.1, TABLE, "Tempurature Probe Table"
TempuratureProbeTable/LocationName, TempuratureProbeTable.8,
DEFAULT, "Probe Location"
TempuratureProbeTable/Type, TempuratureProbeTable.7,
DEFAULT, "Probe Type"
TempuratureProbeTable/Status, TempuratureProbeTable.5,
DEFAULT, "Probe Status"
TempuratureProbeTable/Reading, TempuratureProbeTable.6,
DEFAULT, "Probe Reading in 10ths of Degrees C"
TempuratureProbeTable/UpperCriticalThresh, TempuratureProbeTable.10,
DEFAULT, "Upper Critical Threshold"
TempuratureProbeTable/UpperNonCriticalThresh, TempuratureProbeTable.11,
DEFAULT, "Upper NonCritical Threshold"
TempuratureProbeTable/LowerNonCriticalThresh, TempuratureProbeTable.12,
DEFAULT, "Lower NonCritical Threshold"
TempuratureProbeTable/LowerCriticalThresh, TempuratureProbeTable.13,
DEFAULT, "Lower Critical Threshold"
ProcessorDeviceTable,
1.3.6.1.4.1.674.10892.1.1100.30.1, TABLE, "Processor Device Table"
ProcessorDeviceTable/ManufacturerName, ProcessorDeviceTable.8,
DEFAULT, "Manufacturer Name"
ProcessorDeviceTable/DeviceFamily, ProcessorDeviceTable.10,
DEFAULT, "Processor Family"
ProcessorDeviceTable/CurrentSpeed, ProcessorDeviceTable.12,
DEFAULT, "Current Speed"
ProcessorDeviceTable/CoreCount, ProcessorDeviceTable.17,
DEFAULT, "Core Count"
ProcessorDeviceTable/Voltage, ProcessorDeviceTable.14,
DEFAULT, "Voltage"
ProcessorDeviceTable/Status, ProcessorDeviceTable.5,
DEFAULT, "Status"
MemoryDeviceTable,
1.3.6.1.4.1.674.10892.1.1100.50.1, TABLE, "Memory Device Table"
MemoryDeviceTable/LocationName, MemoryDeviceTable.8,
DEFAULT, "Location Name"
MemoryDeviceTable/BankLocationName, MemoryDeviceTable.10,
DEFAULT, "Bank Location Name"
MemoryDeviceTable/FormFactor, MemoryDeviceTable.12,
DEFAULT, "Form Factor"
MemoryDeviceTable/Type, MemoryDeviceTable.7,
DEFAULT, "Type"
MemoryDeviceTable/Details, MemoryDeviceTable.11,
DEFAULT, "Details"
MemoryDeviceTable/Size, MemoryDeviceTable.14,
DEFAULT, "Size"
MemoryDeviceTable/Speed, MemoryDeviceTable.15,
DEFAULT, "Speed"
MemoryDeviceTable/Status, MemoryDeviceTable.5,
DEFAULT, "Status"
MemoryDeviceTable/ErrorCount, MemoryDeviceTable.9,
DEFAULT, "ECC Error Count"
PCIDeviceTable,
1.3.6.1.4.1.674.10892.1.1100.80.1, TABLE, "PCI Device Table"
PCIDeviceTable/PCISlot, PCIDeviceTable.6,
DEFAULT, "PCI Slot"
PCIDeviceTable/Status, PCIDeviceTable.5,
DEFAULT, "Status"
PCIDeviceTable/Manufacturer, PCIDeviceTable.8,
DEFAULT, "Manufacturer"
PCIDeviceTable/Description, PCIDeviceTable.9,
DEFAULT, "Description"
StorageControllerTable,
1.3.6.1.4.1.674.10893.1.20.130.1.1, TABLE, "Storage Controller Table"
StorageControllerTable/Instance, StorageControllerTable.1,
DEFAULT, "Controller Instance"
StorageControllerTable/Name, StorageControllerTable.2,
DEFAULT, "Controller Name"
StorageControllerTable/State, StorageControllerTable.5,
DEFAULT, "Controller State"
StorageControllerTable/Firmware, StorageControllerTable.8,
DEFAULT, "Controller Firmware Version"
StorageControllerTable/PhysicalDevices, StorageControllerTable.11,
DEFAULT, "Number of Physical Devices including Controller and Disks"
StorageControllerTable/VirtualDisk, StorageControllerTable.12,
DEFAULT, "Number of Virtual Disks"
StorageDiskArrayTable,
1.3.6.1.4.1.674.10893.1.20.130.4.1, TABLE, "Disk Array Table"
StorageDiskArrayTable/DiskNumber, StorageDiskArrayTable.1,
DEFAULT, "Instance Number"
StorageDiskArrayTable/DiskName, StorageDiskArrayTable.2,
DEFAULT, "Disk Name"
StorageDiskArrayTable/DiskVendor, StorageDiskArrayTable.3,
DEFAULT, "Vendor"
StorageDiskArrayTable/DiskSeverity, StorageDiskArrayTable.4,
DEFAULT, "Status"
StorageDiskArrayTable/DiskChannel, StorageDiskArrayTable.10,
DEFAULT, "Channel"
StorageDiskArrayTable/ArraySize, StorageDiskArrayTable.11,
DEFAULT, "Disk Size in MB"
StorageDiskArrayTable/DiskBusType, StorageDiskArrayTable.21,
DEFAULT, "Bus Type"
VirtualDiskTable,
1.3.6.1.4.1.674.10893.1.20.140.1.1, TABLE, "Virtual Disk Table"
VirtualDiskTable/DiskNumber, VirtualDiskTable.1,
DEFAULT, "Disk Number"
VirtualDiskTable/DiskName, VirtualDiskTable.2,
DEFAULT, "Disk Name"
VirtualDiskTable/DiskState, VirtualDiskTable.4,
DEFAULT, "Disk State"
VirtualDiskTable/SizeInMB, VirtualDiskTable.6,
DEFAULT, "Disk Size in MB"
VirtualDiskTable/DiskLayout, VirtualDiskTable.13,
DEFAULT, "Disk Layout"
FirmwareTable,
1.3.6.1.4.1.674.10892.1.300.60.1, TABLE, "Firmware Table"
FirmwareTable/Type, FirmwareTable.7,
DEFAULT, "Firmware Type"
FirmwareTable/TypeName, FirmwareTable.8,
DEFAULT, "Firmware Type Name"
FirmwareTable/VersionName, FirmwareTable.11,
DEFAULT, "Version"
FirmwareTable/DateName, FirmwareTable.10,
DEFAULT, "Version Date"
FirmwareTable/Status, FirmwareTable.5,
DEFAULT, "Firmware Status"
DracTable,
1.3.6.1.4.1.674.10892.1.1700.10.1, TABLE, "DRAC Table"
DracTable/InfoName, DracTable.7,
DEFAULT, "Name"
DracTable/Description, DracTable.8,
DEFAULT, "Description"
DracTable/Version, DracTable.9,
DEFAULT, "Version"
DracTable/Status, DracTable.6,
DEFAULT, "Status"
DracTable/CurrentIP, DracTable.30,
DEFAULT, "Current IP"
DracTable/RAUrl, DracTable.34,
DEFAULT, "URL"
OSTable,
1.3.6.1.4.1.674.10892.1.400.10.1, TABLE, "OS Table"
OSTable/OSName, OSTable.6,
DEFAULT, "OS Name"
OSTable/OSVersion, OSTable.7,
DEFAULT, "OS Version"
OSTable/OSStatus, OSTable.4,
DEFAULT, "OS Status"
</snmp-device-variables-ondemand>
<snmp-device-thresholds>
warning: ${PowerUnitStatusText} = "Unknown"
Alarm: ${PowerUnitStatusText} = "Non-Critical"
Critical: ${PowerUnitStatusText} = "Critical"
warning: ${CoolingUnitStatusText} = "Unknown"
Alarm: ${CoolingUnitStatusText} = "Non-Critical"
Critical: ${CoolingUnitStatusText} = "Critical"
warning: ${VoltageStatusText} = "Unknown"
Alarm: ${VoltageStatusText} = "Non-Critical"
Critical: ${VoltageStatusText} = "Critical"
warning: ${TempuratureStatusText} = "Unknown"
Alarm: ${TempuratureStatusText} = "Non-Critical"
Critical: ${TempuratureStatusText} = "Critical"
warning: ${ProcessorStatusText} = "Unknown"
Alarm: ${ProcessorStatusText} = "Non-Critical"
Critical: ${ProcessorStatusText} = "Critical"
warning: ${MemoryStatusText} = "Unknown"
Alarm: ${MemoryStatusText} = "Non-Critical"
Critical: ${MemoryStatusText} = "Critical"
warning: ${StorageStatusText} = "Unknown"
Alarm: ${StorageStatusText} = "Non-Critical"
Critical: ${StorageStatusText} = "Critical"
warning: ${SystemStatusText} = "Unknown"
Alarm: ${SystemStatusText} = "Non-Critical"
Critical: ${SystemStatusText} = "Critical"
</snmp-device-thresholds>
<snmp-device-display>
\p\\GM4\
\p\\GMB5+\ Dell OpenManage Information
\p\\pM4-\ Model: \p\\G4B\${SystemModel}
\p\\pM4\ OS: \p\\G4B\${OSname}
\p\\pM4\ Information: \p\\G4\${ChassisInfoTable:Chassis} /
\G4\${ESMEventLogEntry:ESM Log} / \G4\${PCIDeviceTable:PCI Devices} /
\G4\${ProcessorDeviceTable:Processors} / \G4\${FirmwareTable:Firmware} /
\G4\${OSTable:OS} / \G4\${DracTable:DRAC}
\p\\pM4\Global System: \p\\pM0\${SystemStatusText} \G4\table:
${systemStateTable:Global Status}
\p\\pM4\ Power: \p\\pM0\${PowerUnitStatusText} \G4\table:
${PowerSupplyTable:Power Supplies}
\p\\pM4\ Voltage: \p\\pM0\${VoltageStatusText} \G4\table:
${VoltageProbeTable:Voltage Sensors}
\p\\pM4\ Cooling: \p\\pM0\${CoolingUnitStatusText} \G4\table:
${CoolingFanTable:Cooling Fans}
\p\\pM4\ Tempurature: \p\\pM0\${TempuratureStatusText} \G4\table:
${TempuratureProbeTable:Tempurature Sensors}
\p\\pM4\ Memory: \p\\pM0\${MemoryStatusText} \G4\table:
${MemoryDeviceTable:Memory Probes}
\p\\pM4\ Disk Array: \p\\pM0\${StorageStatusText} \G4\table:
${StorageControllerTable:Controllers} / \G4\${StorageDiskArrayTable:Array
Disks} / \G4\${VirtualDiskTable:Volumes}
\p\\pM4\OpenManage Link: \p\\pGB4U\https://${SystemName}:1311\P0\
\p\\pM4\ DRAC Link: \p\\pGB4U\https://${DRACIP}\P0\
<!-- \p\\pM4\ Enclosure Mgmt: \p\\pGB4U\https://${Blade Enclosure Mgmt
Address}\P0\ -->
<!-- Uncomment the line above to enable the link to the enclosure mgmt
interface -->
<!-- You may also want to comment out the Drac Link line as it will not report
on -->
<!-- a blade
-->
<!-- \p\\pM4\ Enclosure: \p\\G4\${Blade Enclosure Name}
\G4\${BladeSlotNumber} -->
<!-- uncomment the line above for blade information display. You may want to
place -->
<!-- this line higher up in the status window
-->
</snmp-device-display>