Re: [Nagios-users] check_Openmanage trouble
Weberskirch, Timo timo.weberski...@offlimits-it.com writes: the check_openmanage –no-storage options works (surely without any physical disk… :( ). I was on the phone with the Dell Pro Support. They told me that the MD3 only schows the raid disk Information (not the physical disk informations) to external devices. Also they told me that there is no way to filter out the SAS-Card in OMSA. I have to live with „—no-storage“ option… Hmm.. Ok, so this particular server doesn't have any storage other than the SAS card (connected to the MD3xxx), which OMSA can't manage? If so, that is exactly what the '--no-storage' option is for :) You should use the '--no-storage' option if 1. The server has no storage, which is entirely possible; or 2. The only storage present is something that OMSA doesn't recognize Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with 2% overhead. Download for free and get started troubleshooting in minutes. http://pubads.g.doubleclick.net/gampad/clk?id=48897031iu=/4140/ostg.clktrk ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_Openmanage trouble
Weberskirch, Timo timo.weberski...@offlimits-it.com writes: thank you all for your fast and helpful response. Unfortunately the problem persists. Is there a way to filter out the (in my opinion faulty) SAS card? Storage components are tightly interconnected, so from the plugin side your only option is to not check storage at all: check_openmanage --no-storage But I still believe that this is a software problem, i.e. in OMSA. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with 2% overhead. Download for free and get started troubleshooting in minutes. http://pubads.g.doubleclick.net/gampad/clk?id=48897031iu=/4140/ostg.clktrk ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_Openmanage trouble
Rich rerc...@pha.jhu.edu writes: Usually, when I've seen this, it's been after doing an upgrade of an existing OMSA install (= 6.x to 7.x). In general, I haven't found a good way to resolve it other than automating a complete uninstall of OMSA prior to installing the newer version. Yes, I think the logical next step in this case is to do a complete uninstall, then reinstall of OMSA on the host. The problem is in OMSA and must be fixed there. The plugin is simply complaining that OMSA isn't responding as expected. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with 2% overhead. Download for free and get started troubleshooting in minutes. http://pubads.g.doubleclick.net/gampad/clk?id=48897031iu=/4140/ostg.clktrk ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_Openmanage trouble
Weberskirch, Timo timo.weberski...@offlimits-it.com writes: maybe one of you has the same problem with the check_openmanage plugin… Last week we installed two new Dell PowerEdge R720 with OMSA v 7.3.0 (check_openmange version: 3.7.10). Everytime I try to check my Server I get this error message: “SNMP ERROR [storage / pdisk]: Requested entries are empty or do not exist.” Hello Timo, There seems to be some sort of issue with the Openmanage installation on this server. First thing to do is double-check that everything is installed properly. On a RHEL6 system, the following storage related RPM packages should be installed: # rpm -qa|grep srvadmin-storage srvadmin-storageservices-7.3.0-4.4.1.el6.x86_64 srvadmin-storage-7.3.0-4.93.2.el6.x86_64 srvadmin-storage-cli-7.3.0-4.93.2.el6.x86_64 srvadmin-storageservices-snmp-7.3.0-4.4.1.el6.x86_64 srvadmin-storage-snmp-7.3.0-4.93.2.el6.x86_64 srvadmin-storageservices-cli-7.3.0-4.4.1.el6.x86_64 Do you see any physical disks in the Openmanage Web Console? (point your browser to https://server-ip:1311/ and log in as root) Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Get your SQL database under version control now! Version control is standard for application code, but databases havent caught up. So what steps can you take to put your SQL databases under version control? Why should you start doing it? Read more to find out. http://pubads.g.doubleclick.net/gampad/clk?id=48897031iu=/4140/ostg.clktrk ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] rpmbuild nagios-3.5.0
alexus ale...@gmail.com writes: I'm unable to build RPM w/ nagios 3.5.0, last one that worked for me was 3.2.3. any ideas/suggestions? I'd recommend using the already prebuilt package for rhel6 which is available from EPEL[1]. Add the EPEL repo and you can simply do yum install nagios and be done :) [1] http://fedoraproject.org/wiki/EPEL Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage improvement request
John Skarbek john.skar...@nextcentury.com writes: I?ve recently deployed the check_openmanage script and it works very well. Except for hosts that run esxi. Unless I?m doing something wrong. You're not doing anything wrong. Openmanage, when deployed on ESXi, doesn't have the necessary capabilities for it to work. I?ve discovered that Open Manage doesn?t broadcast it?s OID?s through ESXi like it would if it were a linux or windows host. However I did find that the iDRAC7 does have similar snmp responses that I?d like to capture. However when pointing check_openmanage to the drac interface, I get the message indicating that OMSA must not be installed correctly. However, looking into the script I found: my $chassisModelName = '1.3.6.1.4.1.674.10892.1.300.10.1.9.1'; Which does indeed NOT exist. However, a similar OID with the same information we are looking for is located here: $chassisModelName = '1.3.6.1.4.1.674.10892.5.1.3.12.0'; Actually, the OID is 1.3.6.1.4.1.674.10892.5.4.300.10.1.9.1. I've toyed around with this a bit, and for the most part you can simply replace 1.3.6.1.4.1.674.10892.1 with 1.3.6.1.4.1.674.10892.5.4. Same goes for storage OIDs, to a degree. After modifying the script a little bit I was able to get past that, but now check_openamange is complaining, ?SNMP ERROR [memory]: The requested entries are empty or do not exist. ? I presume the entire set of OID?s is in a different spot when being checked through the drac versus the standard windows snmp service. I would love to assist in enhancing this script, but I?m not sure how I should start. Let me know who I should contact, or feel free to reach out to me to assist with this awesome plugin. I have a modified prealpha version for testing, available in the test branch in git: http://git.uio.no/git/?p=check_openmanage.git;a=shortlog;h=refs/heads/test Note that it's NOT production ready, I have only done some very limited testing. I had to simplify some stuff: * Storage: The storage OIDs from the iDRAC7 are somewhat different, compared to Openmanage. Some information that the plugin needs is not available, such as numbered identifiers for components (used in blacklisting). There are even some OIDs that aren't present in Openmanage. In short, it's a mess, and the storage bit is very simplistic. Perhaps the missing info will be added in a later firmware release, we can only hope. * ESM health OIDs are missing completely, so ESM health check is omitted. Same for SD card check. To use the new feature you have to specify '--idrac', like this: check_openmanage --idrac -H idrac-ip Test it, break it and tell me what you think :) I've noticed that neither the rollup-status or component-status for controllers catches that the controller is actually degraded from out-of-date firmware. Hopefully it's an anomaly that doesn't apply to other aspects of controllers, or other components. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage plugin and storage
Nic Bernstein n...@onlight.com writes: Regarding the non-certified disks problem... There is a special blacklisting keyword to suppress the message about non-certified disks: check_openmanage -b pdisk_cert=all Please try this and see if it resolves your issue. Using blacklisting should also disable the global health check. Ah, that's just what we need. Much appreciated... No, that doesn't seem to be in my version (3.7.9, downloaded yesterday) onlight@monitor:~$ perl check_openmanage -H host -C secret -b pdisk_cert=all Physical Disk 0:1:0 [Ata ST2000DM001-9YN164, 2.0TB] on ctrl 0 is Online Physical Disk 0:1:1 [Ata ST2000DM001-9YN164, 2.0TB] on ctrl 0 is Online onlight@monitor:~$ echo $? 1 I guess I'll wait for a patch. Are you sure you didn't test this with the 7.1.0 workaround manually removed? Say Trond, I sent you some notes last week about enhancements we made to your check_linux_bonding plugin. Would you prefer I re-post those to the list instead? Sorry for being non-responsive of late. I've been swamped at work lately and have attained somewhat of an email backlog. No need to resend :) Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios openmanage ERROR: XML transformation failed
Lorenz, Stephan stephan.lor...@medizin.uni-leipzig.de writes: since installing libxml2, libxml2-devel and curl, the Nagios installation on our Dell R720xd server reports XML errors. Problem running 'omreport storage controller': Error! XML Transformation failed br/Problem running 'omreport chassis memory': Error! XML Transformation failedbr/Problem running 'omreport chassis fans': Error! XML Transformation failedbr/Problem running 'omreport chassis pwrsupplies': Error! XML Transformation failedbr/Problem running 'omreport chassis temps': Error! XML Transformation failedbr/Problem running 'omreport chassis processors': Error! XML Transformation failedbr/Problem running 'omreport chassis volts': Error! XML Transformation failedbr/Problem running 'omreport chassis batteries': Error! XML Transformation failedbr/Problem running 'omreport chassis pwrmonitoring': Error! XML Transformation failedbr/Problem running 'omreport chassis intrusion': Error! XML Transformation failedbr/Problem running 'omreport chassis removableflashmedia': Error! XML Transformation failedbr/ Chassis Service Tag is bogus: 'N/A' I am using Nagios 3.5.1, check_openmanage 3.7.9, Openmanage 7.2.0 on Centos 6.4 2.6.32-358.11.1.el6.centos.plus.x86_64. When I run check_openmanage or omreport manually everything is fine. I tried to reinstall nagios-plugins-openmanage and php-xml for a start, but that did not help. I cannot remove libxml2 and the rest since it is needed elsewhere. Does anyone have a suggestion of how to fix this error? Given that it works when you run the commands manually I'm suspecting some sort of permission issue. Try running the commands as the NRPE user, and also try running it from Nagios with SELinux in permissive mode (needs to be run by the NRPE daemon with the correct SELinux domain). Check out this link about using check_openmanage with SELinux in enforcing mode: http://folk.uio.no/trondham/software/check_openmanage.html#selinux-considerations Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage plugin and storage
Nic Bernstein n...@onlight.com writes: We've recently been experimenting with Trond Hasle Amundsen's check_openmanage on a large network with about a hundred Dell servers of various ages, capabilities, etc. Mostly PE-2950, R210, R410 and R720. Much thanks to Trond for all his great work on Nagios plugins and other projects, by the way. We've hit a wall, however, with the storage monitoring aspects of this plugin. For example, here's a quite specific case. This is a new PE R720, in debug: onlight@monitor:~$ check_openmanage -H host -C secret -d System: PowerEdge R720 OMSA version:7.1.0 ServiceTag: ### Plugin version: 3.7.9 BIOS/date: 1.2.6 05/10/2012 Checking mode: SNMPv2c UDP/IPv4 - Storage Components = STATE |ID| MESSAGE TEXT -+--+ OK |0 | Controller 0 [PERC H310 Mini] is Ready WARNING | 0:0:1:0 | Physical Disk 0:1:0 [Ata ST2000DM001-9YN164, 2.0TB] on ctrl 0 is Online, Not Certified WARNING | 0:0:1:1 | Physical Disk 0:1:1 [Ata ST2000DM001-9YN164, 2.0TB] on ctrl 0 is Online, Not Certified OK | 0:0 | Logical Drive '/dev/sda' [RAID-1, 1862.50 GB] is Ready OK | 0:0 | Connector 0 [SAS] on controller 0 is Ready OK | 0:1 | Connector 1 [SAS] on controller 0 is Ready OK |0:0:1 | Enclosure 0:0:1 [Backplane] on controller 0 is Ready - Chassis Components = STATE | ID | MESSAGE TEXT -+--+ OK |0 | Memory module 0 [DIMM_A1, 4096 MB] is Ok OK |1 | Memory module 1 [DIMM_A2, 4096 MB] is Ok OK |2 | Memory module 2 [DIMM_A3, 4096 MB] is Ok OK |3 | Memory module 3 [DIMM_A4, 4096 MB] is Ok OK |0 | Chassis fan 0 [System Board Fan1 RPM] reading: 1200 RPM OK |1 | Chassis fan 1 [System Board Fan2 RPM] reading: 1080 RPM OK |2 | Chassis fan 2 [System Board Fan3 RPM] reading: 1200 RPM OK |3 | Chassis fan 3 [System Board Fan4 RPM] reading: 1080 RPM OK |4 | Chassis fan 4 [System Board Fan5 RPM] reading: 1080 RPM OK |5 | Chassis fan 5 [System Board Fan6 RPM] reading: 1080 RPM OK |0 | Power Supply 0 [AC]: Presence detected OK |0 | Temperature Probe 0 [System Board Inlet Temp] reads 26 C (min=3/-7, max=42/47) OK |1 | Temperature Probe 1 [System Board Exhaust Temp] reads 33 C (min=8/3, max=70/75) OK |2 | Temperature Probe 2 [CPU1 Temp] reads 49 C (min=8/3, max=83/88) OK |0 | Processor 0 [Intel Xeon E5-2603 0 1.80GHz] is Present OK |0 | Voltage sensor 0 [CPU1 VCORE PG] is Good OK |1 | Voltage sensor 1 [System Board 3.3V PG] is Good OK |2 | Voltage sensor 2 [System Board 5V PG] is Good OK |3 | Voltage sensor 3 [CPU1 PLL PG] is Good OK |4 | Voltage sensor 4 [System Board 1.1V PG] is Good OK |5 | Voltage sensor 5 [CPU1 M23 VDDQ PG] is Good OK |6 | Voltage sensor 6 [CPU1 M23 VTT PG] is Good OK |7 | Voltage sensor 7 [System Board FETDRV PG] is Good OK |8 | Voltage sensor 8 [CPU1 VSA PG] is Good OK |9 | Voltage sensor 9 [CPU1 M01 VDDQ PG] is Good OK | 10 | Voltage sensor 10 [System Board NDC PG] is Good OK | 11 | Voltage sensor 11 [CPU1 VTT PG] is Good OK | 12 | Voltage sensor 12 [System Board 1.5V PG] is Good OK | 13 | Voltage sensor 13 [PS2 PG Fail] is Good OK | 14 | Voltage sensor 14 [System Board PS1 PG Fail] is Good OK | 15 | Voltage sensor 15 [System Board BP1 5V PG] is Good OK | 16 | Voltage sensor 16 [CPU1 M01 VTT PG] is Good OK | 17 | Voltage sensor 17 [PS1 Voltage 1] reads 114 V OK |0 | Battery probe 0 [System Board CMOS Battery] is Presence Detected OK |0 | Amperage probe 0 [PS1 Current 1] reads 0.6 A OK |1 | Amperage probe 1 [System Board Pwr Consumption] reads 56 W OK |0 | Chassis intrusion 0 detection: Ok (Not Breached) OK |0 | SD Card 0 [vFlash] is Absent - Other messages = STATE |
Re: [Nagios-users] Check_Openmanage not ignoring non-certified drives
Bob The Junkie bob_the_jun...@hotmail.com writes: I m using Nagios and Check_Openmange to keep an eye on some Dell R710 servers we ve recently acquired, and I m having problems trying to stop warnings with non-dell certified drives appearing in the alert log. I ve separated out the different components on the servers to check into their own nagios checks so my config files appear as such: In nagios: SERVICES.CFG host Check_command check_dell_components!memory host Check_command check_dell_components!alertlog COMMANDS.CFG Command_name Check_dell_components Command_line check_nrpe H $HOSTADDRESS$ -p 5666 t 30 c Check_OpenManage a only $ARG1$ On each Server in nsclient.ini: Check_OpenManage = scripts\\check_openmanage.exe $ARG1$ --perfdata The problem I m having is that in one of my checks that checks the health of the alert log, I m getting a consistent warning message (Alert log content: 0 critical, 6 non-critical, 36 ok ). I ve traced this down to the 6 non-dell certified drives in the server, and I can indeed see within OMSA that the only 6 warnings all state Controller event log: PD 04(e0x20/s4) is not a certified drive: Controller 0 (PERC 6/i Integrated) . So far, so good. Reading through the documentation I can see the Check_Openmanage includes a blacklisting option specifically for this event pdisk_cert - Suppress warning message about non-certified physical disk but no matter what I try, I can t seem to get Check_Openmanage to ignore these problems. An example of the command I m running on the command line is: check_openmanage.exe -s -a -b pdisk_cert=all Which returns: WARNING: Alert log content: 0 critical, 6 non-critical, 36 ok Now I m assuming the problem here is being caused by the Alert Log generating the errors, and not the physical disk directly causing the errors, which is why blacklisting the certificate problem on the physical disk isn t doing me any good. Which leads me onto my question is there anything I can do to ignore these errors (and thus stop Nagios from complaining) apart from excluding the alert log when I do my checks? Hi, Your analysis is correct. The check_openmanage plugin's check of the log content is limited to counting the number of critical, warning and ok messages. It doesn't do any log parsing. The intended usage of the log checking is as a precausion, if you're concerned about missing some temporary problem. After all, the plugin does active checking and will only report the state of the hardware right now. In your case I think that the easiest solution would be to stop using the log checking with check_openmanage, and either use a fully fledged log parsing plugin (such as check_logfiles) or write your own simple plugin where you just filter out the certificate stuff. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only -- learn more at: http://p.sf.net/sfu/learnmore_122412 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] New check_openmanage error after updating to OMSA 7.2.0-4
Steve Jenkins stevejenk...@gmail.com writes: And... to answer my own question, yes - 3.7.9 does indeed fix this. New version is probably already in the repos, waiting out the testing period. Not sure which repos you're referring to, but I'll assume Fedora and/or Fedora EPEL. I didn't get around to submitting updates until today. They should arrive in the testing repos in a couple of days. The updates need to stay in testing for a week for Fedora and two weeks for EPEL before they can be pushed to stable. If you can't wait, you can download the RPMs via the Fedora build system, you'll find links here: https://admin.fedoraproject.org/updates/search/nagios-plugins-openmanage When it has arrived in testing (and in your local mirror), you can install it with (example for EPEL): yum --enablerepo=epel-testing update nagios-plugins-openmanage Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnmore_122712 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: timeout vs. SNMP timeout
Andrew Daugherity adaugher...@tamu.edu writes: Please try this version (named 3.7.8-beta2) and let me know if it works around your problem. Usage: check_openmange --snmp-timeout integer I think I fixed my problem (for the time being at least) by restarting OMSA on that server. Restarting snmpd didn't solve anything, nor did my timeout hack (which just gave me an UNKNOWN status - plugin timeout instead of SNMP CRITICAL when it randomly failed). Whenever the check failed, it would hang indefinitely, so it was not a case of slow SNMP. Thanks for the added option, though; I think someone may find it useful. Yes, I agree. I'll keep it. Regarding your fix: The timeout option does appear to get passed to SNMP, however the actual timeout is twice what is specified. E.g. --snmp=timeout=1, get SNMP critical message after 2 seconds; --snmp-timeout=14, SNMP critical at 28 seconds; --snmp-timeout=15 or higher, get UNKNOWN: PLUGIN TIMEOUT message at 30 seconds. (I used a host without snmpd running for the timeout tests.) I can't see anything obviously wrong with your code, but it behaves this way both on both SLES 11 SP1 (Perl 5.10, net-snmp 5.4.2.1, Net::SNMP 6.0.1) and OS X 10.8 (Perl 5.12.4, net-snmp 5.6, Net::SNMP 6.1 [from CPAN]). Hmm.. kind of confusing. It is due to the fact that Net::SNMP does one retry (with the same timeout) before it bails out. This is adjustable with the '-retries' parameter to the SNMP object. The default is 1. If I set it to 0, the plugin times out in the SNMP object at the specified time as you would expect. Thanks for pointing this out, I should make a note of it in the manual page. You probably also want to add this option to the help/usage message. I won't make the help output, as that only covers the most popular options, but I'll add it to the manual page. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: timeout vs. SNMP timeout
Trond Hasle Amundsen t.h.amund...@usit.uio.no writes: A new option to specify the SNMP object timeout would be easy to add, and is in my opinion a cleaner approach than just passing the plugin timeout. Such an option is now implemented in the Git version: http://git.uio.no/git/?p=check_openmanage.git;a=commit;h=32564b44c2631eeac03a920f0c180fb12e4b29c8 Please try this version (named 3.7.8-beta2) and let me know if it works around your problem. Usage: check_openmange --snmp-timeout integer Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: timeout vs. SNMP timeout
Andrew Daugherity adaugher...@tamu.edu writes: I'm troubleshooting an issue where one server is occasionally not responding (I think it's a firewall or snmpd issue, not this plugin), and I noticed that changing the timeout option to check_openmanage did not affect how long it took before receiving the SNMP CRITICAL: No response from remote host A.B.C.D message. Looking at the code I see the timeout option is _not_ passed to the Net::SNMP session object, so the SNMP connection timeout uses the default value (5 seconds according to the Net::SNMP man page, but 10 seconds in my testing). If I pass the timeout option to the Net::SNMP-session object like so: diff --git a/check_openmanage b/check_openmanage index b6abec5..3558ed4 100755 --- a/check_openmanage +++ b/check_openmanage @@ -860,6 +860,7 @@ sub snmp_initialize { '-port' = $opt{port}, '-hostname' = $opt{hostname}, '-version' = $opt{protocol}, +'-timeout' = $opt{timeout}, ); # Setting the domain (IP version and transport protocol) Then it does obey the timeout option and I instead get the PLUGIN TIMEOUT: check_openmanage timed out after 30 seconds message. This might be by design though, to have a shorter SNMP timeout and different error messages, but it was perplexing to me why the timeout option was seemingly not working. Perhaps a different option for the SNMP timeout, or a documentation clarification, is a better way? Hello Andrew, Your analysis of this problem is correct, you're hitting the Net::SNMP timeout which is default 5 seconds. There are two reasons why the --timeout parameter isn't passed to the SNMP object: 1. I never saw any reason to :) This is the first time I've heard of problems relating to it. 2. The SNMP object timeout has limitations, it can only be between 1 and 60 seconds. I don't know how Net::SNMP reacts if the specified value is outside of this range. The documentation is lacking on this, as you pointed out, and I'll fix that. A new option to specify the SNMP object timeout would be easy to add, and is in my opinion a cleaner approach than just passing the plugin timeout. PS. I'm going away for the weekend and I'm leaving in a few minutes, so I'll get back to you on this early next week. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage
Jens Hyllegaard (Soft Design A/S) jens.hyllega...@softdesign.dk writes: I am using version 3.7.6 of check_openmanage. I have disabled notifications for battery charge events in the call to check_openmanage but I still get notifications from Nagios. This is command line I use: $USER1$/check_openmanage -s -p -H $HOSTADDRESS$ -b ps=all -b bat_charge This is the current output from check_openmanage for one the servers. WARNING: Cache Battery 0 in controller 0 is Charging (Ready) [probably harmless] Hello Jens, There is a slight typo in your command definition. Replace with: $USER1$/check_openmanage -s -p -H $HOSTADDRESS$ -b ps=all -b bat_charge=all ..and you should be fine :) Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: fix build on SUSE (docbook pkg name)
Andrew Daugherity adaugher...@tamu.edu writes: Simple fix -- the package is named 'docbook-xsl-stylesheets' instead of 'docbook-style-xsl'. I added a variable for this to the global if suse section. Thanks Andrew, applied and pushed to master. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Dell Openmanage
Sven Dohmen s...@o4s.nl writes: Since several months we are using the Dell Openmanage plugin from http:// folk.uio.no/trondham/software/check_openmanage.html. This has been working fine untill the last couple weeks. For some servers we are getting the following results back: W: Controller 0 [PERC 6/i Integrated]: Firmware '6.2.0-0013' is out of date -- SYSTEM: PowerEdge R710, SN: INTERNAL ERROR: Use of uninitialized value within %fw_type in string eq at (eval 1) line 4976. INTERNAL ERROR: Use of uninitialized value within %fw_type in pattern match (m/ /) at (eval 1) line 4980. I noticed this only happens when 1 of the drivers is out of date. Is there a solution for without directly updating the firmware (which is already planned over several weeks). In case anyone else has this issue.. Sven and I worked on this off-list, and we identified this to be an error related to using the '-o' option over SNMP, on servers equipped with iDRAC6 or iDRAC7 management cards. The plugin check_openmanage has been fixed and a new release (versjon 3.7.6) is available: http://folk.uio.no/trondham/software/check_openmanage.html#download Notice for For RHEL and Fedora users: The new release has been submitted as an update for Fedora and Fedora EPEL. It is currently in testing, and can be updated with: yum --enablerepo=\*testing update nagios-plugins-openmanage Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Warning alert isn't working
Leonardo Bacha Abrantes leona...@lbasolutions.com writes: Hi everybody! I'm using check_openmanage plugin in nagios to monitoring the temperature of my dell servers. It's working, however, the warning and critical alerts that I configure are not working. [root@monitor:/etc/openmanage]# /usr/lib/nagios/plugins/check_openmanage -w 25 -c 30 -H 10.11.12.1 -C Test--only temp TEMPERATURES OK - 1 temperature probes checked:BRTemperature Probe 0 [System Board Ambient Temp] reads 30 C (min=8/3, max=42/47) The temperature is 30 and the check should appear WARNING because I used -w 25. Hello Leonardo, The syntax you're using with the '-w' and '-c' options is wrong. From the manual page: -w, --warning STRING or FILE Override the machine-default temperature warning thresholds. Syntax is id1=max[/min],id2=max[/min], The following example sets warning limits to max 50C for probe 0, and max 45C and min 10C for probe 1: check_openmanage -w 0=50,1=45/10 The minimum limit can be omitted, if desired. Most often, you are only interested in setting the maximum thresholds. This parameter can be either a string with the limits, or a file containing the limits string. The option can be specified multiple times. NOTE: This option should only be used to narrow the field of OK temperatures wrt. the OMSA defaults. To expand the field of OK temperatures, increase the OMSA thresholds. See the plugin web page for more information. -c, --critical STRING or FILE Override the machine-default temperature critical thresholds. Syntax and behaviour is the same as for warning thresholds described above. The reason that you need to specify the ID of the temperature probes is that there may be more than one, each with its own thresholds. In your case there is only one probe and its ID is 0, so replace your command above with: check_openmanage -w 0=25 -c 0=30 -H 10.11.12.1 -C Test --only temp That should do the trick. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: Physical Disk ... Undefined value 4096
Helmut Wollmersdorfer helmut.wollmersdor...@fixpunkt.de writes: Physical Disk 0:0:0 [Dell WDC WD1003FBYX-18Y7B0, 1.0TB] on ctrl 0 needs attention: Undefined value 4096 Hello Helmut, The state value for physical disks via SNMP is an integer, which is translated by the plugin. There are a few defined values, and 4096 is not one of them. On the console of the server: # /opt/dell/srvadmin/bin/omreport storage pdisk controller=0 vdisk=0 List of Physical Disks belonging to VD10A Controller PERC H700 Integrated (Slot 4) Span 0 ID: 0:0:0 Status: Unknown Name : Physical Disk 0:0:0 State : Unknown Power Status : Spun Up Bus Protocol : SATA Media : HDD Revision : 01.01V02 Failure Predicted : No Certified : Yes Encryption Capable: No Encrypted : Not Applicable Progress : Not Applicable Mirror Set ID : 0 Capacity : 931.00 GB (999653638144 bytes) Used RAID Disk Space : 931.00 GB (999653638144 bytes) Available RAID Disk Space : 0.00 GB (0 bytes) Hot Spare : No Vendor ID : DELL Product ID: WDC WD1003FBYX-18Y7B0 Serial No.: WD-WCAW3145836558365 Part Number : TH0V8FCR1255213BC4RGA00 Negotiated Speed : 3.00 Gbps Capable Speed : 3.00 Gbps Manufacture Day : Not Available Manufacture Week : Not Available Manufacture Year : Not Available SAS Address : 443322110700 [same for all 4 disks of the array] Thus it seems that check_openmanage works correctly. Also the disk- array seems to work correctly (no error messages in the logs). Is this some sort of wrong diagnostic from the firmware/controller? No, this is not normal behaviour. I've seen this only on disks that were so damaged that Openmanage failed miserably when attempting to get info from them. Clearly this is not the case here, as you get the same error on multiple disks and they otherwise work fine. If you haven't already, you should try upgrading all BIOS and firmware on the server, especially the controller firmware. You should also upgrade Openmanage if you're not running the latest version (6.5.0). If all else fails, I would contact Dell support and have them look at it. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] SELinux and RHEL6.2 preventing disk checks via NRPE
Dennis Kuhlmeier kuhlme...@riege.com writes: Geez, there are a lot more contexts set than I thought. I should probably remove duplicate entries, right? The labels in /etc/selinux/targeted/contexts/files/file_contexts is there by default and these should not be touched. The file /etc/selinux/targeted/contexts/files/file_contexts.local contains local additions or adjustments. If there are entries there that you think ought to be removed, you should remove them with: semanage fcontext -d 'entry' Don't edit the file directly :) Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Systems Optimization Self Assessment Improve efficiency and utilization of IT resources. Drive out cost and improve service delivery. Take 5 minutes to use this Systems Optimization Self Assessment. http://www.accelacomm.com/jaw/sdnl/114/51450054/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage spec file fixes for SUSE
Daugherity, Andrew W adaugher...@tamu.edu writes: First of all, thanks for making this plugin. It works well and is very handy. As requested in the documentation, I am sending this to the nagios-users list rather than directly to the author. Hello Andrew, Excellent :) Usually a public forum is better, where everybody can participate and share their insight. With some minor modifications, the package builds properly on SUSE. There are two main Nagios packaging differences from RedHat: 1) All Nagios plugins are installed to /usr/lib/nagios/plugins, even on 64-bit (there is no /usr/lib64/nagios directory). This may not make the most sense, but it is what is, and being consistent with other Nagios packages is good. 2) Non-binary plugin RPMs (e.g. Perl scripts only) use noarch, while binary plugins use the corresponding arch. For examples of both, browse the build service repo at http://download.opensuse.org/repositories/server:/monitoring/SLE_11.1/ Being a Perl script, check_openmanage falls under the former. This is easily solved with an %if block to make a universal RPM spec: BEGIN PATCH --- nagios-plugins-openmanage.spec.orig 2011-10-05 10:00:18.0 -0500 +++ nagios-plugins-openmanage.spec2011-12-01 15:02:10.0 -0600 @@ -5,6 +5,16 @@ # No binaries here, do not build a debuginfo package %global debug_package %{nil} +# SUSE installs Nagios plugins under /usr/lib, even on 64-bit +# It also uses noarch for non-binary Nagios plugins +%if %{defined suse_version} +%global nagiospluginsdir /usr/lib/nagios/plugins +BuildArch: noarch +%else +%global nagiospluginsdir %{_libdir}/nagios/plugins +%endif + + Name: nagios-plugins-openmanage Version: 3.7.3 Release: 1%{?dist} END PATCH I also tested building on CentOS 5 to make sure nothing broke there, and indeed, nothing changed there. Thanks for the patch, applied. However, there are some changes to the spec file lately. Among them is an added Requires to the nagios-plugins package, which owns the /usr/lib(64)?/nagios/plugins directory. Hopefully SUSE does the same in this respect. The updated spec file is available here: http://folk.uio.no/trondham/software/tmp/nagios-plugins-openmanage.spec PS. check_openmanage has been added to Fedora and EPEL, but there are some SELinux issues. Until these are resolved I'll hold off pushing it to stable, but it is available in testing. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] SELinux and RHEL6.2 preventing disk checks via NRPE
Dennis Kuhlmeier kuhlme...@riege.com writes: Hello, after upgrading to RHEL6.2 I have problems checking some filesystems. Always the same three FS on all hosts, others work fine. /boot /home /var/log/audit $ ./check_nrpe -H backup -c check_fs_boot DISK CRITICAL - /boot is not accessible: Permission denied Now I disable SELinux and it works! $ ./check_nrpe -H backup -c check_fs_boot DISK OK - free space: /boot 36 MB (39% inode=99%);| /boot=55MB;96;;0;96 Although not a single line is logged on the monitored host, neither in messages nor in audit.log I already had a local policy created for the nrpe daemon when RHEL6 was introduced, as somehow many checks failed, although the user nrpe was running in was allowed to perform all checks, the nrpe daemon itself couldn't. I'll attach the policy, although at one point I gave up and just set the entire process to permissive mode. (note that I tried to extend rights on boot filesystem in this policy already, although it would seem to be unnecessary) Anybody experiencing something alike or any suggestions about how to handle nrpe and RHEL6(.2) in a better way than I am? RHEL6 has the following labels for use with Nagios plugins: # grep nagios /etc/selinux/targeted/contexts/files/file_contexts | grep plugin_exec | cut -d: -f3 | sort -u nagios_admin_plugin_exec_t nagios_checkdisk_plugin_exec_t nagios_mail_plugin_exec_t nagios_services_plugin_exec_t nagios_system_plugin_exec_t nagios_unconfined_plugin_exec_t Try setting the confined types first, e.g.: chcon -t nagios_checkdisk_plugin_exec_t /path/to/check_fs_boot If none of them works properly, you have nagios_unconfined_plugin_exec_t as a last resort. When you find one that works, make it permanent with: semanage fcontext -a -t type '/path/to/check_fs_boot' You may also have to set proper labels on the path leading up to the actual plugin. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage plugin: Couldn't run command ...
Corcoran Smith corco...@flair4it.co.uk writes: First message, so please excuse any failures in format, etc! Got two issues with two boxes (out of 160!) using check_openmanage: 1) Couldn't run command 'c:\pro... ' etc 2) U nrecognized character xA8: marked by -- HERE after -- HERE near column 1 at /loader/HASH(0xa7c42c)/UNIVERSAL.pm line 1. both are using the windows exe Hi Corcoran, I'll need more data to debug the first issue, e.g. the full error message from the plugin. Unless they appear on the same server(?), in which case issue 1 is probably caused by issue 2. Regarding issue 2, I've seen this once before. A disk was so damaged that OMSA failed while getting info from it, and gave an error message like above: unrecognized character This output is not something that the plugin doesn't expect and couldn't possibly prepare for, so it throws an error. You need to identify the failed component, it probably needs to be replaced. Try running 'omreport' commands to find it. Start with 'omreport storage pdisk controller=0'. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage on CentOS 5.6 Hosts
the entrox ent...@stoleyour.com writes: i've been using the check_openmanage script to monitor about two dozens of dell servers without a hitch (all Windows based) and we just set up about 15 or so new servers but this time running CentOS, i of course installed the OMSA via Dell's repository and also enabled SNMP but i cant seem to get the command to work on those hosts. i am trying to run the debug command to look at the entire output like this: [root@MONITOR02 plugins]# ./check_openmanage -H HOSTIP -C COMMUNITY -d ERROR: (SNMP) OpenManage is not installed or is not working correctly This error means that the SNMP service on the monitored host is working and we get a reply, but the OIDs for OMSA are not present. i of course checked where the omreport binary was at and its where the script is looking for it: [root@mvarutestvmbase01 ~]# find / -name omreport /opt/dell/srvadmin/sbin/omreport /opt/dell/srvadmin/bin/omreport [root@mvarutestvmbase01 ~]# When using SNMP, the plugin doesn't utilize the omreport binary in any way. It doesn't care where it is installed. BTW, the above location is correct and is the default. just to double check i went ahead and looked if the OMSA was working, i went via web and the console shows up no problem at all, if i authenticate it shows all the information that it should be showing, i also restarted all the services on the OMSA just to see if something was up but nothing, it still claims its not working: http://pics.entrox.me/983ygh426g.png This is interesting. The SNMP service wasn't started. You should see something like this: Starting dsm_sa_snmpd: [ OK ] The dsm_sa_snmpd service is started by /etc/init.d/dataeng. This script is also responsible for starting other components such as dsm_sa_datamgrd, and that seems to work fine. You should also see dsm_sa_snmpd in the process list if it's running: # ps axww | grep dsm_sa_snmpd 4967 ?Ssl0:00 /opt/dell/srvadmin/sbin/dsm_sa_snmpd From what I can gather from the dataeng init script, it won't start dsm_sa_snmpd if this file exists: /opt/dell/srvadmin/var/lib/srvadmin-deng/dcsnmp.off If it exists on your system, try removing it and restart OMSA. Also verify that your /etc/snmp/snmpd.conf contains the following at the very end: # Allow Systems Management Data Engine SNMP to connect to snmpd using SMUX smuxpeer .1.3.6.1.4.1.674.10892.1 This should have been added by OMSA at install time. i also read on the man page of the script (http://folk.uio.no/trondham/software /check_openmanage.html) that i could use the --omreport option but no dice with that, even trying the bin and sbin omreport binary file i got the exact same message: This option allows you to specify the location of the omreport command. It has no effect when using SNMP, and is only really usable on Windows systems, where OMSA can be installed on drives other than C:. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: OOPS! Something is wrong...
Lois Garcia l...@rockyou.com writes: This is the output from omreport chassis pwrsupplies -fmt ssv: C:\Users\Administratoromreport chassis pwrsupplies -fmt ssv Power Supplies Information Power Supply Redundancy Redundancy Status;Lost Individual Power Supply Elements Index;Status;Location;Type;Rated Input Wattage;Maximum Output Wattage;Online Sta tus;Power Monitoring Capable 0;Ok;PS 1 Status;AC;[No Value];[No Value];Presence Detected;Yes 1;Ok;PS 2 Status;AC;1080 W;870 W;Presence Detected;Yes Thanks. This shows that the plugin's behaviour was correct in my opinion. OMSA states that both PSUs are OK, which is what the plugin reports. There is a bug somewhere, but it is probably in OMSA. My guess is that there is a rare and unknown error condition in PSU1, which OMSA doesn't handle correctly. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: OOPS! Something is wrong...
Lois Garcia l...@rockyou.com writes: Thank you, Trond! It looks like a power supply problem. I will take the issue to Dell: C:\Users\Administratoromreport system Health SEVERITY : COMPONENT Critical : Main System Chassis C:\Users\Administratoromreport chassis Health Main System Chassis SEVERITY : COMPONENT Ok : Fans Ok : Intrusion Ok : Memory Critical : Power Supplies Ok : Power Management Ok : Processors Ok : Temperatures Ok : Voltages Ok : Hardware Log Ok : Batteries Hmm... there is obviously something amiss with the power supplies, but the plugin didn't catch it. I'd like to know why. Can you provide the output from: omreport chassis pwrsupplies -fmt ssv This is the command that the plugin runs to get the status of the power supplies. Thank you also for putting such a great plugin into the community. Without it, monitoring the few Windows machines in our all Linux environment would have been a chore I don't care to contemplate. Thank you, glad you like it :) I don't see a donation link on your website at http://folk.uio.no/trondham/ software/check_openmanage.html - ? No, there is no donation link, the thought never crossed my mind. I have benefitted enormously (personally and professionally) from free and open source software for many years. This is just my way of giving back. Besides, I've found that creating and maintaining open source software is by itself rewarding, in many different ways. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: OOPS! Something is wrong...
lois garcia l...@rockyou.com writes: I have check_openmanage running successfully on 13 out of 16 Dell R710s. I am really puzzled at what is going wrong, as it seems different on each machine. I have tried different versions of check_openmanage and reinstalling the same version of Dell OMSA. The first eight servers were built from the same Ghost image, and last month, one of those servers started showing the check_openmanage error: UNKNOWN 09-13-2011 17:04:23 7d 1h 7m 54s 4/4 UNKNOWN: Storage Error! No controllers found UNKNOWN: Problem running 'omreport chassis memory': Error: Memory object not found UNKNOWN: Problem running 'omreport chassis fans': Error! No fan probes found on this system. UNKNOWN: Problem running 'omreport chassis temps': Error! No temperature probes found on this system. UNKNOWN: Problem running 'omreport chassis volts': Error! No voltage probes found on this system. I reinstalled the Dell software, fixing the UNKNOWN error, and now have this error: OOPS! Something is wrong with this server, but I don't know what. The global system health status is CRITICAL, but every component check is OK. This may be a bug in the Nagios plugin, please file a bug report. The server is a Dell R710, running Windows Server 2008 R2 Enterprise. Hello Lois, (I shortened the subject) When the plugin is used in local mode, as in your case, the plugin checks the global health status using this command: # omreport system Health SEVERITY : COMPONENT Ok : Main System Chassis For further help, type the command followed by -? If everything is OK you'll get the output above. What do you get when running this command on the troubled server? Does the ESM log contain any clues? Try running 'omreport system esmlog' and see. Try running 'omreport chassis' as well. There are two possible causes for the oops error. Either Openmanage isn't behaving properly, or your server has an error that the plugin doesn't catch. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] home made php script
Erik Olsen nag...@elitdata.no writes: I've been trying to make my own script now for a few hours but im not getting it to work with nagios. Im most familiar with php so I used that to make the script. My setup: Ubuntu 11.4 server Nagios 3.2.3 The host/command/and service are all in the same .cfg file. define command{ command_name check_ups_temprature2 command_line $USER$/check_ups_temp.php } define service{ use generic-service host_name ups1 service_description Temp ups env sensor check_command eaton_ups_temp } Status Information (Return code of 127 is out of bounds - plugin may be missing) Hi Erik, There is a typo on the command_line line. The $USER$ macro doesn't exist. There are 32 possible user macros, named $USER1$ through $USER32$. Try replacing $USER$ with $USER1$, or simply the actual path leading up to the plugin. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- FREE DOWNLOAD - uberSVN with Social Coding for Subversion. Subversion made easy with a complete admin console. Easy to use, easy to manage, easy to install, easy to extend. Get a Free download of the new open ALM Subversion platform now. http://p.sf.net/sfu/wandisco-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] omreport and check_openmanage
Emilio Bruna emilio.br...@heliman.it writes: Thanks a lot for your hints Trond, check_openmanage is already at latest version. We will try with an OMSA update first and then (if the issue persist) we will update BIOS too. If all else fails, you have the option of disabling the power management check completely, by using '--check amperage=0': check_openmanage --check amperage=0 By using this option you're telling the plugin that it shouldn't even attempt to run 'omreport chassis pwrmonitoring'. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] omreport and check_openmanage
Emilio Bruna emilio.br...@heliman.it writes: Omsa version is 6.2.0.1 so: windows 2008 storage server SP2 Hardware is Dell NX 300 Storage server (a derivate of R410 or R310 i think) This combination should be ok. I don't know the NX300, but if it's based on the R310 or R410 it shouldn't be a problem. There was a bug in check_openmanage related to power monitoring on the R410, but this was fixed in version 3.6.5 of the plugin. Are you using the latest version of check_openmanage, which is 3.6.8? Also, would it be possible for you to upgrade OMSA to the latest version, 6.5.0? This really is an OMSA issue. If the power supplies don't support power monitoring, omreport should just that say that and check_openmanage is happy. But in your case, OMSA is responding with an error. One last tip. In some cases I've seen that certain capabilities in OMSA depends on BIOS and/or firmware versions. You should verify that the BIOS and firmware is relatively up-to-date. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage error on W2k8r2 Dell R900
Jay Wahl j...@firewahl.com writes: Love check_openmanage plugin for Nagios! It has been a great help for monitoring our Dell hardware. I recently built 3 Dell 900s with W2K8r2 with check_openmanage (v 3.6.8) and Dell OMSA (v 6.5.0). Hi Jay, Are you completely sure that you're using version 3.6.8? My reason for asking is that the errors you get don't make sense (details below). I am getting the following errors: C:\Program Files\NSClient++\scriptscheck_openmanage Problem running 'omreport chassis memory': Error Correction;Multibit ECC This was fixed a while back (version 3.6.3 IIRC). The Error Correction field appeared in OMSA 6.4.0 and check_openmanage triggers on strings containing Error. The particular string above obviously does not indicate an actual error and was put in the whitelist for errors shortly after OMSA 6.4.0 was released. INTERNAL ERROR: Use of uninitialized value in concatenation (.) or string at script/check_openmanage line 1650. INTERNAL ERROR: Use of uninitialized value in concatenation (.) or string at script/check_openmanage line 1650. These two don't make any sense, since line 1650 only contains a comment. They are also probably not related to the memory check. Please verify the version of check_openmanage. The plugin will output its version number with either of these options: check_openmanage -V check_openmanage -d Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] omreport and check_openmanage
Emilio Bruna emilio.br...@heliman.it writes: Hello all, i'm monitoring several Dell windows servers with nagios and NSClient++ and OMSA + check_openmanage. On one of these, i'm getting a problem monitoring the redundant power supplies. Running the command below LOCALLY on the machine being monitored i got the right data from omreport.exe: c:\Program Files (x86)\Dell\SysMgt\oma\binomreport.exe chassis pwrsupplies Power Supplies Information --- Main System Chassis Power Supplies : Ok --- Power Supply Redundancy : Ok Attribute : Redundancy Status Value : Full Individual Power Supply Elements Index : 0 Status : Ok Location : PS 1 Status Type : AC Rated Input Wattage : 680 W Maximum Output Wattage : 500 W Online Status : Presence Detected Power Monitoring Capable : Yes Index : 1 Status : Ok Location : PS 2 Status Type : AC Rated Input Wattage : 680 W Maximum Output Wattage : 500 W Online Status : Presence Detected Power Monitoring Capable : Yes running the below command (the ones needed to check_openmanage): c:\Program Files (x86)\Dell\SysMgt\oma\binc:\Users\administrator.CMVC\Desktop\ check_openmanage.exe --omreport c:\Program Files (x86)\Dell\SysMg mreport.exe Problem running 'omreport chassis pwrmonitoring': Error: Current probes not found i've noticed that the switches coming from check_openmanage are slightly different from the ones passed from omreport.exe (omreport chassis pwrmonitoring instead of omreport chassis pwrsupplies) so it seems that check_openmanage has the wrong switches regard to the powermonitoring check status; or maybe the omsa version i'm using is not at the correct version to work in the right way with check_openmanage. Hi Emilio, Don't confuse the two arguments 'pwrsupplies' and 'pwrmonitoring'. They do different things, and check_openmanage uses both of them. It runs 'omreport chassis pwrsupplies' to get the status of the power supplies, and it runs 'omreport chassis pwrmonitoring' to get the status and value of the amperage probes. The latter includes the overall power consumption of the server. In your case, it's the 'pwrmonitoring' command that fails. This is a known problem with some older versions of OMSA. Which version of OMSA are you running, and on what kind of PowerEdge server? Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_Openmanage configuration question
Daniel Ceola dce...@twgi.net writes: Hello all! Hi Daniel, I have a question regarding the initial configuration of check_openmanage. I downloaded the version of the script dated Feb 9 (I don?t see a version number in the script) and am attempting to use the script through SNMP. Tip: Run the plugin with the '-V' or '--version' switch to view the version number. I?m attempting to begin using check_openmanage with our Dell servers. I have installed the Dell OMSA software on one server and it seems to be working just fine. I configured my command definition in a simple fashion, according to the installation guide: # Dell Check openmanage define command{ command_namecheck_openmanage command_line$USER1$/check_openmanage -H $HOSTADDRESS$ } I also configured my service definition in a simple fashion, according to the installation guide: define service{ use generic-service host_name Server_Name service_description Dell OMSA check_command check_openmanage } This looks correct to me. However ? my Nagios console is reporting the status as (null). Also, when I attempt to run the script from the command line (note the file is saved as check_openmanage with no file extension, I also tried check_openmanage.pl and receive the same results), I receive a few errors nagios@UbuntuTest:/usr/local/nagios/libexec$ ./check_openmanage 192.168.1.5 ./check_openmanage: line 27: require: command not found ./check_openmanage: line 28: use: command not found ./check_openmanage: line 29: use: command not found ./check_openmanage: line 30: syntax error near unexpected token `(' ./check_openmanage: line 30: `use POSIX qw(isatty ceil);' Weird. Your system seem to be running the plugin through a shell. The output above is exactly what you'll get if you run sh ./check_openmanage To specify perl as interpreter, run: perl ./check_openmanage However, this should not be needed. The system should identify it as a perl script and use perl to execute it by default. Have you edited the plugin in some way? Check that the md5sum is correct: $ md5sum check_openmanage 5281718fe9e5c4b9570fe76f0fb424ec check_openmanage The above sum is correct for version 3.6.6. You should verify that you get the same (if running 3.6.6). The latest version and its md5sum are available here: http://folk.uio.no/trondham/software/check_openmanage.html#download PS. In your example above you have forgotten the '-H' switch. PPS. The file extension (or the name itself) is unimportant. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Achieve unprecedented app performance and reliability What every C/C++ and Fortran developer should know. Learn how Intel has extended the reach of its next-generation tools to help boost performance applications - inlcuding clusters. http://p.sf.net/sfu/intel-dev2devmay ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage errors
Steve Glasser sglas...@visp.net writes: That combination should work just fine. Please try either of the beta versions, as I suggested in my previous email. The issue you're having may very well be fixed in the betas. Tried check_openmanage-3.7.0-beta2.0-beta2, problem solved. Excellent, thanks for testing and reporting back. I've just released versjon 3.6.6, which contains the same bugfixes as the 3.7 beta, but not the (unfinished) new features :) Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] check_openmanage PNP template (Was: check_openmanage errors)
Randal, Phil pran...@herefordshire.gov.uk writes: Is the beta of check_openmanage.php available for testing? Sure, I put it here: http://folk.uio.no/trondham/software/beta/ Highlights of the template are: - works with the plugin's new perfdata API - removed unnecessary dependence on PHP = 5.2 (good for rhel/centos 5 users) - calculate power usage for the selected time period, in Watt hours and BTU I'm currently using a slightly modified version of the one in the latest PNP release. Two cosmetic issues came to mind: 1: Temperature is measured in Celsius, not Celcius Yep, I know. That typo was the first thing I fixed :) 2: Formatting when reporting multiple sensors in one graph is irksome - the values don't align in a nice column (e.g. temperatures). I 'solve' this by a judicious use of substr() and str_pad() to normalise the length of reported sensor names. Hm... this could be tricky to do in a consistent and general manner (at least the substr() part). The sensor names are as reported by OMSA. Perhaps this could be accomplished with some RRD magic instead? Tips and hints are welcome, since I'm neither a PHP expert nor an RRD ninja :) Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies ash...@optonline.net writes: ok, talked to dell, there is no hardware on the T105 that will allow monitoring of the fan, voltage, etc.. basically the only thing you can monitor is the raid array which is fine as that's all I really want to check with nagios. Ok. I don't know the 100 series, but from what I understand they are entry-level servers with limited capabilities and a low price tag. The plugin will barf at servers that don't have the basic monitoring probes, unless they are absent for obvious reasons (e.g. blades don't have fans). I still think this is a good idea, as I've seen plenty of instances where OMSA malfunctions in such a way that it will say a probe doesn't exist when it actually does. I'm reluctant to change that policy, so users of the 100 series will have to exclude certain checks in the plugin. It is not ideal, but I believe the problem to be limited since most would go for servers with better monitoring capabilities (i.e. 200 series and beyond). Still have that pesky timeout after 30 seconds error though. tried with --timeout 60 and with -t 60 and nothing seems to change the behavior. Still weird. Did you try running the plugin manually with the timeout option? Try 'check_openmanage.exe -t 60 [other options]' Perhaps OMSA on the T105 hangs on some probe that doesn't exist. If you're only interested in monitoring storage, you could try: check_openmanage.exe --only storage Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check-openmanage errors after upgrade of openmanage
Trond Hasle Amundsen t.h.amund...@usit.uio.no writes: Are you using check_openmanage with NRPE or similar in local mode, or checking via SNMP? I have an idea of what the problem might be. Can you try either of the development versions of check_openmanage available here: http://folk.uio.no/trondham/software/check_openmanage.html#download Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check-openmanage errors after upgrade of openmanage
Ashcor Technologies ash...@optonline.net writes: I ran check_openmanage.exe --only storage locally and it worked fine. I then changed the NSC.ini to have: command[check_openmanage]=check_openmanage.exe --only storage and restarted the NSCLient++ (x64) service in test mode. the results: d NSClient++.cpp(1106) Injecting: check_openmanage: d NSClient++.cpp(1142) Injected Result: WARNING 'Problem running 'omreport chass is fans': Error! No fan probes found on this system.br/Problem running 'omreport chassis temps': Error! No temperature probes found on this system.br/Proble m running 'omreport chassis volts': Error! No voltage probes found on this system.' Ok, this actually clarifies things. Clearly, NSClient++ ignores everything after 'check_openmanage.exe' in your NSC.ini. There is no way that check_openmanage would complain about fans etc. when the option '--only storage' is specified. Since it works from command line we can safely assume that NSClient++ is the problem. This explains your issues with the timeout option as well. I've looked on your site for the dev versions and am happy to try them but don't see a zip with the .exe. Is there an .exe available for the dev? also, which dev version would you prefer I try, 3.6 or 3.7? I could make a PE32 executable for the dev versions, but in your case it won't help, so there is really no point. Your problem is that NSClient++ ignores the plugin options. Since I don't use NSClient++ I can't offer any insight into how it should be configured, but my first attempt at a fix would be to put the entire command in quotes in NSC.ini. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage errors
Steve Glasser sglas...@visp.net writes: D'oh. We are using check_openmanage with NRPE. The host o/s is CentOS release 5.5. Perl is perl-5.8.8 (from rpm). That combination should work just fine. Please try either of the beta versions, as I suggested in my previous email. The issue you're having may very well be fixed in the betas. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies ash...@optonline.net writes: on two of my dell servers check_openmanage (via nsclient++ and nrpe) return the same error: Use of uninitialized value in concatenation (.) or string at script/check_openmanage.pl line 1386. both dell systems are running the latest OpenManage version 6.5.0. Hi Jeff, Which version of check_openmanage is this? Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies ash...@optonline.net writes: Thanks for the reply. I just realized from your question that I'm using a pre-compiled .exe version of your check_openmanage from here: https://www.monitoringexchange.org/inventory/Check-Plugins/Hardware/check_openmanage-exe which was probably created from an older version... Yeah I think it's pretty old. A PE32 executable for Windows is available in the zip and tar.gz archives, and as a single file download: http://folk.uio.no/trondham/software/check_openmanage.html#download Upgrading to the latest version will probably solve your problem. Let me know if it doesn't. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies ash...@optonline.net writes: now my problem is this... Problem running 'omreport chassis fans': Error! No fan probes found on this system.br/Problem running 'omreport chassis temps': Error! No temperature probes found on this system.br/Problem running 'omreport chassis volts': Error! No voltage probes found on this system. on the NSC.ini i have the following line added and I restarted the NSClient++ service command[check_openmanage]=check_openmanage.exe -b fan=all even tried command[check_openmanage]=check_openmanage.exe -b fan=0 however it still tries to check the fan. I suppose i have a syntax error? No, that is the correct syntax. Blacklisting won't prevent the component class from being checked in the first place, it will only suppress any info about blacklisted components it in the output and plugin return value. To skip fans alltogether use the '--check' option like this: '--check fans=0'. However, unless this is a blade system and the plugin is unable to identify it as such for some reason, your server HAS fan probes and you're having an OMSA problem. The fact that you get errors for other probes such as temperature and voltage confirms this. You need to recheck that OMSA works, that all relevant OMSA components are installed and running etc. It may be as simple as restarting OMSA, but it could also be more complex (e.g. BIOS/firmware upgrade needed). These errors are pretty generic, but the problem is that OMSA isn't working properly on that server. PS. See this URL about configuring Nagios to not escape HTML code in the plugin output (to avoid the literal 'br/'): http://folk.uio.no/trondham/software/check_openmanage.html#multiline-output Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies ash...@optonline.net writes: Ok, now new and exciting changes... no matter what I do I get: WARNING PLUGIN TIMEOUT: check_openmanage timed out after 30 seconds. I have -t 60 set on the check_openmanage command and also on the NRPE check command line and in the NSC.ini. nothing seems to change the timout beyond 30 seconds. I forgot to mention that since you get that particular error it's the plugin that times out, not NRPE or NSClient++. The fact that you're unable to change that behaviour with the '-t' or '--timeout' option is strange, but it would usually indicate a configuration error on your part. You'll have to post the command definition etc. for me (and others on this list) to be able to spot the error. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies ash...@optonline.net writes: the server is a PowerEdge T105. It IS running slow but I'll be damned if I can figure out why, I'm beggining to suspect bad ram as the performance meter reports minimal load. One thing to check is the power management setting in the BIOS. We set up a few blade servers recently that had set this to active power controller, and this caused the server to be extremely sluggish. Setting this to OS Control or Maximum Performance solved the issue. Try: # omreport chassis pwrmanagement config=profile Power Profiles Maximum Performance : Not Selected Active Power Controller : Not Selected OS Control : Selected Custom : Not Selected You can set the profile to max performance with: omconfig chassis pwrmanagement config=profile profile=maxperformance Just a tip, but worth checking. here is the command line in the NSC.ini [modules] command[check_openmanage]=check_openmanage.exe -t 60 --check fans=0,volt=0 on the nagios server: /usr/lib/nagios/plugins/check_nrpe -H $hostname$ -p 5666 -c check_openmanage -t 60 I'm pretty sure it's not the Check_nrpe command line as this works fine on several other servers. it's def something on the client server itself so this points to the NSClient++ setup. Can't see anything wrong with these definitions.. note I have been testing by running NSClient++.exe /test so i can watch the client server and it is getting the injection command and reporting the timeout locally. Good. But it's still weird that you get a timeout after 30 seconds even when you specify a 60 sec timeout. Try running check_openmanage.exe manually on the server with the same options and see if it then behaves in the same way. If so there is some sort of bug in the plugin that only affects the .exe version. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check-openmanage errors after upgrade of openmanage
Steve Glasser sglas...@visp.net writes: Since upgrading dell openmanage from v 6.3 to 6.5 we have errors using the check-openmanage plugin. The errors are: INTERNAL ERROR: Use of uninitialized value in hash element at /usr/lib64/nagios/plugins/check_openmanage line 4599. INTERNAL ERROR: Use of uninitialized value in length at /usr/lib64/nagios/plugins/check_openmanage line 4599. INTERNAL ERROR: Use of uninitialized value in hash element at /usr/lib64/nagios/plugins/check_openmanage line 4599. INTERNAL ERROR: Use of uninitialized value in concatenation (.) or string at /usr/lib64/nagios/plugins/check_openmanage line 4599. INTERNAL ERROR: Use of uninitialized value in hash element at /usr/lib64/nagios/plugins/check_openmanage line 4601. INTERNAL ERROR: Use of uninitialized value in hash element at /usr/lib64/nagios/plugins/check_openmanage line 4601. INTERNAL ERROR: Use of uninitialized value in hash element at /usr/lib64/nagios/plugins/check_openmanage line 4599. INTERNAL ERROR: Use of uninitialized value in length at /usr/lib64/nagios/plugins/check_openmanage line 4599. INTERNAL ERROR: Use of uninitialized value in hash element at /usr/lib64/nagios/plugins/check_openmanage line 4599. INTERNAL ERROR: Use of uninitialized value in concatenation (.) or string at /usr/lib64/nagios/plugins/check_openmanage line 4599. INTERNAL ERROR: Use of uninitialized value in hash element at /usr/lib64/nagios/plugins/check_openmanage line 4601. INTERNAL ERROR: Use of uninitialized value in hash element at /usr/lib64/nagios/plugins/check_openmanage line 4601. INTERNAL ERROR: Use of uninitialized value in hash element at /usr/lib64/nagios/plugins/check_openmanage line 4599. INTERNAL ERROR: Use of uninitialized value in length at /usr/lib64/nagios/plugins/check_openmanage line 4599. INTERNAL ERROR: Use of uninitialized value in hash element at /usr/lib64/nagios/plugins/check_openmanage line 4599. INTERNAL ERROR: Use of uninitialized value in concatenation (.) or string at /usr/lib64/nagios/plugins/check_openmanage line 4599. INTERNAL ERROR: Use of uninitialized value in hash element at /usr/lib64/nagios/plugins/check_openmanage line 4601. INTERNAL ERROR: Use of uninitialized value in hash element at /usr/lib64/nagios/plugins/check_openmanage line 4601. The plugin reports status unknown. Openmanage is version check-openmanage-3.6.5-1.el5 installed from rpm. The host is an dell 2950. Please let me know if I can provide any additional information. Hi Steve, Are you using check_openmanage with NRPE or similar in local mode, or checking via SNMP? Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Why is check_openmanage so slow on PowerEdge R510?
Helmut Wollmersdorfer helmut.wollmersdor...@fixpunkt.de writes: Another question: I always get on all of the R510s (few days old): root@xen11:~# /usr/lib/nagios/plugins/check_openmanage Cache Battery 0 in controller 0 is Charging (Ready) [probably harmless] root@xen11:~# uptime 12:08:35 up 2 days, 1:22, 1 user, load average: 0.00, 0.00, 0.00 I wonder a little bit that the batteries are not full after some days powered, or if the information is wrong. The plugin is simply reporting what OMSA says, so if the info is wrong it would have to be in the hardware or OMSA level. However I don't think that this is the case. Batteries take a long time to charge for new servers, i.e. if the battery is brand new and hasn't been charged before. At one time we had a battery that didn't finish charging for a week, called Dell and got a replacement battery. This was during a regular charge cycle. In your case I would give it a few more days. Also I tried to '--blacklist bat_charge=0,0' (and other combinations), but blacklisting does not work. Look in the debug output for the battery ID, which consists of the controller number and battery number with colon as delimiter. In your case it would be --blacklist bat_charge=0:0 or simply use 'all': --blacklist bat_charge=all But, as we in fact did experience a case where the battery never finished charging I would advice against this. We just ignore the battery charge warnings unless they persist for days. It can be annoying, but we decided that we can live with it :) Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Benefiting from Server Virtualization: Beyond Initial Workload Consolidation -- Increasing the use of server virtualization is a top priority.Virtualization can reduce costs, simplify management, and improve application availability and disaster protection. Learn more about boosting the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Why is check_openmanage so slow on PowerEdge R510?
C. Bensend be...@bennyvision.com writes: Is there anything in OMSA that tells how *long* a battery has been charging? I simply got so tired of the charging warnings that I blacklisted the bat_charge totally, but I'd still like to detect that type of error - where the battery never finishes charging. If OMSA has it, it would be great to have the option within check_openmanage to specify a length of time threshold for battery charging. :) Hi Benny, Unfortunately OMSA has no info on when the charge cycle is expected to be finished, or how long it has been in its current learn/charge state: # omreport storage battery controller=1 Battery 0 on Controller PERC 6/E Adapter (Slot 1) Controller PERC 6/E Adapter (Slot 1) ID: 0 Status: Non-Critical Name : Battery 0 State : Charging Recharge Count: Not Applicable Max Recharge Count: Not Applicable Predicted Capacity Status : Ready Learn State : Requested Next Learn Time : 0 hours Maximum Learn Delay : 7 days 0 hours Learn Mode: Auto I could make the plugin record it, but then I would violate my principle that the plugin should be stateless... Introducing state in the plugin complicates things. There is another reason that you would want to know that the battery is charging, and I suspect that this is also why Dell has OMSA report it as a non-critical (warning) status. During (some of) the charge cycle, write-back for vdisks (i.e. use of the cache) is disabled. This means that the RAID performance is degraded, and depending on the nature of your disk usage you'll want to know about this when it happens. OMSA also lets you delay the charge cycle for up to seven days. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Benefiting from Server Virtualization: Beyond Initial Workload Consolidation -- Increasing the use of server virtualization is a top priority.Virtualization can reduce costs, simplify management, and improve application availability and disaster protection. Learn more about boosting the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Why is check_openmanage so slow on PowerEdge R510?
Helmut Wollmersdorfer helmut.wollmersdor...@fixpunkt.de writes: new to this architecture I installed the monitoring plugin check- openmanage and was surprised about the performance: root@xen10:~# time perl /usr/lib/nagios/plugins/check_openmanage -d | head -n 3 sh: /bin/rpm: not found System: PowerEdge R510 II OMSA version: 6.5.0 ServiceTag: 1Z7215J Plugin version: 3.6.5 BIOS/date:1.6.3 02/01/2011Checking mode: local real 0m3.426s user 0m2.456s sys 0m0.544s OS: Debian root@xen10:~# uname -a Linux xen10 2.6.32-5-xen-amd64 #1 SMP Tue Mar 8 00:01:30 UTC 2011 x86_64 GNU/Linux Most calls of check_openmanage (from the shell) take 3 - 4 seconds, some with '--only' are faster, but not as fast as omreport: root@xen10:~# time perl /usr/lib/nagios/plugins/check_openmanage -- only fans FANS OK - 5 fan probes checked real 0m0.716s root@xen10:~# time /opt/dell/srvadmin/bin/omreport chassis fans Fan Probes Information Fan Redundancy Redundancy Status : Full [...] real 0m0.037s In comparison called with the option --help (does nearly nothing) the execution time is as expected for loading the perl interpreter and compiling the source: root@xen10:~# time perl /usr/lib/nagios/plugins/check_openmanage -h [...] real 0m0.064s What can be the reason? Hi Helmut, The simple answer is that omreport commands take time. They represent the vast majority of the plugin execution time. The reason that 'check_openmanage --only fans' takes significantly more time than the corresponding omreport command is that the plugin first runs 'omreport -?' to determine if this is a blade or not. If you add the time it takes to run 'omreport -?', the omreport fans command and perl interpreter time you should arrive at about the time it takes 'check_openmanage --only fans' to finish. Note that storage takes time to check, since the omreport commands for storage are slow. This is especially true if you have a lot of storage (e.g. an R510). Also note that if you use the '-d' option, check_openmanage will run 'omreport about' to determine the OMSA version. This is a slow command and adds to the overall execution time. The plugin is much faster if used in SNMP mode, especially if you lots of storage. Example from a 2950 with a couple of MD1000 shelves of extra storage: $ time ./check_openmanage -H foo OK - System: 'PowerEdge 2950 III', SN: 'XXX', 16 GB ram (8 dimms), 3 logical drives, 32 physical drives real0m1.725s user0m0.397s sys 0m0.013s foo /# time /usr/lib64/nagios/plugins/check_openmanage OK - System: 'PowerEdge 2950 III', SN: 'XXX, 16 GB ram (8 dimms), 3 logical drives, 32 physical drives real0m4.188s user0m2.997s sys 0m0.821s As you can see the footprint is significantly smaller with SNMP, so if this is a concern then SNMP should be your weapon of choice :) Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Forrester Wave Report - Recovery time is now measured in hours and minutes not days. Key insights are discussed in the 2010 Forrester Wave Report as part of an in-depth evaluation of disaster recovery service providers. Forrester found the best-in-class provider in terms of services and vision. Read this report now! http://p.sf.net/sfu/ibm-webcastpromo ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage internal error
Adam Caines acai...@lab.icc.edu writes: Looks like it's reporting path health. The 6e has both sas ports connected to redundant controllers in the MD1120. It's strange on another server, I also have a PERC H700 connect to a MD1220 with redundant links and it does not output the path health section. [snip] ID : 0 Status : Ok Name : Logical Connector State : Ready Connector Type : SAS Port RAID Mode Termination : Not Applicable SCSI Rate : Not Applicable Path Health Status : Ok Name : Connector 0 State : Available Status : Ok Name : Connector 1 State : Available Yes, so this is the culprit... check_openmanage did not expect this output. It looks like the controller is connected to the enclosure in redundant path mode, according to the OMSA documentation[1]. I really need to see how this looks with SSV format, can you provide the output from this command: omreport storage connector controller=1 -fmt ssv In case of redundant path mode, the plugin should check the path health and report on it, in addition to the connector health. This functionality must be added to the plugin. Is it possible for you to check how check_openmanage handles this when checking via SNMP as well? [1] http://support.euro.dell.com/support/edocs/software/svradmin/6.4/en/CLI/HTML/reportst.htm#wp1077100 Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Colocation vs. Managed Hosting A question and answer guide to determining the best fit for your organization - today and in the future. http://p.sf.net/sfu/internap-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage internal error
Adam Caines acai...@lab.icc.edu writes: Having a strange problem with check_openmanage. Use it without error on many other systems. Any help would be appreciated. check_openmanage version: 3.6.5 (.exe version) Dell OMSA version: 6.4.0 OS: Windows Server 2008 R2 Hardware: Poweredge 1950 with PERC 6/i and PERC 6/e connected to MD1120 check_openmanage output: OK - System: 'PowerEdge 1950 III', SN: 'XXX', 8 GB ram (4 dimms), 2 logical drives, 28 physical drives INTERNAL ERROR: Use of uninitialized value in numeric lt () at script/ check_openmanage line 4634. INTERNAL ERROR: Use of uninitialized value in numeric lt () at script/ check_openmanage line 4634. INTERNAL ERROR: Use of uninitialized value in numeric lt () at script/ check_openmanage line 4634. INTERNAL ERROR: Use of uninitialized value in numeric lt () at script/ check_openmanage line 4634. INTERNAL ERROR: Use of uninitialized value in numeric lt () at script/ check_openmanage line 4634. INTERNAL ERROR: Use of uninitialized value in numeric lt () at script/ check_openmanage line 4634. INTERNAL ERROR: Use of uninitialized value $level in numeric eq (==) at script/ check_openmanage line 4637. INTERNAL ERROR: Use of uninitialized value $level in numeric eq (==) at script/ check_openmanage line 4637. INTERNAL ERROR: Use of uninitialized value $level in numeric eq (==) at If I run check_openmanage --no-storage the errors are not present: Hi Adam, Interesting. This is the status of the device (as reported by omreport) that is garbled somehow. The plugin will set the status to 'Unknown' if the field is missing or empty, so this means that omreport is reporting the status as something new that check_openmanage doesn't recognize. That you're getting so many of them (and you have established that it's a storage issue), makes me think that it is related to physical disks. We need to see what omreport says about storage, in particular the disk drives. Can you send the output from omreport storage pdisk controller=X where 'X' is the controller number (0,1) , for each of the controllers. If the Status field is 'Ok' for all the disks, we need to look further. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- What You Don't Know About Data Connectivity CAN Hurt You This paper provides an overview of data connectivity, details its effect on application quality, and explores various alternative solutions. http://p.sf.net/sfu/progress-d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage internal error
Adam Caines acai...@lab.icc.edu writes: Looks like some strange output on the lines for controller 1? The formatting is breaking there. I checked omreport storage controller and didn't see anything that stood out as being strange. [snip] OK | 0:0 | Connector 0 [SAS Port RAID Mode] on controller 0 is Ready OK | 0:1 | Connector 1 [SAS Port RAID Mode] on controller 0 is Ready OK | 1:0 | Logical Connector [SAS Port RAID Mode] on controller 1 is Ready | 1:Status | State [Name] on controller 1 is Status | 1:Ok | Available [Unknown type] on controller 1 is Unknown state | 1:Ok | Available [Unknown type] on controller 1 is Unknown state Ok, something strange going on here. This seems to be a parsing error in the plugin, related to the connectors. As I don't have any MD1120 enclosures, I'm curious if these errors are related to the MD1120 being different somehow. Can you send the output from these commands: omreport storage connector controller=0 omreport storage connector controller=1 and also: omreport storage connector controller=0 -fmt ssv omreport storage connector controller=1 -fmt ssv The latter is what the plugin is using as it is easier to parse. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Colocation vs. Managed Hosting A question and answer guide to determining the best fit for your organization - today and in the future. http://p.sf.net/sfu/internap-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Error in performance-data-output
Lichterfeld, Dirk dirk.lichterf...@enercity.de writes: I compare the response time of the nagios check and I see, that the DELL Server R710 needs over 10 seconds to answer. Another server (DELL R310) answer in 8 seconds (the check of this server is ok.) The response time depends on various Dell hardware. Yes, this is expected when using the win32 binary file. It contains a perl interpreter and is slow to start up and execute. When monitoring windows machines, SNMP is preferable unless your security policies prohibits this. What I do? I expanded the check-command of the check_openmange from check_nrpe -H $HOSTADDRESS$ -c Check_Openmanage with the parameter -t 30 to extend the time for this check. 30 seconds is the default timeout for check_openmanage. I would set the timeout to slightly more than the check_openmanage timeout. If you do that, you'll get a meaningful error message from check_openmanage instead of a cryptic one from NSClient++, if check_openmanage times out for some reason. Anyway, the '-t 30' parameter to check_nrpe should work... Is there another way to set the timeout? I'm not familiar with NSClient++, perhaps it has its own timeout? Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- What You Don't Know About Data Connectivity CAN Hurt You This paper provides an overview of data connectivity, details its effect on application quality, and explores various alternative solutions. http://p.sf.net/sfu/progress-d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Error in performance-data-output
Lichterfeld, Dirk dirk.lichterf...@enercity.de writes: Hi Trond, I´m sorry, at my company we use Outlook, so the highlighted text is distinctly and visibly. I will try to specify the problem I mean. If I run NSClient++ in testmode I will get the follow output: d NSClient++.cpp(1106) Injecting: Check_OpenManage: d NSClient++.cpp(1142) Injected Result: OK 'OK - System: 'PowerEdge R710 II', SN: 'XXX', 4 GB ra m (2 dimms), 1 logical drives, 4 physical drives' d NSClient++.cpp(1143) Injected Performance Result: 'fan_0_system_board_fan_1_rpm=3600;0;0 fan_1_sys tem_board_fan_2_rpm=3600;0;0 fan_2_system_board_fan_3_rpm=3600;0;0 fan_3_system_board_fan_4_rpm=3600 ;0;0 fan_4_system_board_fan_5_rpm=3600;0;0 pwr_mon_0_ps_1_current=0.4;0;0 pwr_mon_1_ps_2_current=0.4 ;0;0 pwr_mon_2_system_board_system_level=175;917;966 temp_0_system_board_ambient=20;42;47 ' You can see, the injected perfomance result beginns and ends with a '. Yes, but I think that NSClient++ is responsible for that, putting everything inside single quotes. As you can see it does that for the plugin output as well. 1. I mean, that every description and only the description must be inside of the signs ' our output: fan_2_system_board_fan_3_rpm must be:'fan_2_system_board_fan_3_rpm' 2. At the end is no special sign approved. You can read this in chapter 2.6 Performance data at http://nagiosplug.sourceforge.net/developer-guidelines.html I hope I could describe the problem well enough. Yes, thank you, this was much clearer :) However, the quotes are not needed according to the guidelines for performance data[1]: 3. the single quotes for the label are optional. Required if spaces, = or ' are in the label The perfdata labels don't contain any of the offending characters. Could it be that this is a Windows issue, or perhaps NSClient++? Any NSClient++ users here who can confirm if this is the case? I'm thinking that perhaps the underscore character '_' is throwing off Windows or NSClient++. [1] http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN201 Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_openmanage-- Current probes not found
Joe Beck jb...@urbn.com writes: Yes, just after sending this post I did the things you identified. Verifed model vs others where this issue was not happening We have several r610's this is only one with the issue. Then I went looked at the omsa version found this one was running 5.9 where the others had 6.4 I removed installed 6.4 but same result. I also had some question/confusion about best way to identify the version; in fact it may have already been running 6.4. I'm grep'ing for version; tried running cmds with -v --version, etc but no luck in seeing which version via the cmds This command will tell you which version of OMSA you're running: omreport about There are other ways as well: http://folk.uio.no/trondham/software/check_openmanage.html#how-can-i-find-out-which-version-of-omsa-my-server-is-running I'm not sure if you understood my question about the servers being identical. I didn't mean the model (I assumed the model would be the same), but hardware-wise. Specifically, are they alike with respect to number of power supplies? In any case, the next step will be to examine the installed OMSA software components. On RHEL and derivatives such as CentOS, you can do this by comparing the output from 'rpm -qa|grep srvadmin' from healthy boxes versus the failing one. Also check that the running OMSA services are the same. Since this is happening on only one server, and you have probably installed OMSA in exactly the same way on all the servers, you may have a real hardware problem. If all else fails, you should contact Dell support and have them look at it. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_openmanage-- Current probes not found
Joe Beck jb...@urbn.com writes: I have a couple R610’s Some run omreport chassis pwrmonitoring return output I also have 1 that returns: # omreport chassis pwrmonitoring Power Consumption Information Error : Current probes not found Does this mean that this module just isn’t installed or ??? At this point, do I just alter the nagios service to exclude pwrmonitoring? Hi Joe, I think the next point should be to investigate why OMSA behaves like this. I've seen this error before, but on older servers with old OMSA versions (5.4.0). A simple restart of OMSA (srvadmin-services.sh restart) may be the solution and should be attempted first. The next step would be to reinstall OMSA and verify that everything gets installed. Usually, if power monitoring information is not available, OMSA should say something else and more informative. Is the problematic machine identical to the ones that work? Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage SNMP Error
Shawn Green sgr...@dotomi.com writes: I?m in the process of rolling out check_openmanage to monitor a variety of hardware including R510s, M600s, and M610s. I?m running into an interesting issue where the alert is reporting back: SNMP ERROR [cooling]: The requested entries are empty or do not exist. I understand this is an SNMP error (not check_openmanage), but what?s baffling me is how to work around it. My Net::SNMP module is up to date (v6.0.1) as are net-snmp packages on all hosts. A good majority of hosts that are getting this error are M600/M610 blades, yet other blades in the same chassis? do not get this error. I?m also seeing these on several R510s, yet other R510s have no problems. All hosts are Centos 5.5 64 bit with OMSA 6.2.0. Hi Shawn, One thing that is really peculiar is that you're getting this error from blade servers. The plugin should identify blades and ignore the fact that they don't have cooling devices (i.e. fans). You should never get this error from blades. Are you really sure that the error from your blades are with cooling and not something else? (If so, we'll need to investigate why the plugin doesn't identify the blade servers correctly). Your Net::SNMP version is fine and not to blame. The error lies with OMSA and/or the SNMP service. Try running on the servers: omreport chassis fans On the blades, you should get an error saying that no fan probes where found, which is normal. But the R510s should display fan info. If they don't, the problem is not SNMP related but with OMSA itself. If you haven't already done so, try restarting OMSA (i.e. run 'srvadmin-services.sh restart') on the servers. Reinstalling OMSA (or better yet: reinstall with version 6.4.0) is the logical next step. Make sure that there are no errors during installation and that everything gets installed. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage and linebreaks
Bryan O'Shea bryanosh...@gmail.com writes: check_openmanage and linebreaks not working in $SERVICEOUTPUT$ emails. When using the either of the following options the linebreaks seem to be broken: -e or --postmsg This is what i get in my service notification emails instead of the desired output of seperate lines. Power Supply 1 [AC] needs attention: Presence detected, Failure detected, AC lostbr/NOTE: PowerEdge 2950 III 437RQH1 - 555-1212 It puts a br/ in instead of a \n. Hi Bryan, The default behaviour of check_openmanage is to use HTML linebreaks when run from Nagios, NRPE etc., and regular linebreaks in a console which has a TTY. The reason for this is that the plugin monitors several things, and in case of multiple alerts it's practical to display them each on a different line. However, since this behaviour doesn't fit everyone you can modify it with the '--linebreak' switch. To switch to regular (\n) linebreaks: check_openmanage --linebreak=REG You can also specify any string as a custom linebreak: check_openmanage --linebreak=' -- ' If you choose regular linebreaks, the first line will be put in the SERVICEOUTPUT macro, while any subsequent lines will be put in the LONGSERVICEOUTPUT macro. This is how Nagios 3.x handles multiline output from plugins. PS. In order for the default HTML linebreaks to work as indended in the web frontend, you should set escape_html_tags=0 in the Nagios config. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: Amperage probe 0 [System Board System Level] reads 0 W
Tom Sommer m...@tomsommer.dk writes: After upgrading OpenManage to version 6.4.0 on a DELL R410, check_openmanage 3.6.4 returns CRITICAL: Amperage probe 0 [System Board System Level] reads 0 W Is this due to OpenManage changing behavior (bug), or is the hardware really faulty? (doubtful) :) Hi Tom, Most likely this is some sort of bug in OpenManage, or something went wrong during upgrade. You should confirm the fault by running omreport chassis pwrmonitoring Investigate the Status field. The only accepted value is Ok. I know I could just disable amperage checks, but I'd like not to. Anyone else seen this? Sorry, no. Very often these problems are resolved simply by restarting OpenManage on the monitored server, or a reboot. The next step is to re-install OpenManage in case something was missed during install/upgrade. If all else fails, contact Dell support. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: 'Amperage probe 0 [System Board System Level] reads 0 W'
Tom Sommer m...@tomsommer.dk writes: After upgrading OpenManage to version 6.4.0 on a DELL R410, check_openmanage 3.6.4 returns CRITICAL: Amperage probe 0 [System Board System Level] reads 0 W Is this due to OpenManage changing behavior (bug), or is the hardware really faulty? (doubtful) :) Most likely this is some sort of bug in OpenManage, or something went wrong during upgrade. You should confirm the fault by running omreport chassis pwrmonitoring # omreport chassis pwrmonitoring Power Consumption Information is not available on this system because all the Power Supply units on your system do not support PMBus or the firmware on your system does not support power monitoring. Strange.. if the system doesn't support power monitoring, the plugin shouldn't complain about it. Are you using check_openmanage via SNMP or locally? (I'm guessing SNMP, and if so there are obvious inconsistencies between what OMSA displays through omreport and what is available via SNMP.) Did power monitoring work at all before upgrading OMSA? Anyone else seen this? Sorry, no. Very often these problems are resolved simply by restarting OpenManage on the monitored server, or a reboot. The next step is to re-install OpenManage in case something was missed during install/upgrade. If all else fails, contact Dell support. Tried all but the latter - guess it's a DELL bug. I forgot one other possible cause: old BIOS and/or firmware. Newer versions of OMSA often need relatively up-to-date BIOS and firmware versions to function normally. You should upgrade all BIOS and firmware on the server. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage showing 0 logical drives with OMSA 6.4 and PERC4
Steve Jenkins stevejenk...@gmail.com writes: After upgrading three of the 1850s to Dell OMSA 6.4 today, I noticed something strange. The three of them now display in Nagios: OK - System: 'PowerEdge 1850', SN: '', 3 GB ram (6 dimms), 0 logical drives, 2 physical drives OK - System: 'PowerEdge 1850', SN: 'XXX', 12 GB ram (6 dimms), 0 logical drives, 2 physical drives OK - System: 'PowerEdge 1850', SN: 'XXX', 4 GB ram (6 dimms), 0 logical drives, 2 physical drives All three display 0 logical drives, even though they all have a working RAID array. [snip] The strange part is that OMSA 6.4 on the 1850s is clearly aware that there's a logical drive, because the GUI shows Virtual Disk 0 RAID-1 in the Storage Dashboard. Hi Steve, Interesting.. OMSA is obviously aware of the logical drives, but what does omreport actually say about them? Try running 'omreport storage vdisk controller=number'. You seem to be running check_openmanage in local mode, so the output from omreport is what matters. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage showing 0 logical drives with OMSA 6.4 and PERC4
Steve Jenkins stevejenk...@gmail.com writes: On Tue, Jan 25, 2011 at 3:41 AM, Trond Hasle Amundsen t.h.amund...@usit.uio.no wrote: Interesting.. OMSA is obviously aware of the logical drives, but what does omreport actually say about them? Try running 'omreport storage vdisk controller=number'. Looks like omreport sees the controller, but not the VDisk: # omreport storage vdisk controller=0 No virtual disks found Ok, so there is the reason that check_openmanage doesn't display any virtual disks. It relies on OMSA for the information, specifically omreport when used in local mode. Based on the issue at hand and your reports about OMSA 6.4 and PERC4 controllers on the linux poweredge list, it seems that the latest OMSA has serious issues with 8th gen Dell servers. PS. You may have noticed that the plugin doesn't issue an alert when virtual disks are missing. The reason for this is that it's perfectly legal and plausible for systems to have no virtual disks. This is the downside of a plugin that both discovers the components and monitors them at the same time. It can't give alerts on missing components unless they should always be present in all servers. A notable exception is controllers, since being unable to display controllers is a common OMSA problem. check_openmanage will complain about missing controllers even though controller-less systems are possible. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage
Jeffrey Watts jeffrey.w.wa...@gmail.com writes: Hello, I'm using Mr. Amundsen's excellent check_openmanage plugin, and I'm getting an odd error: $ check_openmanage -H myserver -C public Power Supply 0 [AC] needs attention: Presence detected, Failure detected, AC lost Voltage sensor 14 [PS 2 Voltage 2] is INTERNAL ERROR: Use of uninitialized value $reading in sprintf at /usr/lib/ nagios/plugins/check_openmanage line 3565. Has anyone else seen this error? I'm running version 3.6.4. Please let me know what additional information is needed. Hi Jeffrey, This shouldn't happen, and I think I see where the problem is. Please try the version available here, and let me know if it performs any better: http://folk.uio.no/trondham/software/test/ Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage
Jeffrey Watts jeffrey.w.wa...@gmail.com writes: Thanks Trond! That seems to have fixed it. Here's what I see now: ./check_openmanage -H pkc-search28 -C tomgeco Power Supply 0 [AC] needs attention: Presence detected, Failure detected, AC lost Voltage sensor 14 [PS 2 Voltage 2] is Unknown reading It comes up correctly now as a CRIT, too. Good, thanks for reporting back. I'll include this fix in the next release. The problem was that where the reading is not available, the plugin assumes that the reading is discrete (i.e. not a number but good, bad etc.). This assumption is wrong in cases where the reading is NOT discrete and simply not available via SNMP. The fixed version will set the reading to Unknown reading when the reading can't be obtained. (However, this situation shouldn't occur at all if OMSA it behaving as it should. Pulling the cable on one power supply would normally lead to a reading of 0 volts for that voltage probe.) Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_OpenManage error
Jeffrey C. Veatch jeffrey.vea...@knowyourrights.com writes: To whom it may concern: I have been trying to use check_openmanage in my Nagios configuration, but no matter what I do I get a list of Internal Errors at the end of the returned test. The only way I can avoid it is by using the debug mode and only returning the first 80 lines. This however does not warn me of any issues the server is having. Here are some details. The server running OMSA is an R710 running VMware ESX 4.0.0 Update 2. OMSA version is 6.4. The nagios server is in a virtual machine running OpenSUSE 11.3. The Nagios version is 3.2.3. If there are other packages that you need to know the version, let me know. The following is an example of the results that I get. Oh, and in nagios this ends up being an unknown state for the check. VLinux:/usr/local/nagios/libexec # ./check_openmanage -H 192.168.10.21 OK - System: 'PowerEdge R710', SN: '5QTMZK1', 72 GB ram (18 dimms), 1 logical drives, 2 physical drives INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 588. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 655. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 708. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 764. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 869. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 952. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1028. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1103. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1168. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1325. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1531. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1549. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1563. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1577. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1591. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1613. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1633. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1653. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1674. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1702. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1737. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1846. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1968. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1973. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1978. INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ 5.12.1/Net/SNMP.pm line 1983. Thanks for any help you can give me. Hi Jeffrey, Interesting error, never seen this one before :) check_openmanage will print any perl warnings that occur during execution as internal errors. This is done to avoid situations where the plugin stops working due to perl incompatibilities etc. without your knowledge, as Nagios completely ignores any plugin output to STDERR. Which version of Net::SNMP are you using? Try 'rpm -q perl-Net-SNMP' to find out. Perl 5.12 deprecated the locked attribute, and this was fixed in Net::SNMP version 6.0.1, i.e. the latest release. The changelog for Net::SNMP 6.0.1 has the following: - Removed all occurrences of the locked attribute that was deprecated in Perl 5.12.0. I believe this to be a problem with your distribution using an old/incompatible version of Net::SNMP. It seems that for perl 5.12.x you need Net::SNMP 6.0.1 (or any later version). PS. I found this in the OpenSUSE bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=629698 Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo
Re: [Nagios-users] check_openmanage plugin reporting Firmware out of date
Trond Hasle Amundsen t.h.amund...@usit.uio.no writes: Surangiwala, Asif asif.surangiw...@firstdata.com writes: Can we update the check_openmanage script to parse the Minimum Required Firmware Version and compare it with the current Firmware Version to overcome the OMSA bug? It is entirely possible to mitigate this bug within the plugin, but I don't think that it's a good idea to let the plugin do all version parsings and ignore OMSA on a general basis. I have created a version that works around this particular bug (version 3.6.2-p1) and made it available here: http://folk.uio.no/trondham/software/omsa-fw-bug/ It simply ignores out-of-date firmware if the firmware and minimum firmware versions match those in question. But in order for this to work, I also had to turn off checking the global health status, which inherits the non-critical status of the controller. DISCLAIMER: This version is only intended as a temporary solution for users of OMSA 6.3.0 that struggles with the recent firmware bug, and don't want to use blacklisting as a workaround. When OMSA 6.4.0 becomes available, you should upgrade OMSA and revert to a regular release of check_openmanage. Hi Asif, Dell has released OMSA 6.4.0, which fixes the firmware version parsing issue. I have also released a new version of check_openmanage that contains a few compatibility fixes for OMSA 6.4.0: http://folk.uio.no/trondham/software/check_openmanage.html#download Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL, new data types, scalar functions, improved concurrency, built-in packages, OCI, SQL*Plus, data movement tools, best practices and more. http://p.sf.net/sfu/oracle-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage plugin reporting Firmware out of date
Surangiwala, Asif asif.surangiw...@firstdata.com writes: I have Dell Open Manage Server Administrator 6.3.0 installed on some Dell R710’s with PERC H700 controller. When I run the Nagios plugin check_openmanage, it reports the following: Controller 0 [PERC H700 Integrated]: Firmware '12.10.0-0025' is out of date The H700 is running the latest firmware 12.10.0-0025, check_openmanage plugin is v3.6.2 by Trond H. Amundsen. OMSA is running fine and is not complaining about any firmware issues. The same ‘Firmware out of date’ warning is also given for H800 controllers on the R710’s having it. Is there an issue with the plugin’s interaction with OMSA? Hi Asif, This is a bug in OMSA, not check_openmanage. OMSA is reporting that the firmware is too old while clearly it is not. Dell has stated that the bug will be fixed in the next version of OMSA. For more information, see the following thread on the Linux-Poweredge mailing list: http://lists.us.dell.com/pipermail/linux-poweredge/2010-December/043713.html As a workaround, I suggest using blacklisting to suppress the false warnings until OMSA 6.4.0 is released and deployed on your systems: check_openmanage -b ctrl_fw=all [..other options..] Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Increase Visibility of Your 3D Game App Earn a Chance To Win $500! Tap into the largest installed PC base get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage plugin reporting Firmware out of date
Surangiwala, Asif asif.surangiw...@firstdata.com writes: Can we update the check_openmanage script to parse the Minimum Required Firmware Version and compare it with the current Firmware Version to overcome the OMSA bug? It is entirely possible to mitigate this bug within the plugin, but I don't think that it's a good idea to let the plugin do all version parsings and ignore OMSA on a general basis. I have created a version that works around this particular bug (version 3.6.2-p1) and made it available here: http://folk.uio.no/trondham/software/omsa-fw-bug/ It simply ignores out-of-date firmware if the firmware and minimum firmware versions match those in question. But in order for this to work, I also had to turn off checking the global health status, which inherits the non-critical status of the controller. DISCLAIMER: This version is only intended as a temporary solution for users of OMSA 6.3.0 that struggles with the recent firmware bug, and don't want to use blacklisting as a workaround. When OMSA 6.4.0 becomes available, you should upgrade OMSA and revert to a regular release of check_openmanage. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Increase Visibility of Your 3D Game App Earn a Chance To Win $500! Tap into the largest installed PC base get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_OpenManage INTERNAL ERROR
Benny Somali benny.som...@firstnational.ca writes: Works fine now. Good, thanks for testing. By the way, the Status Information field is blank, is it related to the max length of 1023 chars? Probably not. You shouldn't run into problems with the silly nrpe limit for other than large servers with lots of performance data, and then only the perfdata should be affected. My guess is that the State field is also empty for the failed disk. I have an updated beta for you here: http://folk.uio.no/trondham/software/beta/ If should now report that the disk is Unknown State. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Nokia and ATT present the 2010 Calling All Innovators-North America contest Create new apps games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_OpenManage INTERNAL ERROR
Benny Somali benny.som...@firstnational.ca writes: Ignore my previous question. Too late, but no problem. My one-line patch is easily reversed :) It worked fine now. I used a batch script and didn't add a line to turn the echo off so it returned special characters. So I added @echo off and the Status Information displayed. Good. Thanks again for reporting this and for testing the beta version. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Nokia and ATT present the 2010 Calling All Innovators-North America contest Create new apps games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_OpenManage INTERNAL ERROR
Benny Somali benny.som...@firstnational.ca writes: INTERNAL ERROR: substr outside of string at script/check_openmanage line 1502. INTERNAL ERROR: Use of uninitialized value in lc at script/check_openmanage line 1502. Hi Benny, Thanks for reporting this. The error is related to the vendor of physical disks as reported by omreport. What does 'omreport storage pdisk controller=0' say? I'm guessing that the Vendor field is empty or missing for one of the disks. Finding the root cause would be interesting. Can you tell if the disk in question is an original disk supplied by Dell? If it isn't, this could be the reason that the vendor field is empty/missing, i.e. Openmanage doesn't recognize it. If it is a Dell drive, we're probably dealing with a rare Openmanage oddity. In any case, check_openmanage should handle this situation more gracefully. I'll provide a patched version for you to test on Monday. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Nokia and ATT present the 2010 Calling All Innovators-North America contest Create new apps games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_OpenManage INTERNAL ERROR
Benny Somali benny.som...@firstnational.ca writes: Yes, you are right. There is pdisk #1 that has empty vendor ID field. The disk in question was original Dell disk, however, it seemed to be bad now. We have an opened trouble ticket with Dell and expect to get a replacement disk. Ah.. it makes sense that in some circumstances, if the disk is sufficiently bad, Openmanage can't report the vendor. I went ahead and patched this in the plugin. There is a beta version (win32 binary) available here: http://folk.uio.no/trondham/software/beta/check_openmanage.exe Please give it a try and let me know if it resolved this issue. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Nokia and ATT present the 2010 Calling All Innovators-North America contest Create new apps games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Question on setting up my own check
Marc Powell li...@xodus.org writes: On Oct 19, 2010, at 2:20 PM, steve f wrote: Hello All, I have the following script created to check free space on a remote legacy box via rsh. used=`sudo rsh $1 df -v |grep starlite6 | head -1 | awk '{print $4}'` free=`sudo rsh $1 df -v |grep starlite6 | head -1 | awk '{print $5}'` Beyond just good programming practice, always use full paths to external programs within your scripts. $PATH may not be what you expect it to be, especially when being run by the nagios daemon which has a more restrictive environment. # (paths may be different on your system) used=`/usr/bin/sudo /usr/bin/rsh $1 /bin/df -v | /bin/grep starlite | /usr/bin/head -1 | /usr/bin/awk '{print $4}'` Or... set PATH before doing anything else, e.g. #!/bin/bash PATH=/bin:/sbin:/usr/bin:/usr/sbin export PATH [...rest of script...] This will enhance readability wrt. using full paths everywhere. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Download new Adobe(R) Flash(R) Builder(TM) 4 The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly Flex(R) Builder(TM)) enable the development of rich applications that run across multiple browsers and platforms. Download your free trials today! http://p.sf.net/sfu/adobe-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Bug in check_openmanage ?
rb...@free.fr writes: omreport chassis pwrmanagement Power Budget Information is not available on this system. In fact, i solve the problem by updating/resetting the idrac. Ok, good to know. I'm still a little concerned that there was a hardware problem that check_openmanage didn't identify properly. Please let me know if this happens again. But the plugins nagios is always ko and i don't know why ... ./tmp/check_openmanage -H 10.1.19.193 SNMP ERROR [cooling]: Requested entries are empty or do not exist. This is a completely different problem. Cooling devices (i.e. fans) should exist in all servers except blades. Which type of server is this, and do you know if it has fans or not? The error above is from the Net::SNMP perl module. If the plugin doesn't get the data it expects when polling via SNMP, it will forward the error message from Net::SNMP. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Nokia and ATT present the 2010 Calling All Innovators-North America contest Create new apps games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Bug in check_openmanage ?
rb...@free.fr writes: OOPS! Something is wrong with this server, but I don't know what. The global system health status is CRITICAL, but every component check is OK. This may be a bug in the Nagios plugin, please file a bug report. The status change from OK to Unknown... Is anybody can help me to debbug ? Hi Rémi, Thanks for reporting this. As an extra precaution, check_openmanage will check the global health status in addition to each of the components, providing you don't use blacklisting and/or check control such that the global check can be a false positive. This case seems to be a real issue where a component is bad and the global health status reflects this. The component in question is not checked by the plugin for some reason. I'd like to narrow down the suspect pool. If you have login access to this server, can you send the output from the following command: omreport chassis If this command reports that everything is OK, we're probably dealing with a storage problem. Just to rule out blacklisting bugs etc., what is the command definition for check_openmanage in your Nagios config? Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Bug in check_openmanage ?
rb...@free.fr writes: Hi Trond, You are right ... -- # omreport chassis Health Main System Chassis SEVERITY : COMPONENT Ok : Memory Critical : Power Management Ok : Processors Ok : Temperatures Ok : Voltages Ok : Hardware Log Ok : Batteries For further help, type the command followed by -? On the IDRAC i have the message System Board Current Latch This is interesting.. Have you configured power budgeting on this server? What does this command say: omreport chassis pwrmanagement On a regular R805 here it just says: Power Budget Information is not available on this system. but we've never configured or used this feature, so I don't know anything about it. I'm thinking that perhaps check_openmanage should support these and similar configurable OMSA features. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Nokia and ATT present the 2010 Calling All Innovators-North America contest Create new apps games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage and Open Manage 6.3
Luca Olivotto lolivo...@gmail.com writes: Hello, i have a problem with the plugin check_openmanage . if i use this command: ./check_openmanage -H xx.xx.xx.xx i get this result: OOPS! Something is wrong with this server, but I don't know what. The global system health status is WARNING, but every component check is OK. This may be a bug in the Nagios plugin, please file a bug report. The server that i'm checking is a PowerEdge 2950 and i suppose that the problem is the version of OpenManage installed on the server. The version is 6.3 and the only warning shown via the webinterface are the old version of the firmware/driver/storeDriver of the controller. If i try that command check_openmanage -H 10.10.10.6 -b ctrl_fw=all/ctrl_driver=all/ctrl_stdr=all -s -e the output is: OK - System: 'PowerEdge 2950', SN: 'xx', 16 GB ram (4 dimms), 0 logical drives, 0 physical drives as you can see the disk are not checked(that server has a broked mirror). the version of check_openmanage is 3.6.0 Hi Luca, Your analysis is correct. OMSA doesn't display storage info via SNMP, but there is something wrong with a storage component. For some reason, OMSA senses the storage failure and the global health status inherits this failure status, but OMSA doesn't display the storage. This condition will trigger the behaviour you are seeing. The plugin searches for storage controllers. If it doesn't find any controllers, it concludes that there is no storage alltogether and will skip subsequent checks of disk drives etc. Do you see any controlles by running this command on the server: omreport storage controller Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage and Open Manage 6.3
Luca Olivotto lolivo...@gmail.com writes: yes, i see the perc 6i controller. Ok, thanks. I then suspect that the problem lies with the SNMP part of OMSA. Kan you run the following command from your Nagios server to confirm: snmpwalk -v2c -c community hostname/ip 1.3.6.1.4.1.674.10893.1.20.130.1 The result should look something like this: $ snmpwalk -v2c -c public foobar 1.3.6.1.4.1.674.10893.1.20.130.1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.1.1 = INTEGER: 1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.2.1 = STRING: PERC 6/i Integrated SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.3.1 = STRING: DELL SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.4.1 = INTEGER: 6 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.5.1 = INTEGER: 1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.7.1 = INTEGER: 30 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.8.1 = STRING: 6.2.0-0013 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.9.1 = INTEGER: 256 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.10.1 = INTEGER: 0 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.11.1 = INTEGER: 6 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.12.1 = INTEGER: 2 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.37.1 = INTEGER: 3 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.38.1 = INTEGER: 3 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.39.1 = STRING: \\0 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.40.1 = INTEGER: 3 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.41.1 = STRING: 00.00.04.17-RH1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.42.1 = STRING: embedded SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.43.1 = INTEGER: 99 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.47.1 = INTEGER: 2 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.48.1 = INTEGER: 30 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.49.1 = INTEGER: 30 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.50.1 = INTEGER: 30 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.51.1 = INTEGER: 30 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.52.1 = INTEGER: 1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.53.1 = INTEGER: 1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.54.1 = INTEGER: 32 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.57.1 = INTEGER: 99 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.58.1 = INTEGER: 99 Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage and Open Manage 6.3
Luca Olivotto lolivo...@gmail.com writes: that is the output: SNMPv2-SMI::enterprises.674.10893.1.20.130.1 = No Such Object available on this agent at this OID Ok, this confirms that the problem lies with OMSA, specifically the SNMP functionality. I'm afraid that I can't offer much clues about how to fix this. I would try restarting the OMSA and SNMP services, and if that doesn't work, reinstall OMSA completely. Best of luck, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage ignores blacklist directive
C. Bensend be...@bennyvision.com writes: Despite of giving it the parameter to ignore Warnings about the controller firmware, it still gives a Warning Status: /usr/lib/nagios/plugins/check_openmanage -b ctrl_fw -s -H 192.168.2.137 'ctrl_fw' isn't the complete option you need to give there - you also need to specify the ID per: http://folk.uio.no/trondham/software/check_openmanage.8.html Try 'ctrl_fw=0,1' Yes, or: ctrl_fw=all ..if you wish to blacklist this for all controllers and aren't interested in specifying controller IDs. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: Use of uninitialized value in sprintf at /usr/lib64/nagios/plugins/check_openmanage
Max Williams max.willi...@mflow.com writes: Here is the output, the inactive temperature probe is sorted but the missing EMM still produces an alert: OK | 1:1:0:1 | Temperature Probe 1 in enclosure 3 [MD1000] is Inactive This one works as expected :) OK | 1:1:0:2 | Temperature Probe 2 in enclosure 3 [MD1000]: C ( max) OK | 1:1:0:3 | Temperature Probe 3 in enclosure 3 [MD1000]: C ( max) Hmm... something strange going on here. I wonder why this happens, in the SNMP output you attached previously the values are there. Anyway, I've added some extra checking in the code to make it report better if the reading is unavailable for some reason. It should now report simply: Temperature Probe 0 in enclosure 2:0:0 [MD1000] is Ready if the temp reading is not an integer and OMSA reports the status as OK. CRITICAL | 1:1:0:1 | EMM 1 in enclosure 3 [MD1000] needs attention: Not Installed Ah.. I misread the SNMP output.. The status is Unknown when reported by omreport, but Other when reported with SNMP. One little annoying difference between the two.. The output should be: EMM 0 in enclosure 2:0:0 [MD1000] is Not Installed with an OK state. I've created a second test version: http://folk.uio.no/trondham/software/beta/check_openmanage Please give this one a try and see if it performs better. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: Use of uninitialized value in sprintf at /usr/lib64/nagios/plugins/check_openmanage
Max Williams max.willi...@mflow.com writes: Excellent, sorted, everything reports as OK now. Good. I'll try to make a release with these changes in the next couple of days. Thanks so much Trond, amazing support and an amazingly useful plugin! Glad you like it, Max. Thanks for reporting this issue :) Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: Use of uninitialized value in sprintf at /usr/lib64/nagios/plugins/check_openmanage
Max Williams max.willi...@mflow.com writes: Hi, After adding more storage to a couple of our servers we are getting this error: [r...@host ~]# /usr/lib64/nagios/plugins/check_openmanage -C password -b ctrl_driver=0,1,2 -b ctrl_fw=0,1,2 -b intr=0 -H host2 Temperature Probe 1 in enclosure 3 [MD1000] is Inactive C at ( max) EMM 1 in enclosure 3 [MD1000] needs attention: Not Installed INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ plugins/check_openmanage line 2312. INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ plugins/check_openmanage line 2312. INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ plugins/check_openmanage line 2318. INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ plugins/check_openmanage line 2318. INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ plugins/check_openmanage line 2318. INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ plugins/check_openmanage line 2318. [r...@host ~]# We didn?t get this error before adding a new cabinet of disks which now brings the total up to 47 (2x internal disk and 3x full MD1000s). Has any one else come across this error? I am not perl literate so not sure how to debug or fix this. Hi Max, This is interesting. I've never seen Inactive temperature sensors in external enclosures. Also, that the plugin reports missing EMMs seems like a misfeature. Can you post the output from the following commands: On the monitored host: omreport storage enclosure controller=id enclosure=id info=temps omreport storage enclosure controller=id enclosure=id info=emms Replace id with controller/enclosure pairs. You'll get the enclosure and controller IDs with commands omreport storage controller omreport storage enclosure Also, since you're checking with SNMP, I'll need the output from an snmpwalk of the enclosures wrt. temperatures and EMMs. From the Nagios server: snmpwalk -v2c -c community hostname 1.3.6.1.4.1.674.10893.1.20.130.11 snmpwalk -v2c -c community hostname 1.3.6.1.4.1.674.10893.1.20.130.13 If you are uncomfortable with posting this information on the mailinglist, feel free to email me directly. Debug output from the plugin could also be useful: check_openmanage -H hostname -C community -d Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: Use of uninitialized value in sprintf at /usr/lib64/nagios/plugins/check_openmanage
Max Williams max.willi...@mflow.com writes: Both of the new enclosures show the same output so perhaps these just have a different configuration to the others we have here. Yes. I suspect that the is related to one EMM not being installed. My guess is that the inactive temperature sensor is located in the EMM, but there is no way to tell since neither the omreport output nor the SNMP output reveals the location of the temperature sensors. Or perhaps the EMM is needed to activate the sensor. We always order our MD1000s with 2 EMMs, so this is something that I haven't had the opportunity to test. I have created a test version for you to try. This version should: * report inactive temperature sensors as OK * report EMMs with state Not Installed as OK In addition it checks that the reading from the sensors are in fact digits before attempting to print the values. The test version is located here: http://folk.uio.no/trondham/software/beta/ Try it with the '-d' option to see that it reports these things properly. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage plugin error
Andrea Ballarati ballar...@interfree.it writes: Nagios reports error from the plugin in subject, we have another Dell PowerEdge 1950 for which no errors are reported. This is the output of check_openmanage -d System: PowerEdge 1800 ServiceTag: OMSA version:4.5.0 BIOS/date: A05 09/21/2005 Plugin version: 3.5.7 - Storage Components = STATE |ID| MESSAGE TEXT -+--+ WARNING |0 | Controller 0 [CERC SATA 1.5/2s] needs attention: Degraded OK |0:0:0 | Array Disk 0:0 [1.0TB] on ctrl 0 is Online OK |0:0:1 | Array Disk 0:1 [1.0TB] on ctrl 0 is Online OK | 0:0 | Logical drive 0 'Windows Disk 0' [RAID-1, 931.48 GB] on ctrl 0 is Ready OK | 0:0 | Channel 0 [] on controller 0 is Ready - Chassis Components = STATE | ID | MESSAGE TEXT -+--+ OK |1 | Memory module 1 [DIMM1_A, 512 MB] is Ok OK |2 | Memory module 2 [DIMM1_B, 512 MB] is Ok OK |1 | Chassis fan 1 [BMC Fan 1]: 1500 OK |2 | Chassis fan 2 [BMC Fan 2]: 1500 OK |0 | Power Supply 0 [VRM]: Presence detected OK |1 | Power Supply 1 [VRM]: Presence detected OK |0 | Temperature Probe 0 [PROC_1 Temp] reads 38 C (max=120/125) OK |1 | Temperature Probe 1 [BMC Ambient Temp] reads 22 C (min=8/3, max=40/45) OK |2 | Temperature Probe 2 [BMC Planar Temp] reads 33 C (min=8/3, max=62/67) OK |3 | Temperature Probe 3 [BMC VRD 0 Temp] reads 31 C (min=8/3, max=70/75) OK |4 | Temperature Probe 4 [BMC VRD 1 Temp] reads 27 C (min=8/3, max=70/75) OK |0 | Processor 0 [Intel Xeon 3.00GHz] is Present OK |0 | Voltage sensor 0 [BMC CMOS Battery] is 3.070 V OK |1 | Voltage sensor 1 [PROC_1 VCORE] is Good OK |2 | Voltage sensor 2 [BMC PROC VTT] is Good OK |3 | Voltage sensor 3 [BMC 1.5V PG] is Good OK |4 | Voltage sensor 4 [BMC 1.8V PG] is Good OK |5 | Voltage sensor 5 [BMC 3.3V PG] is Good OK |6 | Voltage sensor 6 [BMC 5V PG] is Good OK |0 | Chassis intrusion 0 detection: Ok (Not Breached) - Other messages = STATE | MESSAGE TEXT -+--- OK | ESM log health is Ok (less than 80% full) INTERNAL ERROR: Use of uninitialized value in numeric eq (==) at /usr/lib/nagios/plugins/check_openmanage line 1380. INTERNAL ERROR: Use of uninitialized value in numeric eq (==) at /usr/lib/nagios/plugins/check_openmanage line 1380. INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib/nagios/plugins Hi Andrea, check_openmanage is designed to work with relatively recent OMSA versions. You are using OMSA version 4.5.0, which is very old. The server in question (poweredge 1800) is supported by newer OMSA, so the solution is an OMSA upgrade to the latest version (6.2.0). OMSA versions 5.3.0 and later is OK to use with check_openmanage, and I've had reports that 5.1.0 and 5.2.0 works as well (but no guarantee). Anything older will yield strange results or will simply not work. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage weirdness
Greg Etling getl...@stern.nyu.edu writes: Trond, thanks for your quick reply. Unfortunately it does appear we have a disconnect between OMSA and SNMP: [snip] [r...@nagios ~]# snmpwalk -v2c -c * testserver 1.3.6.1.4.1.674.10893.1.20.130.1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1 = No Such Object available on this agent at this OID Hmm.. you should see output like: $ snmpwalk -v2c -c community hostname 1.3.6.1.4.1.674.10893.1.20.130.1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.1.1 = INTEGER: 1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.2.1 = STRING: PERC 6/i Integrated SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.3.1 = STRING: DELL SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.4.1 = INTEGER: 6 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.5.1 = INTEGER: 1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.7.1 = INTEGER: 30 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.8.1 = STRING: 6.2.0-0013 [...] It appears to only have data under the 1.3.6.1.4.1.674.10892 and 1.3.6.1.4.1.674.10899 trees. Thoughts? Unfortunately my Windows knowledge is rather limited. I have never installed OMSA on Windows, but I suspect that there are options to choose from during the install. The first thing I would do is to re-install OMSA step by step and try to figure out what I might have missed. On Linux, the install procedure and packaging of the OMSA components changed with version 6.2.0. This may very well be the case with the Windows version as well. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Internal error
Richard Hagen r.ha...@qlict.nl writes: I recently installed a new DELL Poweredge 2970 with W2k8 and installed also DELL OMSA. When i read the status from nagios i get the following error: Amperage probe 0 [PS 1 Current 1] reads 0 A Amperage probe 1 [PS 2 Current 2] reads 0 A INTERNAL ERROR: Use of uninitialized value in division (/) at /usr/lib/nagios/ plugins/check_openmanage line 3536. INTERNAL ERROR: Use of uninitialized value in division (/) at /usr/lib/nagios/ plugins/check_openmanage line 3536. INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib/nagios/ plugins/check_openmanage line 3562. Hi Richard, This happens because the value (i.e. reading from the amperage probes) are not reported by SNMP, while the rest of the data about the probes are reported (status, type, name etc.). There is something wrong with Openmanage on this server. What is the output from this command: omreport chassis pwrmonitoring That being said, the plugin could handle this better. Please try the beta version available here: http://folk.uio.no/trondham/tmp/ Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_multipath
Brian O'Mahony brian.omah...@curamsoftware.com writes: It works locally though, and I have Cmnd_Alias MULTIPATH=/sbin/multipath -l nagios ALL= NOPASSWD: MULTIPATH My money is on Requiretty. Locally you have a TTY, while NRPE does not. The Requiretty setting in /etc/sudoers must be turned off. Comment out this line in /etc/sudoers: Defaultsrequiretty Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage 3.5.6
Nicole Hähnel m...@nicole-haehnel.de writes: CRITICAL: [xxx] Physical Disk 0:0 [Wdc WD1600JS-55MHB0, 160GB] on ctrl 0 needs attention: -- SYSTEM: PowerEdge 830, SN: xxx INTERNAL ERROR: Use of uninitialized value in string eq at /usr/lib64/nagios/ plugins/grontmij/check_openmanage line 1432. INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ plugins/grontmij/check_openmanage line 1445. Mostly for the list archive: We took this off the list to do some back-and-forth debugging and testing, and the issue is now resolved. A new version of check_openmanage is released, which will print the above correctly as: CRITICAL: Physical Disk 0:0 [Wdc WD1600JS-55MHB0, 160GB] on ctrl 0 needs attention: Undefined value 4096 This relates to SNMP returning values which are not defined in the MIBs. Such values are now reported as Undefined value number. Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Keeping the Nagios Configuration Sane
David Wallis wal...@aps.anl.gov writes: Matt Simmons wrote: Hi All, I'm attending the 2010 Professional IT Community Conference (http://www.picconf.org) being held in New Brunswick, NJ, and I'm giving a talk about staying sane while working with the Nagios configuration. The talk will be 45 minutes long, and will primarily be an outshoot from this article that I wrote on my blog: http://www.standalone-sysadmin.com/blog/2009/07/nagios-config/ I could talk about that and some other things that I've been figuring out, but I was wondering if anyone had any tricks or tips for dealing with the Nagios config? Is there anything special that you do to keep things straight? I'm going to be putting my slides and any additional material online following the conference, so hopefully someone else can get some use from it. By the way, if anyone on this list is in the north east of the US, you should come visit the conference. Without training, it's only $275 for 2 days. With a full day and a half of training, it's still only $400 for the whole shebang. Anyway, this isn't a sales email. I'm looking forward to any tips you would want to share. Thanks in advance! --Matt I manage the Nagios installation for 3 different domains at work, each domain with several hundred servers and clients. I quickly reached the There's got to be a better way! point when trying to maintain configuration files that were getting pretty big. I was using all the tricks listed in the Nagios docs, but it was still pretty crazy. The approach I took was to write a configuration generator program that uses a meta-config file to generate the hosts.cfg, hostgroups.cfg and services.cfg config files. The meta-config file allows one to set up cascading configuration variables, and then has one line per monitored host, that includes things like host groups, parents, etc, and then a list of services to monitor. I also created the idea of meta-services that allow the program to generate configuration data for any number of related services with a single service name in the meta-config file. For instance, including the service weball will cause the configuration generator to create service entries for every plumbed interface on the web server, checks for every virtual server (http and https), and checks for every SSL cert that it finds. In one domain, a 400 line meta-config file generates a 20,000 line services.cfg file. Rather than updating individual config files, I just update the meta-config file and then regenerate all of the *.cfg files. I've been using this for several years with very good results. That's an interesting approach, and we do something similar. It goes without saying that when the number of hosts grows to several hundred, maintaining the Nagios config for hosts and hostgroups etc. the regular way becomes an arduous task. This is especially true if your environment is largely heterogenous. We have a list of our servers maintained in a homegrown application using a topic map as base. Large parts of the Nagios config are generated from this. I think this is an important point. Usually, you already have a list of your servers, and you can use this list as a base for Nagios config as well. The format of the host list is not important, but deciding that this is the starting point for Nagios hosts config is. When a host is added/removed in the list, it is added/removed in Nagios. This is very much like David's approach, i.e. a list of hosts in a format that is easier to handle and maintain. In addition, we have defined several roles that a server may have, such as dell-hardware, hp-hardware, mail-mx-server, web-server, dns-server etc. A simple perl script runs every day on each host and determines its roles. This information is collected and kept centrally. Parts of the Nagios config (hostgroups, servicegroups) are generated based on these roles. NRPE config is the same on all hosts. It is maintained centrally and distributed to each host daily. Adding stuff in the sudoers file (needed for some plugins) is done automatically based on the host's roles. Another point: We generally don't use plugins that require us to configure the plugin and tailor it for each individual host. For example, for filesystem monitoring we have created a custom plugin that monitors all partitions by default. It has a optional configuration file locally on each host where we can set individual thresholds if needed. Thinking like this should come easy to system administrators that are used to dealing with large installations. It's all about automation :) Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune
Re: [Nagios-users] Problem with check_openmanage 3.5.6
Nicole Hähnel m...@nicole-haehnel.de writes: it's a windows server. So I'm using check_openmanage with snmp. check_openmanage -s -C $ARG1$ -H $HOSTADDRESS$ -e -i -p --state --check intrusion=1,alertlog=1,esmlog=1 -o 3 --htmlinfo de List of Physical Disks on Controller CERC SATA 1.5/6ch (Slot 4) Controller CERC SATA 1.5/6ch (Slot 4) ID: 0:0 Status: Unknown Name : Physical Disk 0:0 State : Unknown Failure Predicted : No Progress : Not Applicable Bus Protocol : SATA Media : HDD Capacity : 149.05 GB (160040681472 bytes) Used RAID Disk Space : 0.00 GB (0 bytes) Available RAID Disk Space : 0.00 GB (0 bytes) Hot Spare : No Vendor ID : WDC Product ID: WD1600JS-55MHB0 Revision : 02.0 Serial No.: WD-WCANM3083963 Negotiated Speed : Not Available Capable Speed : Not Available Manufacture Day : Not Available Manufacture Week : Not Available Manufacture Year : Not Available SAS Address : Not Available Ok, so the status and state are both Unknown. I'm guessing that these values are completely missing in the SNMP output, which is why perl chokes on it. I've added some robustness in the code that should handle this case properly. Please try the beta version (3.5.7-beta1) available here: http://folk.uio.no/trondham/tmp/check_openmanage-3.5.7-beta1 The plugin will give an alert on the drive, which in my opinion is the correct thing to do. You can always blacklist the drive. The cause of the error is obviously that this is a non-Dell drive, which Openmanage doesn't know how to handle. BTW, you can reduce your command definition to this: check_openmanage -s -C $ARG1$ -H $HOSTADDRESS$ -e -i -p -a -o 3 --htmlinfo de The effect will be the same. You probably defined the command a while ago, and there have been some changes to options since then. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage 3.5.6
Nicole Hähnel m...@nicole-haehnel.de writes: I tested the new version: CRITICAL: [xxx] Physical Disk 0:0 [Wdc WD1600JS-55MHB0, 160GB] on ctrl 0 needs attention: -- SYSTEM: PowerEdge 830, SN: xxx INTERNAL ERROR: Use of uninitialized value in string eq at /usr/lib64/nagios/ plugins/grontmij/check_openmanage line 1432. INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ plugins/grontmij/check_openmanage line 1445. Hmm.. OK, new test: http://folk.uio.no/trondham/tmp/check_openmanage-3.5.7-beta2 Regards, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage 3.5.6
Nicole Hähnel m...@nicole-haehnel.de writes: Hi I get this message on one pe830 (OM 6.1.0) : CRITICAL: [ xxx] Physical Disk 0:0 [Wdc WD1600JS-55MHB0, 160GB] on ctrl 0 needs attention: -- SYSTEM: PowerEdge 830, SN: xxx INTERNAL ERROR: Use of uninitialized value in string eq at /usr/lib64/nagios/ plugins/grontmij/check_openmanage line 1428. INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ plugins/grontmij/check_openmanage line 1441. Is this a problem of check_openmanage or the disk? It's a non dell sata disk. Hi Nicole, Can you provide the output of the following command, executed on the monitored host: omreport storage pdisk controller=0 Also, are you using check_openmanage in SNMP or local context? Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage and net-snmp v3
Hi all, Just to bring this thread to a conclusion... I have released a new version of check_openmanage that adds a new option '--use-get_table', which is to be used as a workaround for issues with SNMPv3 on Windows using net-snmp. There are a few other minor fixes and feature enhancements as well. Downloads and changelog: http://folk.uio.no/trondham/software/check_openmanage.html#download (Also available on Nagios Exchange and Monitoring Exchange.) Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage and net-snmp v3
Verhaeghe, Koen koen.verhae...@meucci-solutions.com writes: The script is working, at least, it does not give any errors anymore. I even get Physical Disk 0:1 [Ata WDC WD800JD-75MSA3, 0GB] on ctrl 0 needs attention: Failure Predicted as expected. I was expecting also an errormessage from the Virtual disks, as they are degraded, but that's not there. If the error is just Failure Predicted, it means that the disk is working fine for the time being and the virtual drive status is not affected. When/if the drive eventually fails the virtual drive will be degraded. Moreover, I know some of our servers have problems with power supplies or memory, so I changed a section in the below mentioned script like you did for the disks and others, just to test: #my $result = $snmp_session-get_entries(-columns = [keys %ps_oid]); ## # SNMPv3 test ## my $result = q{}; if ($opt{protocol} == 3) { my $powerDeviceTable = '1.3.6.1.4.1.674.10892.1.600.12.1'; $result = $snmp_session-get_table(-baseoid = $powerDeviceTable); } else { $result = $snmp_session-get_entries(-columns = [keys %ps_oid]); } ## ## And now I do get the expected error: Power Supply 1 [AC] needs attention: Presence detected, Failure detected, AC lost I think it is safe to say that, when using net-snmp v3, the get_entries method is not giving the expected result. The complete picture is still a little unclear to me. Do these problems occur only when you use net-snmp instead of Windows' native snmp agent? (I'm assuming that net-snmp refers to http://freshmeat.net/projects/net-snmp). I would be interested in any test results you might have using the native Windows snmp agent with SNMPv3. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage and net-snmp v3
Verhaeghe, Koen koen.verhae...@meucci-solutions.com writes: Thanks for your reply and the new script. These are the results: With windows SNMP (v2) it works: Yep, that was expected :) With net-snmp v3 (version 5.4.2.1) on the same server, diabling the windows snmp, I get: ./check_openmanagetest -H xx.xx.xx.xx -P 3 --authprotocol md5 -U xx --authpassword xxx --privpassword xx --privprotocol des -p multiline -t 120 -o 3 -b ctrl_fw=all/ctrl_driver=all/ctrl_stdr=all SNMP ERROR [processors]: Received genError(5) error-status at error-index 3. Hmm.. was this on one of the servers that previously has problems fetching the cooling OIDs? I believe it would be better to make this work with the standard Windows SNMP service, which is what most people would use. Where the results any different without net-snmp? This normally indicates a too low version of OMSA, but I am using 6.2.0. With SNMPv2 on Windows, that usually is the case, yes. I have a new test version for you: http://folk.uio.no/trondham/tmp/check_openmanage-snmpv3test2 This version uses get_table() for fetching OIDs for CPUs and physical drives as well as cooling devices. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage and net-snmp v3
Verhaeghe, Koen koen.verhae...@meucci-solutions.com writes: Hi All, does anyone have an explanation for this: when using check_openmanage with snmp v3, the script exits because some OIDs do not exist for a type of server. (e.g. '1.3.6.1.4.1.674.10893.1.20.130.4.1.9' = 'arrayDiskEnclosureID' for PowerEdge 860). output: ./check_openmanage -H xx.xx.xx.xx -P 3 --authprotocol md5 -U --authpassword x --privpassword x --privprotocol des -p multiline -t 120 -o 3 -b ctrl_fw=all/ctrl_driver=all/ctrl_stdr=all SNMP ERROR [storage / pdisk]: The requested entries are empty or do not exist. When enabling the windows snmp service again and disabling the net-snmp v3, I get the correct output: ./check_openmanage -H xx.xx.xx.xx -P 2 -C xx -p multiline -t 120 -o 3 -b ctrl_fw=all/ctrl_driver=all/ctrl_stdr=all Physical Disk 0:1 [Ata WDC WD800JD-75MSA3, 0GB] on ctrl 0 needs attention: Failed Logical drive 0 'Windows Disk 0' [RAID-1, 73.57 GB] on ctrl 0 needs attention: Degraded|'fan_1_bmc_cpu#fan'=3225RPM;0;0 'fan_2_bmc_dimm_fan'=3150RPM;0;0 'temp_0_bmc_planar'=31C;48;53 tested with: OMSA version: 5.1 and 6.2 Net-snmp (x86) versions 5.4.2.1 and 5.5 NET::SNMP 6.0.0 on the nagios server Any ideas? I've tried commenting out the OIDs that do not exist (and all related script steps) but then the output gives 'OK', but I know there is a degraded disk... ./check_openmanage -H xx.xx.xx.xx -P 3 --authprotocol md5 -U --authpassword x --privpassword x --privprotocol des -p multiline -t 120 -o 3 -b ctrl_fw=all/ctrl_driver=all/ctrl_stdr=all OK - System: 'PowerEdge 860', SN: 'J478F3J', hardware working fine, 1 logical drives, 2 physical drives - BIOS='A05 10/04/2007', DRAC4='1.60', BMC='1.75' - Ctrl 0 [SAS 5/iR Adapter]: Fw='00.10.51.00.06.12.05.00', Dr='1.21.08.00' - OpenManage Server Administrator (OMSA) version: '5.1.0'|'temp_0_bmc_planar'=30C;48;53 On other types of servers I get a similar error for [cooling] (e.g on a 2950) Hi Koen, I'm the author of that plugin. To be honest, I've never actually tested the SNMPv3 stuff. I just pass the options to Net::SNMP and let it handle it, and hope that it works. You are the first to report SNMPv3 troubles, and I assume that the SNMPv3 users are a minority. I'm always interested in fixing bugs, but I'm unable to reproduce this problem. I see that you're checking a Windows box. I have none of those to play with, but I have set up SNMPv3 on a RHEL5 box. Checking the RHEL5 host via SNMPv3 works just fine: $ ./check_openmanage -H myhost -P 3 --authprotocol md5 -U \ --authpassword --privpassword --privprotocol des Controller 0 [SAS 6/iR Integrated]: Driver '3.04.07rh' is out of date Windows + OMSA + SNMP has had some problems in the past, but at least for SNMPv2c and SNMPv1 these issues should be resolved with OMSA 5.5.0.1 and later versions. It seems there are still issues with SNMPv3. In the past, there have been problems with SNMP and using the Net::SNMP function get_entries() vs. get_table(). The former is preferred because it is faster, since we're not interested in all the OIDs. This is especially true for servers with many physical disks. I have created a test version that fetches the cooling OIDs with get_table() instead of get_entries() if SNMPv3 is used. This version is available here: http://folk.uio.no/trondham/tmp/check_openmanage-snmpv3test Can you try this version on the servers where checking the cooling devices fail? (It's a bit more complicated for physical drives). PS. Please upgrade to OMSA version 5.5.0.1 or later. Previous versions are known to perform badly with SNMP on Windows. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage not using my custom temperature thresholds
C. Bensend be...@bennyvision.com writes: Hey folks, I am trying to use custom temperature thresholds with one of my servers, and it doesn't seem to take them into account. The full command (as defined in NSC.ini for NSClient++): command[check_openmanage]=check_openmanage.exe -e -p -w 0=50 -c 0=54 -b bat_charge=ALL/ctrl_fw=ALL/ctrl_driver=ALL --omreport F:\dellopenmanage\oma\bin\omreport.exe Per http://folk.uio.no/trondham/software/check_openmanage.html and the man page, I'm pretty sure that's supposed to set the temperature probe 0's warning threshold to 50C and critical to 54C. However, I'm still getting a non-OK for temp probe 0: Temperature Probe 0 [System Board Ambient Temp] is too high at 43 C -- SYSTEM: PowerEdge 2900, SN: 4PVXSK1 Just to be sure this wasn't a glitch with the display of the temp probe #, I've tried 0=50,1=50,2=50 and 0=54,1=54,2=54 but it still complains. Am I missing something? I've looked at this until I'm crosseyed, and I'm pretty sure I'm using it correctly. Is there a hardcoded threshold in there that I'm not aware of? Hi Benny, Openmanage has its own limits. From a random M600 server here, the limits for ambient temperature is # omreport chassis temps Temperature Probes Information Main System Chassis Temperatures: Ok Index : 0 Status: Ok Probe Name: System Board Ambient Temp Reading : 16.0 C Minimum Warning Threshold : 8.0 C Maximum Warning Threshold : 42.0 C Minimum Failure Threshold : 3.0 C Maximum Failure Threshold : 47.0 C To be honest, I've never considered the possibility of anyone wanting to set custom temperatures *higher* than the OMSA maximum. I allways assumed that people wanted to use the custom limits to set the max temperature *lower* than the default limits. Clearly I was wrong :) What happens in your case is that the OMSA limits kicks in. It is possible to adjust the OMSA warning limits, e.g. # omconfig chassis temps index=0 maxwarnthresh=45 Temperature probe warning threshold(s) set successfully. It is not possible to adjust the critical (failure) limits like this, only the warning limits can be set manually. Also, I believe that when a server hits the critical limit, in the interest of self preservation it shuts itself down. The plugin could be made to ignore the OMSA warning limit if the custom limit is set beyond it, but I'm not sure that we want this in general. What do you think? Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage not using my custom temperature thresholds
C. Bensend be...@bennyvision.com writes: Now that I know what's going on (and how to adjust the OMSA threshold if need be), I'd say keep it where it is. However, if these details were mentioned on the page: http://folk.uio.no/trondham/software/check_openmanage.html it would have saved me a lot of time, hair, and such. Could this be added? Yes, I have updated the documentation: http://folk.uio.no/trondham/software/check_openmanage.html#custom-temperature-thresholds Hopefully this will clarify things for other users. BTW, thanks for reporting this, the documentation was ambiguous and in need of an update :) Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage having issues with OMSA 6.2.0
C. Bensend be...@bennyvision.com writes: Hey folks, During this past weekend's maintenance window, we upgraded several hosts to OMSA v6.2.0. They were previously at v5.0.0, and so check_openmanage wasn't able to poll them. Now, they are still showing as UNKNOWN, giving the following error: Problem running 'omreport storage controller': Error! Invalid name=value pair: controller This is running on a Dell PowerEdge 1950, with the following command via NSClient++: check_openmanage.exe -e -b bat_charge=ALL/ctrl_fw=ALL/ctrl_driver=ALL --omreport E:\OpenManage\oma\bin\omreport.exe I would have thought 6.2.0 would be OK - is anyone else seeing issues, or does anyone know of incompatibilities? I checked the check_openmanage FAQ, but didn't see anything... Hi Benny, The command 'omreport storage controller' is pretty basic and should never fail like that. You should check if OMSA is correctly installed, specifically the storage stuff. OMSA consists of many different components, and I'm guessing that the storage component(s) are missing on your server. If you run the command manually, you get the same error message, right? Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage 3.5.5-beta6 snmp_detect_blade bug
McKinlay, Ken ken.mckin...@curtisswright.com writes: Trond, Other little bug for your next release. Using check_openmanage 3.5.5-beta6 on a server loaded with OMSA 5.1.0 (a different box this time), in the snmp_detect_blade function it returned: INTERNAL ERROR: Use of uninitialized value in string eq at ./check_openmanage-3.5.5-beta6 line 599. Looking at the line and then doing my own SNMP query, that OID is missing in OMSA 5.1.0. However, by changing line 599 to first make sure a result has been set then the uninitialized value error is bypassed in the if statement: if ( $result-{$DellBaseBoardType} $result-{$DellBaseBoardType} eq '3') { Thank you, the patch is applied. Note that check_openmanage is not designed to work with really old OMSA versions (5.2 and earlier). This is more of a problem when checking locally, since omreport commands are different. I generally won't add support for old OMSA if it has a noticeable speed or complexity impact, but that is not the case here. Besides, checking that the value exists is good practice anyway :) An updated version is available here: http://folk.uio.no/trondham/tmp/check_openmanage-3.5.5-beta7 If you confirm that this beta works for you, and I don't get any more bug reports in the next few days, this will eventually become 3.5.5. Cheers, -- Trond H. Amundsen t.h.amund...@usit.uio.no Center for Information Technology Services, University of Oslo -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null