Re: [Nagios-users] check_Openmanage trouble

2013-08-15 Thread Trond Hasle Amundsen
Weberskirch, Timo timo.weberski...@offlimits-it.com writes:

 the check_openmanage –no-storage options works (surely without any physical 
 disk… :( ).

 I was on the phone with the Dell Pro Support. They told me that the MD3 
 only schows the raid disk Information (not the physical
 disk informations) to external devices.

 Also they told me that there is no way to filter out the SAS-Card in OMSA.

 I have to live with „—no-storage“ option…

Hmm.. Ok, so this particular server doesn't have any storage other than
the SAS card (connected to the MD3xxx), which OMSA can't manage? If so,
that is exactly what the '--no-storage' option is for :)

You should use the '--no-storage' option if

  1. The server has no storage, which is entirely possible; or
  2. The only storage present is something that OMSA doesn't recognize

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with 2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_Openmanage trouble

2013-08-14 Thread Trond Hasle Amundsen
Weberskirch, Timo timo.weberski...@offlimits-it.com writes:

 thank you all for your fast and helpful response.  Unfortunately the problem
 persists.

 Is there a way to filter out the  (in my opinion faulty) SAS card?

Storage components are tightly interconnected, so from the plugin side
your only option is to not check storage at all:

   check_openmanage --no-storage

But I still believe that this is a software problem, i.e. in OMSA.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo


--
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with 2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_Openmanage trouble

2013-08-07 Thread Trond Hasle Amundsen
Rich rerc...@pha.jhu.edu writes:

 Usually, when I've seen this, it's been after doing an upgrade of an
 existing OMSA install (= 6.x to 7.x).

 In general, I haven't found a good way to resolve it other than
 automating a complete uninstall of OMSA prior to installing the newer
 version.

Yes, I think the logical next step in this case is to do a complete
uninstall, then reinstall of OMSA on the host. The problem is in OMSA
and must be fixed there. The plugin is simply complaining that OMSA
isn't responding as expected.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with 2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_Openmanage trouble

2013-08-06 Thread Trond Hasle Amundsen
Weberskirch, Timo timo.weberski...@offlimits-it.com writes:

 maybe one of you has the same problem with the check_openmanage plugin…

 Last week we installed two new Dell PowerEdge R720 with OMSA v 7.3.0
 (check_openmange version: 3.7.10).

 Everytime I try to check my Server I get this error message:

 “SNMP ERROR [storage / pdisk]: Requested entries are empty or do not exist.”

Hello Timo,

There seems to be some sort of issue with the Openmanage installation on
this server. First thing to do is double-check that everything is
installed properly. On a RHEL6 system, the following storage related RPM
packages should be installed:

  # rpm -qa|grep srvadmin-storage
  srvadmin-storageservices-7.3.0-4.4.1.el6.x86_64
  srvadmin-storage-7.3.0-4.93.2.el6.x86_64
  srvadmin-storage-cli-7.3.0-4.93.2.el6.x86_64
  srvadmin-storageservices-snmp-7.3.0-4.4.1.el6.x86_64
  srvadmin-storage-snmp-7.3.0-4.93.2.el6.x86_64
  srvadmin-storageservices-cli-7.3.0-4.4.1.el6.x86_64

Do you see any physical disks in the Openmanage Web Console? (point your
browser to https://server-ip:1311/ and log in as root)

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Get your SQL database under version control now!
Version control is standard for application code, but databases havent 
caught up. So what steps can you take to put your SQL databases under 
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=48897031iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] rpmbuild nagios-3.5.0

2013-07-24 Thread Trond Hasle Amundsen
alexus ale...@gmail.com writes:

 I'm unable to build RPM w/ nagios 3.5.0, last one that worked for me was 
 3.2.3.
 any ideas/suggestions?

I'd recommend using the already prebuilt package for rhel6 which is
available from EPEL[1]. Add the EPEL repo and you can simply do yum
install nagios and be done :)

[1] http://fedoraproject.org/wiki/EPEL

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage improvement request

2013-07-23 Thread Trond Hasle Amundsen
John Skarbek john.skar...@nextcentury.com writes:

 I?ve recently deployed the check_openmanage script and it works very well. 
 Except for hosts that run esxi.  Unless I?m doing something wrong. 

You're not doing anything wrong. Openmanage, when deployed on ESXi,
doesn't have the necessary capabilities for it to work.

 I?ve discovered that Open Manage doesn?t broadcast it?s OID?s through ESXi 
 like
 it would if it were a linux or windows host.  However I did find that the
 iDRAC7 does have similar snmp responses that I?d like to capture.  However 
 when
 pointing check_openmanage to the drac interface, I get the message indicating
 that OMSA must not be installed correctly.  However, looking into the script I
 found:

 my $chassisModelName = '1.3.6.1.4.1.674.10892.1.300.10.1.9.1';

 Which does indeed NOT exist.  However, a similar OID with the same information
 we are looking for is located here:

$chassisModelName = '1.3.6.1.4.1.674.10892.5.1.3.12.0';

Actually, the OID is 1.3.6.1.4.1.674.10892.5.4.300.10.1.9.1. I've toyed
around with this a bit, and for the most part you can simply replace
1.3.6.1.4.1.674.10892.1 with 1.3.6.1.4.1.674.10892.5.4. Same goes for
storage OIDs, to a degree.

 After modifying the script a little bit I was able to get past that, but now
 check_openamange is complaining, ?SNMP ERROR [memory]: The requested entries
 are empty or do not exist. ?

 I presume the entire set of OID?s is in a different spot when being checked
 through the drac versus the standard windows snmp service.  I would love to
 assist in enhancing this script, but I?m not sure how I should start.  Let me
 know who I should contact, or feel free to reach out to me to assist with this
 awesome plugin.

I have a modified prealpha version for testing, available in the test
branch in git:

  http://git.uio.no/git/?p=check_openmanage.git;a=shortlog;h=refs/heads/test

Note that it's NOT production ready, I have only done some very limited
testing.

I had to simplify some stuff:

  * Storage: The storage OIDs from the iDRAC7 are somewhat different,
compared to Openmanage. Some information that the plugin needs is
not available, such as numbered identifiers for components (used in
blacklisting). There are even some OIDs that aren't present in
Openmanage. In short, it's a mess, and the storage bit is very
simplistic. Perhaps the missing info will be added in a later
firmware release, we can only hope.

  * ESM health OIDs are missing completely, so ESM health check is
omitted. Same for SD card check.

To use the new feature you have to specify '--idrac', like this:

  check_openmanage --idrac -H idrac-ip

Test it, break it and tell me what you think :)

I've noticed that neither the rollup-status or component-status for
controllers catches that the controller is actually degraded from
out-of-date firmware. Hopefully it's an anomaly that doesn't apply to
other aspects of controllers, or other components.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem with check_openmanage plugin and storage

2013-06-19 Thread Trond Hasle Amundsen
Nic Bernstein n...@onlight.com writes:

 Regarding the non-certified disks problem... There is a special
 blacklisting keyword to suppress the message about non-certified disks:

   check_openmanage -b pdisk_cert=all

 Please try this and see if it resolves your issue. Using blacklisting
 should also disable the global health check.


 Ah, that's just what we need.  Much appreciated...

 No, that doesn't seem to be in my version (3.7.9, downloaded yesterday)

 onlight@monitor:~$ perl check_openmanage -H host -C secret -b 
 pdisk_cert=all
 Physical Disk 0:1:0 [Ata ST2000DM001-9YN164, 2.0TB] on ctrl 0 is Online
 Physical Disk 0:1:1 [Ata ST2000DM001-9YN164, 2.0TB] on ctrl 0 is Online
 onlight@monitor:~$ echo $?
 1

 I guess I'll wait for a patch.

Are you sure you didn't test this with the 7.1.0 workaround manually
removed?

 Say Trond, I sent you some notes last week about enhancements we made to your
 check_linux_bonding plugin.  Would you prefer I re-post those to the list
 instead?

Sorry for being non-responsive of late. I've been swamped at work lately
and have attained somewhat of an email backlog. No need to resend :)

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios openmanage ERROR: XML transformation failed

2013-06-19 Thread Trond Hasle Amundsen
Lorenz, Stephan stephan.lor...@medizin.uni-leipzig.de writes:

 since installing libxml2, libxml2-devel and curl, the Nagios installation on
 our Dell R720xd server reports XML errors.

  

 Problem running 'omreport storage controller': Error! XML Transformation 
 failed
 br/Problem running 'omreport chassis memory': Error! XML Transformation
 failedbr/Problem running 'omreport chassis fans': Error! XML Transformation
 failedbr/Problem running 'omreport chassis pwrsupplies': Error! XML
 Transformation failedbr/Problem running 'omreport chassis temps': Error! XML
 Transformation failedbr/Problem running 'omreport chassis processors': 
 Error!
 XML Transformation failedbr/Problem running 'omreport chassis volts': Error!
 XML Transformation failedbr/Problem running 'omreport chassis batteries':
 Error! XML Transformation failedbr/Problem running 'omreport chassis
 pwrmonitoring': Error! XML Transformation failedbr/Problem running 'omreport
 chassis intrusion': Error! XML Transformation failedbr/Problem running
 'omreport chassis removableflashmedia': Error! XML Transformation failedbr/
 Chassis Service Tag is bogus: 'N/A'

  

 I am using Nagios 3.5.1, check_openmanage 3.7.9, Openmanage 7.2.0 on Centos 
 6.4
 2.6.32-358.11.1.el6.centos.plus.x86_64.

  

 When I run check_openmanage or omreport manually everything is fine. I tried 
 to
 reinstall nagios-plugins-openmanage and php-xml for a start, but that did not
 help. I cannot remove libxml2 and the rest since it is needed elsewhere.

  

 Does anyone have a suggestion of how to fix this error?

Given that it works when you run the commands manually I'm suspecting
some sort of permission issue. Try running the commands as the NRPE
user, and also try running it from Nagios with SELinux in permissive
mode (needs to be run by the NRPE daemon with the correct SELinux
domain).

Check out this link about using check_openmanage with SELinux in
enforcing mode:

  
http://folk.uio.no/trondham/software/check_openmanage.html#selinux-considerations

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem with check_openmanage plugin and storage

2013-06-18 Thread Trond Hasle Amundsen
Nic Bernstein n...@onlight.com writes:

 We've recently been experimenting with Trond Hasle Amundsen's check_openmanage
 on a large network with about a hundred Dell servers of various ages,
 capabilities, etc.  Mostly PE-2950, R210, R410 and R720.  Much thanks to Trond
 for all his great work on Nagios plugins and other projects, by the way.

 We've hit a wall, however, with the storage monitoring aspects of this plugin.

 For example, here's a quite specific case.  This is a new PE R720, in debug:

 onlight@monitor:~$ check_openmanage -H host -C secret -d
System:  PowerEdge R720   OMSA version:7.1.0
ServiceTag:  ###  Plugin version:  3.7.9
BIOS/date:   1.2.6 05/10/2012 Checking mode:   SNMPv2c UDP/IPv4
 
 -
Storage Components
 
 =
   STATE  |ID|  MESSAGE TEXT
 
 -+--+
   OK |0 | Controller 0 [PERC H310 Mini] is Ready
  WARNING |  0:0:1:0 | Physical Disk 0:1:0 [Ata ST2000DM001-9YN164, 2.0TB] 
 on ctrl 0 is Online, Not Certified
  WARNING |  0:0:1:1 | Physical Disk 0:1:1 [Ata ST2000DM001-9YN164, 2.0TB] 
 on ctrl 0 is Online, Not Certified
   OK |  0:0 | Logical Drive '/dev/sda' [RAID-1, 1862.50 GB] is 
 Ready
   OK |  0:0 | Connector 0 [SAS] on controller 0 is Ready
   OK |  0:1 | Connector 1 [SAS] on controller 0 is Ready
   OK |0:0:1 | Enclosure 0:0:1 [Backplane] on controller 0 is Ready
 
 -
Chassis Components
 
 =
   STATE  |  ID  |  MESSAGE TEXT
 
 -+--+
   OK |0 | Memory module 0 [DIMM_A1, 4096 MB] is Ok
   OK |1 | Memory module 1 [DIMM_A2, 4096 MB] is Ok
   OK |2 | Memory module 2 [DIMM_A3, 4096 MB] is Ok
   OK |3 | Memory module 3 [DIMM_A4, 4096 MB] is Ok
   OK |0 | Chassis fan 0 [System Board Fan1 RPM] reading: 1200 RPM
   OK |1 | Chassis fan 1 [System Board Fan2 RPM] reading: 1080 RPM
   OK |2 | Chassis fan 2 [System Board Fan3 RPM] reading: 1200 RPM
   OK |3 | Chassis fan 3 [System Board Fan4 RPM] reading: 1080 RPM
   OK |4 | Chassis fan 4 [System Board Fan5 RPM] reading: 1080 RPM
   OK |5 | Chassis fan 5 [System Board Fan6 RPM] reading: 1080 RPM
   OK |0 | Power Supply 0 [AC]: Presence detected
   OK |0 | Temperature Probe 0 [System Board Inlet Temp] reads 26 
 C (min=3/-7, max=42/47)
   OK |1 | Temperature Probe 1 [System Board Exhaust Temp] reads 
 33 C (min=8/3, max=70/75)
   OK |2 | Temperature Probe 2 [CPU1 Temp] reads 49 C (min=8/3, 
 max=83/88)
   OK |0 | Processor 0 [Intel Xeon E5-2603 0 1.80GHz] is Present
   OK |0 | Voltage sensor 0 [CPU1 VCORE PG] is Good
   OK |1 | Voltage sensor 1 [System Board 3.3V PG] is Good
   OK |2 | Voltage sensor 2 [System Board 5V PG] is Good
   OK |3 | Voltage sensor 3 [CPU1 PLL PG] is Good
   OK |4 | Voltage sensor 4 [System Board 1.1V PG] is Good
   OK |5 | Voltage sensor 5 [CPU1 M23 VDDQ PG] is Good
   OK |6 | Voltage sensor 6 [CPU1 M23 VTT PG] is Good
   OK |7 | Voltage sensor 7 [System Board FETDRV PG] is Good
   OK |8 | Voltage sensor 8 [CPU1 VSA PG] is Good
   OK |9 | Voltage sensor 9 [CPU1 M01 VDDQ PG] is Good
   OK |   10 | Voltage sensor 10 [System Board NDC PG] is Good
   OK |   11 | Voltage sensor 11 [CPU1 VTT PG] is Good
   OK |   12 | Voltage sensor 12 [System Board 1.5V PG] is Good
   OK |   13 | Voltage sensor 13 [PS2 PG Fail] is Good
   OK |   14 | Voltage sensor 14 [System Board PS1 PG Fail] is Good
   OK |   15 | Voltage sensor 15 [System Board BP1 5V PG] is Good
   OK |   16 | Voltage sensor 16 [CPU1 M01 VTT PG] is Good
   OK |   17 | Voltage sensor 17 [PS1 Voltage 1] reads 114 V
   OK |0 | Battery probe 0 [System Board CMOS Battery] is Presence 
 Detected
   OK |0 | Amperage probe 0 [PS1 Current 1] reads 0.6 A
   OK |1 | Amperage probe 1 [System Board Pwr Consumption] reads 
 56 W
   OK |0 | Chassis intrusion 0 detection: Ok (Not Breached)
   OK |0 | SD Card 0 [vFlash] is Absent
 
 -
Other messages
 
 =
   STATE  | 

Re: [Nagios-users] Check_Openmanage not ignoring non-certified drives

2013-01-14 Thread Trond Hasle Amundsen
Bob The Junkie bob_the_jun...@hotmail.com writes:

 I m using Nagios and Check_Openmange to keep an eye on some Dell R710 servers
 we ve recently acquired, and I m having problems trying to stop warnings with
 non-dell certified drives appearing in the alert log.

 I ve separated out the different components on the servers to check into their
 own nagios checks   so my config files appear as such:

 In nagios:

 SERVICES.CFG

 host

 Check_command check_dell_components!memory

 host

 Check_command check_dell_components!alertlog

 COMMANDS.CFG

 Command_name Check_dell_components

 Command_line check_nrpe  H $HOSTADDRESS$ -p 5666  t 30  c Check_OpenManage  a 
  
  only $ARG1$ 

 On each Server in nsclient.ini:

 Check_OpenManage = scripts\\check_openmanage.exe $ARG1$ --perfdata

 The problem I m having is that in one of my checks that checks the health of
 the alert log, I m getting a consistent warning message (Alert log content: 0
 critical, 6 non-critical, 36 ok ). I ve traced this down to the 6 non-dell
 certified drives in the server, and I can indeed see within OMSA that the only
 6 warnings all state  Controller event log: PD 04(e0x20/s4) is not a certified
 drive: Controller 0 (PERC 6/i Integrated) .

 So far, so good. Reading through the documentation I can see the
 Check_Openmanage includes a blacklisting option specifically for this event  
 pdisk_cert - Suppress warning message about non-certified physical disk  but 
 no
 matter what I try, I can t seem to get Check_Openmanage to ignore these
 problems. An example of the command I m running on the command line is:

 check_openmanage.exe -s -a -b pdisk_cert=all

 Which returns:

 WARNING: Alert log content: 0 critical, 6 non-critical, 36 ok

 Now I m assuming the problem here is being caused by the Alert Log generating
 the errors, and not the physical disk directly causing the errors, which is 
 why
 blacklisting the certificate problem on the physical disk isn t doing me any
 good.

 Which leads me onto my question   is there anything I can do to ignore these
 errors (and thus stop Nagios from complaining) apart from excluding the alert
 log when I do my checks?

Hi,

Your analysis is correct. The check_openmanage plugin's check of the log
content is limited to counting the number of critical, warning and ok
messages. It doesn't do any log parsing. The intended usage of the log
checking is as a precausion, if you're concerned about missing some
temporary problem. After all, the plugin does active checking and will
only report the state of the hardware right now.

In your case I think that the easiest solution would be to stop using
the log checking with check_openmanage, and either use a fully fledged
log parsing plugin (such as check_logfiles) or write your own simple
plugin where you just filter out the certificate stuff.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122412
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] New check_openmanage error after updating to OMSA 7.2.0-4

2013-01-10 Thread Trond Hasle Amundsen
Steve Jenkins stevejenk...@gmail.com writes:

 And... to answer my own question, yes - 3.7.9 does indeed fix
 this. New version is probably already in the repos, waiting out the
 testing period.

Not sure which repos you're referring to, but I'll assume Fedora and/or
Fedora EPEL.

I didn't get around to submitting updates until today. They should
arrive in the testing repos in a couple of days. The updates need to
stay in testing for a week for Fedora and two weeks for EPEL before they
can be pushed to stable. If you can't wait, you can download the RPMs
via the Fedora build system, you'll find links here:

  https://admin.fedoraproject.org/updates/search/nagios-plugins-openmanage

When it has arrived in testing (and in your local mirror), you can
install it with (example for EPEL):

  yum --enablerepo=epel-testing update nagios-plugins-openmanage

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122712
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: timeout vs. SNMP timeout

2012-12-11 Thread Trond Hasle Amundsen
Andrew Daugherity adaugher...@tamu.edu writes:

 Please try this version (named 3.7.8-beta2) and let me know if it works
 around your problem. Usage:
 
   check_openmange --snmp-timeout integer

 I think I fixed my problem (for the time being at least) by restarting
 OMSA on that server.  Restarting snmpd didn't solve anything, nor did
 my timeout hack (which just gave me an UNKNOWN status - plugin timeout
 instead of SNMP CRITICAL when it randomly failed).  Whenever the check
 failed, it would hang indefinitely, so it was not a case of slow SNMP.
 Thanks for the added option, though; I think someone may find it
 useful.

Yes, I agree. I'll keep it.

 Regarding your fix:
 The timeout option does appear to get passed to SNMP, however the
 actual timeout is twice what is specified.  E.g. --snmp=timeout=1, get
 SNMP critical message after 2 seconds; --snmp-timeout=14, SNMP
 critical at 28 seconds; --snmp-timeout=15 or higher, get UNKNOWN:
 PLUGIN TIMEOUT message at 30 seconds.  (I used a host without snmpd
 running for the timeout tests.)  I can't see anything obviously wrong
 with your code, but it behaves this way both on both SLES 11 SP1 (Perl
 5.10, net-snmp 5.4.2.1, Net::SNMP 6.0.1) and OS X 10.8 (Perl 5.12.4,
 net-snmp 5.6, Net::SNMP 6.1 [from CPAN]).

Hmm.. kind of confusing. It is due to the fact that Net::SNMP does one
retry (with the same timeout) before it bails out. This is adjustable
with the '-retries' parameter to the SNMP object. The default is 1. If I
set it to 0, the plugin times out in the SNMP object at the specified
time as you would expect. Thanks for pointing this out, I should make a
note of it in the manual page.

 You probably also want to add this option to the help/usage message.

I won't make the help output, as that only covers the most popular
options, but I'll add it to the manual page.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: timeout vs. SNMP timeout

2012-12-10 Thread Trond Hasle Amundsen
Trond Hasle Amundsen t.h.amund...@usit.uio.no writes:

 A new option to specify the SNMP object timeout would be easy to add,
 and is in my opinion a cleaner approach than just passing the plugin
 timeout.

Such an option is now implemented in the Git version:

  
http://git.uio.no/git/?p=check_openmanage.git;a=commit;h=32564b44c2631eeac03a920f0c180fb12e4b29c8

Please try this version (named 3.7.8-beta2) and let me know if it works
around your problem. Usage:

  check_openmange --snmp-timeout integer

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: timeout vs. SNMP timeout

2012-12-07 Thread Trond Hasle Amundsen
Andrew Daugherity adaugher...@tamu.edu writes:

 I'm troubleshooting an issue where one server is occasionally not responding 
 (I think it's a firewall or snmpd issue, not this plugin), and I noticed that 
 changing the timeout option to check_openmanage did not affect how long it 
 took before receiving the
   SNMP CRITICAL: No response from remote host A.B.C.D

 message.  Looking at the code I see the timeout option is _not_ passed to the 
 Net::SNMP session object, so the SNMP connection timeout uses the default 
 value (5 seconds according to the Net::SNMP man page, but 10 seconds in my 
 testing).

 If I pass the timeout option to the Net::SNMP-session object like so:
 
 diff --git a/check_openmanage b/check_openmanage
 index b6abec5..3558ed4 100755
 --- a/check_openmanage
 +++ b/check_openmanage
 @@ -860,6 +860,7 @@ sub snmp_initialize {
  '-port' = $opt{port},
  '-hostname' = $opt{hostname},
  '-version'  = $opt{protocol},
 +'-timeout'  = $opt{timeout},
 );
  
  # Setting the domain (IP version and transport protocol)
 
 Then it does obey the timeout option and I instead get the
   PLUGIN TIMEOUT: check_openmanage timed out after 30 seconds

 message.  This might be by design though, to have a shorter SNMP timeout and 
 different error messages, but it was perplexing to me why the timeout option 
 was seemingly not working.  Perhaps a different option for the SNMP timeout, 
 or a documentation clarification, is a better way?

Hello Andrew,

Your analysis of this problem is correct, you're hitting the Net::SNMP
timeout which is default 5 seconds. There are two reasons why the
--timeout parameter isn't passed to the SNMP object:

  1. I never saw any reason to :) This is the first time I've heard of
 problems relating to it.

  2. The SNMP object timeout has limitations, it can only be between 1
 and 60 seconds. I don't know how Net::SNMP reacts if the specified
 value is outside of this range.

The documentation is lacking on this, as you pointed out, and I'll fix
that. A new option to specify the SNMP object timeout would be easy to
add, and is in my opinion a cleaner approach than just passing the
plugin timeout.

PS. I'm going away for the weekend and I'm leaving in a few minutes, so
I'll get back to you on this early next week.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem with check_openmanage

2012-10-12 Thread Trond Hasle Amundsen
Jens Hyllegaard (Soft Design A/S) jens.hyllega...@softdesign.dk
writes:

 I am using version 3.7.6 of check_openmanage.

 I have disabled notifications for battery charge events in the call to
 check_openmanage but I still get notifications from Nagios.

  

 This is command line I use:

 $USER1$/check_openmanage -s -p -H $HOSTADDRESS$ -b ps=all -b bat_charge

  

 This is the current output from check_openmanage for one the servers.

 WARNING: Cache Battery 0 in controller 0 is Charging (Ready) [probably
 harmless]

Hello Jens,

There is a slight typo in your command definition. Replace with:

  $USER1$/check_openmanage -s -p -H $HOSTADDRESS$ -b ps=all -b bat_charge=all

..and you should be fine :)

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: fix build on SUSE (docbook pkg name)

2012-08-06 Thread Trond Hasle Amundsen
Andrew Daugherity adaugher...@tamu.edu writes:

 Simple fix -- the package is named 'docbook-xsl-stylesheets' instead
 of 'docbook-style-xsl'.  I added a variable for this to the global if
 suse section.

Thanks Andrew, applied and pushed to master.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Dell Openmanage

2012-07-01 Thread Trond Hasle Amundsen
Sven Dohmen s...@o4s.nl writes:

 Since several months we are using the Dell Openmanage plugin from http://
 folk.uio.no/trondham/software/check_openmanage.html. This has been working 
 fine
 untill the last couple weeks.

 For some servers we are getting the following results back:

 W: Controller 0 [PERC 6/i Integrated]: Firmware '6.2.0-0013' is out of date
 -- SYSTEM: PowerEdge R710, SN:
 INTERNAL ERROR: Use of uninitialized value within %fw_type in string eq at
 (eval 1) line 4976.
 INTERNAL ERROR: Use of uninitialized value within %fw_type in pattern match 
 (m/
 /) at (eval 1) line 4980. 

 I noticed this only happens when 1 of the drivers is out of date. Is there a
 solution for without directly updating the firmware (which is already planned
 over several weeks).

In case anyone else has this issue.. Sven and I worked on this off-list,
and we identified this to be an error related to using the '-o' option
over SNMP, on servers equipped with iDRAC6 or iDRAC7 management
cards. The plugin check_openmanage has been fixed and a new release
(versjon 3.7.6) is available:

  http://folk.uio.no/trondham/software/check_openmanage.html#download

Notice for For RHEL and Fedora users: The new release has been submitted
as an update for Fedora and Fedora EPEL. It is currently in testing, and
can be updated with:

  yum --enablerepo=\*testing update nagios-plugins-openmanage

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Warning alert isn't working

2012-04-13 Thread Trond Hasle Amundsen
Leonardo Bacha Abrantes leona...@lbasolutions.com writes:

 Hi everybody!

 I'm using check_openmanage plugin in nagios to monitoring the temperature of 
 my
 dell servers.
 It's working, however, the warning and critical alerts that I configure are 
 not
 working.

 [root@monitor:/etc/openmanage]# /usr/lib/nagios/plugins/check_openmanage -w 25
 -c 30 -H 10.11.12.1 -C Test--only temp
 TEMPERATURES OK - 1 temperature probes checked:BRTemperature Probe 0 [System
 Board Ambient Temp] reads 30 C (min=8/3, max=42/47)

 The temperature is 30 and the check should appear WARNING because I used -w 
 25.

Hello Leonardo,

The syntax you're using with the '-w' and '-c' options is wrong. From
the manual page:

   -w, --warning STRING or FILE
   Override the machine-default temperature warning
   thresholds. Syntax is id1=max[/min],id2=max[/min], The
   following example sets warning limits to max 50C for probe 0,
   and max 45C and min 10C for probe 1:

   check_openmanage -w 0=50,1=45/10

   The minimum limit can be omitted, if desired. Most often, you
   are only interested in setting the maximum thresholds.

   This parameter can be either a string with the limits, or a
   file containing the limits string. The option can be
   specified multiple times.

   NOTE: This option should only be used to narrow the field of
   OK temperatures wrt. the OMSA defaults. To expand the field
   of OK temperatures, increase the OMSA thresholds. See the
   plugin web page for more information.

   -c, --critical STRING or FILE
   Override the machine-default temperature critical
   thresholds. Syntax and behaviour is the same as for warning
   thresholds described above.

The reason that you need to specify the ID of the temperature probes is
that there may be more than one, each with its own thresholds. In your
case there is only one probe and its ID is 0, so replace your command
above with:

  check_openmanage -w 0=25 -c 0=30 -H 10.11.12.1 -C Test --only temp

That should do the trick.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: Physical Disk ... Undefined value 4096

2012-02-01 Thread Trond Hasle Amundsen
Helmut Wollmersdorfer helmut.wollmersdor...@fixpunkt.de writes:

 Physical Disk 0:0:0 [Dell WDC WD1003FBYX-18Y7B0, 1.0TB] on ctrl 0  
 needs attention: Undefined value 4096

Hello Helmut,

The state value for physical disks via SNMP is an integer, which is
translated by the plugin. There are a few defined values, and 4096 is
not one of them.

 On the console of the server:

 # /opt/dell/srvadmin/bin/omreport storage pdisk controller=0 vdisk=0
 List of Physical Disks belonging to VD10A

 Controller PERC H700 Integrated (Slot 4)

 Span 0
 ID: 0:0:0
 Status: Unknown
 Name  : Physical Disk 0:0:0
 State : Unknown
 Power Status  : Spun Up
 Bus Protocol  : SATA
 Media : HDD
 Revision  : 01.01V02
 Failure Predicted : No
 Certified : Yes
 Encryption Capable: No
 Encrypted : Not Applicable
 Progress  : Not Applicable
 Mirror Set ID : 0
 Capacity  : 931.00 GB (999653638144 bytes)
 Used RAID Disk Space  : 931.00 GB (999653638144 bytes)
 Available RAID Disk Space : 0.00 GB (0 bytes)
 Hot Spare : No
 Vendor ID : DELL
 Product ID: WDC WD1003FBYX-18Y7B0
 Serial No.: WD-WCAW3145836558365
 Part Number   : TH0V8FCR1255213BC4RGA00
 Negotiated Speed  : 3.00 Gbps
 Capable Speed : 3.00 Gbps
 Manufacture Day   : Not Available
 Manufacture Week  : Not Available
 Manufacture Year  : Not Available
 SAS Address   : 443322110700

 [same for all 4 disks of the array]

 Thus it seems that check_openmanage works correctly. Also the disk- 
 array seems to work correctly (no error messages in the logs).

 Is this some sort of wrong diagnostic from the firmware/controller?

No, this is not normal behaviour. I've seen this only on disks that were
so damaged that Openmanage failed miserably when attempting to get info
from them. Clearly this is not the case here, as you get the same error
on multiple disks and they otherwise work fine.

If you haven't already, you should try upgrading all BIOS and firmware
on the server, especially the controller firmware. You should also
upgrade Openmanage if you're not running the latest version (6.5.0).

If all else fails, I would contact Dell support and have them look at
it.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] SELinux and RHEL6.2 preventing disk checks via NRPE

2011-12-13 Thread Trond Hasle Amundsen
Dennis Kuhlmeier kuhlme...@riege.com writes:

 Geez, there are a lot more contexts set than I thought. I should
 probably remove duplicate entries, right?

The labels in

  /etc/selinux/targeted/contexts/files/file_contexts

is there by default and these should not be touched. The file

  /etc/selinux/targeted/contexts/files/file_contexts.local

contains local additions or adjustments. If there are entries there that
you think ought to be removed, you should remove them with:

  semanage fcontext -d 'entry'

Don't edit the file directly :)

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Systems Optimization Self Assessment
Improve efficiency and utilization of IT resources. Drive out cost and 
improve service delivery. Take 5 minutes to use this Systems Optimization 
Self Assessment. http://www.accelacomm.com/jaw/sdnl/114/51450054/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage spec file fixes for SUSE

2011-12-12 Thread Trond Hasle Amundsen
Daugherity, Andrew W adaugher...@tamu.edu writes:

 First of all, thanks for making this plugin.  It works well and is
 very handy.  As requested in the documentation, I am sending this to
 the nagios-users list rather than directly to the author.

Hello Andrew,

Excellent :) Usually a public forum is better, where everybody can
participate and share their insight.

 With some minor modifications, the package builds properly on SUSE.
 There are two main Nagios packaging differences from RedHat:

 1) All Nagios plugins are installed to /usr/lib/nagios/plugins, even
 on 64-bit (there is no /usr/lib64/nagios directory).  This may not
 make the most sense, but it is what is, and being consistent with
 other Nagios packages is good.

 2) Non-binary plugin RPMs (e.g. Perl scripts only) use noarch, while
 binary plugins use the corresponding arch.  For examples of both,
 browse the build service repo at
 http://download.opensuse.org/repositories/server:/monitoring/SLE_11.1/
 Being a Perl script, check_openmanage falls under the former.

 This is easily solved with an %if block to make a universal RPM spec:
  BEGIN PATCH 
 --- nagios-plugins-openmanage.spec.orig   2011-10-05 10:00:18.0 
 -0500
 +++ nagios-plugins-openmanage.spec2011-12-01 15:02:10.0 -0600
 @@ -5,6 +5,16 @@
 # No binaries here, do not build a debuginfo package
 %global debug_package %{nil}

 +# SUSE installs Nagios plugins under /usr/lib, even on 64-bit
 +# It also uses noarch for non-binary Nagios plugins
 +%if %{defined suse_version}
 +%global nagiospluginsdir /usr/lib/nagios/plugins
 +BuildArch:   noarch
 +%else
 +%global nagiospluginsdir %{_libdir}/nagios/plugins
 +%endif
 +
 +
 Name:  nagios-plugins-openmanage
 Version:   3.7.3
 Release:   1%{?dist}
  END PATCH 

 I also tested building on CentOS 5 to make sure nothing broke there,
 and indeed, nothing changed there.

Thanks for the patch, applied. However, there are some changes to the
spec file lately. Among them is an added Requires to the nagios-plugins
package, which owns the /usr/lib(64)?/nagios/plugins directory.
Hopefully SUSE does the same in this respect. The updated spec file is
available here:

  http://folk.uio.no/trondham/software/tmp/nagios-plugins-openmanage.spec

PS. check_openmanage has been added to Fedora and EPEL, but there are
some SELinux issues. Until these are resolved I'll hold off pushing it
to stable, but it is available in testing.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] SELinux and RHEL6.2 preventing disk checks via NRPE

2011-12-09 Thread Trond Hasle Amundsen
Dennis Kuhlmeier kuhlme...@riege.com writes:

 Hello,

 after upgrading to RHEL6.2 I have problems checking some
 filesystems. Always the same three FS on all hosts, others work fine.

 /boot
 /home
 /var/log/audit

 $ ./check_nrpe -H backup -c check_fs_boot
 DISK CRITICAL - /boot is not accessible: Permission denied

 Now I disable SELinux and it works!
 $ ./check_nrpe -H backup -c check_fs_boot
 DISK OK - free space: /boot 36 MB (39% inode=99%);| /boot=55MB;96;;0;96

 Although not a single line is logged on the monitored host, neither
 in messages nor in audit.log

 I already had a local policy created for the nrpe daemon when RHEL6
 was introduced, as somehow many checks failed, although the user
 nrpe was running in was allowed to perform all checks, the nrpe
 daemon itself couldn't. I'll attach the policy, although at one
 point I gave up and just set the entire process to permissive mode.
 (note that I tried to extend rights on boot filesystem in this
 policy already, although it would seem to be unnecessary)

 Anybody experiencing something alike or any suggestions about how to
 handle nrpe and RHEL6(.2) in a better way than I am?

RHEL6 has the following labels for use with Nagios plugins:

  # grep nagios /etc/selinux/targeted/contexts/files/file_contexts | grep 
plugin_exec | cut -d: -f3 | sort -u
  nagios_admin_plugin_exec_t
  nagios_checkdisk_plugin_exec_t
  nagios_mail_plugin_exec_t
  nagios_services_plugin_exec_t
  nagios_system_plugin_exec_t
  nagios_unconfined_plugin_exec_t

Try setting the confined types first, e.g.:

  chcon -t nagios_checkdisk_plugin_exec_t /path/to/check_fs_boot

If none of them works properly, you have nagios_unconfined_plugin_exec_t
as a last resort.

When you find one that works, make it permanent with:

  semanage fcontext -a -t type '/path/to/check_fs_boot'

You may also have to set proper labels on the path leading up to the
actual plugin.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo


--
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage plugin: Couldn't run command ...

2011-11-17 Thread Trond Hasle Amundsen
Corcoran Smith corco...@flair4it.co.uk writes:

 First message, so please excuse any failures in format, etc!

 Got two issues with two boxes (out of 160!) using check_openmanage:

 1) Couldn't run command 'c:\pro... ' etc
 2) U nrecognized character xA8: marked by -- HERE after -- HERE near column 
 1 at /loader/HASH(0xa7c42c)/UNIVERSAL.pm line 1.

 both are using the windows exe

Hi Corcoran,

I'll need more data to debug the first issue, e.g. the full error
message from the plugin. Unless they appear on the same server(?), in
which case issue 1 is probably caused by issue 2.

Regarding issue 2, I've seen this once before. A disk was so damaged
that OMSA failed while getting info from it, and gave an error message
like above: unrecognized character This output is not something
that the plugin doesn't expect and couldn't possibly prepare for, so it
throws an error.

You need to identify the failed component, it probably needs to be
replaced. Try running 'omreport' commands to find it. Start with
'omreport storage pdisk controller=0'.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage on CentOS 5.6 Hosts

2011-11-05 Thread Trond Hasle Amundsen
the entrox ent...@stoleyour.com writes:

 i've been using the check_openmanage script to monitor about two dozens of 
 dell
 servers without a hitch (all Windows based) and we just set up about 15 or so
 new servers but this time running CentOS, i of course installed the OMSA via
 Dell's repository and also enabled SNMP but i cant seem to get the command to
 work on those hosts.

 i am trying to run the debug command to look at the entire output like this:

 [root@MONITOR02 plugins]# ./check_openmanage -H HOSTIP -C COMMUNITY -d
 ERROR: (SNMP) OpenManage is not installed or is not working correctly

This error means that the SNMP service on the monitored host is working
and we get a reply, but the OIDs for OMSA are not present.

 i of course checked where the omreport binary was at and its where the script
 is looking for it:

 [root@mvarutestvmbase01 ~]# find / -name omreport
 /opt/dell/srvadmin/sbin/omreport
 /opt/dell/srvadmin/bin/omreport
 [root@mvarutestvmbase01 ~]#

When using SNMP, the plugin doesn't utilize the omreport binary in any
way. It doesn't care where it is installed. BTW, the above location is
correct and is the default.

 just to double check i went ahead and looked if the OMSA was working, i went
 via web and the console shows up no problem at all, if i authenticate it shows
 all the information that it should be showing, i also restarted all the
 services on the OMSA just to see if something was up but nothing, it still
 claims its not working:

 http://pics.entrox.me/983ygh426g.png

This is interesting. The SNMP service wasn't started. You should see
something like this:

  Starting dsm_sa_snmpd: [  OK  ]

The dsm_sa_snmpd service is started by /etc/init.d/dataeng. This script
is also responsible for starting other components such as
dsm_sa_datamgrd, and that seems to work fine.

You should also see dsm_sa_snmpd in the process list if it's running:

  # ps axww | grep dsm_sa_snmpd 
   4967 ?Ssl0:00 /opt/dell/srvadmin/sbin/dsm_sa_snmpd

From what I can gather from the dataeng init script, it won't start
dsm_sa_snmpd if this file exists:

  /opt/dell/srvadmin/var/lib/srvadmin-deng/dcsnmp.off

If it exists on your system, try removing it and restart OMSA.

Also verify that your /etc/snmp/snmpd.conf contains the following at the
very end:

  # Allow Systems Management Data Engine SNMP to connect to snmpd using SMUX
  smuxpeer .1.3.6.1.4.1.674.10892.1

This should have been added by OMSA at install time.

 i also read on the man page of the script 
 (http://folk.uio.no/trondham/software
 /check_openmanage.html) that i could use the --omreport option but no dice 
 with
 that, even trying the bin and sbin omreport binary file i got the exact same
 message:

This option allows you to specify the location of the omreport
command. It has no effect when using SNMP, and is only really usable on
Windows systems, where OMSA can be installed on drives other than C:.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: OOPS! Something is wrong...

2011-09-30 Thread Trond Hasle Amundsen
Lois Garcia l...@rockyou.com writes:

 This is the output from omreport chassis pwrsupplies -fmt ssv:

 C:\Users\Administratoromreport chassis pwrsupplies -fmt ssv
 Power Supplies Information

 Power Supply Redundancy
 Redundancy Status;Lost

 Individual Power Supply Elements

 Index;Status;Location;Type;Rated Input Wattage;Maximum Output Wattage;Online
 Sta
 tus;Power Monitoring Capable
 0;Ok;PS 1 Status;AC;[No Value];[No Value];Presence Detected;Yes
 1;Ok;PS 2 Status;AC;1080 W;870 W;Presence Detected;Yes

Thanks. This shows that the plugin's behaviour was correct in my
opinion. OMSA states that both PSUs are OK, which is what the plugin
reports. There is a bug somewhere, but it is probably in OMSA. My guess
is that there is a rare and unknown error condition in PSU1, which OMSA
doesn't handle correctly.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: OOPS! Something is wrong...

2011-09-28 Thread Trond Hasle Amundsen
Lois Garcia l...@rockyou.com writes:

 Thank you, Trond! It looks like a power supply problem. I will take the issue
 to Dell:

 C:\Users\Administratoromreport system
 Health

 SEVERITY : COMPONENT
 Critical : Main System Chassis


 C:\Users\Administratoromreport chassis
 Health

 Main System Chassis

 SEVERITY : COMPONENT
 Ok   : Fans
 Ok   : Intrusion
 Ok   : Memory
 Critical : Power Supplies
 Ok   : Power Management
 Ok   : Processors
 Ok   : Temperatures
 Ok   : Voltages
 Ok   : Hardware Log
 Ok   : Batteries

Hmm... there is obviously something amiss with the power supplies, but
the plugin didn't catch it. I'd like to know why. Can you provide the
output from:

  omreport chassis pwrsupplies -fmt ssv

This is the command that the plugin runs to get the status of the power
supplies.

 Thank you also for putting such a great plugin into the
 community. Without it, monitoring the few Windows machines in our all
 Linux environment would have been a chore I don't care to contemplate.

Thank you, glad you like it :)

 I don't see a donation link on your website at http://folk.uio.no/trondham/
 software/check_openmanage.html - ?

No, there is no donation link, the thought never crossed my mind. I have
benefitted enormously (personally and professionally) from free and open
source software for many years. This is just my way of giving back.
Besides, I've found that creating and maintaining open source software
is by itself rewarding, in many different ways.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_openmanage: OOPS! Something is wrong...

2011-09-27 Thread Trond Hasle Amundsen
lois garcia l...@rockyou.com writes:

 I have check_openmanage running successfully on 13 out of 16 Dell R710s. 
 I am really puzzled at what is going wrong, as it seems different on each
 machine. I have tried different versions of check_openmanage and 
 reinstalling the same version of Dell OMSA.

 The first eight servers were built from the same Ghost image, and last 
 month, one of those servers started showing the check_openmanage error:

 UNKNOWN 09-13-2011 17:04:23 7d 1h 7m 54s 4/4 
 UNKNOWN: Storage Error! No
 controllers found
 UNKNOWN: Problem running 'omreport chassis memory': 
 Error: Memory object not found
 UNKNOWN: Problem running 'omreport chassis fans': 
 Error! No fan probes found on
 this system.
 UNKNOWN: Problem running 'omreport chassis temps': 
 Error! No temperature probes
 found on this system.
 UNKNOWN: Problem running 'omreport chassis volts': 
 Error! No voltage probes
 found on this system.

 I reinstalled the Dell software, fixing the UNKNOWN error, and now have 
 this error:

 OOPS! Something is wrong with this server, but I don't know what. The 
 global system health status is CRITICAL, but every component check is 
 OK. This may be a bug in the Nagios plugin, please file a bug report. 

 The server is a Dell R710, running Windows Server 2008 R2 Enterprise.

Hello Lois,

(I shortened the subject)

When the plugin is used in local mode, as in your case, the plugin
checks the global health status using this command:

  # omreport system
  Health
  
  SEVERITY : COMPONENT
  Ok   : Main System Chassis
  
  For further help, type the command followed by -?

If everything is OK you'll get the output above. What do you get when
running this command on the troubled server?

Does the ESM log contain any clues? Try running 'omreport system esmlog'
and see. Try running 'omreport chassis' as well.

There are two possible causes for the oops error. Either Openmanage
isn't behaving properly, or your server has an error that the plugin
doesn't catch.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] home made php script

2011-08-14 Thread Trond Hasle Amundsen
Erik Olsen nag...@elitdata.no writes:

 I've been trying to make my own script now for a few hours but im not
 getting it to work with nagios.
 Im most familiar with php so I used that to make the script.

 My setup:
 Ubuntu 11.4 server
 Nagios 3.2.3

 The host/command/and service are all in the same .cfg file.

   define command{
   command_name check_ups_temprature2
   command_line $USER$/check_ups_temp.php
 }

 define service{
 use generic-service
 host_name   ups1
 service_description Temp ups env sensor
 check_command   eaton_ups_temp
 }

 Status Information  (Return code of 127 is out of bounds - plugin may be
 missing)

Hi Erik,

There is a typo on the command_line line. The $USER$ macro doesn't
exist. There are 32 possible user macros, named $USER1$ through
$USER32$. Try replacing $USER$ with $USER1$, or simply the actual path
leading up to the plugin.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
FREE DOWNLOAD - uberSVN with Social Coding for Subversion.
Subversion made easy with a complete admin console. Easy 
to use, easy to manage, easy to install, easy to extend. 
Get a Free download of the new open ALM Subversion platform now.
http://p.sf.net/sfu/wandisco-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] omreport and check_openmanage

2011-07-06 Thread Trond Hasle Amundsen
Emilio Bruna emilio.br...@heliman.it writes:

 Thanks a lot for your hints Trond,
 check_openmanage is already at latest version.

 We will try with an OMSA update first and then (if the issue persist)
 we will update BIOS too.

If all else fails, you have the option of disabling the power management
check completely, by using '--check amperage=0':

  check_openmanage --check amperage=0

By using this option you're telling the plugin that it shouldn't even
attempt to run 'omreport chassis pwrmonitoring'.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] omreport and check_openmanage

2011-07-04 Thread Trond Hasle Amundsen
Emilio Bruna emilio.br...@heliman.it writes:

 Omsa version is 6.2.0.1
 so: windows 2008 storage server SP2
 Hardware is Dell NX 300 Storage server (a derivate of R410 or R310 i think)

This combination should be ok. I don't know the NX300, but if it's based
on the R310 or R410 it shouldn't be a problem. There was a bug in
check_openmanage related to power monitoring on the R410, but this was
fixed in version 3.6.5 of the plugin. Are you using the latest version
of check_openmanage, which is 3.6.8?

Also, would it be possible for you to upgrade OMSA to the latest
version, 6.5.0?

This really is an OMSA issue. If the power supplies don't support power
monitoring, omreport should just that say that and check_openmanage is
happy. But in your case, OMSA is responding with an error.

One last tip. In some cases I've seen that certain capabilities in OMSA
depends on BIOS and/or firmware versions. You should verify that the
BIOS and firmware is relatively up-to-date.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo


--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage error on W2k8r2 Dell R900

2011-07-03 Thread Trond Hasle Amundsen
Jay Wahl j...@firewahl.com writes:

 Love check_openmanage plugin for Nagios! It has been a great help for
 monitoring our Dell hardware. I recently built 3 Dell 900s with W2K8r2 with
 check_openmanage (v 3.6.8) and Dell OMSA (v 6.5.0).

Hi Jay,

Are you completely sure that you're using version 3.6.8? My reason for
asking is that the errors you get don't make sense (details below).

 I am getting the following errors:
 C:\Program Files\NSClient++\scriptscheck_openmanage
 Problem running 'omreport chassis memory': Error Correction;Multibit ECC

This was fixed a while back (version 3.6.3 IIRC).

The Error Correction field appeared in OMSA 6.4.0 and check_openmanage
triggers on strings containing Error. The particular string above
obviously does not indicate an actual error and was put in the whitelist
for errors shortly after OMSA 6.4.0 was released.

 INTERNAL ERROR: Use of uninitialized value in concatenation (.) or string at
 script/check_openmanage line 1650.
 INTERNAL ERROR: Use of uninitialized value in concatenation (.) or string at
 script/check_openmanage line 1650.

These two don't make any sense, since line 1650 only contains a comment.
They are also probably not related to the memory check.

Please verify the version of check_openmanage. The plugin will output
its version number with either of these options:

  check_openmanage -V
  check_openmanage -d

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] omreport and check_openmanage

2011-07-01 Thread Trond Hasle Amundsen
Emilio Bruna emilio.br...@heliman.it writes:

 Hello all,
 i'm monitoring several Dell windows servers with nagios and NSClient++
 and OMSA + check_openmanage.  On one of these, i'm getting a problem
 monitoring the redundant power supplies.

 Running the command below LOCALLY on the machine being monitored i got
 the right data from omreport.exe:

 c:\Program Files (x86)\Dell\SysMgt\oma\binomreport.exe chassis pwrsupplies
 Power Supplies Information

 ---
 Main System Chassis Power Supplies : Ok
 ---

 Power Supply Redundancy : Ok
 Attribute : Redundancy Status
 Value : Full
 Individual Power Supply Elements
 Index    : 0
 Status   : Ok
 Location : PS 1 Status
 Type : AC
 Rated Input Wattage  : 680 W
 Maximum Output Wattage   : 500 W
 Online Status    : Presence Detected
 Power Monitoring Capable : Yes

 Index    : 1
 Status   : Ok
 Location : PS 2 Status
 Type : AC
 Rated Input Wattage  : 680 W
 Maximum Output Wattage   : 500 W
 Online Status    : Presence Detected
 Power Monitoring Capable : Yes

 running the below command (the ones needed to check_openmanage):

 c:\Program Files 
 (x86)\Dell\SysMgt\oma\binc:\Users\administrator.CMVC\Desktop\
 check_openmanage.exe --omreport c:\Program Files (x86)\Dell\SysMg
 mreport.exe
 Problem running 'omreport chassis pwrmonitoring': Error: Current probes not
 found

 i've noticed that the switches coming from check_openmanage are
 slightly different from the ones passed from omreport.exe (omreport
 chassis pwrmonitoring instead of omreport chassis pwrsupplies)

 so it seems that check_openmanage has the wrong switches regard to the
 powermonitoring check status; or maybe the omsa version i'm using is
 not at the correct version to work in the right way with
 check_openmanage.

Hi Emilio,

Don't confuse the two arguments 'pwrsupplies' and 'pwrmonitoring'. They
do different things, and check_openmanage uses both of them. It runs
'omreport chassis pwrsupplies' to get the status of the power supplies,
and it runs 'omreport chassis pwrmonitoring' to get the status and value
of the amperage probes. The latter includes the overall power
consumption of the server.

In your case, it's the 'pwrmonitoring' command that fails. This is a
known problem with some older versions of OMSA. Which version of OMSA
are you running, and on what kind of PowerEdge server?

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo
--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Check_Openmanage configuration question

2011-05-10 Thread Trond Hasle Amundsen
Daniel Ceola dce...@twgi.net writes:

 Hello all!

Hi Daniel,

 I have a question regarding the initial configuration of
 check_openmanage.  I downloaded the version of the script dated Feb 9
 (I don?t see a version number in the script) and am attempting to use
 the script through SNMP.

Tip: Run the plugin with the '-V' or '--version' switch to view the
version number.

 I?m attempting to begin using check_openmanage with our Dell servers.
 I have installed the Dell OMSA software on one server and it seems to
 be working just fine.  I configured my command definition in a simple
 fashion, according to the installation guide:

 #  Dell Check openmanage

 define command{
 command_namecheck_openmanage
 command_line$USER1$/check_openmanage -H $HOSTADDRESS$
 }

 I also configured my service definition in a simple fashion, according
 to the installation guide:

 define service{
 use generic-service
 host_name   Server_Name
 service_description Dell OMSA
 check_command   check_openmanage
 }

This looks correct to me.

 However ? my Nagios console is reporting the status as (null).  Also,
 when I attempt to run the script from the command line (note the file
 is saved as check_openmanage with no file extension, I also tried
 check_openmanage.pl and receive the same results), I receive a few
 errors

 nagios@UbuntuTest:/usr/local/nagios/libexec$ ./check_openmanage 192.168.1.5
 ./check_openmanage: line 27: require: command not found
 ./check_openmanage: line 28: use: command not found
 ./check_openmanage: line 29: use: command not found
 ./check_openmanage: line 30: syntax error near unexpected token `('
 ./check_openmanage: line 30: `use POSIX qw(isatty ceil);'

Weird. Your system seem to be running the plugin through a shell. The
output above is exactly what you'll get if you run

  sh ./check_openmanage

To specify perl as interpreter, run:

  perl ./check_openmanage

However, this should not be needed. The system should identify it as a
perl script and use perl to execute it by default. Have you edited the
plugin in some way? Check that the md5sum is correct:

  $ md5sum check_openmanage
  5281718fe9e5c4b9570fe76f0fb424ec  check_openmanage

The above sum is correct for version 3.6.6. You should verify that you
get the same (if running 3.6.6). The latest version and its md5sum are
available here:

  http://folk.uio.no/trondham/software/check_openmanage.html#download

PS. In your example above you have forgotten the '-H' switch.

PPS. The file extension (or the name itself) is unimportant.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage errors

2011-04-28 Thread Trond Hasle Amundsen
Steve Glasser sglas...@visp.net writes:

 That combination should work just fine. Please try either of the beta
 versions, as I suggested in my previous email. The issue you're having
 may very well be fixed in the betas.

 Tried check_openmanage-3.7.0-beta2.0-beta2, problem solved.

Excellent, thanks for testing and reporting back. I've just released
versjon 3.6.6, which contains the same bugfixes as the 3.7 beta, but not
the (unfinished) new features :)

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] check_openmanage PNP template (Was: check_openmanage errors)

2011-04-28 Thread Trond Hasle Amundsen
Randal, Phil pran...@herefordshire.gov.uk writes:

 Is the beta of check_openmanage.php available for testing?

Sure, I put it here:

  http://folk.uio.no/trondham/software/beta/

Highlights of the template are:

  - works with the plugin's new perfdata API
  - removed unnecessary dependence on PHP = 5.2 (good for rhel/centos 5
users)
  - calculate power usage for the selected time period, in Watt hours
and BTU

 I'm currently using a slightly modified version of the one in the latest PNP 
 release.

 Two cosmetic issues came to mind:

 1: Temperature is measured in Celsius, not Celcius

Yep, I know. That typo was the first thing I fixed :)

 2: Formatting when reporting multiple sensors in one graph is irksome
 - the values don't align in a nice column (e.g. temperatures).  I
 'solve' this by a judicious use of substr() and str_pad() to normalise
 the length of reported sensor names.

Hm... this could be tricky to do in a consistent and general manner (at
least the substr() part). The sensor names are as reported by
OMSA. Perhaps this could be accomplished with some RRD magic instead?

Tips and hints are welcome, since I'm neither a PHP expert nor an RRD
ninja :)

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage

2011-04-27 Thread Trond Hasle Amundsen
Ashcor Technologies ash...@optonline.net writes:

 ok, talked to dell, there is no hardware on the T105 that will allow
 monitoring of the fan, voltage, etc.. basically the only thing you can
 monitor is the raid array which is fine as that's all I really want to
 check with nagios.

Ok. I don't know the 100 series, but from what I understand they are
entry-level servers with limited capabilities and a low price tag. The
plugin will barf at servers that don't have the basic monitoring probes,
unless they are absent for obvious reasons (e.g. blades don't have
fans). I still think this is a good idea, as I've seen plenty of
instances where OMSA malfunctions in such a way that it will say a probe
doesn't exist when it actually does.

I'm reluctant to change that policy, so users of the 100 series will
have to exclude certain checks in the plugin. It is not ideal, but I
believe the problem to be limited since most would go for servers with
better monitoring capabilities (i.e. 200 series and beyond).

 Still have that pesky timeout after 30 seconds error though.  tried
 with --timeout 60 and with -t 60 and nothing seems to change the
 behavior.

Still weird. Did you try running the plugin manually with the timeout
option? Try 'check_openmanage.exe -t 60 [other options]'

Perhaps OMSA on the T105 hangs on some probe that doesn't exist. If
you're only interested in monitoring storage, you could try:

  check_openmanage.exe --only storage

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check-openmanage errors after upgrade of openmanage

2011-04-27 Thread Trond Hasle Amundsen
Trond Hasle Amundsen t.h.amund...@usit.uio.no writes:

 Are you using check_openmanage with NRPE or similar in local mode, or
 checking via SNMP?

I have an idea of what the problem might be. Can you try either of the
development versions of check_openmanage available here:

  http://folk.uio.no/trondham/software/check_openmanage.html#download

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check-openmanage errors after upgrade of openmanage

2011-04-27 Thread Trond Hasle Amundsen
Ashcor Technologies ash...@optonline.net writes:

 I ran check_openmanage.exe --only storage locally and it worked fine.

 I then changed the NSC.ini to have:

 command[check_openmanage]=check_openmanage.exe --only storage

 and restarted the NSCLient++ (x64) service in test mode.

 the results:

 d NSClient++.cpp(1106) Injecting: check_openmanage:
 d NSClient++.cpp(1142) Injected Result: WARNING 'Problem running 
 'omreport chass is fans': Error! No fan probes found on this 
 system.br/Problem running 'omreport chassis temps': Error! No 
 temperature probes found on this system.br/Proble m running 'omreport 
 chassis volts': Error! No voltage probes found on this system.'

Ok, this actually clarifies things. Clearly, NSClient++ ignores
everything after 'check_openmanage.exe' in your NSC.ini. There is no way
that check_openmanage would complain about fans etc. when the option
'--only storage' is specified. Since it works from command line we can
safely assume that NSClient++ is the problem. This explains your issues
with the timeout option as well.

 I've looked on your site for the dev versions and am happy to try them 
 but don't see a zip with the .exe.  Is there an .exe available for the 
 dev?  also, which dev version would you prefer I try, 3.6 or 3.7?

I could make a PE32 executable for the dev versions, but in your case it
won't help, so there is really no point. Your problem is that NSClient++
ignores the plugin options.

Since I don't use NSClient++ I can't offer any insight into how it
should be configured, but my first attempt at a fix would be to put the
entire command in quotes in NSC.ini.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage errors

2011-04-27 Thread Trond Hasle Amundsen
Steve Glasser sglas...@visp.net writes:

 D'oh.  We are using check_openmanage with NRPE.  The host o/s is CentOS 
 release 5.5.  Perl is perl-5.8.8 (from rpm).

That combination should work just fine. Please try either of the beta
versions, as I suggested in my previous email. The issue you're having
may very well be fixed in the betas.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage

2011-04-26 Thread Trond Hasle Amundsen
Ashcor Technologies ash...@optonline.net writes:

 on two of my dell servers check_openmanage (via nsclient++ and nrpe) 
 return the same error:

 Use of uninitialized value in concatenation (.) or string at 
 script/check_openmanage.pl line 1386.

 both dell systems are running the latest OpenManage version 6.5.0.

Hi Jeff,

Which version of check_openmanage is this?

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage

2011-04-26 Thread Trond Hasle Amundsen
Ashcor Technologies ash...@optonline.net writes:

 Thanks for the reply.  I just realized from your question that I'm using 
 a pre-compiled .exe version of your check_openmanage from here:
  
 https://www.monitoringexchange.org/inventory/Check-Plugins/Hardware/check_openmanage-exe

 which was probably created from an older version...

Yeah I think it's pretty old. A PE32 executable for Windows is available
in the zip and tar.gz archives, and as a single file download:

  http://folk.uio.no/trondham/software/check_openmanage.html#download

Upgrading to the latest version will probably solve your problem. Let me
know if it doesn't.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage

2011-04-26 Thread Trond Hasle Amundsen
Ashcor Technologies ash...@optonline.net writes:

 now my problem is this...

 Problem running 'omreport chassis fans': Error! No fan probes found on 
 this system.br/Problem running 'omreport chassis temps': Error! No 
 temperature probes found on this system.br/Problem running 'omreport 
 chassis volts': Error! No voltage probes found on this system.

 on the NSC.ini i have the following line added and I restarted the 
 NSClient++ service

 command[check_openmanage]=check_openmanage.exe -b fan=all

 even tried

 command[check_openmanage]=check_openmanage.exe -b fan=0

 however it still tries to check the fan.  I suppose i have a syntax 
 error?

No, that is the correct syntax. Blacklisting won't prevent the component
class from being checked in the first place, it will only suppress any
info about blacklisted components it in the output and plugin return
value. To skip fans alltogether use the '--check' option like this:
'--check fans=0'.

However, unless this is a blade system and the plugin is unable to
identify it as such for some reason, your server HAS fan probes and
you're having an OMSA problem. The fact that you get errors for other
probes such as temperature and voltage confirms this.

You need to recheck that OMSA works, that all relevant OMSA components
are installed and running etc. It may be as simple as restarting OMSA,
but it could also be more complex (e.g. BIOS/firmware upgrade needed).
These errors are pretty generic, but the problem is that OMSA isn't
working properly on that server.

PS. See this URL about configuring Nagios to not escape HTML code in the
plugin output (to avoid the literal 'br/'):

  http://folk.uio.no/trondham/software/check_openmanage.html#multiline-output

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage

2011-04-26 Thread Trond Hasle Amundsen
Ashcor Technologies ash...@optonline.net writes:

 Ok, now new and exciting changes... no matter what I do I get: WARNING 
 PLUGIN TIMEOUT: check_openmanage timed out after 30 seconds.

 I have -t 60 set on the check_openmanage command and also on the NRPE 
 check command line and in the NSC.ini.  nothing seems to change the 
 timout beyond 30 seconds.

I forgot to mention that since you get that particular error it's the
plugin that times out, not NRPE or NSClient++. The fact that you're
unable to change that behaviour with the '-t' or '--timeout' option is
strange, but it would usually indicate a configuration error on your
part. You'll have to post the command definition etc. for me (and others
on this list) to be able to spot the error.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage

2011-04-26 Thread Trond Hasle Amundsen
Ashcor Technologies ash...@optonline.net writes:

 the server is a PowerEdge T105.  It IS running slow but I'll be damned 
 if I can figure out why, I'm beggining to suspect bad ram as the 
 performance meter reports minimal load.

One thing to check is the power management setting in the BIOS. We set
up a few blade servers recently that had set this to active power
controller, and this caused the server to be extremely
sluggish. Setting this to OS Control or Maximum Performance solved
the issue. Try:

  # omreport chassis pwrmanagement config=profile
  Power Profiles
  
  Maximum Performance : Not Selected
  Active Power Controller : Not Selected
  OS Control  : Selected
  Custom  : Not Selected

You can set the profile to max performance with:

  omconfig chassis pwrmanagement config=profile profile=maxperformance

Just a tip, but worth checking.

 here is the command line in the NSC.ini

 [modules]
 command[check_openmanage]=check_openmanage.exe -t 60 --check 
 fans=0,volt=0

 on the nagios server:

 /usr/lib/nagios/plugins/check_nrpe -H $hostname$ -p 5666 -c 
 check_openmanage -t 60

 I'm pretty sure it's not the Check_nrpe command line as this works fine 
 on several other servers.  it's def something on the client server 
 itself so this points to the NSClient++ setup.

Can't see anything wrong with these definitions..

 note I have been testing by running NSClient++.exe /test so i can watch 
 the client server and it is getting the injection command and reporting 
 the timeout locally.

Good. But it's still weird that you get a timeout after 30 seconds even
when you specify a 60 sec timeout. Try running check_openmanage.exe
manually on the server with the same options and see if it then behaves
in the same way. If so there is some sort of bug in the plugin that only
affects the .exe version.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check-openmanage errors after upgrade of openmanage

2011-04-26 Thread Trond Hasle Amundsen
Steve Glasser sglas...@visp.net writes:

 Since upgrading dell openmanage from v 6.3 to 6.5 we have errors using 
 the check-openmanage plugin.  The errors are:

 INTERNAL ERROR: Use of uninitialized value in hash element at 
 /usr/lib64/nagios/plugins/check_openmanage line 4599.
 INTERNAL ERROR: Use of uninitialized value in length at 
 /usr/lib64/nagios/plugins/check_openmanage line 4599.
 INTERNAL ERROR: Use of uninitialized value in hash element at 
 /usr/lib64/nagios/plugins/check_openmanage line 4599.
 INTERNAL ERROR: Use of uninitialized value in concatenation (.) or 
 string at /usr/lib64/nagios/plugins/check_openmanage line 4599.
 INTERNAL ERROR: Use of uninitialized value in hash element at 
 /usr/lib64/nagios/plugins/check_openmanage line 4601.
 INTERNAL ERROR: Use of uninitialized value in hash element at 
 /usr/lib64/nagios/plugins/check_openmanage line 4601.
 INTERNAL ERROR: Use of uninitialized value in hash element at 
 /usr/lib64/nagios/plugins/check_openmanage line 4599.
 INTERNAL ERROR: Use of uninitialized value in length at 
 /usr/lib64/nagios/plugins/check_openmanage line 4599.
 INTERNAL ERROR: Use of uninitialized value in hash element at 
 /usr/lib64/nagios/plugins/check_openmanage line 4599.
 INTERNAL ERROR: Use of uninitialized value in concatenation (.) or 
 string at /usr/lib64/nagios/plugins/check_openmanage line 4599.
 INTERNAL ERROR: Use of uninitialized value in hash element at 
 /usr/lib64/nagios/plugins/check_openmanage line 4601.
 INTERNAL ERROR: Use of uninitialized value in hash element at 
 /usr/lib64/nagios/plugins/check_openmanage line 4601.
 INTERNAL ERROR: Use of uninitialized value in hash element at 
 /usr/lib64/nagios/plugins/check_openmanage line 4599.
 INTERNAL ERROR: Use of uninitialized value in length at 
 /usr/lib64/nagios/plugins/check_openmanage line 4599.
 INTERNAL ERROR: Use of uninitialized value in hash element at 
 /usr/lib64/nagios/plugins/check_openmanage line 4599.
 INTERNAL ERROR: Use of uninitialized value in concatenation (.) or 
 string at /usr/lib64/nagios/plugins/check_openmanage line 4599.
 INTERNAL ERROR: Use of uninitialized value in hash element at 
 /usr/lib64/nagios/plugins/check_openmanage line 4601.
 INTERNAL ERROR: Use of uninitialized value in hash element at 
 /usr/lib64/nagios/plugins/check_openmanage line 4601.

 The plugin reports status unknown.

 Openmanage is version check-openmanage-3.6.5-1.el5 installed from rpm. 
 The host is an dell 2950.  Please let me know if I can provide any 
 additional information.

Hi Steve,

Are you using check_openmanage with NRPE or similar in local mode, or
checking via SNMP?

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Why is check_openmanage so slow on PowerEdge R510?

2011-04-14 Thread Trond Hasle Amundsen
Helmut Wollmersdorfer helmut.wollmersdor...@fixpunkt.de writes:

 Another question:

 I always get on all of the R510s (few days old):

 root@xen11:~# /usr/lib/nagios/plugins/check_openmanage
 Cache Battery 0 in controller 0 is Charging (Ready) [probably harmless]
 root@xen11:~# uptime
  12:08:35 up 2 days,  1:22,  1 user,  load average: 0.00, 0.00, 0.00

 I wonder a little bit that the batteries are not full after some days powered,
 or if the information is wrong.

The plugin is simply reporting what OMSA says, so if the info is wrong
it would have to be in the hardware or OMSA level. However I don't think
that this is the case. Batteries take a long time to charge for new
servers, i.e. if the battery is brand new and hasn't been charged
before.

At one time we had a battery that didn't finish charging for a week,
called Dell and got a replacement battery. This was during a regular
charge cycle. In your case I would give it a few more days.

 Also I tried to '--blacklist bat_charge=0,0' (and other combinations), but
 blacklisting does not work.

Look in the debug output for the battery ID, which consists of the
controller number and battery number with colon as delimiter. In your
case it would be

  --blacklist bat_charge=0:0

or simply use 'all':

  --blacklist bat_charge=all

But, as we in fact did experience a case where the battery never
finished charging I would advice against this. We just ignore the
battery charge warnings unless they persist for days. It can be
annoying, but we decided that we can live with it :)

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Why is check_openmanage so slow on PowerEdge R510?

2011-04-14 Thread Trond Hasle Amundsen
C. Bensend be...@bennyvision.com writes:

Is there anything in OMSA that tells how *long* a battery has
 been charging?  I simply got so tired of the charging warnings
 that I blacklisted the bat_charge totally, but I'd still like to
 detect that type of error - where the battery never finishes
 charging.

If OMSA has it, it would be great to have the option within
 check_openmanage to specify a length of time threshold for battery
 charging.  :)

Hi Benny,

Unfortunately OMSA has no info on when the charge cycle is expected to
be finished, or how long it has been in its current learn/charge state:

  # omreport storage battery controller=1
  Battery 0 on Controller PERC 6/E Adapter (Slot 1)
  
  Controller PERC 6/E Adapter (Slot 1)
  ID: 0
  Status: Non-Critical
  Name  : Battery 0
  State : Charging
  Recharge Count: Not Applicable
  Max Recharge Count: Not Applicable
  Predicted Capacity Status : Ready
  Learn State   : Requested
  Next Learn Time   : 0 hours
  Maximum Learn Delay   : 7 days 0 hours
  Learn Mode: Auto

I could make the plugin record it, but then I would violate my principle
that the plugin should be stateless... Introducing state in the plugin
complicates things.

There is another reason that you would want to know that the battery is
charging, and I suspect that this is also why Dell has OMSA report it as
a non-critical (warning) status. During (some of) the charge cycle,
write-back for vdisks (i.e. use of the cache) is disabled. This means
that the RAID performance is degraded, and depending on the nature of
your disk usage you'll want to know about this when it happens. OMSA
also lets you delay the charge cycle for up to seven days.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Why is check_openmanage so slow on PowerEdge R510?

2011-04-13 Thread Trond Hasle Amundsen
Helmut Wollmersdorfer helmut.wollmersdor...@fixpunkt.de writes:

 new to this architecture I installed the monitoring plugin check- 
 openmanage and was surprised about the performance:

 root@xen10:~# time perl /usr/lib/nagios/plugins/check_openmanage  -d |  
 head -n 3
 sh: /bin/rpm: not found
 System:   PowerEdge R510 II   OMSA version:   
  6.5.0
 ServiceTag:   1Z7215J Plugin version: 
  3.6.5
 BIOS/date:1.6.3 02/01/2011Checking mode:  
  local

 real  0m3.426s
 user  0m2.456s
 sys   0m0.544s

 OS: Debian
 root@xen10:~# uname -a
 Linux xen10 2.6.32-5-xen-amd64 #1 SMP Tue Mar 8 00:01:30 UTC 2011  
 x86_64 GNU/Linux

 Most calls of check_openmanage (from the shell) take 3 - 4 seconds,  
 some with '--only' are faster, but not as fast as omreport:

 root@xen10:~# time perl /usr/lib/nagios/plugins/check_openmanage  -- 
 only fans
 FANS OK - 5 fan probes checked

 real  0m0.716s


 root@xen10:~# time /opt/dell/srvadmin/bin/omreport chassis fans
 Fan Probes Information

 Fan Redundancy
 Redundancy Status : Full
 [...]

 real  0m0.037s

 In comparison called with the option --help (does nearly nothing) the  
 execution time is as expected for loading the perl interpreter and  
 compiling the source:

 root@xen10:~# time perl /usr/lib/nagios/plugins/check_openmanage  -h
 [...]
 real  0m0.064s

 What can be the reason?

Hi Helmut,

The simple answer is that omreport commands take time. They represent
the vast majority of the plugin execution time.

The reason that 'check_openmanage --only fans' takes significantly more
time than the corresponding omreport command is that the plugin first
runs 'omreport -?' to determine if this is a blade or not. If you add
the time it takes to run 'omreport -?', the omreport fans command and
perl interpreter time you should arrive at about the time it takes
'check_openmanage --only fans' to finish.

Note that storage takes time to check, since the omreport commands for
storage are slow. This is especially true if you have a lot of storage
(e.g. an R510).

Also note that if you use the '-d' option, check_openmanage will run
'omreport about' to determine the OMSA version. This is a slow command
and adds to the overall execution time.

The plugin is much faster if used in SNMP mode, especially if you lots
of storage. Example from a 2950 with a couple of MD1000 shelves of extra
storage:

  $ time ./check_openmanage -H foo
  OK - System: 'PowerEdge 2950 III', SN: 'XXX', 16 GB ram (8 dimms), 3
  logical drives, 32 physical drives
  
  real0m1.725s
  user0m0.397s
  sys 0m0.013s
  
  foo /# time /usr/lib64/nagios/plugins/check_openmanage 
  OK - System: 'PowerEdge 2950 III', SN: 'XXX, 16 GB ram (8 dimms), 3
  logical drives, 32 physical drives
  
  real0m4.188s
  user0m2.997s
  sys 0m0.821s

As you can see the footprint is significantly smaller with SNMP, so if
this is a concern then SNMP should be your weapon of choice :)

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Forrester Wave Report - Recovery time is now measured in hours and minutes
not days. Key insights are discussed in the 2010 Forrester Wave Report as
part of an in-depth evaluation of disaster recovery service providers.
Forrester found the best-in-class provider in terms of services and vision.
Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage internal error

2011-03-09 Thread Trond Hasle Amundsen
Adam Caines acai...@lab.icc.edu writes:

 Looks like it's reporting path health.  The 6e has both sas ports
 connected to redundant controllers in the MD1120.  It's strange on
 another server, I also have a PERC H700 connect to a MD1220 with
 redundant links and it does not output the path health section.

[snip]

 ID             : 0
 Status         : Ok
 Name           : Logical Connector
 State          : Ready
 Connector Type : SAS Port RAID Mode
 Termination    : Not Applicable
 SCSI Rate      : Not Applicable

 Path Health
 Status : Ok
 Name   : Connector 0
 State  : Available

 Status : Ok
 Name   : Connector 1
 State  : Available

Yes, so this is the culprit... check_openmanage did not expect this
output. It looks like the controller is connected to the enclosure in
redundant path mode, according to the OMSA documentation[1]. I really
need to see how this looks with SSV format, can you provide the output
from this command:

  omreport storage connector controller=1 -fmt ssv

In case of redundant path mode, the plugin should check the path health
and report on it, in addition to the connector health. This
functionality must be added to the plugin.

Is it possible for you to check how check_openmanage handles this when
checking via SNMP as well?

[1] 
http://support.euro.dell.com/support/edocs/software/svradmin/6.4/en/CLI/HTML/reportst.htm#wp1077100

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_openmanage internal error

2011-03-08 Thread Trond Hasle Amundsen
Adam Caines acai...@lab.icc.edu writes:

 Having a strange problem with check_openmanage.  Use it without error on many
 other systems.  Any help would be appreciated.

 check_openmanage version: 3.6.5 (.exe version)
 Dell OMSA version: 6.4.0
 OS: Windows Server 2008 R2
 Hardware: Poweredge 1950 with PERC 6/i and PERC 6/e connected to MD1120

 

 check_openmanage output:

 OK - System: 'PowerEdge 1950 III', SN: 'XXX', 8 GB ram (4 dimms), 2 
 logical
 drives, 28 physical drives
 INTERNAL ERROR: Use of uninitialized value in numeric lt () at script/
 check_openmanage line 4634.
 INTERNAL ERROR: Use of uninitialized value in numeric lt () at script/
 check_openmanage line 4634.
 INTERNAL ERROR: Use of uninitialized value in numeric lt () at script/
 check_openmanage line 4634.
 INTERNAL ERROR: Use of uninitialized value in numeric lt () at script/
 check_openmanage line 4634.
 INTERNAL ERROR: Use of uninitialized value in numeric lt () at script/
 check_openmanage line 4634.
 INTERNAL ERROR: Use of uninitialized value in numeric lt () at script/
 check_openmanage line 4634.
 INTERNAL ERROR: Use of uninitialized value $level in numeric eq (==) at 
 script/
 check_openmanage line 4637.
 INTERNAL ERROR: Use of uninitialized value $level in numeric eq (==) at 
 script/
 check_openmanage line 4637.
 INTERNAL ERROR: Use of uninitialized value $level in numeric eq (==) at

 

 If I run check_openmanage --no-storage the errors are not present:

Hi Adam,

Interesting. This is the status of the device (as reported by omreport)
that is garbled somehow. The plugin will set the status to 'Unknown' if
the field is missing or empty, so this means that omreport is reporting
the status as something new that check_openmanage doesn't recognize.

That you're getting so many of them (and you have established that it's
a storage issue), makes me think that it is related to physical disks.

We need to see what omreport says about storage, in particular the disk
drives. Can you send the output from

  omreport storage pdisk controller=X

where 'X' is the controller number (0,1) , for each of the controllers.
If the Status field is 'Ok' for all the disks, we need to look further.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_openmanage internal error

2011-03-08 Thread Trond Hasle Amundsen
Adam Caines acai...@lab.icc.edu writes:

 Looks like some strange output on the lines for controller 1?  The
 formatting is breaking there.  I checked omreport storage controller
 and didn't see anything that stood out as being strange.

[snip]

       OK |      0:0 | Connector 0 [SAS Port RAID Mode] on controller 0 is 
 Ready
       OK |      0:1 | Connector 1 [SAS Port RAID Mode] on controller 0 is 
 Ready
       OK |      1:0 | Logical Connector  [SAS Port RAID Mode] on controller 1 
 is Ready
          | 1:Status | State [Name] on controller 1 is Status
          |     1:Ok | Available  [Unknown type] on controller 1 is Unknown 
 state
          |     1:Ok | Available  [Unknown type] on controller 1 is Unknown 
 state

Ok, something strange going on here. This seems to be a parsing error in
the plugin, related to the connectors. As I don't have any MD1120
enclosures, I'm curious if these errors are related to the MD1120 being
different somehow.

Can you send the output from these commands:

  omreport storage connector controller=0
  omreport storage connector controller=1

and also:

  omreport storage connector controller=0 -fmt ssv
  omreport storage connector controller=1 -fmt ssv

The latter is what the plugin is using as it is easier to parse.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Error in performance-data-output

2011-03-07 Thread Trond Hasle Amundsen
Lichterfeld, Dirk dirk.lichterf...@enercity.de writes:

 I compare the response time of the nagios check and I see, that the DELL
 Server R710 needs over 10 seconds to answer. Another server (DELL R310)
 answer in 8 seconds (the check of this server is ok.)

 The response time depends on various Dell hardware.

Yes, this is expected when using the win32 binary file. It contains a
perl interpreter and is slow to start up and execute. When monitoring
windows machines, SNMP is preferable unless your security policies
prohibits this.

 What I do? I expanded the check-command of the check_openmange from
 check_nrpe -H $HOSTADDRESS$ -c Check_Openmanage with the parameter -t
 30 to extend the time for this check. 

30 seconds is the default timeout for check_openmanage. I would set the
timeout to slightly more than the check_openmanage timeout. If you do
that, you'll get a meaningful error message from check_openmanage
instead of a cryptic one from NSClient++, if check_openmanage times out
for some reason.

Anyway, the '-t 30' parameter to check_nrpe should work...

 Is there another way to set the timeout?

I'm not familiar with NSClient++, perhaps it has its own timeout?

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Error in performance-data-output

2011-03-02 Thread Trond Hasle Amundsen
Lichterfeld, Dirk dirk.lichterf...@enercity.de writes:

 Hi Trond,

 I´m sorry, at my company we use Outlook, so the highlighted text is 
 distinctly and visibly.

 I will try to specify the problem I mean.

 If I run NSClient++ in testmode I will get the follow output:

   d NSClient++.cpp(1106) Injecting: Check_OpenManage:
   d NSClient++.cpp(1142) Injected Result: OK 'OK - System: 'PowerEdge 
 R710 II', SN: 'XXX', 4 GB ra
   m (2 dimms), 1 logical drives, 4 physical drives'
   d NSClient++.cpp(1143) Injected Performance Result: 
 'fan_0_system_board_fan_1_rpm=3600;0;0 fan_1_sys
   tem_board_fan_2_rpm=3600;0;0 fan_2_system_board_fan_3_rpm=3600;0;0 
 fan_3_system_board_fan_4_rpm=3600
   ;0;0 fan_4_system_board_fan_5_rpm=3600;0;0 
 pwr_mon_0_ps_1_current=0.4;0;0 pwr_mon_1_ps_2_current=0.4
   ;0;0 pwr_mon_2_system_board_system_level=175;917;966 
 temp_0_system_board_ambient=20;42;47
   '

 You can see, the injected perfomance result beginns and ends with a '. 

Yes, but I think that NSClient++ is responsible for that, putting
everything inside single quotes. As you can see it does that for the
plugin output as well.

 1. I mean, that every description and only the description must be inside of 
 the signs ' 
   our output:  fan_2_system_board_fan_3_rpm
   must be:'fan_2_system_board_fan_3_rpm'
 2. At the end is no special sign approved. 

 You can read this in chapter 2.6 Performance data at 
 http://nagiosplug.sourceforge.net/developer-guidelines.html

 I hope I could describe the problem well enough.

Yes, thank you, this was much clearer :) However, the quotes are not
needed according to the guidelines for performance data[1]:

  3. the single quotes for the label are optional. Required if spaces, =
 or ' are in the label

The perfdata labels don't contain any of the offending characters.

Could it be that this is a Windows issue, or perhaps NSClient++?

Any NSClient++ users here who can confirm if this is the case? I'm
thinking that perhaps the underscore character '_' is throwing off
Windows or NSClient++.

[1] http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN201

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Check_openmanage-- Current probes not found

2011-02-24 Thread Trond Hasle Amundsen
Joe Beck jb...@urbn.com writes:

 Yes, just after sending this post I did the things you identified.
 Verifed model vs others where this issue was not happening
 We have several r610's  this is only one with the issue.
 Then I went  looked at the omsa version  found this one was running 5.9
 where the others had 6.4
 I removed  installed 6.4 but same result.
 I also had some question/confusion about best way to identify the version;
 in fact it may have already been running 6.4.

 I'm grep'ing for version; tried running cmds with -v  --version, etc but no
 luck in seeing which version via the cmds

This command will tell you which version of OMSA you're running:

  omreport about

There are other ways as well:

  
http://folk.uio.no/trondham/software/check_openmanage.html#how-can-i-find-out-which-version-of-omsa-my-server-is-running

I'm not sure if you understood my question about the servers being
identical. I didn't mean the model (I assumed the model would be the
same), but hardware-wise. Specifically, are they alike with respect to
number of power supplies?

In any case, the next step will be to examine the installed OMSA
software components. On RHEL and derivatives such as CentOS, you can do
this by comparing the output from 'rpm -qa|grep srvadmin' from healthy
boxes versus the failing one. Also check that the running OMSA services
are the same.

Since this is happening on only one server, and you have probably
installed OMSA in exactly the same way on all the servers, you may have
a real hardware problem. If all else fails, you should contact Dell
support and have them look at it.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Check_openmanage-- Current probes not found

2011-02-23 Thread Trond Hasle Amundsen
Joe Beck jb...@urbn.com writes:

 I have a couple R610’s
 Some run omreport chassis pwrmonitoring  return output
 I also have 1 that returns:
 # omreport chassis pwrmonitoring
 Power Consumption Information

 Error : Current probes not found

 Does this mean that this module just isn’t installed or ???

 At this point, do I just alter the nagios service to exclude pwrmonitoring?

Hi Joe,

I think the next point should be to investigate why OMSA behaves like
this. I've seen this error before, but on older servers with old OMSA
versions (5.4.0). A simple restart of OMSA (srvadmin-services.sh
restart) may be the solution and should be attempted first. The next
step would be to reinstall OMSA and verify that everything gets
installed.

Usually, if power monitoring information is not available, OMSA should
say something else and more informative.

Is the problematic machine identical to the ones that work?

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_openmanage SNMP Error

2011-02-17 Thread Trond Hasle Amundsen
Shawn Green sgr...@dotomi.com writes:

 I?m in the process of rolling out check_openmanage to monitor a variety of
 hardware including R510s, M600s, and M610s.  I?m running into an interesting
 issue where the alert is reporting back:

 SNMP ERROR [cooling]: The requested entries are empty or do not exist. 

 I understand this is an SNMP error (not check_openmanage), but what?s baffling
 me is how to work around it.  My Net::SNMP module is up to date (v6.0.1) as 
 are
 net-snmp packages on all hosts.

 A good majority of hosts that are getting this error are M600/M610 blades, yet
 other blades in the same chassis? do not get this error.  I?m also seeing 
 these
 on several R510s, yet other R510s have no problems. 

 All hosts are Centos 5.5 64 bit with OMSA 6.2.0.

Hi Shawn,

One thing that is really peculiar is that you're getting this error from
blade servers. The plugin should identify blades and ignore the fact
that they don't have cooling devices (i.e. fans). You should never get
this error from blades. Are you really sure that the error from your
blades are with cooling and not something else?

(If so, we'll need to investigate why the plugin doesn't identify the
blade servers correctly).

Your Net::SNMP version is fine and not to blame. The error lies with
OMSA and/or the SNMP service. Try running on the servers:

  omreport chassis fans

On the blades, you should get an error saying that no fan probes where
found, which is normal. But the R510s should display fan info. If they
don't, the problem is not SNMP related but with OMSA itself.

If you haven't already done so, try restarting OMSA (i.e. run
'srvadmin-services.sh restart') on the servers. Reinstalling OMSA (or
better yet: reinstall with version 6.4.0) is the logical next step. Make
sure that there are no errors during installation and that everything
gets installed.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage and linebreaks

2011-02-11 Thread Trond Hasle Amundsen
Bryan O'Shea bryanosh...@gmail.com writes:

 check_openmanage and linebreaks not working in $SERVICEOUTPUT$  emails.

 When using the either of the following options the linebreaks seem to be 
 broken:
 -e or --postmsg

 This is what i get in my service notification emails instead of the
 desired output of seperate lines.

 Power Supply 1 [AC] needs attention: Presence detected, Failure
 detected, AC lostbr/NOTE: PowerEdge 2950 III 437RQH1 -
 555-1212

 It puts a br/ in instead of a \n.

Hi Bryan,

The default behaviour of check_openmanage is to use HTML linebreaks when
run from Nagios, NRPE etc., and regular linebreaks in a console which
has a TTY. The reason for this is that the plugin monitors several
things, and in case of multiple alerts it's practical to display them
each on a different line.

However, since this behaviour doesn't fit everyone you can modify it
with the '--linebreak' switch. To switch to regular (\n) linebreaks:

  check_openmanage --linebreak=REG

You can also specify any string as a custom linebreak:

  check_openmanage --linebreak=' -- '

If you choose regular linebreaks, the first line will be put in the
SERVICEOUTPUT macro, while any subsequent lines will be put in the
LONGSERVICEOUTPUT macro. This is how Nagios 3.x handles multiline output
from plugins.

PS. In order for the default HTML linebreaks to work as indended in the
web frontend, you should set escape_html_tags=0 in the Nagios config.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: Amperage probe 0 [System Board System Level] reads 0 W

2011-01-27 Thread Trond Hasle Amundsen
Tom Sommer m...@tomsommer.dk writes:

 After upgrading OpenManage to version 6.4.0 on a DELL R410,
 check_openmanage 3.6.4 returns

 CRITICAL: Amperage probe 0 [System Board System Level] reads 0 W

 Is this due to OpenManage changing behavior (bug), or is the hardware
 really faulty? (doubtful) :)

Hi Tom,

Most likely this is some sort of bug in OpenManage, or something went
wrong during upgrade. You should confirm the fault by running

  omreport chassis pwrmonitoring

Investigate the Status field. The only accepted value is Ok.

 I know I could just disable amperage checks, but I'd like not to.

 Anyone else seen this?

Sorry, no. Very often these problems are resolved simply by restarting
OpenManage on the monitored server, or a reboot. The next step is to
re-install OpenManage in case something was missed during
install/upgrade. If all else fails, contact Dell support.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: 'Amperage probe 0 [System Board System Level] reads 0 W'

2011-01-27 Thread Trond Hasle Amundsen
Tom Sommer m...@tomsommer.dk writes:

 After upgrading OpenManage to version 6.4.0 on a DELL R410,
 check_openmanage 3.6.4 returns

 CRITICAL: Amperage probe 0 [System Board System Level] reads 0 W


 Is this due to OpenManage changing behavior (bug), or is the hardware
 really faulty? (doubtful) :)

 Most likely this is some sort of bug in OpenManage, or something went
 wrong during upgrade. You should confirm the fault by running

 omreport chassis pwrmonitoring

 # omreport chassis pwrmonitoring

 Power Consumption Information is not available on this system because all
 the Power Supply units on your system do not support PMBus or the firmware
 on your system does not support power monitoring.

Strange.. if the system doesn't support power monitoring, the plugin
shouldn't complain about it. Are you using check_openmanage via SNMP or
locally?

(I'm guessing SNMP, and if so there are obvious inconsistencies
between what OMSA displays through omreport and what is available via
SNMP.)

Did power monitoring work at all before upgrading OMSA?

 Anyone else seen this?

 Sorry, no. Very often these problems are resolved simply by restarting
 OpenManage on the monitored server, or a reboot. The next step is to
 re-install OpenManage in case something was missed during install/upgrade.
 If all else fails, contact Dell support.

 Tried all but the latter - guess it's a DELL bug.

I forgot one other possible cause: old BIOS and/or firmware. Newer
versions of OMSA often need relatively up-to-date BIOS and firmware
versions to function normally. You should upgrade all BIOS and firmware
on the server.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage showing 0 logical drives with OMSA 6.4 and PERC4

2011-01-25 Thread Trond Hasle Amundsen
Steve Jenkins stevejenk...@gmail.com writes:

 After upgrading three of the 1850s to Dell OMSA 6.4 today, I noticed
 something strange. The three of them now display in Nagios:

 OK - System: 'PowerEdge 1850', SN: '', 3 GB ram (6 dimms), 0
 logical drives, 2 physical drives

 OK - System: 'PowerEdge 1850', SN: 'XXX', 12 GB ram (6 dimms), 0
 logical drives, 2 physical drives

 OK - System: 'PowerEdge 1850', SN: 'XXX', 4 GB ram (6 dimms), 0
 logical drives, 2 physical drives

 All three display 0 logical drives, even though they all have a
 working RAID array.

[snip]

 The strange part is that OMSA 6.4 on the 1850s is clearly aware that
 there's a logical drive, because the GUI shows Virtual Disk 0 RAID-1
 in the Storage Dashboard.

Hi Steve,

Interesting.. OMSA is obviously aware of the logical drives, but what
does omreport actually say about them? Try running 'omreport storage
vdisk controller=number'.

You seem to be running check_openmanage in local mode, so the output
from omreport is what matters.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage showing 0 logical drives with OMSA 6.4 and PERC4

2011-01-25 Thread Trond Hasle Amundsen
Steve Jenkins stevejenk...@gmail.com writes:

 On Tue, Jan 25, 2011 at 3:41 AM, Trond Hasle Amundsen
 t.h.amund...@usit.uio.no wrote:
 Interesting.. OMSA is obviously aware of the logical drives, but what
 does omreport actually say about them? Try running 'omreport storage
 vdisk controller=number'.

 Looks like omreport sees the controller, but not the VDisk:

 # omreport storage vdisk controller=0
 No virtual disks found

Ok, so there is the reason that check_openmanage doesn't display any
virtual disks. It relies on OMSA for the information, specifically
omreport when used in local mode.

Based on the issue at hand and your reports about OMSA 6.4 and PERC4
controllers on the linux poweredge list, it seems that the latest OMSA
has serious issues with 8th gen Dell servers.

PS. You may have noticed that the plugin doesn't issue an alert when
virtual disks are missing. The reason for this is that it's perfectly
legal and plausible for systems to have no virtual disks. This is the
downside of a plugin that both discovers the components and monitors
them at the same time. It can't give alerts on missing components unless
they should always be present in all servers. A notable exception is
controllers, since being unable to display controllers is a common OMSA
problem. check_openmanage will complain about missing controllers even
though controller-less systems are possible.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem with check_openmanage

2011-01-24 Thread Trond Hasle Amundsen
Jeffrey Watts jeffrey.w.wa...@gmail.com writes:

 Hello, I'm using Mr. Amundsen's excellent check_openmanage plugin, and I'm
 getting an odd error:

 $ check_openmanage -H myserver -C public
 Power Supply 0 [AC] needs attention: Presence detected, Failure detected, AC
 lost
 Voltage sensor 14 [PS 2 Voltage 2] is 
 INTERNAL ERROR: Use of uninitialized value $reading in sprintf at /usr/lib/
 nagios/plugins/check_openmanage line 3565.

 Has anyone else seen this error?  I'm running version 3.6.4.  Please let me
 know what additional information is needed.

Hi Jeffrey,

This shouldn't happen, and I think I see where the problem is. Please
try the version available here, and let me know if it performs any
better:

  http://folk.uio.no/trondham/software/test/

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problem with check_openmanage

2011-01-24 Thread Trond Hasle Amundsen
Jeffrey Watts jeffrey.w.wa...@gmail.com writes:

 Thanks Trond!  That seems to have fixed it.  Here's what I see now:

 ./check_openmanage -H pkc-search28 -C tomgeco
 Power Supply 0 [AC] needs attention: Presence detected, Failure detected, AC 
 lost
 Voltage sensor 14 [PS 2 Voltage 2] is Unknown reading

 It comes up correctly now as a CRIT, too.

Good, thanks for reporting back. I'll include this fix in the next
release. The problem was that where the reading is not available, the
plugin assumes that the reading is discrete (i.e. not a number but
good, bad etc.). This assumption is wrong in cases where the reading
is NOT discrete and simply not available via SNMP. The fixed version
will set the reading to Unknown reading when the reading can't be
obtained.

(However, this situation shouldn't occur at all if OMSA it behaving as
it should. Pulling the cable on one power supply would normally lead to
a reading of 0 volts for that voltage probe.)

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Check_OpenManage error

2011-01-14 Thread Trond Hasle Amundsen
Jeffrey C. Veatch jeffrey.vea...@knowyourrights.com writes:

 To whom it may concern:
  
 I have been trying to use check_openmanage in my Nagios configuration, but no
 matter what I do I get a list of Internal Errors at the end of the returned
 test.  The only way I can avoid it is by using the debug mode and only
 returning the first 80 lines.  This however does not warn me of any issues the
 server is having.
  
 Here are some details.  The server running OMSA is an R710 running VMware ESX
 4.0.0 Update 2.  OMSA version is 6.4.
 The nagios server is in a virtual machine running OpenSUSE 11.3.  The Nagios
 version is 3.2.3.
  
 If there are other packages that you need to know the version, let me know. 
 The following is an example of the results that I get.  Oh, and in nagios this
 ends up being an unknown state for the check.
  
 VLinux:/usr/local/nagios/libexec # ./check_openmanage -H 192.168.10.21
 OK - System: 'PowerEdge R710', SN: '5QTMZK1', 72 GB ram (18 dimms), 1 logical
 drives, 2 physical drives
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 588.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 655.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 708.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 764.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 869.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 952.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1028.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1103.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1168.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1325.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1531.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1549.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1563.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1577.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1591.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1613.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1633.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1653.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1674.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1702.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1737.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1846.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1968.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1973.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1978.
 INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/
 5.12.1/Net/SNMP.pm line 1983.
 Thanks for any help you can give me.

Hi Jeffrey,

Interesting error, never seen this one before :)

check_openmanage will print any perl warnings that occur during
execution as internal errors. This is done to avoid situations where the
plugin stops working due to perl incompatibilities etc. without your
knowledge, as Nagios completely ignores any plugin output to STDERR.

Which version of Net::SNMP are you using? Try 'rpm -q perl-Net-SNMP' to
find out. Perl 5.12 deprecated the locked attribute, and this was
fixed in Net::SNMP version 6.0.1, i.e. the latest release. The changelog
for Net::SNMP 6.0.1 has the following:

  - Removed all occurrences of the locked attribute that was
deprecated in Perl 5.12.0.

I believe this to be a problem with your distribution using an
old/incompatible version of Net::SNMP. It seems that for perl 5.12.x you
need Net::SNMP 6.0.1 (or any later version).

PS. I found this in the OpenSUSE bugzilla:

  https://bugzilla.novell.com/show_bug.cgi?id=629698

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo


Re: [Nagios-users] check_openmanage plugin reporting Firmware out of date

2010-12-13 Thread Trond Hasle Amundsen
Trond Hasle Amundsen t.h.amund...@usit.uio.no writes:

 Surangiwala, Asif  asif.surangiw...@firstdata.com writes:

 Can we update the check_openmanage script to parse the Minimum
 Required Firmware Version and compare it with the current Firmware
 Version to overcome the OMSA bug?

 It is entirely possible to mitigate this bug within the plugin, but I
 don't think that it's a good idea to let the plugin do all version
 parsings and ignore OMSA on a general basis. I have created a version
 that works around this particular bug (version 3.6.2-p1) and made it
 available here:

   http://folk.uio.no/trondham/software/omsa-fw-bug/

 It simply ignores out-of-date firmware if the firmware and minimum
 firmware versions match those in question. But in order for this to
 work, I also had to turn off checking the global health status, which
 inherits the non-critical status of the controller.

 DISCLAIMER: This version is only intended as a temporary solution for
 users of OMSA 6.3.0 that struggles with the recent firmware bug, and
 don't want to use blacklisting as a workaround. When OMSA 6.4.0 becomes
 available, you should upgrade OMSA and revert to a regular release of
 check_openmanage.

Hi Asif,

Dell has released OMSA 6.4.0, which fixes the firmware version parsing
issue. I have also released a new version of check_openmanage that
contains a few compatibility fixes for OMSA 6.4.0:

  http://folk.uio.no/trondham/software/check_openmanage.html#download

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL,
new data types, scalar functions, improved concurrency, built-in packages, 
OCI, SQL*Plus, data movement tools, best practices and more.
http://p.sf.net/sfu/oracle-sfdev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage plugin reporting Firmware out of date

2010-12-01 Thread Trond Hasle Amundsen
Surangiwala, Asif  asif.surangiw...@firstdata.com writes:

 I have Dell Open Manage Server Administrator 6.3.0 installed on some Dell
 R710’s with PERC H700 controller. When I run the Nagios plugin
 check_openmanage, it reports the following:

 Controller 0 [PERC H700 Integrated]: Firmware '12.10.0-0025' is out of date

 The H700 is running the latest firmware 12.10.0-0025, check_openmanage plugin
 is v3.6.2 by Trond H. Amundsen. OMSA is running fine and is not complaining
 about any firmware issues.

 The same ‘Firmware out of date’ warning is also given for H800 controllers on
 the R710’s having it.

 Is there an issue with the plugin’s interaction with OMSA?

Hi Asif,

This is a bug in OMSA, not check_openmanage. OMSA is reporting that the
firmware is too old while clearly it is not. Dell has stated that the
bug will be fixed in the next version of OMSA. For more information, see
the following thread on the Linux-Poweredge mailing list:

  http://lists.us.dell.com/pipermail/linux-poweredge/2010-December/043713.html

As a workaround, I suggest using blacklisting to suppress the false
warnings until OMSA 6.4.0 is released and deployed on your systems:

  check_openmanage -b ctrl_fw=all [..other options..]

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Increase Visibility of Your 3D Game App  Earn a Chance To Win $500!
Tap into the largest installed PC base  get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_openmanage plugin reporting Firmware out of date

2010-12-01 Thread Trond Hasle Amundsen
Surangiwala, Asif  asif.surangiw...@firstdata.com writes:

 Can we update the check_openmanage script to parse the Minimum
 Required Firmware Version and compare it with the current Firmware
 Version to overcome the OMSA bug?

It is entirely possible to mitigate this bug within the plugin, but I
don't think that it's a good idea to let the plugin do all version
parsings and ignore OMSA on a general basis. I have created a version
that works around this particular bug (version 3.6.2-p1) and made it
available here:

  http://folk.uio.no/trondham/software/omsa-fw-bug/

It simply ignores out-of-date firmware if the firmware and minimum
firmware versions match those in question. But in order for this to
work, I also had to turn off checking the global health status, which
inherits the non-critical status of the controller.

DISCLAIMER: This version is only intended as a temporary solution for
users of OMSA 6.3.0 that struggles with the recent firmware bug, and
don't want to use blacklisting as a workaround. When OMSA 6.4.0 becomes
available, you should upgrade OMSA and revert to a regular release of
check_openmanage.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Increase Visibility of Your 3D Game App  Earn a Chance To Win $500!
Tap into the largest installed PC base  get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Check_OpenManage INTERNAL ERROR

2010-10-25 Thread Trond Hasle Amundsen
Benny Somali benny.som...@firstnational.ca writes:

 Works fine now.

Good, thanks for testing.

 By the way, the Status Information field is blank, is it related to
 the max length of 1023 chars?

Probably not. You shouldn't run into problems with the silly nrpe limit
for other than large servers with lots of performance data, and then
only the perfdata should be affected.

My guess is that the State field is also empty for the failed
disk. I have an updated beta for you here:

  http://folk.uio.no/trondham/software/beta/

If should now report that the disk is Unknown State.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Check_OpenManage INTERNAL ERROR

2010-10-25 Thread Trond Hasle Amundsen
Benny Somali benny.som...@firstnational.ca writes:

 Ignore my previous question.

Too late, but no problem. My one-line patch is easily reversed :)

 It worked fine now.
 I used a batch script and didn't add a line to turn the echo off so it
 returned special characters.
 So I added @echo off and the Status Information displayed.

Good. Thanks again for reporting this and for testing the beta version.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Check_OpenManage INTERNAL ERROR

2010-10-22 Thread Trond Hasle Amundsen
Benny Somali benny.som...@firstnational.ca writes:

 INTERNAL ERROR: substr outside of string at script/check_openmanage line 1502.
 INTERNAL ERROR: Use of uninitialized value in lc at script/check_openmanage 
 line 1502.

Hi Benny,

Thanks for reporting this. The error is related to the vendor of
physical disks as reported by omreport. What does 'omreport storage
pdisk controller=0' say? I'm guessing that the Vendor field is empty or
missing for one of the disks.

Finding the root cause would be interesting. Can you tell if the disk in
question is an original disk supplied by Dell? If it isn't, this could
be the reason that the vendor field is empty/missing, i.e. Openmanage
doesn't recognize it. If it is a Dell drive, we're probably dealing with
a rare Openmanage oddity.

In any case, check_openmanage should handle this situation more
gracefully. I'll provide a patched version for you to test on Monday.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Check_OpenManage INTERNAL ERROR

2010-10-22 Thread Trond Hasle Amundsen
Benny Somali benny.som...@firstnational.ca writes:

 Yes, you are right.
 There is pdisk #1 that has empty vendor ID field.
 The disk in question was original Dell disk, however, it seemed to be
 bad now.
 We have an opened trouble ticket with Dell and expect to get a
 replacement disk.

Ah.. it makes sense that in some circumstances, if the disk is
sufficiently bad, Openmanage can't report the vendor.

I went ahead and patched this in the plugin. There is a beta version
(win32 binary) available here:

  http://folk.uio.no/trondham/software/beta/check_openmanage.exe

Please give it a try and let me know if it resolved this issue.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Question on setting up my own check

2010-10-20 Thread Trond Hasle Amundsen
Marc Powell li...@xodus.org writes:

 On Oct 19, 2010, at 2:20 PM, steve f wrote:

 Hello All,
 
 I have the following script created to check free space on a remote legacy 
 box via rsh. 
 
 used=`sudo rsh $1 df -v |grep starlite6 | head -1 | awk '{print $4}'`
 free=`sudo rsh $1 df -v |grep starlite6 | head -1 | awk '{print $5}'`

 Beyond just good programming practice, always use full paths to external 
 programs within your scripts. $PATH may not be what you expect it to be, 
 especially when being run by the nagios daemon which has a more restrictive 
 environment.

 # (paths may be different on your system)
 used=`/usr/bin/sudo /usr/bin/rsh $1 /bin/df -v | /bin/grep starlite | 
 /usr/bin/head -1 | /usr/bin/awk '{print $4}'`

Or... set PATH before doing anything else, e.g.

  #!/bin/bash
  PATH=/bin:/sbin:/usr/bin:/usr/sbin
  export PATH
  [...rest of script...]

This will enhance readability wrt. using full paths everywhere.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Download new Adobe(R) Flash(R) Builder(TM) 4
The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly 
Flex(R) Builder(TM)) enable the development of rich applications that run
across multiple browsers and platforms. Download your free trials today!
http://p.sf.net/sfu/adobe-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Bug in check_openmanage ?

2010-09-24 Thread Trond Hasle Amundsen
rb...@free.fr writes:

 omreport chassis pwrmanagement
 Power Budget Information is not available on this system.

 In fact, i solve the problem by updating/resetting the idrac.

Ok, good to know. I'm still a little concerned that there was a hardware
problem that check_openmanage didn't identify properly. Please let me
know if this happens again.

 But the plugins nagios is always ko and i don't know why ...

 ./tmp/check_openmanage -H 10.1.19.193
  SNMP ERROR [cooling]: Requested entries are empty or do not exist.

This is a completely different problem. Cooling devices (i.e. fans)
should exist in all servers except blades. Which type of server is this,
and do you know if it has fans or not?

The error above is from the Net::SNMP perl module. If the plugin doesn't
get the data it expects when polling via SNMP, it will forward the error
message from Net::SNMP.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Bug in check_openmanage ?

2010-09-23 Thread Trond Hasle Amundsen
rb...@free.fr writes:

 OOPS! Something is wrong with this server, but I don't know what. The
 global system health status is CRITICAL, but every component check is
 OK. This may be a bug in the Nagios plugin, please file a bug report.

 The status change from OK to Unknown...

 Is anybody can help me to debbug ?

Hi Rémi,

Thanks for reporting this.

As an extra precaution, check_openmanage will check the global health
status in addition to each of the components, providing you don't use
blacklisting and/or check control such that the global check can be a
false positive.

This case seems to be a real issue where a component is bad and the
global health status reflects this. The component in question is not
checked by the plugin for some reason. I'd like to narrow down the
suspect pool. If you have login access to this server, can you send the
output from the following command:

  omreport chassis

If this command reports that everything is OK, we're probably dealing
with a storage problem.

Just to rule out blacklisting bugs etc., what is the command definition
for check_openmanage in your Nagios config?

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Bug in check_openmanage ?

2010-09-23 Thread Trond Hasle Amundsen
rb...@free.fr writes:

 Hi Trond,

 You are right ...

 --
 # omreport chassis

 Health

 Main System Chassis

 SEVERITY : COMPONENT
 Ok   : Memory
 Critical : Power Management
 Ok   : Processors
 Ok   : Temperatures
 Ok   : Voltages
 Ok   : Hardware Log
 Ok   : Batteries

 For further help, type the command followed by -?
 

 On the IDRAC i have the message System Board Current Latch

This is interesting.. Have you configured power budgeting on this
server? What does this command say:

  omreport chassis pwrmanagement

On a regular R805 here it just says:

  Power Budget Information is not available on this system.

but we've never configured or used this feature, so I don't know
anything about it.

I'm thinking that perhaps check_openmanage should support these and
similar configurable OMSA features.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem with check_openmanage and Open Manage 6.3

2010-09-16 Thread Trond Hasle Amundsen
Luca Olivotto lolivo...@gmail.com writes:

 Hello,
 i have a problem with the plugin check_openmanage .

 if i use this command:
 ./check_openmanage -H xx.xx.xx.xx

 i get this result:
 OOPS! Something is wrong with this server, but I don't know what. The global
 system health status is WARNING, but every component check is OK. This may be
 a bug in the Nagios plugin, please file a bug report.

 The server that i'm checking is a PowerEdge 2950 and i suppose that the
 problem is the version of OpenManage installed on the server. The version is
 6.3 and the only warning shown via the webinterface are  the old version of
 the firmware/driver/storeDriver of the controller.
 If i try that command

 check_openmanage -H 10.10.10.6 -b ctrl_fw=all/ctrl_driver=all/ctrl_stdr=all -s
 -e
 the output is:
 OK - System: 'PowerEdge 2950', SN: 'xx', 16 GB ram (4 dimms), 0 logical
 drives, 0 physical drives

 as you can see the disk are not checked(that server has a broked mirror).

 the version of check_openmanage is 3.6.0

Hi Luca,

Your analysis is correct. OMSA doesn't display storage info via SNMP,
but there is something wrong with a storage component. For some reason,
OMSA senses the storage failure and the global health status inherits
this failure status, but OMSA doesn't display the storage. This
condition will trigger the behaviour you are seeing.

The plugin searches for storage controllers. If it doesn't find any
controllers, it concludes that there is no storage alltogether and will
skip subsequent checks of disk drives etc.

Do you see any controlles by running this command on the server:

  omreport storage controller

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problem with check_openmanage and Open Manage 6.3

2010-09-16 Thread Trond Hasle Amundsen
Luca Olivotto lolivo...@gmail.com writes:

 yes, i see the perc 6i controller.

Ok, thanks. I then suspect that the problem lies with the SNMP part of
OMSA. Kan you run the following command from your Nagios server to
confirm:

  snmpwalk -v2c -c community hostname/ip 1.3.6.1.4.1.674.10893.1.20.130.1

The result should look something like this:

  $ snmpwalk -v2c -c public foobar 1.3.6.1.4.1.674.10893.1.20.130.1
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.1.1 = INTEGER: 1
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.2.1 = STRING: PERC 6/i 
Integrated
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.3.1 = STRING: DELL
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.4.1 = INTEGER: 6
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.5.1 = INTEGER: 1
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.7.1 = INTEGER: 30
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.8.1 = STRING: 6.2.0-0013
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.9.1 = INTEGER: 256
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.10.1 = INTEGER: 0
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.11.1 = INTEGER: 6
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.12.1 = INTEGER: 2
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.37.1 = INTEGER: 3
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.38.1 = INTEGER: 3
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.39.1 = STRING: \\0
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.40.1 = INTEGER: 3
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.41.1 = STRING: 00.00.04.17-RH1
  
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.42.1 = STRING: embedded
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.43.1 = INTEGER: 99
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.47.1 = INTEGER: 2
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.48.1 = INTEGER: 30
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.49.1 = INTEGER: 30
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.50.1 = INTEGER: 30
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.51.1 = INTEGER: 30
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.52.1 = INTEGER: 1
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.53.1 = INTEGER: 1
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.54.1 = INTEGER: 32
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.57.1 = INTEGER: 99
  SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.58.1 = INTEGER: 99

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem with check_openmanage and Open Manage 6.3

2010-09-16 Thread Trond Hasle Amundsen
Luca Olivotto lolivo...@gmail.com writes:

 that is the output:
 SNMPv2-SMI::enterprises.674.10893.1.20.130.1 = No Such Object available on
 this agent at this OID

Ok, this confirms that the problem lies with OMSA, specifically the SNMP
functionality. I'm afraid that I can't offer much clues about how to fix
this. I would try restarting the OMSA and SNMP services, and if that
doesn't work, reinstall OMSA completely.

Best of luck,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage ignores blacklist directive

2010-08-10 Thread Trond Hasle Amundsen
C. Bensend be...@bennyvision.com writes:

 Despite of giving it the parameter to ignore Warnings about the
 controller firmware, it still gives a Warning Status:

 /usr/lib/nagios/plugins/check_openmanage -b ctrl_fw -s -H 192.168.2.137

 'ctrl_fw' isn't the complete option you need to give there - you
 also need to specify the ID per:

 http://folk.uio.no/trondham/software/check_openmanage.8.html

 Try 'ctrl_fw=0,1'

Yes, or:

  ctrl_fw=all

..if you wish to blacklist this for all controllers and aren't
interested in specifying controller IDs.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: Use of uninitialized value in sprintf at /usr/lib64/nagios/plugins/check_openmanage

2010-06-28 Thread Trond Hasle Amundsen
Max Williams max.willi...@mflow.com writes:

 Here is the output, the inactive temperature probe is sorted but the
 missing EMM still produces an alert:

   OK |  1:1:0:1 | Temperature Probe 1 in enclosure 3 [MD1000] is Inactive

This one works as expected :)

   OK |  1:1:0:2 | Temperature Probe 2 in enclosure 3 [MD1000]:  C ( max)
   OK |  1:1:0:3 | Temperature Probe 3 in enclosure 3 [MD1000]:  C ( max)

Hmm... something strange going on here. I wonder why this happens, in
the SNMP output you attached previously the values are there. Anyway,
I've added some extra checking in the code to make it report better if
the reading is unavailable for some reason. It should now report simply:

  Temperature Probe 0 in enclosure 2:0:0 [MD1000] is Ready

if the temp reading is not an integer and OMSA reports the status as OK.

 CRITICAL |  1:1:0:1 | EMM 1 in enclosure 3 [MD1000] needs attention: Not 
 Installed

Ah.. I misread the SNMP output.. The status is Unknown when reported
by omreport, but Other when reported with SNMP. One little annoying
difference between the two.. The output should be:

  EMM 0 in enclosure 2:0:0 [MD1000] is Not Installed

with an OK state.

I've created a second test version:

  http://folk.uio.no/trondham/software/beta/check_openmanage

Please give this one a try and see if it performs better.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: Use of uninitialized value in sprintf at /usr/lib64/nagios/plugins/check_openmanage

2010-06-28 Thread Trond Hasle Amundsen
Max Williams max.willi...@mflow.com writes:

 Excellent, sorted, everything reports as OK now. 

Good. I'll try to make a release with these changes in the next couple
of days.

 Thanks so much Trond, amazing support and an amazingly useful plugin!

Glad you like it, Max. Thanks for reporting this issue :)

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: Use of uninitialized value in sprintf at /usr/lib64/nagios/plugins/check_openmanage

2010-06-25 Thread Trond Hasle Amundsen
Max Williams max.willi...@mflow.com writes:

 Hi,

 After adding more storage to a couple of our servers we are getting this 
 error:

  

 [r...@host  ~]# /usr/lib64/nagios/plugins/check_openmanage -C password -b
 ctrl_driver=0,1,2 -b ctrl_fw=0,1,2 -b intr=0 -H host2

 Temperature Probe 1 in enclosure 3 [MD1000] is Inactive C at  ( max)

 EMM 1 in enclosure 3 [MD1000] needs attention: Not Installed

 INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/
 plugins/check_openmanage line 2312.

 INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/
 plugins/check_openmanage line 2312.

 INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/
 plugins/check_openmanage line 2318.

 INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/
 plugins/check_openmanage line 2318.

 INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/
 plugins/check_openmanage line 2318.

 INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/
 plugins/check_openmanage line 2318.

 [r...@host  ~]#

  

 We didn?t get this error before adding a new cabinet of disks which now brings
 the total up to 47 (2x internal disk and 3x full MD1000s).

 Has any one else come across this error? I am not perl literate so not sure 
 how
 to debug or fix this.

Hi Max,

This is interesting. I've never seen Inactive temperature sensors in
external enclosures. Also, that the plugin reports missing EMMs seems
like a misfeature. Can you post the output from the following commands:

On the monitored host:

  omreport storage enclosure controller=id enclosure=id info=temps
  omreport storage enclosure controller=id enclosure=id info=emms

Replace id with controller/enclosure pairs. You'll get the
enclosure and controller IDs with commands

  omreport storage controller
  omreport storage enclosure

Also, since you're checking with SNMP, I'll need the output from an
snmpwalk of the enclosures wrt. temperatures and EMMs. From the Nagios
server:

  snmpwalk -v2c -c community hostname 1.3.6.1.4.1.674.10893.1.20.130.11
  snmpwalk -v2c -c community hostname 1.3.6.1.4.1.674.10893.1.20.130.13

If you are uncomfortable with posting this information on the
mailinglist, feel free to email me directly.

Debug output from the plugin could also be useful:

  check_openmanage -H hostname -C community -d

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage: Use of uninitialized value in sprintf at /usr/lib64/nagios/plugins/check_openmanage

2010-06-25 Thread Trond Hasle Amundsen
Max Williams max.willi...@mflow.com writes:

 Both of the new enclosures show the same output so perhaps these just
 have a different configuration to the others we have here.

Yes. I suspect that the is related to one EMM not being installed. My
guess is that the inactive temperature sensor is located in the EMM, but
there is no way to tell since neither the omreport output nor the SNMP
output reveals the location of the temperature sensors. Or perhaps the
EMM is needed to activate the sensor. We always order our MD1000s with 2
EMMs, so this is something that I haven't had the opportunity to test.

I have created a test version for you to try. This version should:

  * report inactive temperature sensors as OK
  * report EMMs with state Not Installed as OK

In addition it checks that the reading from the sensors are in fact
digits before attempting to print the values.

The test version is located here:

  http://folk.uio.no/trondham/software/beta/

Try it with the '-d' option to see that it reports these things
properly.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage plugin error

2010-05-26 Thread Trond Hasle Amundsen
Andrea Ballarati ballar...@interfree.it writes:

 Nagios reports error from the plugin in subject, we have another Dell
 PowerEdge 1950 for which no errors are reported.
 This is the output of check_openmanage -d

System:  PowerEdge 1800
ServiceTag:    OMSA version:4.5.0
BIOS/date:   A05 09/21/2005   Plugin version:  3.5.7
 -
Storage Components

 =
   STATE  |ID|  MESSAGE TEXT

 -+--+
  WARNING |0 | Controller 0 [CERC SATA 1.5/2s] needs attention:
 Degraded
   OK |0:0:0 | Array Disk 0:0 [1.0TB] on ctrl 0 is Online
   OK |0:0:1 | Array Disk 0:1 [1.0TB] on ctrl 0 is Online
   OK |  0:0 | Logical drive 0 'Windows Disk 0' [RAID-1, 931.48
 GB] on ctrl 0 is Ready
   OK |  0:0 | Channel 0 [] on controller 0 is Ready
 -
Chassis Components

 =
   STATE  |  ID  |  MESSAGE TEXT

 -+--+
   OK |1 | Memory module 1 [DIMM1_A, 512 MB] is Ok
   OK |2 | Memory module 2 [DIMM1_B, 512 MB] is Ok
   OK |1 | Chassis fan 1 [BMC Fan 1]: 1500
   OK |2 | Chassis fan 2 [BMC Fan 2]: 1500
   OK |0 | Power Supply 0 [VRM]: Presence detected
   OK |1 | Power Supply 1 [VRM]: Presence detected
   OK |0 | Temperature Probe 0 [PROC_1 Temp] reads 38 C (max=120/125)
   OK |1 | Temperature Probe 1 [BMC Ambient Temp] reads 22 C
 (min=8/3, max=40/45)
   OK |2 | Temperature Probe 2 [BMC Planar Temp] reads 33 C
 (min=8/3, max=62/67)
   OK |3 | Temperature Probe 3 [BMC VRD 0 Temp] reads 31 C
 (min=8/3, max=70/75)
   OK |4 | Temperature Probe 4 [BMC VRD 1 Temp] reads 27 C
 (min=8/3, max=70/75)
   OK |0 | Processor 0 [Intel Xeon 3.00GHz] is Present
   OK |0 | Voltage sensor 0 [BMC CMOS Battery] is 3.070 V
   OK |1 | Voltage sensor 1 [PROC_1 VCORE] is Good
   OK |2 | Voltage sensor 2 [BMC PROC VTT] is Good
   OK |3 | Voltage sensor 3 [BMC 1.5V PG] is Good
   OK |4 | Voltage sensor 4 [BMC 1.8V PG] is Good
   OK |5 | Voltage sensor 5 [BMC 3.3V PG] is Good
   OK |6 | Voltage sensor 6 [BMC 5V PG] is Good
   OK |0 | Chassis intrusion 0 detection: Ok (Not Breached)
 -
Other messages

 =
   STATE  |  MESSAGE TEXT

 -+---
   OK | ESM log health is Ok (less than 80% full)

 INTERNAL ERROR: Use of uninitialized value in numeric eq (==) at
 /usr/lib/nagios/plugins/check_openmanage line 1380.
 INTERNAL ERROR: Use of uninitialized value in numeric eq (==) at
 /usr/lib/nagios/plugins/check_openmanage line 1380.
 INTERNAL ERROR: Use of uninitialized value in sprintf at
 /usr/lib/nagios/plugins

Hi Andrea,

check_openmanage is designed to work with relatively recent OMSA
versions. You are using OMSA version 4.5.0, which is very old. The
server in question (poweredge 1800) is supported by newer OMSA, so the
solution is an OMSA upgrade to the latest version (6.2.0).

OMSA versions 5.3.0 and later is OK to use with check_openmanage, and
I've had reports that 5.1.0 and 5.2.0 works as well (but no
guarantee). Anything older will yield strange results or will simply not
work.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--

___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage weirdness

2010-05-21 Thread Trond Hasle Amundsen
Greg Etling getl...@stern.nyu.edu writes:

 Trond, thanks for your quick reply. Unfortunately it does appear we have 
 a disconnect between OMSA and SNMP:

[snip]

 [r...@nagios ~]# snmpwalk -v2c -c * testserver 
 1.3.6.1.4.1.674.10893.1.20.130.1
 SNMPv2-SMI::enterprises.674.10893.1.20.130.1 = No Such Object available 
 on this agent at this OID

Hmm.. you should see output like:

$ snmpwalk -v2c -c community hostname 1.3.6.1.4.1.674.10893.1.20.130.1
SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.1.1 = INTEGER: 1
SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.2.1 = STRING: PERC 6/i 
Integrated
SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.3.1 = STRING: DELL
SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.4.1 = INTEGER: 6
SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.5.1 = INTEGER: 1
SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.7.1 = INTEGER: 30
SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.8.1 = STRING: 6.2.0-0013
[...]

 It appears to only have data under the 1.3.6.1.4.1.674.10892 and 
 1.3.6.1.4.1.674.10899 trees. Thoughts?

Unfortunately my Windows knowledge is rather limited. I have never
installed OMSA on Windows, but I suspect that there are options to
choose from during the install. The first thing I would do is to
re-install OMSA step by step and try to figure out what I might have
missed. On Linux, the install procedure and packaging of the OMSA
components changed with version 6.2.0. This may very well be the case
with the Windows version as well.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--

___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Internal error

2010-04-13 Thread Trond Hasle Amundsen
Richard Hagen r.ha...@qlict.nl writes:

 I recently installed a new DELL Poweredge 2970 with W2k8 and installed also
 DELL OMSA.

 When i read the status from nagios i get the following error:

 Amperage probe 0 [PS 1 Current 1] reads 0 A
 Amperage probe 1 [PS 2 Current 2] reads 0 A
 INTERNAL ERROR: Use of uninitialized value in division (/) at /usr/lib/nagios/
 plugins/check_openmanage line 3536.
 INTERNAL ERROR: Use of uninitialized value in division (/) at /usr/lib/nagios/
 plugins/check_openmanage line 3536.
 INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib/nagios/
 plugins/check_openmanage line 3562.

Hi Richard,

This happens because the value (i.e. reading from the amperage probes)
are not reported by SNMP, while the rest of the data about the probes
are reported (status, type, name etc.). There is something wrong with
Openmanage on this server. What is the output from this command:

  omreport chassis pwrmonitoring

That being said, the plugin could handle this better. Please try the
beta version available here:

  http://folk.uio.no/trondham/tmp/

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Check_multipath

2010-03-25 Thread Trond Hasle Amundsen
Brian O'Mahony brian.omah...@curamsoftware.com writes:

 It works locally though, and I have

  

 Cmnd_Alias MULTIPATH=/sbin/multipath -l

 nagios  ALL= NOPASSWD: MULTIPATH

My money is on Requiretty. Locally you have a TTY, while NRPE does
not. The Requiretty setting in /etc/sudoers must be turned
off. Comment out this line in /etc/sudoers:

  Defaultsrequiretty

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem with check_openmanage 3.5.6

2010-03-22 Thread Trond Hasle Amundsen
Nicole Hähnel m...@nicole-haehnel.de writes:

 CRITICAL: [xxx] Physical Disk 0:0 [Wdc WD1600JS-55MHB0, 160GB] on ctrl 0 
 needs
 attention:
 -- SYSTEM:  PowerEdge 830, SN: xxx
 INTERNAL ERROR: Use of uninitialized value in string eq at 
 /usr/lib64/nagios/
 plugins/grontmij/check_openmanage line 1432.
 INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/
 plugins/grontmij/check_openmanage line 1445.

Mostly for the list archive:

We took this off the list to do some back-and-forth debugging and
testing, and the issue is now resolved. A new version of
check_openmanage is released, which will print the above correctly as:

  CRITICAL: Physical Disk 0:0 [Wdc WD1600JS-55MHB0, 160GB] on ctrl 0 needs 
attention: Undefined value 4096

This relates to SNMP returning values which are not defined in the
MIBs. Such values are now reported as Undefined value number.

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Keeping the Nagios Configuration Sane

2010-03-10 Thread Trond Hasle Amundsen
David Wallis wal...@aps.anl.gov writes:

 Matt Simmons wrote:
 Hi All,

 I'm attending the 2010 Professional IT Community Conference
 (http://www.picconf.org) being held in New Brunswick, NJ, and I'm
 giving a talk about staying sane while working with the Nagios
 configuration.

 The talk will be 45 minutes long, and will primarily be an outshoot
 from this article that I wrote on my blog:
 http://www.standalone-sysadmin.com/blog/2009/07/nagios-config/

 I could talk about that and some other things that I've been figuring
 out, but I was wondering if anyone had any tricks or tips for dealing
 with the Nagios config? Is there anything special that you do to keep
 things straight?

 I'm going to be putting my slides and any additional material online
 following the conference, so hopefully someone else can get some use
 from it.

 By the way, if anyone on this list is in the north east of the US, you
 should come visit the conference. Without training, it's only $275 for
 2 days. With a full day and a half of training, it's still only $400
 for the whole shebang. Anyway, this isn't a sales email.

 I'm looking forward to any tips you would want to share. Thanks in advance!

 --Matt
   

 I manage the Nagios installation for 3 different domains at work, each 
 domain with several hundred servers and clients. I quickly reached the 
 There's got to be a better way! point when trying to maintain 
 configuration files that were getting pretty big. I was using all the 
 tricks listed in the Nagios docs, but it was still pretty crazy.

 The approach I took was to write a configuration generator program that 
 uses a meta-config file to generate the hosts.cfg, hostgroups.cfg and 
 services.cfg config files. The meta-config file allows one to set up 
 cascading configuration variables, and then has one line per monitored 
 host, that includes things like host groups, parents, etc, and then a 
 list of services to monitor.

 I also created the idea of meta-services that allow the program to 
 generate configuration data for any number of related services with a 
 single service name in the meta-config file. For instance, including the 
 service weball will cause the configuration generator to create 
 service entries for every plumbed interface on the web server, checks 
 for every virtual server (http and https), and checks for every SSL cert 
 that it finds. In one domain, a 400 line meta-config file generates a 
 20,000 line services.cfg file.

 Rather than updating individual config files, I just update the 
 meta-config file and then regenerate all of the *.cfg files. I've been 
 using this for several years with very good results.

That's an interesting approach, and we do something similar. It goes
without saying that when the number of hosts grows to several hundred,
maintaining the Nagios config for hosts and hostgroups etc. the regular
way becomes an arduous task. This is especially true if your environment
is largely heterogenous.

We have a list of our servers maintained in a homegrown application
using a topic map as base. Large parts of the Nagios config are
generated from this. I think this is an important point. Usually, you
already have a list of your servers, and you can use this list as a base
for Nagios config as well. The format of the host list is not important,
but deciding that this is the starting point for Nagios hosts config
is. When a host is added/removed in the list, it is added/removed in
Nagios. This is very much like David's approach, i.e. a list of hosts in
a format that is easier to handle and maintain.

In addition, we have defined several roles that a server may have,
such as dell-hardware, hp-hardware, mail-mx-server, web-server,
dns-server etc. A simple perl script runs every day on each host and
determines its roles. This information is collected and kept
centrally. Parts of the Nagios config (hostgroups, servicegroups) are
generated based on these roles.

NRPE config is the same on all hosts. It is maintained centrally and
distributed to each host daily. Adding stuff in the sudoers file (needed
for some plugins) is done automatically based on the host's roles.

Another point: We generally don't use plugins that require us to
configure the plugin and tailor it for each individual host. For
example, for filesystem monitoring we have created a custom plugin that
monitors all partitions by default. It has a optional configuration file
locally on each host where we can set individual thresholds if needed.

Thinking like this should come easy to system administrators that are
used to dealing with large installations. It's all about automation :)

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune 

Re: [Nagios-users] Problem with check_openmanage 3.5.6

2010-02-25 Thread Trond Hasle Amundsen
Nicole Hähnel m...@nicole-haehnel.de writes:

 it's a windows server.
 So I'm using check_openmanage with snmp.

 check_openmanage -s -C $ARG1$ -H $HOSTADDRESS$ -e -i -p --state --check 
 intrusion=1,alertlog=1,esmlog=1 -o 3 --htmlinfo de

 List of Physical Disks on Controller CERC SATA 1.5/6ch (Slot 4)

 Controller CERC SATA 1.5/6ch (Slot 4)
 ID: 0:0
 Status: Unknown
 Name  : Physical Disk 0:0
 State : Unknown
 Failure Predicted : No
 Progress  : Not Applicable
 Bus Protocol  : SATA
 Media : HDD
 Capacity  : 149.05 GB (160040681472 bytes)
 Used RAID Disk Space  : 0.00 GB (0 bytes)
 Available RAID Disk Space : 0.00 GB (0 bytes)
 Hot Spare : No
 Vendor ID : WDC
 Product ID: WD1600JS-55MHB0
 Revision  : 02.0
 Serial No.:  WD-WCANM3083963
 Negotiated Speed  : Not Available
 Capable Speed : Not Available
 Manufacture Day   : Not Available
 Manufacture Week  : Not Available
 Manufacture Year  : Not Available
 SAS Address   : Not Available

Ok, so the status and state are both Unknown. I'm guessing that these
values are completely missing in the SNMP output, which is why perl
chokes on it. I've added some robustness in the code that should handle
this case properly. Please try the beta version (3.5.7-beta1) available
here:

  http://folk.uio.no/trondham/tmp/check_openmanage-3.5.7-beta1

The plugin will give an alert on the drive, which in my opinion is the
correct thing to do. You can always blacklist the drive. The cause of
the error is obviously that this is a non-Dell drive, which Openmanage
doesn't know how to handle.

BTW, you can reduce your command definition to this:

  check_openmanage -s -C $ARG1$ -H $HOSTADDRESS$ -e -i -p -a -o 3 --htmlinfo de

The effect will be the same. You probably defined the command a while
ago, and there have been some changes to options since then.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problem with check_openmanage 3.5.6

2010-02-25 Thread Trond Hasle Amundsen
Nicole Hähnel m...@nicole-haehnel.de writes:

 I tested the new version:

 CRITICAL: [xxx] Physical Disk 0:0 [Wdc WD1600JS-55MHB0, 160GB] on ctrl 0 needs
 attention:
 -- SYSTEM:  PowerEdge 830, SN: xxx
 INTERNAL ERROR: Use of uninitialized value in string eq at /usr/lib64/nagios/
 plugins/grontmij/check_openmanage line 1432.
 INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/
 plugins/grontmij/check_openmanage line 1445.

Hmm.. OK, new test:

  http://folk.uio.no/trondham/tmp/check_openmanage-3.5.7-beta2

Regards,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problem with check_openmanage 3.5.6

2010-02-24 Thread Trond Hasle Amundsen
Nicole Hähnel m...@nicole-haehnel.de writes:

 Hi

 I get this message on one pe830 (OM 6.1.0) :

 CRITICAL: [ xxx] Physical Disk 0:0 [Wdc WD1600JS-55MHB0, 160GB] on ctrl 0 
 needs
 attention:
 -- SYSTEM: PowerEdge 830, SN: xxx
 INTERNAL ERROR: Use of uninitialized value in string eq at /usr/lib64/nagios/
 plugins/grontmij/check_openmanage line 1428.
 INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/
 plugins/grontmij/check_openmanage line 1441.


 Is this a problem of check_openmanage or the disk?
 It's a non dell sata disk.

Hi Nicole,

Can you provide the output of the following command, executed on the
monitored host:

  omreport storage pdisk controller=0

Also, are you using check_openmanage in SNMP or local context?

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_openmanage and net-snmp v3

2010-02-23 Thread Trond Hasle Amundsen

Hi all,

Just to bring this thread to a conclusion... I have released a new
version of check_openmanage that adds a new option '--use-get_table',
which is to be used as a workaround for issues with SNMPv3 on Windows
using net-snmp. There are a few other minor fixes and feature
enhancements as well.

Downloads and changelog:

  http://folk.uio.no/trondham/software/check_openmanage.html#download

(Also available on Nagios Exchange and Monitoring Exchange.)

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage and net-snmp v3

2010-02-15 Thread Trond Hasle Amundsen
Verhaeghe, Koen koen.verhae...@meucci-solutions.com writes:

 The script is working, at least, it does not give any errors anymore.
 I even get Physical Disk 0:1 [Ata WDC WD800JD-75MSA3, 0GB] on ctrl 0
 needs attention: Failure Predicted as expected. I was expecting also an
 errormessage from the Virtual disks, as they are degraded, but that's
 not there.

If the error is just Failure Predicted, it means that the disk is
working fine for the time being and the virtual drive status is not
affected. When/if the drive eventually fails the virtual drive will be
degraded.

 Moreover, I know some of our servers have problems with power supplies
 or memory, so I changed a section in the below mentioned script like you
 did for the disks and others, just to test:

   #my $result = $snmp_session-get_entries(-columns = [keys
 %ps_oid]);
   
 
 ##
   # SNMPv3 test
   
 
 ##
   my $result = q{};
   if ($opt{protocol} == 3) {
   my $powerDeviceTable = '1.3.6.1.4.1.674.10892.1.600.12.1';
   $result = $snmp_session-get_table(-baseoid =
 $powerDeviceTable);
   }
   else {
   $result = $snmp_session-get_entries(-columns = [keys
 %ps_oid]);
   }
   
 
 ##
   
 
 ##

 And now I do get the expected error:
 Power Supply 1 [AC] needs attention: Presence detected, Failure
 detected, AC lost

 I think it is safe to say that, when using net-snmp v3, the get_entries
 method is not giving the expected result.

The complete picture is still a little unclear to me. Do these problems
occur only when you use net-snmp instead of Windows' native snmp agent?
(I'm assuming that net-snmp refers to
http://freshmeat.net/projects/net-snmp).

I would be interested in any test results you might have using the
native Windows snmp agent with SNMPv3.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage and net-snmp v3

2010-02-12 Thread Trond Hasle Amundsen
Verhaeghe, Koen koen.verhae...@meucci-solutions.com writes:

 Thanks for your reply and the new script.
 These are the results:

 With windows SNMP (v2) it works:

Yep, that was expected :)

 With net-snmp v3 (version 5.4.2.1) on the same server, diabling the
 windows snmp, I get:
 ./check_openmanagetest -H xx.xx.xx.xx -P 3 --authprotocol md5 -U xx
 --authpassword xxx --privpassword xx --privprotocol des  -p
 multiline  -t 120 -o 3 -b ctrl_fw=all/ctrl_driver=all/ctrl_stdr=all
 SNMP ERROR [processors]: Received genError(5) error-status at
 error-index 3.

Hmm.. was this on one of the servers that previously has problems
fetching the cooling OIDs?

I believe it would be better to make this work with the standard Windows
SNMP service, which is what most people would use. Where the results any
different without net-snmp?

 This normally indicates a too low version of OMSA, but I am using 6.2.0.

With SNMPv2 on Windows, that usually is the case, yes.

I have a new test version for you:

  http://folk.uio.no/trondham/tmp/check_openmanage-snmpv3test2

This version uses get_table() for fetching OIDs for CPUs and physical
drives as well as cooling devices.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage and net-snmp v3

2010-02-11 Thread Trond Hasle Amundsen
Verhaeghe, Koen koen.verhae...@meucci-solutions.com writes:

 Hi All,
  
 does anyone have an explanation for this: 
 when using check_openmanage with snmp v3, the script exits because some
 OIDs do not exist for a type of server.
 (e.g. '1.3.6.1.4.1.674.10893.1.20.130.4.1.9'  = 'arrayDiskEnclosureID'
 for PowerEdge 860).

 output:
 ./check_openmanage  -H xx.xx.xx.xx -P 3 --authprotocol md5 -U 
 --authpassword x --privpassword x --privprotocol des  -p
 multiline  -t 120 -o 3 -b ctrl_fw=all/ctrl_driver=all/ctrl_stdr=all
 SNMP ERROR [storage / pdisk]: The requested entries are empty or do not
 exist.
  
 When enabling the windows snmp service again and disabling the net-snmp
 v3, I get the correct output:
  
 ./check_openmanage  -H xx.xx.xx.xx -P 2 -C xx  -p multiline  -t 120
 -o 3 -b ctrl_fw=all/ctrl_driver=all/ctrl_stdr=all
 Physical Disk 0:1 [Ata WDC WD800JD-75MSA3, 0GB] on ctrl 0 needs
 attention: Failed
 Logical drive 0 'Windows Disk 0' [RAID-1, 73.57 GB] on ctrl 0 needs
 attention: Degraded|'fan_1_bmc_cpu#fan'=3225RPM;0;0
 'fan_2_bmc_dimm_fan'=3150RPM;0;0
 'temp_0_bmc_planar'=31C;48;53
  
 tested with:
 OMSA version: 5.1 and 6.2
 Net-snmp (x86) versions 5.4.2.1 and 5.5
 NET::SNMP 6.0.0 on the nagios server
  
 Any ideas? 
 I've tried commenting out the OIDs that do not exist (and all related
 script steps) but then the output gives 'OK', but I know there is a
 degraded disk...

 ./check_openmanage  -H xx.xx.xx.xx -P 3 --authprotocol md5 -U 
 --authpassword x --privpassword x --privprotocol des  -p
 multiline  -t 120 -o 3 -b ctrl_fw=all/ctrl_driver=all/ctrl_stdr=all
 OK - System: 'PowerEdge 860', SN: 'J478F3J', hardware working fine, 1
 logical drives, 2 physical drives
 - BIOS='A05 10/04/2007', DRAC4='1.60', BMC='1.75'
 - Ctrl 0 [SAS 5/iR Adapter]: Fw='00.10.51.00.06.12.05.00',
 Dr='1.21.08.00'
 - OpenManage Server Administrator (OMSA) version:
 '5.1.0'|'temp_0_bmc_planar'=30C;48;53

 On other types of servers I get a similar error for [cooling] (e.g on a
 2950)

Hi Koen,

I'm the author of that plugin. To be honest, I've never actually tested
the SNMPv3 stuff. I just pass the options to Net::SNMP and let it handle
it, and hope that it works. You are the first to report SNMPv3 troubles,
and I assume that the SNMPv3 users are a minority.

I'm always interested in fixing bugs, but I'm unable to reproduce this
problem. I see that you're checking a Windows box. I have none of those
to play with, but I have set up SNMPv3 on a RHEL5 box. Checking the
RHEL5 host via SNMPv3 works just fine:

  $ ./check_openmanage -H myhost -P 3 --authprotocol md5 -U  \
  --authpassword  --privpassword  --privprotocol des
  Controller 0 [SAS 6/iR Integrated]: Driver '3.04.07rh' is out of date

Windows + OMSA + SNMP has had some problems in the past, but at least
for SNMPv2c and SNMPv1 these issues should be resolved with OMSA 5.5.0.1
and later versions. It seems there are still issues with SNMPv3.

In the past, there have been problems with SNMP and using the Net::SNMP
function get_entries() vs. get_table(). The former is preferred because
it is faster, since we're not interested in all the OIDs. This is
especially true for servers with many physical disks.

I have created a test version that fetches the cooling OIDs with
get_table() instead of get_entries() if SNMPv3 is used. This version is
available here:

  http://folk.uio.no/trondham/tmp/check_openmanage-snmpv3test

Can you try this version on the servers where checking the cooling
devices fail?

(It's a bit more complicated for physical drives).

PS. Please upgrade to OMSA version 5.5.0.1 or later. Previous versions
are known to perform badly with SNMP on Windows.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage not using my custom temperature thresholds

2010-02-09 Thread Trond Hasle Amundsen
C. Bensend be...@bennyvision.com writes:

 Hey folks,

I am trying to use custom temperature thresholds with one of my
 servers, and it doesn't seem to take them into account.

The full command (as defined in NSC.ini for NSClient++):

 command[check_openmanage]=check_openmanage.exe -e -p -w 0=50 -c 0=54 -b
 bat_charge=ALL/ctrl_fw=ALL/ctrl_driver=ALL --omreport
 F:\dellopenmanage\oma\bin\omreport.exe

Per http://folk.uio.no/trondham/software/check_openmanage.html and
 the man page, I'm pretty sure that's supposed to set the temperature
 probe 0's warning threshold to 50C and critical to 54C.  However, I'm
 still getting a non-OK for temp probe 0:

 Temperature Probe 0 [System Board Ambient Temp] is too high at 43 C
 -- SYSTEM: PowerEdge 2900, SN: 4PVXSK1

Just to be sure this wasn't a glitch with the display of the temp
 probe #, I've tried 0=50,1=50,2=50 and 0=54,1=54,2=54 but it still
 complains.

Am I missing something?  I've looked at this until I'm crosseyed,
 and I'm pretty sure I'm using it correctly.  Is there a hardcoded
 threshold in there that I'm not aware of?

Hi Benny,

Openmanage has its own limits. From a random M600 server here, the
limits for ambient temperature is

  # omreport chassis temps
  Temperature Probes Information
  
  
  Main System Chassis Temperatures: Ok
  
  
  Index : 0
  Status: Ok
  Probe Name: System Board Ambient Temp
  Reading   : 16.0 C
  Minimum Warning Threshold : 8.0 C
  Maximum Warning Threshold : 42.0 C
  Minimum Failure Threshold : 3.0 C
  Maximum Failure Threshold : 47.0 C

To be honest, I've never considered the possibility of anyone wanting to
set custom temperatures *higher* than the OMSA maximum. I allways
assumed that people wanted to use the custom limits to set the max
temperature *lower* than the default limits. Clearly I was wrong :)

What happens in your case is that the OMSA limits kicks in. It is
possible to adjust the OMSA warning limits, e.g.

  # omconfig chassis temps index=0 maxwarnthresh=45
  Temperature probe warning threshold(s) set successfully.

It is not possible to adjust the critical (failure) limits like this,
only the warning limits can be set manually. Also, I believe that when a
server hits the critical limit, in the interest of self preservation it
shuts itself down.

The plugin could be made to ignore the OMSA warning limit if the custom
limit is set beyond it, but I'm not sure that we want this in general.
What do you think?

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage not using my custom temperature thresholds

2010-02-09 Thread Trond Hasle Amundsen
C. Bensend be...@bennyvision.com writes:

 Now that I know what's going on (and how to adjust the OMSA threshold
 if need be), I'd say keep it where it is.  However, if these details
 were mentioned on the page:

 http://folk.uio.no/trondham/software/check_openmanage.html

 it would have saved me a lot of time, hair, and such.  Could this be
 added?

Yes, I have updated the documentation:

  
http://folk.uio.no/trondham/software/check_openmanage.html#custom-temperature-thresholds

Hopefully this will clarify things for other users. BTW, thanks for
reporting this, the documentation was ambiguous and in need of an
update :)

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage having issues with OMSA 6.2.0

2010-02-01 Thread Trond Hasle Amundsen
C. Bensend be...@bennyvision.com writes:

 Hey folks,

During this past weekend's maintenance window, we upgraded several
 hosts to OMSA v6.2.0.  They were previously at v5.0.0, and so
 check_openmanage wasn't able to poll them.

Now, they are still showing as UNKNOWN, giving the following
 error:


 Problem running 'omreport storage controller': Error!  Invalid
 name=value pair: controller


This is running on a Dell PowerEdge 1950, with the following
 command via NSClient++:


 check_openmanage.exe -e -b bat_charge=ALL/ctrl_fw=ALL/ctrl_driver=ALL
 --omreport E:\OpenManage\oma\bin\omreport.exe


I would have thought 6.2.0 would be OK - is anyone else seeing
 issues, or does anyone know of incompatibilities?  I checked the
 check_openmanage FAQ, but didn't see anything...

Hi Benny,

The command 'omreport storage controller' is pretty basic and should
never fail like that. You should check if OMSA is correctly installed,
specifically the storage stuff. OMSA consists of many different
components, and I'm guessing that the storage component(s) are missing
on your server.

If you run the command manually, you get the same error message, right?

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_openmanage 3.5.5-beta6 snmp_detect_blade bug

2010-01-20 Thread Trond Hasle Amundsen
McKinlay, Ken ken.mckin...@curtisswright.com writes:

 Trond,

 Other little bug for your next release. Using check_openmanage
 3.5.5-beta6 on a server loaded with OMSA 5.1.0 (a different box this
 time), in the snmp_detect_blade function it returned: INTERNAL ERROR:
 Use of uninitialized value in string eq at
 ./check_openmanage-3.5.5-beta6 line 599.

 Looking at the line and then doing my own SNMP query, that OID is
 missing in OMSA 5.1.0. However, by changing line 599 to first make sure
 a result has been set then the uninitialized value error is bypassed in
 the if statement:

 if ( $result-{$DellBaseBoardType}  $result-{$DellBaseBoardType} eq
 '3') {

Thank you, the patch is applied. Note that check_openmanage is not
designed to work with really old OMSA versions (5.2 and earlier). This
is more of a problem when checking locally, since omreport commands are
different. I generally won't add support for old OMSA if it has a
noticeable speed or complexity impact, but that is not the case here.
Besides, checking that the value exists is good practice anyway :)

An updated version is available here:

  http://folk.uio.no/trondham/tmp/check_openmanage-3.5.5-beta7

If you confirm that this beta works for you, and I don't get any more
bug reports in the next few days, this will eventually become 3.5.5.

Cheers,
-- 
Trond H. Amundsen t.h.amund...@usit.uio.no
Center for Information Technology Services, University of Oslo

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


  1   2   >