Hi I haven't researched this or anything, but is there is a -v option to 
check_by_ssh to get the exact error thrown? - I'm simply wondering if you have 
a bad/mismatched key in ~/.ssh/known_hosts or authorized_keys (sorry, ive been 
too busy to be much help on nagios lately guys)...

Cheers!
Jamie

From: MAHONEY, DANIEL [mailto:dm5...@att.com]
Sent: Monday, June 10, 2013 5:27 PM
To: nagios-users@lists.sourceforge.net; MAHONEY, DANIEL
Subject: [Nagios-users] Return code of 127 is out of bounds - plugin may be 
missing

Send to 
nagios-users@lists.sourceforge.net<mailto:nagios-users@lists.sourceforge.net>

Greetings, all. I've googled the subject above and evaluated the answers I've 
found but haven't yet found info that pinpoints my issue.

I'm running Nagios Core 3.2.1 on RedHat 5.8. This installation has been running 
for a few years, I just inherited it's care and maintenance recently. On one of 
my monitored servers I write a script "checkRAID.sh" that calls another piece 
of code, looks at the results, and returns either a 0 or a 2 (the result will 
always be either good or critical, depending on whether the RAID controller is 
unhappy).

Nagios runs as user "nagios". The remote machine is configured to allow user 
"nagios" to log in without a password, using a key pair. This works.

In /usr/local/nagios/etc/checkcommands.cfg I have :
define command{
    command_name    check_raid
    command_line    /usr/local/nagios/libexec/check_by_ssh -H $HOSTNAME -l 
nagios -i /home/nagios/.ssh/id_rsa -E -o StrictHostKeyChecking=no -C 
/home/nagios/checkRAID.sh
}

When I become nagios ("su - nagios") and run that script, I get:
[nagios@nagios ~]$ /usr/local/nagios/libexec/check_by_ssh -H <remote server IP> 
-l nagios -i /home/nagios/.ssh/id_rsa -E -o StrictHostKeyChecking=no -C 
/home/nagios/checkRAID.sh
Check failed
[nagios@nagios ~]$ echo $?
2
[nagios@nagios ~]$

That "Check failed" line is what's written to stdout just before returning an 
exit code of 2. This shows me that the remote script is working fine, and that 
the local nagios user is able to execute it with no problems.  However, once I 
add an entry to services.cfg to tie this service check to my remote host and 
give it time to run the command, when I look at nagios' "Services" page it 
shows :

check_raid          CRITICAL              06-10-2013 21:17:25        0d 6h 14m 
29s    3/3         (Return code of 127 is out of bounds - plugin may be missing)

This has me baffled. The return code is quite clearly 2.

I recently set debug_level to -1 and restarted. I'm hoping that the debug log 
will

Daniel Mahoney
dm5...@att.com<mailto:dm5...@att.com>

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Reply via email to