Hi all,

I posted on the nagios-portal 
(http://www.nagios-portal.org/wbb/index.php?page=Thread&threadID=21919) and was 
referred here for a wider audience, I hope to find some wise minds to aid me 
with this conundrum...



I have an issue with distributed monitoring at the moment. Our Master server is 
accepting commands without a hitch. We've manually run the defined 
submit_check_result script created through the NSCA/Distributed Monitoring 
guide on the webpage.

However, the icinga-process on the slave-icinga doesn't seem to actually 
execute it.

I've added an extra line to log each use (and what it sends) to an extra 
log-file and it's obvious it's never being used by anyone but me, manually.

When turning on debug-levels, I can see it expands the oscp_command variables 
and expands the service-variables and sends it.

This is an excerpt from the debug-file;



[1299142675.061622] [016.2] [pid=15506] Found a check result (#1) to handle...
[1299142675.061679] [016.1] [pid=15506] Handling check result for service 
'Current Users' on host 'KD-OPS02'...
[1299142675.061718] [016.0] [pid=15506] ** Handling check result for service 
'Current Users' on host 'KD-OPS02'...
[1299142675.061733] [016.1] [pid=15506] HOST: KD-OPS02, SERVICE: Current Users, 
CHECK TYPE: Active, OPTIONS: 0, SCHEDULED: Yes, RESCHEDULE: Yes, EXITED OK: 
Yes, RETURN CODE: 0, OUTPUT: USERS OK - 1 users currently logged in 
|users=1;20;50;0\n
[1299142675.061883] [016.2] [pid=15506] Parsing check output...
[1299142675.061907] [016.2] [pid=15506] Short Output: USERS OK - 1 users 
currently logged in
[1299142675.061917] [016.2] [pid=15506] Long Output: NULL
[1299142675.061924] [016.2] [pid=15506] Perf Data: users=1;20;50;0
[1299142675.061936] [016.2] [pid=15506] ST: HARD CA: 1 MA: 5 CS: 0 LS: 0 LHS: 0
[1299142675.061959] [016.1] [pid=15506] Service is OK.
[1299142675.061968] [016.1] [pid=15506] Service did not change state.
[1299142675.062020] [2320.2] [pid=15506] Raw Command Input: 
/usr/local/nagios/libexec/submit_check_result $HOSTNAME$ '$SERVICEDESC$' 
$SERVICESTATE$ '$SERVICEOUTPUT$'
[1299142675.062049] [2320.2] [pid=15506] Expanded Command Output: 
/usr/local/nagios/libexec/submit_check_result $HOSTNAME$ '$SERVICEDESC$' 
$SERVICESTATE$ '$SERVICEOUTPUT$'
[1299142675.062059] [016.2] [pid=15506] Raw obsessive compulsive service 
processor command line: /usr/local/nagios/libexec/submit_check_result 
$HOSTNAME$ '$SERVICEDESC$' $SERVICESTATE$ '$SERVICEOUTPUT$'
[1299142675.062115] [016.2] [pid=15506] Processed obsessive compulsive service 
processor command line: /usr/local/nagios/libexec/submit_check_result KD-OPS02 
'Current Users' OK 'USERS OK - 1 users currently logged in'
[1299142675.062140] [256.1] [pid=15506] Running command 
'/usr/local/nagios/libexec/submit_check_result KD-OPS02 'Current Users' OK 
'USERS OK - 1 users currently logged in''...
[1299142675.062332] [064.1] [pid=15506] Making callbacks (type 10)...
[1299142675.077264] [256.1] [pid=15506] Execution time=0.014 sec, early 
timeout=0, result=2, output=(null)



This claims that it's sending the command and with audit control I can see that 
it's actually poking in the file;

aureport -f gives;
1783. 03/03/2011 10:07:55 /usr/local/nagios/libexec/submit_check_result 2 yes 
/usr/sbin/icinga -1 1786


The permissions on on the submit_check_result are;

-rwxr-xr-x 1 nagios nagios 1088 2011-03-03 08:47 
/usr/local/nagios/libexec/submit_check_result

And running the script with the nagios user (the same as the icinga process is 
running as);
root@kd-ops02:/usr/src<http://www.nagios-portal.org/wbb/mailto:root@kd-ops02:/usr/src>#
 ps aux | grep icinga
nagios 15506 0.0 0.1 37396 2020 ? SNsl 09:57 0:01 /usr/sbin/icinga -d 
/etc/icinga/icinga.cfg


su - nagios
/usr/local/nagios/libexec/submit_check_result KD-OPS02 'Disk Space' OK 
'Manually Sent OK #5'

The seperate logging I added in the submit_check_result script writes;

KD-OPS02 Disk Space 0 Manually Sent OK #5
1 data packet(s) sent to host successfully.

And the Master Icinga server receives it and handles it properly when I sent it 
manually.

What is going wrong?





The slave-system is running on Ubuntu Server 10.10 Maverick with icinga 
installed from the Ubuntu repository, package v. 1.0.2-1, and with Nagios 
Plugins 1.4.15



dpkg -l | grep icinga
ii icinga 1.0.2-1 monitoring and host and network monitoring system - 
metapackage
ii icinga-cgi 1.0.2-1 host and network monitoring system - CGI scripts
ii icinga-common 1.0.2-1 host and network monitoring system - support files
ii icinga-core 1.0.2-1 host and network monitoring system - core files
------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
icinga-users mailing list
icinga-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/icinga-users

Reply via email to