I had success with the first service eventhandler we implemented and am now trying to duplicate it for a second service (with a slightly modified eventhandler script) and failing. The eventhandler that is failing is for the "Argus Daemon" service on host quagmire, and the failure is that although the logs show the event handler being called with expected arguments, the command executed from the eventhandler script (an SSH connection to the target host) is never observed. If we call the eventhandler script manually with the same expected arguments, it operates properly (SSH connection occurs and remote service is started).
Running Nagios 2.10 (nagios-2.10-3.fc7) on Fedora 7 GNU/Linux. SELinux enabled but set to not enforce. # /etc/nagios/nagios.cfg log_event_handlers=1 event_handler_timeout=30 enable_event_handlers=1 # /etc/nagios/definitions.cfg (object configuration file) define service{ name generic-service ; The 'name' of this service template ... event_handler_enabled 1 ; Service event handler is enabled register 0 } define service{ use ti-service host_name quagmire service_description Argus Daemon check_command check_nrpe!check_proc_argus event_handler handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe } # /etc/nagios/commands.cfg define command{ command_name handler_restart_service_openbsd command_line $USER1$/eventhandlers/service-restart-openbsd.sh $HOSTADDRESS$ $ARG2$ $SERVICESTATE$ $SERVICES TATETYPE$ $SERVICEATTEMPT$ $ARG1$ } # service-restart-openbsd.sh: $ ls -lZ /usr/lib64/nagios/plugins/eventhandlers/ -rwxr-xr-x root root system_u:object_r:bin_t service-restart-linux.sh -rwxr-xr-x root root system_u:object_r:bin_t service-restart-openbsd.sh ----- snip ----- #!/bin/sh # # $Id$ # # Event handler script for restarting a service. The idea of a "service" on # OpenBSD doesn't really work as it doesn't use a SysV init but a monolithic # rc. For this reason we call a script on the remote server and don't # parameterize paths to an init script in this handler. # # [Attribution] taken from example in Nagios documentation at: # http://nagios.sourceforge.net/docs/2_0/eventhandlers.html # # Note: This script will only restart the service if the service is # retried 3 times (in a "soft" state) or if the service somehow # manages to fall into a "hard" error state. # # Host to connect to DST_HOST="$1" # User to connect as via SSH DST_USER="$2" # Service state (OK, WARNING, etc.) SVC_STATE="$3" # Service type (SOFT, HARD, etc.) SVC_STATE_TYPE="$4" # Service attempt (3, 4, etc.) SVC_STATE_ATTEMPT="$5" # Script name (full path.) SVC_NAME="$6" case "$SVC_STATE" in # Only deal with services that have dropped to CRITICAL state. CRITICAL) case "$SVC_STATE_TYPE" in # SOFT failures we deal with once it becomes apparent that # the failure is definate (on the third failure, before # notifications are sent out.) SOFT) case "$SVC_STATE_ATTEMPT" in 3) /usr/bin/ssh -tt -l $DST_USER $DST_HOST sudo $SVC_NAME ;; esac ;; HARD) # If we hit a HARD failure, attempt to deal with it one # last time. /usr/bin/ssh -tt -l $DST_USER $DST_HOST sudo $SVC_NAME ;; esac ;; esac # Eventhandlers should always exit successfully, apparently. exit 0 ----- /snip ----- Here's the logs showing detection of the service reaching the two states the eventhandler script should activate on (SOFT/3 and HARD): [1226710816] SERVICE ALERT: quagmire;Argus Daemon;CRITICAL;SOFT;1;PROCS CRITICAL: 0 processes with command name 'argus' [1226710816] SERVICE EVENT HANDLER: quagmire;Argus Daemon;CRITICAL;SOFT;1;handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe [1226710876] SERVICE ALERT: quagmire;Argus Daemon;CRITICAL;SOFT;2;PROCS CRITICAL: 0 processes with command name 'argus' [1226710876] SERVICE EVENT HANDLER: quagmire;Argus Daemon;CRITICAL;SOFT;2;handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe [1226710936] SERVICE ALERT: quagmire;Argus Daemon;CRITICAL;SOFT;3;PROCS CRITICAL: 0 processes with command name 'argus' [1226710936] SERVICE EVENT HANDLER: quagmire;Argus Daemon;CRITICAL;SOFT;3;handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe [1226710996] SERVICE ALERT: quagmire;Argus Daemon;CRITICAL;HARD;4;PROCS CRITICAL: 0 processes with command name 'argus' [1226710996] SERVICE NOTIFICATION: ti;quagmire;Argus Daemon;CRITICAL;notify-by-email;PROCS CRITICAL: 0 processes with command name argus [1226710996] SERVICE EVENT HANDLER: quagmire;Argus Daemon;CRITICAL;HARD;4;handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe I can manually invoke it and have it succeed: $ su - nagios $ sh -x /usr/lib64/nagios/plugins/eventhandlers/service-restart-openbsd.sh quagmire.local _nrpe CRITICAL HARD 4 start-argus + DST_HOST=quagmire.local + DST_USER=_nrpe + SVC_STATE=CRITICAL + SVC_STATE_TYPE=HARD + SVC_STATE_ATTEMPT=4 + SVC_NAME=start-argus + case "$SVC_STATE" in + case "$SVC_STATE_TYPE" in + /usr/bin/ssh -tt -l _nrpe quagmire.local sudo /usr/local/bin/start-argus Connection to quagmire.local closed. + exit 0 When invoked manually I see the sshd log indicating the connection on the remote end and a sudo log indicating the command execution. When nagios kicks off the eventhandlers, neither of these logs are seen on the remote side. Any clue where else to look? -- Darren Spruell [EMAIL PROTECTED] ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null