>From back in January -- there has been other discussion on similar issues, but no discussion / traction on this bug (and of course, no medium to report it, track it, document it, submit bugs/patches, etc.)
The problem is that the exact same code forking/exec'ing code is used for: - Service/Host Checks - Event Handlers - Notify Commands - OCHP/OCSP Handler - Performance Data Handlers Result codes are explicitly registered with the API. 126 and 127 are also checked for explicitly and warned/logged (but only in recent versions). Of course 0,1,2,3 are evaluated as Service/Host check API values. 568 /* check for possibly missing scripts/binaries/etc */ 569 if(result==126 || result==127){ The problem is that 0,1,2,3 and != 126/127 can have different connotations and for non host-service check forks, but the method/function called, my_system(), doesn't have a way of distinguishing the calling method to change logging behavior, which it should. The problem is further complicated by rampant use of pipes and other exotic Bourne-style expressions being used in command_line variables withing Nagios (one book in particular set this in motion), which, depending oh how compliant a Bourne-shell is, can behave differently on various systems. Examples below, and, by no means, are they mean to be definitive, as how bash(1) forks may behave entirely different than exec(). Embedded perl could also further complicate things (but of course) Solution 1: - Teach my_system() to behave different for non-healthcheck forks Solution 2: - Call a shell script wrapper for OCSP/OCSP/Pref/Event/Notify Solution 3 (added begrudgingly): - Tell send_nsca and other builtins to use error codes > 3, 4->125 and 129->231 are available, but this doesn't fix the problems with pipes outlined below. ~BAS $ echo test | /doesntexist -bash: /doesntexist: No such file or directory $ echo $? 127 $ /doesntexist | echo foo foo -bash: /doesntexist: No such file or directory $ echo $? 0 $ echo > test.sh $ chmod -x test.sh $ /home/seklecki/test.sh -bash: /home/seklecki/test.sh: Permission denied $ echo $? 126 $ echo test | ./test.sh -bash: ./test.sh: Permission denied $ echo $? 126 $ ./test.sh | echo foo -bash: ./test.sh: Permission denied foo $ echo $? 0 $ echo fuck shit ass | /usr/local/sbin/send_nsca -H cock.gobbling.asshat Could not open config file 'send_nsca.cfg' for reading. Error: Config file 'send_nsca.cfg' contained errors... $ echo $? 2 On Wed, 2008-01-02 at 15:06 -0500, Brian A. Seklecki wrote: > What happens if ocsp/ohcp commands return non-zero status? > > # send_nsca -H doesnt.fucking.exist -c foo/etc/nagios/send_nsca.cfg > Invalid host name 'doesnt.fucking.exist' > Error: Could not connect to host doesnt.fucking.exist on port 5667 > # echo $? > 2 > > When this happens, its a very serious problem. Nothing is logged. This > results in a silent failure. > > Obviously, send_nsca should transmit to a hostname in hosts(5) and/or to > an IP address that is highly available resolving any dependency on DNS. > > But even with that in mind, this exec()/fork() model behavior is > pragmatically incorrect. > > The code should be checking result code for return values != 0, and > printing a critical error to the logs. > > Moreover, even with debug_level=99999999999999999999999999 > > No warning / error / notice occurs: > > > [1199302511.076169] [001.0] [pid=75615] handle_host_state() > [1199302511.076189] [001.0] [pid=75615] > obsessive_compulsive_host_check_processor() > [1199302511.076229] [001.0] [pid=75615] get_raw_command_line() > [1199302511.076261] [2320.2] [pid=75615] Raw Command Input: /bin/echo > $HOSTNAME$//$HOSTSTATEID$//'$HOSTOUTPUT$' | /usr/local/sbin/send_nsca -H > fbsd01.cfi.biz -c /usr/local/etc/nagios/send_nsca.cfg -d "//" > [1199302511.076284] [2320.2] [pid=75615] Expanded Command > Output: /bin/echo $HOSTNAME$//$HOSTSTATEID$//'$HOSTOUTPUT$' > | /usr/local/sbin/send_nsca -H fbsd01.cfi.biz > -c /usr/local/etc/nagios/send_nsca.cfg -d "//" > [1199302511.076289] [016.2] [pid=75615] Raw obsessive compulsive host > processor command line: /bin/echo $HOSTNAME$//$HOSTSTATEID > $//'$HOSTOUTPUT$' | /usr/local/sbin/send_nsca -H fbsd01.cfi.biz > -c /usr/local/etc/nagios/send_nsca.cfg -d "//" > [1199302511.076664] [001.0] [pid=75615] process_macros() > [1199302511.076683] [2048.1] [pid=75615] **** BEGIN MACRO PROCESSING > *********** > [1199302511.076700] [2048.1] [pid=75615] Processing: '/bin/echo > $HOSTNAME$//$HOSTSTATEID$//'$HOSTOUTPUT$' | /usr/local/sbin/send_nsca -H > fbsd01.cfi.biz -c /usr/local/etc/nagios/send_nsca.cfg -d "//"' > [1199302511.076720] [2048.2] [pid=75615] Processing part: '/bin/echo ' > [1199302511.076739] [2048.2] [pid=75615] Not currently in macro. > Running output (10): '/bin/echo ' > [1199302511.076758] [2048.2] [pid=75615] Processing part: 'HOSTNAME' > [1199302511.076780] [2048.2] [pid=75615] Uncleaned macro. Running > output (16): '/bin/echo fbsd01' > [1199302511.077138] [2048.2] [pid=75615] Just finished macro. Running > output (16): '/bin/echo fbsd01' > [1199302511.077157] [2048.2] [pid=75615] Processing part: '//' > [1199302511.077176] [2048.2] [pid=75615] Not currently in macro. > Running output (18): '/bin/echo fbsd01//' > [1199302511.077194] [2048.2] [pid=75615] Processing part: > 'HOSTSTATEID' > [1199302511.077216] [2048.2] [pid=75615] Uncleaned macro. Running > output (19): '/bin/echo fbsd01//0' > [1199302511.077235] [2048.2] [pid=75615] Just finished macro. Running > output (19): '/bin/echo fbsd01//0' > [1199302511.077254] [2048.2] [pid=75615] Processing part: '//'' > [1199302511.077273] [2048.2] [pid=75615] Not currently in macro. > Running output (22): '/bin/echo fbsd01//0//'' > [1199302511.077291] [2048.2] [pid=75615] Processing part: 'HOSTOUTPUT' > [1199302511.079235] [2048.2] [pid=75615] Uncleaned macro. Running > output (63): '/bin/echo fbsd01//0//'PING OK - Packet loss = 0%, RTA = > 0.97 ms' > [1199302511.079337] [2048.2] [pid=75615] Just finished macro. Running > output (63): '/bin/echo fbsd01//0//'PING OK - Packet loss = 0%, RTA = > 0.97 ms' > [1199302511.079805] [2048.2] [pid=75615] Processing part: '' > | /usr/local/sbin/send_nsca -H fbsd01.cfi.biz > -c /usr/local/etc/nagios/send_nsca.cfg -d "//"' > [1199302511.080427] [2048.2] [pid=75615] Not currently in macro. > Running output (157): '/bin/echo fbsd01//0//'PING OK - Packet loss = 0%, > RTA = 0.97 ms' | /usr/local/sbin/send_nsca -H fbsd01.cfi.biz > -c /usr/local/etc/nagios/send_nsca.cfg -d "//"' > [1199302511.081348] [2048.1] [pid=75615] Done. Final output: > '/bin/echo fbsd01//0//'PING OK - Packet loss = 0%, RTA = 0.97 ms' > | /usr/local/sbin/send_nsca -H fbsd01.cfi.biz > -c /usr/local/etc/nagios/send_nsca.cfg -d "//"' > [1199302511.081823] [2048.1] [pid=75615] **** END MACRO PROCESSING > ************* > [1199302511.082308] [016.2] [pid=75615] Processed obsessive compulsive > host processor command line: /bin/echo fbsd01//0//'PING OK - Packet loss > = 0%, RTA = 0.97 ms' | /usr/local/sbin/send_nsca -H fbsd01.cfi.biz > -c /usr/local/etc/nagios/send_nsca.cfg -d "//" > [1199302511.083217] [001.0] [pid=75615] my_system() > [1199302511.084280] [256.1] [pid=75615] Running command '/bin/echo > fbsd01//0//'PING OK - Packet loss = 0%, RTA = 0.97 ms' > | /usr/local/sbin/send_nsca -H fbsd01.cfi.biz > -c /usr/local/etc/nagios/send_nsca.cfg -d "//"'... > [1199302511.091760] [001.0] [pid=80369] process_macros() > [1199302511.092248] [001.0] [pid=80369] process_macros() > [1199302511.093349] [001.0] [pid=80369] process_macros() > [1199302511.094769] [001.0] [pid=80369] process_macros() > [1199302511.095734] [001.0] [pid=80369] process_macros() > [1199302511.096702] [001.0] [pid=80369] process_macros() ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null