Hey, I don’t know what your servers or networks are busy with when monit fails. But if I were you I would try to find out why monit reports failure and nmap does not. See my suggestions in my last mail.
> Am 29.08.2017 um 06:08 schrieb Rizal Muttaqin <[email protected]>: > > So basically the default timeout (5 seconds) is too fast? > > Right now the failed-succeded interval back to 5 minutes again > http://imgur.com/a/FithN > > > On 28/08/17 17:09, Tino Hendricks wrote: >> Then I think the only possibility is you’re running into timeouts. >> I’d play around with some values here >> https://mmonit.com/monit/documentation/monit.html#CONNECTION-TESTS >> >> And maybe let a ping, traceroute or recurring nmap running in parallel for >> testing purposes so you can be _sure_ that it’s a monit problem (and not a >> temporary network problem that went away by the time your checkscript kicks >> in). >> >> Tino >> >> >> >> Am 28. August 2017 um 12:05:32, Rizal Muttaqin >> ([email protected](mailto:[email protected])) schrieb: >> >>> Yep, the Nmap script was made for double-checking functionality. >>> Normally, I have to check network port status manually after monit sent >>> an alert. I need to automate it with Nmap script especially when I'm not >>> online. What I'm not sure why there were different report between monit >>> alert and Nmap output file. Nmap never tell me other report beside open >>> status: >>> #################### >>> Thu Aug 24 16:07:23 UTC 2017 >>> 80/tcp open >>> 843/tcp open >>> 2121/tcp open >>> 8080/tcp open >>> Thu Aug 24 20:46:23 UTC 2017 >>> 80/tcp open >>> 843/tcp open >>> 2121/tcp open >>> 8080/tcp open >>> Fri Aug 25 13:04:42 UTC 2017 >>> 80/tcp open >>> 843/tcp open >>> 2121/tcp open >>> 8080/tcp open >>> Fri Aug 25 21:11:23 UTC 2017 >>> 80/tcp open >>> 843/tcp open >>> 2121/tcp open >>> 8080/tcp open >>> Sun Aug 27 07:49:19 UTC 2017 >>> 80/tcp open >>> 843/tcp open >>> 2121/tcp open >>> 8080/tcp open >>> Mon Aug 28 00:33:10 UTC 2017 >>> 80/tcp open >>> 843/tcp open >>> 2121/tcp open >>> 8080/tcp open >>> ####################### >>> On 28/08/17 16:37, Tino Hendricks wrote: >>>> Rizal, >>>> >>>> looking at your script I think you mixed up functionality: >>>> The check if a network port is open or not is done with the "if failed >>>> port…“ statements. The „start/stop program“ is meant for the case where >>>> these checks fail and you tell monit to „restart“. >>>> So if you’re not happy with the built-in checks that monit offers >>>> https://mmonit.com/monit/documentation/monit.html#CONNECTION-TESTS >>>> you need to put your checkport.sh in the test-part of monit’s config. >>>> >>>> Something like >>>> >>>> check program checkport.shwith path /opt/monit/scripts/checkport.sh >>>> if status != 0 then alert >>>> >>>> more examples: >>>> https://mmonit.com/monit/documentation/monit.html#PROGRAM-STATUS-TEST >>>> >>>> HTH >>>> >>>> Tino >>>> >>>> >>>> Am 28. August 2017 um 05:25:49, Rizal Muttaqin >>>> ([email protected](mailto:[email protected])) schrieb: >>>> >>>>> Hello all, >>>>> >>>>> It's my first experience to play with monit. So, basically I have two >>>>> server with several services running there and I want monit to check >>>>> whether some ports (relative to that service) are listening or not with >>>>> monit. The configuration relatively simple, monit check port status, >>>>> when failed monit will start a nmap script bash script and send the >>>>> status to the file. Plus, in same function, monit will send alert in 5 >>>>> cycles. >>>>> >>>>> The problem is when monit send an connections failed alert then in the >>>>> next 5 minutes monit send again an connection suceeded alert, when I >>>>> check the nmap log script there's no port failed (filtered or >>>>> close)/port status is always open. I've checked manually with nmap when >>>>> monit send failed alert but the result is always the same: port status >>>>> is open: >>>>> >>>>> >>>>> Why monit do always send failure alert when the port is open, and why in >>>>> the next 5 minutes interval I see connection succeeded? I've changed set >>>>> daemon to 30, and then the alert interval become 1.5 minutes, and revert >>>>> daemon to be 300, but now the alert interval is always be 1.5 minutes. >>>>> >>>>> This is my /etc/monitrc configuration for first server (another server >>>>> configuration script exactly the same) >>>>> >>>>> >>>>> ##################################################### >>>>> >>>>> set daemon 300 # check services at 300 seconds (5 minutes) >>>>> intervals >>>>> >>>>> check host somehost with address somehost.com >>>>> start program = "/opt/monit/scripts/checkport.sh start" >>>>> stop program = "/opt/monit/scripts/checkport.sh stop" >>>>> if failed port 80 then restart >>>>> if failed port 843 then restart >>>>> if failed port 2121 then restart >>>>> if failed port 8080 then restart >>>>> if failed port 80 for 5 cycles then alert >>>>> if failed port 843 for 5 cycles then alert >>>>> if failed port 2121 for 5 cycles then alert >>>>> if failed port 8080 for 5 cycles then alert >>>>> alert [email protected] with reminder on 5 cycles >>>>> >>>>> ######################################################## >>>>> >>>>> and this is my /opt/monit/checkport.sh script >>>>> >>>>> ######################################################## >>>>> >>>>> #!/bin/bash >>>>> >>>>> case $1 in >>>>> start) >>>>> nmap -p 80,843,2121,8080 -P0 somehost.com -oG-| awk 'NR>=6 && >>>>> NR<=9 {print $1 "\t" $2}' | cat >> /opt/monit/log/checkedport | date >> >>>>> /opt/monit/log/checkedport & echo $! > /var/run/checkport.pid ; >>>>> ;; >>>>> stop) >>>>> pkill -F /var/run/checkport.pid ;; >>>>> *) >>>>> echo "usage: checkport {start|stop}" ;; >>>>> esac >>>>> exit 0 >>>>> ######################################################### >>>>> >>>>> >>>>> -- >>>>> To unsubscribe: >>>>> https://lists.nongnu.org/mailman/listinfo/monit-general >>> > -- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
