Re: [Nagios-users] Dynamic warning/critical thresholds
On 22/06/12 15:11, Jonathan Gazeley wrote: I've got a bunch of Nagios plugins that monitor things like DNS/HTTP/RADIUS hits per second. I've set what I believe to be sensible max/min warning thresholds but what I really want is dynamic thresholds. If some quantity suddenly doubles or halves, I'd like an alert. For example, if I usually serve 10 DNS lookups per second, and suddenly it is doing 20 per second, that isn't a fault but I would like to know about it, because it might mean there is a problem with the network in general. Is there a way of doing this? Any ideas? Thanks, Jonathan -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Dynamic warning/critical thresholds
On 22/06/12 15:11, Jonathan Gazeley wrote: I've got a bunch of Nagios plugins that monitor things like DNS/HTTP/RADIUS hits per second. I've set what I believe to be sensible max/min warning thresholds but what I really want is dynamic thresholds. If some quantity suddenly doubles or halves, I'd like an alert. For example, if I usually serve 10 DNS lookups per second, and suddenly it is doing 20 per second, that isn't a fault but I would like to know about it, because it might mean there is a problem with the network in general. Is there a way of doing this? Any ideas? You've already received two replies, both stating that you'll likely have to write some code to do it. I'm not aware of any common plugins out there that calculate rates of change and alert appropriately. Maybe they exist, but I don't recall seeing any of them. Have you tried any of the plugin sites? -- Death rays, advanced technology or not, no creature wants to be stabbed in their hoo-hoo.-- Seen on zombiehunters.org -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Dynamic warning/critical thresholds
On 10/07/12 14:47, C. Bensend wrote: On 22/06/12 15:11, Jonathan Gazeley wrote: I've got a bunch of Nagios plugins that monitor things like DNS/HTTP/RADIUS hits per second. I've set what I believe to be sensible max/min warning thresholds but what I really want is dynamic thresholds. If some quantity suddenly doubles or halves, I'd like an alert. For example, if I usually serve 10 DNS lookups per second, and suddenly it is doing 20 per second, that isn't a fault but I would like to know about it, because it might mean there is a problem with the network in general. Is there a way of doing this? Any ideas? You've already received two replies, both stating that you'll likely have to write some code to do it. I'm not aware of any common plugins out there that calculate rates of change and alert appropriately. Maybe they exist, but I don't recall seeing any of them. Have you tried any of the plugin sites? Oh, I didn't receive any replies. Presumably the mails got lost in the ether. I'm happy to write code - I just wondered if there was a built-in way of doing this. Thanks for your response, Jonathan -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Dynamic warning/critical thresholds
You've already received two replies, both stating that you'll likely have to write some code to do it. I'm not aware of any common plugins out there that calculate rates of change and alert appropriately. Maybe they exist, but I don't recall seeing any of them. Have you tried any of the plugin sites? Oh, I didn't receive any replies. Presumably the mails got lost in the ether. I'm happy to write code - I just wondered if there was a built-in way of doing this. Not to my knowledge, no - the standard Nagios plugins don't know about rate of change, and I haven't run across many (any?) third- party plugins that do. The difficult part is retaining state - yes, it's simple to use a statefile, but if you have a lot of services you could end up with thousands of state files. It can become pretty ugly to deal with them. Your original message (and consequently, the replies you missed) can be found here: http://marc.info/?l=nagios-usersm=134037453807273w=2 -- Death rays, advanced technology or not, no creature wants to be stabbed in their hoo-hoo.-- Seen on zombiehunters.org -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Dynamic warning/critical thresholds
As indicated, make a plugin that gets the info and set thresholds so high they're never likely to ring the red bell. But in addition to this, I'd set it up with something like nagiosgraph to generate graphs that you can watch. This will save/show historical data and show you the norm. Thus you may wish to then set a threshold at a later date based on this empirical data. If you don't want to go to the trouble of nagiosgraph, your plugin can still email you when the rate reaches a threshold you define (again, without setting off nagios warning/critical alarms) On Tue, Jul 10, 2012 at 10:28 AM, Jonathan Gazeley jonathan.gaze...@bristol.ac.uk wrote: On 10/07/12 14:47, C. Bensend wrote: On 22/06/12 15:11, Jonathan Gazeley wrote: I've got a bunch of Nagios plugins that monitor things like DNS/HTTP/RADIUS hits per second. I've set what I believe to be sensible max/min warning thresholds but what I really want is dynamic thresholds. If some quantity suddenly doubles or halves, I'd like an alert. For example, if I usually serve 10 DNS lookups per second, and suddenly it is doing 20 per second, that isn't a fault but I would like to know about it, because it might mean there is a problem with the network in general. Is there a way of doing this? Any ideas? You've already received two replies, both stating that you'll likely have to write some code to do it. I'm not aware of any common plugins out there that calculate rates of change and alert appropriately. Maybe they exist, but I don't recall seeing any of them. Have you tried any of the plugin sites? Oh, I didn't receive any replies. Presumably the mails got lost in the ether. I'm happy to write code - I just wondered if there was a built-in way of doing this. Thanks for your response, Jonathan -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Dynamic warning/critical thresholds
I've got a bunch of Nagios plugins that monitor things like DNS/HTTP/RADIUS hits per second. I've set what I believe to be sensible max/min warning thresholds but what I really want is dynamic thresholds. If some quantity suddenly doubles or halves, I'd like an alert. For example, if I usually serve 10 DNS lookups per second, and suddenly it is doing 20 per second, that isn't a fault but I would like to know about it, because it might mean there is a problem with the network in general. Is there a way of doing this? Thanks, Jonathan -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Dynamic warning/critical thresholds
I do something similar in some places, but I think they're all custom checks. For example, the number of DNS queries per second in BIND query.log. Rather than set some (static) threshold, I warn if one host has more than 2x the queries of the next ranked host. If it were me, I would write a shell wrapper around the existing nagios check to determine the dynamic thresholds, then exec the stock plugin (perhaps with a longer check_interval since it will be somewhat more expensive). For DNS, it might look like (untested): f=/var/cache/bind/query.log x=`awk '{print $1,$2; quit}' $f` t0=`date -d @x +%s` x=`tail $f |awk '{print $1,$2; quit}' $f` t1=`date -d @x +%s` d=$(($t1-$t2)) n=`wc -l $f` r=$(($n/$d)) [ $r -eq 0 ] echo OK: very low $n/$d exit exec /usr/lib/nagios/plugins/check_ -w $((2*$r)) -c $((4*$r)) -f $f -t 60 That is kind of contrived; I don't know if there's actually a check for DNS query rate (?) The execed check above is doing essentially nothing that the shell isn't already doing, so you could also write code to test the query rate for just the most recent N minutes, and test if it is above 2*$r or 4*$r. Justin On Fri, Jun 22, 2012 at 03:11:05PM +0100, Jonathan Gazeley wrote: I've got a bunch of Nagios plugins that monitor things like DNS/HTTP/RADIUS hits per second. I've set what I believe to be sensible max/min warning thresholds but what I really want is dynamic thresholds. If some quantity suddenly doubles or halves, I'd like an alert. For example, if I usually serve 10 DNS lookups per second, and suddenly it is doing 20 per second, that isn't a fault but I would like to know about it, because it might mean there is a problem with the network in general. -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Dynamic warning/critical thresholds
I've got a bunch of Nagios plugins that monitor things like DNS/HTTP/RADIUS hits per second. I've set what I believe to be sensible max/min warning thresholds but what I really want is dynamic thresholds. If some quantity suddenly doubles or halves, I'd like an alert. For example, if I usually serve 10 DNS lookups per second, and suddenly it is doing 20 per second, that isn't a fault but I would like to know about it, because it might mean there is a problem with the network in general. Is there a way of doing this? There's always a way. :) However, in this case, you're probably going to have to write a plugin to do it. You're asking to alert on a rate of change, and I can't think of any of the stock plugins that do that. Keeping state between polling runs is something that can get a big ugly. Do some rooting around the plugin community (the Nagios Exchange and/or the Monitoring Exchange) to see if you can find some examples of rate-aware plugins. While it's not rate that it's tracking, I know the check_iptraf*.pl plugins will at least keep state between polling cycles, so that might be somewhere to start. Benny -- Death rays, advanced technology or not, no creature wants to be stabbed in their hoo-hoo.-- Seen on zombiehunters.org -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null