Does anyone have a good guide to the impact the check_interval setting has on calculating uptime and availability data from Nagios logs?
For example, if your check_interval is set to 10 minutes, a service could be down for 9 minutes and never register in Nagios. However, your availability numbers at that point couldn't be any more precise than 99.99% (as the cutoff for "five nines" is 5.26 service outage minutes a year). While unlikely, six such outages would push you into 99.9% - and an SLA report that generated from Nagios log files would still report 100%. If the value for check_interval is set to 30 minutes, the problem is amplified - Nagios is more likely to miss events, which makes me even less comfortable with the resulting statistics. Are there SLA packages for Nagios that account for this, or does Nagios's in-built reporting engine account for this in some way? Or, is there a statistician amongst us who can make me understand that I'm just being overly paranoid, and show me that the math actually works out? -- Breandan Dezendorf brean...@dezendorf.com bwdez...@gmail.com ------------------------------------------------------------------------------ The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null