Thanks for the reply. As for normalizing the data, I had the suspicion that is what was going on, so thank you for clarifying it. I started to look at the link you posted and will go over it more in detail. It will help out. I can easily see how a CDP's result will be different then an actual outage doing it the way I have it setup.
I'll take a look at count. Seems to be a nicer way of doing this, but will take me some time to get it set up. The ultimate goal in what I am doing is to show a graph of the response time, with our SLA times marked. With outages in and out of SLA times (which is what I have setup now). The next step will be to figure out the total SLA availability and total avilability of the service that I am monitoring. I was hoping to use the flags for this, realizing that it is not a true measure of uptime. Checking every 5 mins, can show a 10 min outage, even though it was only down for the moment it took to do the check. Again, thanks for your time. -mathew On Fri, Mar 4, 2011 at 2:58 AM, Simon Hobson <[email protected]> wrote: > mathew anderson wrote: > > >I have a single RRD file that has the value 100 in places. Whenever > >my monitoring sees an error (probes every 5 mins), it pushes a 100 > >into an rrd file. I am trying to figure out how many times this > >value is in a given time range. > > Are you aware that except under certain very strict conditions, what > is stored is NOT what you entered ? > ALL input data is normalised, and then consolidated. If your data > entry times don't exactly match step boundaries then it normalisation > will alter it. Suppose your step time was 1 minute (60 seconds), > you'd been entering zeros, then at 20s past the minute you enter 100, > and 20s past the next minute you enter zero again, and continue > entering zeros. The nomalisation means that for your one minute with > a value of 100, 2/3 (ie 66.6) will go into one step period, and the > rest (33.3) will go into the next. So you'd get out 0, 0, 66.6, 33.3, > 0, 0 > > Then say you had a consolidation for 10 minute periods. The > consolidated average for that would then be 10 (assuming both the > non-zero normalised values fall into the same consolidated time > period). > > See : http://www.vandenbogaerdt.nl/rrdtool/ > In particular the one on Rates, normalizing and consolidating > > Also, note that all time periods are referenced to unix epoch > (midnight, 1st Jan 1970). So with a step time of 300, step periods > start on the hour, 5 minutes past the hour, etc. If you consolidate 6 > PDPs to a CDP (ie 1/2 hour) then these consolidated periods will be > on the hour and half hour. > > > Given that you seem to be logging errors, it may be better to log the > error count rather than a flag. If the errors are reported as a > count, then use a counter data type and rrd will take care of > converting that to a rate. You can then get rrd graph to do logic > such as "if rate > some_threshold then draw it in red". > > -- > Simon Hobson > > Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed > author Gladys Hobson. Novels - poetry - short stories - ideal as > Christmas stocking fillers. Some available as e-books. > > _______________________________________________ > rrd-users mailing list > [email protected] > https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users >
_______________________________________________ rrd-users mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
