On Mon, 16 Aug 2010, Dave Plonka wrote: > This would be easier for you to understand (why it's doing what it > does) if you plot the confidence band - i.e., the line above and below > the hwpreduct value that the observations must exceed to be considered > a violation.
I feel stupid for asking this, but how do I define the confidence band and how do I get rrdgraph to print it? The rrdcreate page mentions the confidence band several times but besides "defining a matching set of several RRDs" I can't find instructions in there on how to set my confidence band to a certain width. It also references rrdgraph where supposedly there is an example of a printed confidence band, but searching for "confidence" on the rrdgraph page doesn't yield any results. I'll go through the references you've listed (thanks!) as soon as I get a sec, but if you have a snippet of rrdtool code that uses/prints confidence bands, I'd really appreciate it! Thanks much, you've been a big help! -- Mike Mike Schilli [email protected] > >> The data is from a temperature sensor, which has a resolution of .5 >> degrees Celsius. The data covers 7 days [1] and the rrdtool commands >> I've used are available at [2]. For this example, I've used alpha=0.5, >> beta=0.5, gamma=0.5, with a seasonal period of 60*24 (one day in >> one-minute steps). >> >> What I've noticed so far: >> >> * The green line (rrdtool's prediction) is only available after the 3rd >> day. What's the reason for that? > > Prediction, i.e, the "hwpredict" value, is based on past observations; > the algorithm needs prior data points to predict, therefore there is > some time to bootstrap it for operations. Once the HWPREDICT RRA is > populated though, you won't have to wait again (as long as you don't > have gaps in your data points/observations.) > >> * There's a clear jump in the middle of the graph which goes undetected. > > This can happen (by design) if you have the H-W RRD attributes set to > only consider it errant if `n' samples fall outside the expected range > within the configured window of points - since this is a very short > duration anomaly (perhaps only one data point), it is not reported > as an error. That's configurable - see the "threshold" value you > set in the FAILURES RRA. The default is that 7 observations of 9 > must be out of the confidence band before it is reported as a failure > (vs. the predicition). > >> * There's a high number of false positives, starting after the spike, >> and continuing until the end of the graph. I've tried various >> combinations of alpha, beta, and gamma to get rid of them but without >> success. > > This would be easier to understand if you plot the confidence band. > It looks to me like your band is way too tight. > > If you haven't already, I suggest reading Jake Brutlag's orginal > paper, available online from the LISA 2000 Conference: > > "Aberrant Behavior Detection in Time Series for Network Service Monitoring" > http://www.usenix.org/events/lisa00/brutlag.html > > I've also done some work in which we used this H-W implentation > for evaluation of our method; might be helpful: > > "A Signal Analysis of Network Traffic Anomalies" > http://pages.cs.wisc.edu/~pb/paper_imw_02.pdf (sample parameters page 11 - > 300 second step, IIRC) > > "Traffic Anomaly Detection at Fine Timescales with Bayes Nets" > http://pages.cs.wisc.edu/~pb/icimp08_final.pdf (sample parameters page 8 - > 1 second step) > > Note that the HW parameters can be very sensitive to your "step" value. > So, don't expect defaults to work if they were meant for a 300 second > step, and you're using a 60 second step... as usual, it's best to > understand them completely to choose reasonable values. > > Dave > > -- > [email protected] http://net.doit.wisc.edu/~plonka/ Madison, WI > _______________________________________________ rrd-users mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
