Yesterday BAARDA, Don wrote: | | G'day,
it seems that some code reviewing in this area would be good atleast to document how it realy works :-) pushed it to my todo list ... thanks tobi | | > -----Original Message----- | > From: Blaise Lepeuple [mailto:[EMAIL PROTECTED] | > Sent: Tuesday, August 07, 2001 2:10 PM | > To: [EMAIL PROTECTED] | > Subject: rddtool Heartbeat & Step | > | > | > I'm sorry if you are the wrong person to ask this, but you | > had your email | > address on the online man page for "rrd create". | > | > If I should talk to somebody else, please redirect me to him. | [...] | | It's been a while since I examined and understood the internals of RRD. I | did go through it a while ago and satisfied myself that it worked, and since | then have been content that it does what it is supposed to. | | > So here is the creation of the rrd : | > | > rrdtool create test.rrd -s 10 --start 997147699 DS:tik:COUNTER:10:0:U | > RRA:AVERAGE:0:1:10 | [...] | | I've just had a look at the man page that now includes my description. It | includes verbatim my stuff about heartbeat and step, but excluded the bit I | had on the end of that email about xff. I'll add it here for your info, | because it will help when you start making RRA's with steps>1; | | You are right that "xff" has little affect if you have few "unknown" | PDPs, and setting "heartbeat" high is one way of reducing the number of | "unknown" PDPs. However, it is worth remembering that "unknowns" can happen | because of other reasons. When setting "xff", you are deciding how many | "unknown" PDPs are acceptable when accumulating into an RRA, and an | "unknown" really means that rrd has no idea what the rate for the PDP was. | So "xff" is a "garbage threshold" for how much missing input data you can | tolerate when accumulating your data into "course grain", large "steps", | RRAs. | | When setting "heartbeat", you are specifying a requirement on your | samples. Remember that a long "heartbeat" means that you are happy for | multiple PDPs to be estimated from a single sample, which means the | individual PDPs are not really accurate. The nice thing about this though is | that these not-quite-accurate PDPs accumulate accurately. The individual | PDPs are estimated from the average rate over a longer period, hence when | you accumulate these PDPs into a single period, the average rate is correct | for that period. So "heartbeat" is a "garbage threshold" for how much | inaccuracy you can tolerate in your "fine grain", small "steps", RRAs. | | Note that the xff for your RRA is 0. This has no effect since steps=1 for | this RRA, and as I remember it xff only comes into effect when accumulating | multiple PDP's into an RRA. | | > Now if I do the measure for 20 a bit early or a bit late, I | > would expect | > this PDP to have an unknown value since the interval for that | > pdp exceeded | > the heartbeat. | > If it is late, I am getting the expected result : | > | > rrdtool update test.rrd 997147700:0 997147710:10 997147721:21 | > 997147730:30 | > 997147740:40 | > | > rrdtool fetch test.rrd AVERAGE --start 997147710 --end 997147740 : | > tik | > | > 997147710: 1.0000000000e+00 | > 997147720: nan | > 997147730: 1.0000000000e+00 | > 997147740: 1.0000000000e+00 | [...] | | This is fine. The PDP for 997147711 -> 997147720 includes no known values, | and is hence unknown. The PDP for 997147721 -> 997147730 includes 1sec < | heartbeat unknown, and hence the PDP is known. | | > rrdtool update test.rrd 997147700:0 997147710:10 997147719:19 | > 997147730:30 | > 997147740:40 | > | > rrdtool fetch test.rrd AVERAGE --start 997147710 --end 997147740 : | > tik | > | > 997147710: 1.0000000000e+00 | > 997147720: nan | > 997147730: nan | > 997147740: 1.0000000000e+00 | [...] | | This looks wrong. You may have tripped up a bug in RRD. From my | understanding the last time I looked at RRD, the 997147720: output should | not be nan since the period 997147711->997147720 has only 1sec unknown, and | since 1sec is less than heartbeat, that PDP should be OK. | | > rrdtool update test.rrd 997147700:0 997147710:10 997147719:19 | > 997147729:29 | > 997147740:40 | > | > rrdtool fetch test.rrd AVERAGE --start 997147710 --end 997147740 : | > tik | > | > 997147710: 1.0000000000e+00 | > 997147720: 1.0000000000e+00 | > 997147730: nan | > 997147740: nan | | This looks wrong too. 997147711->997147720 has 1sec unknown, hence PDP OK. | 997147721->997147730 has only 1sec unknown too, so should be known. For | 997147731->997147740 is all unknown so unknown is correct. | | > On the other hand, I can stretch up to 18 seconds some | > readings without | > affecting anything : | > | > rrdtool update test.rrd 997147700:0 997147710:10 997147711:11 | > 997147729:29 | > 997147730:30 997147740:40 | > | > rrdtool fetch test.rrd AVERAGE --start 997147710 --end 997147740 : | > tik | > | > 997147710: 1.0000000000e+00 | > 997147720: 1.0000000000e+00 | > 997147730: 1.0000000000e+00 | > 997147740: 1.0000000000e+00 | | Surprisingly, this is actually correct. 997147711->997147720 has 9sec's | unknown < step so known. 997147721->997147730 also has 9sec's unknown < step | so known. The large unknown period between 997147712->997147729 still leaves | enough known values in the PDP's on each side for them both to be known. | | I've Cc'd this to the rrd-users list in case someone else can comment on the | presence/absence of a bug. Note that you are floating in the areas of a | possible "off by one" bug, and I recall seeing that one of these was fixed | at some point. What version of rrd are you running? | | ABO | | -- | Unsubscribe mailto:[EMAIL PROTECTED] | Help mailto:[EMAIL PROTECTED] | Archive http://www.ee.ethz.ch/~slist/rrd-users | WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi | | -- ______ __ _ /_ __/_ / / (_) Oetiker, ETZ J97, ETH, 8092 Zurich, Switzerland / // _ \/ _ \/ / phoneto:+41(0)1-632-5286 faxto:+41(0)1-632-1517 /_/ \.__/_.__/_/ mailto:[EMAIL PROTECTED] http://people.ee.ethz.ch/~oetiker -- Unsubscribe mailto:[EMAIL PROTECTED] Help mailto:[EMAIL PROTECTED] Archive http://www.ee.ethz.ch/~slist/rrd-users WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
