Hi Kennedy, I'm quoting the longish mail; see the comments below. (BTW, your mailer is broken, it's using 8-bit quote characters but claiming it's 7-bit ASCII.)
In article <[EMAIL PROTECTED]>, [EMAIL PROTECTED] wrote: > # Create a simplified Smokeping-like RRD > rrdtool create test_01.rrd --start 1000000000 --step > 300 \ > DS:loss:GAUGE:600:0:20 DS:ping1:GAUGE:600:0:180 \ > RRA:AVERAGE:0.5:1:1008 RRA:AVERAGE:0.5:12:4320 \ > RRA:MIN:0.5:12:4320 RRA:MAX:0.5:12:4320 \ > RRA:AVERAGE:0.5:144:720 RRA:MAX:0.5:144:720 \ > RRA:MIN:0.5:144:720 > # > # Load in some dummy data > rrdtool update test_01.rrd 1000000200:4:5 > rrdtool update test_01.rrd 1000000521:4:5 > rrdtool update test_01.rrd 1000000821:8:9 > rrdtool update test_01.rrd 1000001121:U:5 > rrdtool update test_01.rrd 1000001421:U:5 > # Dump > rrdtool dump test_01.rrd > test_01.xml > > Note that it's putting in 3 values followed by 2 "U's" > for the DS "loss". > > The dump shows: > Time loss ping1 > 1000000200 4.0000000000e+000 5.0000000000e+000 > 1000000500 4.0000000000e+000 5.0000000000e+000 > 1000000800 7.7200000000e+000 8.7200000000e+000 > 1000001100 8.0000000000e+000 5.2800000000e+000 > 1000001400 NaN 5.0000000000e+000 > > So, the loss value in the "1000000800" (3rd) row is > the "equi-spaced points on an interpolated curve" > issue, right? I'm OK with that yeah, as you point > out it's a little "weird" for ping data, but it > doesn't fundamentally change the results. However, > the "1000001100" (4th) row is a little different now > rather than storing a "NaN" it's taking the > "un-interpolated value" from the previous timeslot. The RRD logic goes like this: in the interval 800-1100, there's 21 seconds of value 4 and the rest (279s) of value 8. This leads to (21*4+279*8)/300 = 7.72. This is the 'equi-spaced points on an interpolated curve', yes. The next interval has 21 seconds of known value 8 and the rest unknown. RRDtool discards the unknown and assumes the rest of the interval equals the known part. This is hard-coded into RRDtool, you can't change it with the database parameters. The core of the problem is thus that Smokeping is using NaN as round-trip-time for the missing pings, but RRDtool considers this as missing data and prefers the known data over it, so we get the false smoke. One way to fix this might be to store the RTT of the missing pings as the average of the others instead of NaN. That should clean up the smoke. I haven't tried this, so I could be missing something obvious. I'm not sure what should be done if all the pings were lost. Tobi, any ideas? Cheers, -- niko -- Unsubscribe mailto:[EMAIL PROTECTED] Help mailto:[EMAIL PROTECTED] Archive http://www.ee.ethz.ch/~slist/smokeping-users WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
