The answer is simple: if your RRDs use the AVERAGE compacting function, then
the data point(s) stored for a fixed period of time are calculated as the
average value (as opposed to the MAX value which would be closer to the
truth). This is not a bug, is just the way standard RRDs work.
The attached picture shows:
* data for 1 day with 5 minute resolution
* data for 1 day with 30 minute resolution
* data for 1 day with 2 hour resolution
* data for 1 day with 1 day resolution

Although I admit the graphs are for different sets of data (but are close in
the long run), you can see that from 30M peek traffic, you get to about 5M
if you average it over the whole day.

Maybe Peter could allow us to globally select what compacting (or whatever
it's called) function we want to use in the RRDs (if it's not too
complicated).

Cheers,
Adrian



On Thu, Jan 22, 2009 at 4:13 AM, Steve Foley <sfo...@ucsd.edu> wrote:

> I have been successfully running nfsen 1.3 since September of last year. In
> digging around for some statistics the last day or so, I found that nfsen
> doesnt seem to compute stats well for older data. In digging through the
> code, it looks like all of my continuous/shadow profiles are getting their
> Detail tab data from the RRD files via a fetch. Comparing recent RRD data
> (still in the freshest, most detailed, most sampled period of the RRD) to
> the raw nfcapd output (yes, I still have a bunch of it), the traffic bytes
> values seem to line up reasonably well. If one channel shows 1.3 GB of usage
> in the table/RRD, it really is about 1.3 GB in the nfcapd raw data.
>
> However, when I start digging into older data (3 months back) that has been
> dithered by RRD, I find that those numbers are more like 100 MB. That's an
> order of magnitude! I think what might be happening is that calculations
> from the RRD file are not taking into account fewer data points that need to
> provide more weight in the output. If I do a fetch for RRD data, the new
> stuff has 1000's of values for a month, while older data may only have 30
> values for that month. Summing up 1000's of points is good, but summing of
> 30 gives a lower value if they are not weighted to represent a whole day
> each instead of just 5 minutes.
>
> I will admit that I didnt spend a lot of time looking through the code, but
> I think this is what might be going on in the ReadRRDStatInfo() routine. Am
> I just confused/crazy, or is there another problem here? If all is indeed
> well, how do I go about getting accurate stats from older data? Id be
> interested to get the right numbers, even if I had to wait for the data to
> come straight from nfdump in some cases. Maybe there is another/easy way to
> get summary flows/packets/bytes for a given channel for a given period of
> time other than setting up the details tab timeslot?
>
> -Steve
>
> -----
> Steve Foley
> Scripps Institution of Oceanography
> sfo...@ucsd.edu, (858) 822-3356
>
>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by:
> SourcForge Community
> SourceForge wants to tell your story.
> http://p.sf.net/sfu/sf-spreadtheword
> _______________________________________________
> Nfsen-discuss mailing list
> Nfsen-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfsen-discuss
>
>
------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Nfsen-discuss mailing list
Nfsen-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfsen-discuss

Reply via email to