Re: [Nfsen-discuss] Details tab miscalculating old table data?

Peter Haag Thu, 05 Feb 2009 23:07:18 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Steve,
As soon as you do averaging over more a longer period of time, values get 
inaccurate by the nature of RRD. This type of
calculation is done anyway only for shadow profiles. Real profiles are accurate 
all the time, as data is read from the
flow files. Shadow profiles do not have any data associated other than the RRD 
files for the graphs. If you need more
accurate information for shadow profiles, you would need to extend the RRD 
layout in NfSenRRd.pm => SetupRRD.


Shadow profiles come at the price of having no data assigned. This is the good 
and the bad news.

        - Peter

Steve Foley wrote:
> Yes, I know the RRDs will dither data over time and it does average down
> as data ages, but I think being off by 2 orders of magnitude when I jump
> between new data and old data (a difference of 1 day) is a bit much.
> 
> Digging through the code, I see the following in the ReadRRDStatInfo()
> function in NfProfile.pm:
> 
>         foreach my $line (@$data) {
>                 my $i = 0;
>                 foreach my $val (@$line) {
>                         if ( defined $val ) {
>                                 $$statinfo{$$names[$i++]} += int(300 *
> $val);
>                         }
>                 }
>         }
> 
> I think that is grabbing all the max values at 5 minute intervals,
> multiplying them out by 300 sec to get a reasonable value for that 5
> minute block. I dont see any code that adjusts that 300 multiplier by a
> larger number when the RRD fetch dips into older data. If the data
> returned is suddenly averaged by 1 day block instead of 5 minute, the
> calculation will be off by a large amount. Could that be what is
> happening here, or is there another routine where older RRD data is
> calculated more correctly (but probably still with some RRD averaging
> down over time)?
> 
> BTW, my setup has continuous shadow profiles, so it seems that all the
> data has to come from RRDs, right? Any way to look at the real nfdump
> summary stats directly via the web (in the table or though the Netflow
> Processing section) for this? Sometimes I want to offer more precise
> stats that arent RRD averaged over time if I still have the nfdump files
> around.


> 
> -Steve
> 
> On Jan 22, 2009, at 5:18 AM, Adrian Popa wrote:
> 
>> The answer is simple: if your RRDs use the AVERAGE compacting
>> function, then the data point(s) stored for a fixed period of time are
>> calculated as the average value (as opposed to the MAX value which
>> would be closer to the truth). This is not a bug, is just the way
>> standard RRDs work.
>> The attached picture shows:
>> * data for 1 day with 5 minute resolution
>> * data for 1 day with 30 minute resolution
>> * data for 1 day with 2 hour resolution
>> * data for 1 day with 1 day resolution
>>
>> Although I admit the graphs are for different sets of data (but are
>> close in the long run), you can see that from 30M peek traffic, you
>> get to about 5M if you average it over the whole day.
>>
>> Maybe Peter could allow us to globally select what compacting (or
>> whatever it's called) function we want to use in the RRDs (if it's not
>> too complicated).
>>
>> Cheers,
>> Adrian
>>
>>
>>
>>
>> On Thu, Jan 22, 2009 at 4:13 AM, Steve Foley <sfo...@ucsd.edu> wrote:
>> I have been successfully running nfsen 1.3 since September of last
>> year. In digging around for some statistics the last day or so, I
>> found that nfsen doesnt seem to compute stats well for older data. In
>> digging through the code, it looks like all of my continuous/shadow
>> profiles are getting their Detail tab data from the RRD files via a
>> fetch. Comparing recent RRD data (still in the freshest, most
>> detailed, most sampled period of the RRD) to the raw nfcapd output
>> (yes, I still have a bunch of it), the traffic bytes values seem to
>> line up reasonably well. If one channel shows 1.3 GB of usage in the
>> table/RRD, it really is about 1.3 GB in the nfcapd raw data.
>>
>> However, when I start digging into older data (3 months back) that has
>> been dithered by RRD, I find that those numbers are more like 100 MB.
>> That's an order of magnitude! I think what might be happening is that
>> calculations from the RRD file are not taking into account fewer data
>> points that need to provide more weight in the output. If I do a fetch
>> for RRD data, the new stuff has 1000's of values for a month, while
>> older data may only have 30 values for that month. Summing up 1000's
>> of points is good, but summing of 30 gives a lower value if they are
>> not weighted to represent a whole day each instead of just 5 minutes.
>>
>> I will admit that I didnt spend a lot of time looking through the
>> code, but I think this is what might be going on in the
>> ReadRRDStatInfo() routine. Am I just confused/crazy, or is there
>> another problem here? If all is indeed well, how do I go about getting
>> accurate stats from older data? Id be interested to get the right
>> numbers, even if I had to wait for the data to come straight from
>> nfdump in some cases. Maybe there is another/easy way to get summary
>> flows/packets/bytes for a given channel for a given period of time
>> other than setting up the details tab timeslot?
>>
>> -Steve
>>
>> -----
>> Steve Foley
>> Scripps Institution of Oceanography
>> sfo...@ucsd.edu, (858) 822-3356
>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> This SF.net email is sponsored by:
>> SourcForge Community
>> SourceForge wants to tell your story.
>> http://p.sf.net/sfu/sf-spreadtheword
>> _______________________________________________
>> Nfsen-discuss mailing list
>> Nfsen-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfsen-discuss
>>
>>
> 
> -----
> Steve Foley
> Scripps Institution of Oceanography
> sfo...@ucsd.edu, (858) 822-3356
> 
> 
> 
> ------------------------------------------------------------------------
> 
> ------------------------------------------------------------------------------
> Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
> software. With Adobe AIR, Ajax developers can use existing skills and code to
> build responsive, highly engaging applications that combine the power of local
> resources and data with the reach of the web. Download the Adobe AIR SDK and
> Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Nfsen-discuss mailing list
> Nfsen-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfsen-discuss

- --
_______ SWITCH - The Swiss Education and Research Network ______
Peter Haag,  Security Engineer,  Member of SWITCH CERT
PGP fingerprint: D9 31 D5 83 03 95 68 BA  FB 84 CA 94 AB FC 5D D7
SWITCH, Werdstrasse 2, P.O. Box,  CH-8021   Zurich, Switzerland
E-mail: peter.h...@switch.ch Web: http://www.switch.ch/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBSYvhP/5AbZRALNr/AQJeUAQApM/a6up1YqLqKTHgLOeaTvoWetRoHmly
v1aChnI98ggRliDxltW8dnOhH/mG7kpU2bjVtBJAk/qSMBuVC6R/WiY29s8a5k0r
AUkIEicdAaIET8ybwqjqKmoZMz1rASSxPJCDzy2fWqzFg2EC0RbENbmSKdokuV0f
krW/gTHd4YQ=
=r/xm
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
Nfsen-discuss mailing list
Nfsen-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfsen-discuss

Re: [Nfsen-discuss] Details tab miscalculating old table data?

Reply via email to