Tony
Thu, 24 Sep 2009 21:46:02 -0700
Hi, Replying to myself, but it's only when you start talking to yourself that it's an issue, right ? RIGHT ?
--- On Tue, 22/9/09, Tony <td_mi...@yahoo.com> wrote:
>
--snip--
>
> I've upgraded to the latest 12.0rc2 version and the results
> are a lot better. An example of the data is:
>
--snip--
>
> Now that adjb seems to be doing what it is supposed to do I
> will accumulate a few more days/weeks of data and compare
> the values from pmacct (with adjb) to those being recorded
> directly by the packeteer. Hopefully they will be a lot
> closer now.
>
The below stats are from a SINGLE day (23/09/2009) worth of data. I have some
small concerns about the validity of the data set for comparison given the way
data is extracted from the packeteer. The concern I have is that I'm not sure
if the daily report that runs extracts data from 2300-2300 or 0000-0000.
Regardless the difference in the volume of data between 2300-0000 on different
days shouldn't be that great anyway.
Here is the data:
adjb pmacct packeteer (pack-adjb) %
11037185152 10733168136 12957484242 1920299090 14.820%
4216446261 4112843092 4062920012 -153526249 -3.779%
5176360717 4945117219 5133601176 -42759541 -0.833%
1347873812 1318879176 1362592012 14718200 1.080%
955390004 923140839 952564475 -2825529 -0.297%
871276688 852006937 892911008 21634320 2.423%
703135346 673351910 695471238 -7664108 -1.102%
449624941 455719218 453788344 4163403 0..917%
339088025 324566192 338516514 -571511 -0.169%
148191479 144684695 149437506 1246027 0.834%
52648230 38364870 40825032 -11823198 -28.961%
adjb = Data from pmacct with adjb=26 applied
pmacct = Direct pmacct data (no adjust)
packeteer = Data exported from the packeteer
(pack-adj) = 3rd column minus 1st column
% = (pack-adj) column as a percent of packeteer column
If you were to score it like they do at the Olympics and discard the highest &
lowest and then average the rest, it would come out a very respectable -0.103%,
which in anyones language would be near enough not to worry about. The concern
I have is with the ones that are wildly different (14 & 28%) and the fact that
they are in opposite directions. The -3.8% is a bit far off too, but that could
just be due to the smallish sample size and might get better over a few days.
The 28% could be the same, it's not a very large sample. The 14% however is
10GB of data and should be big enough to reflect proper statistical variance
given that most of the smaller ones seem to.
I have some issues with the quality of the data extracted from the packeteer
and I'm going to see if I can extract it in a better manner. At the moment it
is grouped into subnets that are allocated to users and it is on a daily basis.
This means that I'm creating a spreadsheet with the info from mySQL for pmacct
and then manually copying stuff from the packeteer output with a lot of
cross-referencing to match "names" to IP addresses. The above table took about
3 hours worth of time to create and isn't conducive for continual testing as I
make changes.
I'm hopefully going to revisit this early next week and try and get some better
information.
regards,
Tony.
__________________________________________________________________________________
Get more done like never before with Yahoo!7 Mail.
Learn more: http://au.overview.mail.yahoo.com/
_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists