Re: [pmacct-discussion] reloading config accuracy

2009-09-25 Thread Wim Kerkhoff

On 9/25/2009 4:03 PM, Aaron Glenn wrote:

On Fri, Sep 25, 2009 at 4:59 AM, Tonytd_mi...@yahoo.com  wrote:
   

Is there a way to sort it properly by IP address (so that .2 comes after .1) in 
either an SQL query or in an XLS sheet ?

 

I hesitate to be 'that guy' but, you should look at using PostgreSQL.
I don't know enough about MySQL to make any suggestions specific to
it.

   


I totally agree; PostgreSQL handles network data types much better then 
MySQL. It will maintain proper index of the netdata data types.


You can do some sorting in MySQL based on IP:

http://dev.mysql.com/doc/refman/5.0/en/miscellaneous-functions.html#function_inet-aton

But this will not use any of the indexes I believe.

Wim

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] reloading config accuracy

2009-09-24 Thread Tony
Hi,

Replying to myself, but it's only when you start talking to yourself that it's 
an issue, right ? RIGHT ?

--- On Tue, 22/9/09, Tony td_mi...@yahoo.com wrote:

--snip--
 
 I've upgraded to the latest 12.0rc2 version and the results
 are a lot better. An example of the data is:
 
--snip--
 
 Now that adjb seems to be doing what it is supposed to do I
 will accumulate a few more days/weeks of data and compare
 the values from pmacct (with adjb) to those being recorded
 directly by the packeteer. Hopefully they will be a lot
 closer now.
 

The below stats are from a SINGLE day (23/09/2009) worth of data. I have some 
small concerns about the validity of the data set for comparison given the way 
data is extracted from the packeteer. The concern I have is that I'm not sure 
if the daily report that runs extracts data from 2300-2300 or -. 
Regardless the difference in the volume of data between 2300- on different 
days shouldn't be that great anyway.

Here is the data:

adjbpmacct  packeteer   (pack-adjb) %
11037185152 10733168136 12957484242 1920299090  14.820%
4216446261  4112843092  4062920012  -153526249  -3.779%
5176360717  4945117219  5133601176  -42759541   -0.833%
1347873812  1318879176  1362592012  147182001.080%
955390004   923140839   952564475   -2825529-0.297%
871276688   852006937   892911008   216343202.423%
703135346   673351910   695471238   -7664108-1.102%
449624941   455719218   453788344   4163403 0..917%
339088025   324566192   338516514   -571511 -0.169%
148191479   144684695   149437506   1246027 0.834%
526482303836487040825032-11823198   -28.961%

adjb = Data from pmacct with adjb=26 applied
pmacct = Direct pmacct data (no adjust)
packeteer = Data exported from the packeteer
(pack-adj) = 3rd column minus 1st column
% = (pack-adj) column as a percent of packeteer column


If you were to score it like they do at the Olympics and discard the highest  
lowest and then average the rest, it would come out a very respectable -0.103%, 
which in anyones language would be near enough not to worry about. The concern 
I have is with the ones that are wildly different (14  28%) and the fact that 
they are in opposite directions. The -3.8% is a bit far off too, but that could 
just be due to the smallish sample size and might get better over a few days. 
The 28% could be the same, it's not a very large sample. The 14% however is 
10GB of data and should be big enough to reflect proper statistical variance 
given that most of the smaller ones seem to.

I have some issues with the quality of the data extracted from the packeteer 
and I'm going to see if I can extract it in a better manner. At the moment it 
is grouped into subnets that are allocated to users and it is on a daily basis. 
This means that I'm creating a spreadsheet with the info from mySQL for pmacct 
and then manually copying stuff from the packeteer output with a lot of 
cross-referencing to match names to IP addresses. The above table took about 
3 hours worth of time to create and isn't conducive for continual testing as I 
make changes.

I'm hopefully going to revisit this early next week and try and get some better 
information.


regards,
Tony.


  
__
Get more done like never before with Yahoo!7 Mail.
Learn more: http://au.overview.mail.yahoo.com/


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] reloading config accuracy

2009-09-21 Thread Tony
Hi Paolo,

--- On Mon, 7/9/09, Paolo Lucente pa...@pmacct.net wrote:

 From: Paolo Lucente pa...@pmacct.net
 Subject: Re: [pmacct-discussion] reloading config  accuracy
 To: Tony td_mi...@yahoo.com
 Cc: pmacct-discussion@pmacct.net
 Received: Monday, 7 September, 2009, 2:28 AM
 Hi Tony,
 
 On Sat, Sep 05, 2009 at 09:01:01PM -0700, Tony wrote:
 
  I have tested the above suggested configuration and it
 is working. There is data going into the SQL table now! I am
 going to let it run in parallel with the unadjusted data
 (which is going into another table) and then compare the two
 of them and also compare to the stats being reported by the
 packeteer.
 
 I've just managed to commit to the CVS repository some
 code to remove a dependency between actions and checks
 in the sql_preprocess layer (so that you can roll-back
 to your original config, which did make sense). I also
 went through an overall review of the feature - which
 resulted in a couple of fixes (one right to the 'adjb'
 section) and some cleanups. 
 
 Hence I would highly invite you to make your assessment
 against the version currently in the CVS or alternatively
 wait until the rc2 release is out, later in the week.
 


I haven't upgraded yet, I will be doing that now, but I wanted to give you some 
feedback on what I'm seeing in the old version and we can see if it persists to 
the new version.

The line I have added to the config file is:

sql_preprocess[abc]: minp=1, adjb=26

I am not sure how it is applying the extra though as it is only making a small 
difference.

I am using 10 minute aggregation and an example of data for a single IP address 
is:


1030664410306462182 0.00177%
8363188083631698182 0.00022%
1016473 1016265 208 0.02047%
3318523 3318341 182 0.00548%
4822049048220308182 0.00038%

The first number is the value (bytes) in the adjusted table, the second is the 
unadjusted/original number (bytes), the third is the difference between the two 
and the fourth is the difference as a percentage. The difference column is 
ALWAYS 182 or 208 across the whole data range that I checked.

These were retrieve using a query like:

mysql select ip_dst, bytes, stamp_inserted from internet where ip_dst like  
'x.x.x.x' and stamp_inserted like '2009-09-18%' order by 3;

You can see that the adjustment isn't mkaing much difference and certainly not 
what I would expect to see if it was adding 26 bytes to each PACKET (which is 
what needs to be done to account for ethernet overhead).

This suggests to me that the adjb value isn't applied to each packet, but to 
only 7 or 8 of the packets or flows.

I just thought I'd alert you to the above and see if the changes you have 
committed make a difference. Hopefully I will be able to report back again in a 
few days.


Thanks,
Tony.


  
__
Get more done like never before with Yahoo!7 Mail.
Learn more: http://au.overview.mail.yahoo.com/

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] reloading config accuracy

2009-09-21 Thread Paolo Lucente
Hi Tony,

On Sun, Sep 20, 2009 at 06:03:18PM -0700, Tony wrote:

 I haven't upgraded yet, I will be doing that now, but I wanted to give you 
 some feedback on what I'm seeing in the old version and we can see if it 
 persists to the new version.

 [ ... ]
 
 10306644  10306462182 0.00177%
 83631880  83631698182 0.00022%
 1016473   1016265 208 0.02047%
 3318523   3318341 182 0.00548%
 48220490  48220308182 0.00038%
 
 I just thought I'd alert you to the above and see if the changes you have 
 committed make a difference. Hopefully I will be able to report back again in 
 a few days.

To confirm you that that behaviour was precisely target of a
bugfix which was issued a few weeks ago to the CVS and now is
already part of 0.12.0rc2 release. Let me know how it goes
after the upgrade.

Cheers,
Paolo



___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] reloading config accuracy

2009-09-06 Thread Paolo Lucente
Hi Tony,

On Sat, Sep 05, 2009 at 09:01:01PM -0700, Tony wrote:

 I have tested the above suggested configuration and it is working. There is 
 data going into the SQL table now! I am going to let it run in parallel with 
 the unadjusted data (which is going into another table) and then compare the 
 two of them and also compare to the stats being reported by the packeteer.

I've just managed to commit to the CVS repository some
code to remove a dependency between actions and checks
in the sql_preprocess layer (so that you can roll-back
to your original config, which did make sense). I also
went through an overall review of the feature - which
resulted in a couple of fixes (one right to the 'adjb'
section) and some cleanups. 

Hence I would highly invite you to make your assessment
against the version currently in the CVS or alternatively
wait until the rc2 release is out, later in the week.

Cheers,
Paolo



___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] reloading config accuracy

2009-09-05 Thread Tony
Hi Paolo,

--- On Sat, 5/9/09, Paolo Lucente pa...@pmacct.net wrote:

  
  If I comment out the sql_preprocess line then the
  config works and I get correct data being put into my SQL
  table. When I add the preprocess line I get nothing being
  put into my table. Am I mis-understanding how to use this
  option ?
 
 The sql_preprocess directive supports both checks and actions;
 a SQL entry must successfully pass a check in order to undergo
 one of the available actions (adjb being one of those). In your
 case you want a dummy check (ie. minp=1) to be put on the stack
 in order for the action to be evaluated against every SQL entry
 being committed to the database, ie.
 

It wasn't obvious to me from the documentation that this was required. The only 
documentation I could find on the sql_preprocess command was the reference to 
it in the FAQ (for what I am trying to do) and the reference in the list of 
config keys, which says the following about sql_preprocess:

==
allows to process aggregates (via a comma-separated list of conditionals and 
checks) while purging data to the RDBMS thus resulting in a powerful selection 
tier
==

This wasn't obvious to me that it REQUIRED a check to perform an action, just 
that you could do both check conditions and actions. Most of the other actions 
might make more sense if you had some kind of check on them, but I didn't 
really look at them, I only looked at the one I wanted, which I need to apply 
to every DB write, not selectively.


 sql_preprocess[abc]: minp=1, adjb=26
 
 Can you give this a try? Perhaps this underlying mechanism is
 not properly documented; it could be the trigger to change the
 way it works: if no checks are specified, then evaluate the
 specified action(s) against the entire queue of committed SQL
 entries. 
 

I have tested the above suggested configuration and it is working. There is 
data going into the SQL table now! I am going to let it run in parallel with 
the unadjusted data (which is going into another table) and then compare the 
two of them and also compare to the stats being reported by the packeteer.


Thanks for your assistance, I'll report back in a week/month or two with the 
results of the testing with adjusted data.

regards,
Tony.


  
__
Get more done like never before with Yahoo!7 Mail.
Learn more: http://au.overview.mail.yahoo.com/


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists