Re: [pmacct-discussion] reloading config accuracy
On 9/25/2009 4:03 PM, Aaron Glenn wrote: On Fri, Sep 25, 2009 at 4:59 AM, Tonytd_mi...@yahoo.com wrote: Is there a way to sort it properly by IP address (so that .2 comes after .1) in either an SQL query or in an XLS sheet ? I hesitate to be 'that guy' but, you should look at using PostgreSQL. I don't know enough about MySQL to make any suggestions specific to it. I totally agree; PostgreSQL handles network data types much better then MySQL. It will maintain proper index of the netdata data types. You can do some sorting in MySQL based on IP: http://dev.mysql.com/doc/refman/5.0/en/miscellaneous-functions.html#function_inet-aton But this will not use any of the indexes I believe. Wim ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] reloading config accuracy
Hi, Replying to myself, but it's only when you start talking to yourself that it's an issue, right ? RIGHT ? --- On Tue, 22/9/09, Tony td_mi...@yahoo.com wrote: --snip-- I've upgraded to the latest 12.0rc2 version and the results are a lot better. An example of the data is: --snip-- Now that adjb seems to be doing what it is supposed to do I will accumulate a few more days/weeks of data and compare the values from pmacct (with adjb) to those being recorded directly by the packeteer. Hopefully they will be a lot closer now. The below stats are from a SINGLE day (23/09/2009) worth of data. I have some small concerns about the validity of the data set for comparison given the way data is extracted from the packeteer. The concern I have is that I'm not sure if the daily report that runs extracts data from 2300-2300 or -. Regardless the difference in the volume of data between 2300- on different days shouldn't be that great anyway. Here is the data: adjbpmacct packeteer (pack-adjb) % 11037185152 10733168136 12957484242 1920299090 14.820% 4216446261 4112843092 4062920012 -153526249 -3.779% 5176360717 4945117219 5133601176 -42759541 -0.833% 1347873812 1318879176 1362592012 147182001.080% 955390004 923140839 952564475 -2825529-0.297% 871276688 852006937 892911008 216343202.423% 703135346 673351910 695471238 -7664108-1.102% 449624941 455719218 453788344 4163403 0..917% 339088025 324566192 338516514 -571511 -0.169% 148191479 144684695 149437506 1246027 0.834% 526482303836487040825032-11823198 -28.961% adjb = Data from pmacct with adjb=26 applied pmacct = Direct pmacct data (no adjust) packeteer = Data exported from the packeteer (pack-adj) = 3rd column minus 1st column % = (pack-adj) column as a percent of packeteer column If you were to score it like they do at the Olympics and discard the highest lowest and then average the rest, it would come out a very respectable -0.103%, which in anyones language would be near enough not to worry about. The concern I have is with the ones that are wildly different (14 28%) and the fact that they are in opposite directions. The -3.8% is a bit far off too, but that could just be due to the smallish sample size and might get better over a few days. The 28% could be the same, it's not a very large sample. The 14% however is 10GB of data and should be big enough to reflect proper statistical variance given that most of the smaller ones seem to. I have some issues with the quality of the data extracted from the packeteer and I'm going to see if I can extract it in a better manner. At the moment it is grouped into subnets that are allocated to users and it is on a daily basis. This means that I'm creating a spreadsheet with the info from mySQL for pmacct and then manually copying stuff from the packeteer output with a lot of cross-referencing to match names to IP addresses. The above table took about 3 hours worth of time to create and isn't conducive for continual testing as I make changes. I'm hopefully going to revisit this early next week and try and get some better information. regards, Tony. __ Get more done like never before with Yahoo!7 Mail. Learn more: http://au.overview.mail.yahoo.com/ ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] reloading config accuracy
Hi Paolo, --- On Mon, 7/9/09, Paolo Lucente pa...@pmacct.net wrote: From: Paolo Lucente pa...@pmacct.net Subject: Re: [pmacct-discussion] reloading config accuracy To: Tony td_mi...@yahoo.com Cc: pmacct-discussion@pmacct.net Received: Monday, 7 September, 2009, 2:28 AM Hi Tony, On Sat, Sep 05, 2009 at 09:01:01PM -0700, Tony wrote: I have tested the above suggested configuration and it is working. There is data going into the SQL table now! I am going to let it run in parallel with the unadjusted data (which is going into another table) and then compare the two of them and also compare to the stats being reported by the packeteer. I've just managed to commit to the CVS repository some code to remove a dependency between actions and checks in the sql_preprocess layer (so that you can roll-back to your original config, which did make sense). I also went through an overall review of the feature - which resulted in a couple of fixes (one right to the 'adjb' section) and some cleanups. Hence I would highly invite you to make your assessment against the version currently in the CVS or alternatively wait until the rc2 release is out, later in the week. I haven't upgraded yet, I will be doing that now, but I wanted to give you some feedback on what I'm seeing in the old version and we can see if it persists to the new version. The line I have added to the config file is: sql_preprocess[abc]: minp=1, adjb=26 I am not sure how it is applying the extra though as it is only making a small difference. I am using 10 minute aggregation and an example of data for a single IP address is: 1030664410306462182 0.00177% 8363188083631698182 0.00022% 1016473 1016265 208 0.02047% 3318523 3318341 182 0.00548% 4822049048220308182 0.00038% The first number is the value (bytes) in the adjusted table, the second is the unadjusted/original number (bytes), the third is the difference between the two and the fourth is the difference as a percentage. The difference column is ALWAYS 182 or 208 across the whole data range that I checked. These were retrieve using a query like: mysql select ip_dst, bytes, stamp_inserted from internet where ip_dst like 'x.x.x.x' and stamp_inserted like '2009-09-18%' order by 3; You can see that the adjustment isn't mkaing much difference and certainly not what I would expect to see if it was adding 26 bytes to each PACKET (which is what needs to be done to account for ethernet overhead). This suggests to me that the adjb value isn't applied to each packet, but to only 7 or 8 of the packets or flows. I just thought I'd alert you to the above and see if the changes you have committed make a difference. Hopefully I will be able to report back again in a few days. Thanks, Tony. __ Get more done like never before with Yahoo!7 Mail. Learn more: http://au.overview.mail.yahoo.com/ ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] reloading config accuracy
Hi Tony, On Sun, Sep 20, 2009 at 06:03:18PM -0700, Tony wrote: I haven't upgraded yet, I will be doing that now, but I wanted to give you some feedback on what I'm seeing in the old version and we can see if it persists to the new version. [ ... ] 10306644 10306462182 0.00177% 83631880 83631698182 0.00022% 1016473 1016265 208 0.02047% 3318523 3318341 182 0.00548% 48220490 48220308182 0.00038% I just thought I'd alert you to the above and see if the changes you have committed make a difference. Hopefully I will be able to report back again in a few days. To confirm you that that behaviour was precisely target of a bugfix which was issued a few weeks ago to the CVS and now is already part of 0.12.0rc2 release. Let me know how it goes after the upgrade. Cheers, Paolo ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] reloading config accuracy
Hi Tony, On Sat, Sep 05, 2009 at 09:01:01PM -0700, Tony wrote: I have tested the above suggested configuration and it is working. There is data going into the SQL table now! I am going to let it run in parallel with the unadjusted data (which is going into another table) and then compare the two of them and also compare to the stats being reported by the packeteer. I've just managed to commit to the CVS repository some code to remove a dependency between actions and checks in the sql_preprocess layer (so that you can roll-back to your original config, which did make sense). I also went through an overall review of the feature - which resulted in a couple of fixes (one right to the 'adjb' section) and some cleanups. Hence I would highly invite you to make your assessment against the version currently in the CVS or alternatively wait until the rc2 release is out, later in the week. Cheers, Paolo ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] reloading config accuracy
Hi Paolo, --- On Sat, 5/9/09, Paolo Lucente pa...@pmacct.net wrote: If I comment out the sql_preprocess line then the config works and I get correct data being put into my SQL table. When I add the preprocess line I get nothing being put into my table. Am I mis-understanding how to use this option ? The sql_preprocess directive supports both checks and actions; a SQL entry must successfully pass a check in order to undergo one of the available actions (adjb being one of those). In your case you want a dummy check (ie. minp=1) to be put on the stack in order for the action to be evaluated against every SQL entry being committed to the database, ie. It wasn't obvious to me from the documentation that this was required. The only documentation I could find on the sql_preprocess command was the reference to it in the FAQ (for what I am trying to do) and the reference in the list of config keys, which says the following about sql_preprocess: == allows to process aggregates (via a comma-separated list of conditionals and checks) while purging data to the RDBMS thus resulting in a powerful selection tier == This wasn't obvious to me that it REQUIRED a check to perform an action, just that you could do both check conditions and actions. Most of the other actions might make more sense if you had some kind of check on them, but I didn't really look at them, I only looked at the one I wanted, which I need to apply to every DB write, not selectively. sql_preprocess[abc]: minp=1, adjb=26 Can you give this a try? Perhaps this underlying mechanism is not properly documented; it could be the trigger to change the way it works: if no checks are specified, then evaluate the specified action(s) against the entire queue of committed SQL entries. I have tested the above suggested configuration and it is working. There is data going into the SQL table now! I am going to let it run in parallel with the unadjusted data (which is going into another table) and then compare the two of them and also compare to the stats being reported by the packeteer. Thanks for your assistance, I'll report back in a week/month or two with the results of the testing with adjusted data. regards, Tony. __ Get more done like never before with Yahoo!7 Mail. Learn more: http://au.overview.mail.yahoo.com/ ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists