Hi guys,

I just wanted to summarize that:

It looks like a fine way to address IPs in a DB seems to be the pgsql way. 
Fast, and you can easily select data using subnets without relying on regexp. 
While in mysql you either obtain a int value difficult to handle, or a string 
very slow to search (choose between plague and cholera).

For some reasons Scott is migrating from <something that works> to <something 
that doesn't match the needs and causes problems>.

So, the DB engine choice was obviously not decided according to the needs. So 
the reasons were wrong. Rollback.
Problem solved: there's no problem :)


regards,
Olivier

Le 18 sept. 2014 à 12:04, [email protected] a écrit :

> Hi Paolo and Scott,
> 
>  Your discussion pointed out an important issue - what is the proper design 
> when it comes to high frequently IP-based queries?
> 
>  Inspired by this thread, my idea is to use postgresql and count on its 
> cidr/inet function. (I am using mysql and store ipv4 as string(15))
> 
>  This way the IP address is readable and processing IP address is also 
> efficient.  It seems worthy to try.  I will try this approach later. 
> 
>  Any comment is appreciated.  
> -----Original Message-----
> From: pmacct-discussion [mailto:[email protected]] On 
> Behalf Of Paolo Lucente
> Sent: Thursday, September 18, 2014 8:44 AM
> To: Scott Pettit - Vorco
> Cc: [email protected]
> Subject: Re: [pmacct-discussion] mySQL ATON/NTOA and pmacct performance
> 
> Hi Scott,
> 
> Great. Let me know how your testing goes. Should you hit any bugs or gaps, 
> feel free to contact me privately to solve them. Then we can summarize on the 
> list.
> 
> Cheers,
> Paolo
> 
> On Wed, Sep 17, 2014 at 04:35:05AM +0000, Scott Pettit - Vorco wrote:
>> In MySQL (any SQL actually), string searches are computationally very 
>> expensive when compared with integers. If you’re dealing with a 
>> relatively small number of rows then the difference isn’t a problem.
>> 
>> It’s like asking “show me all the rows where the number is between 1 
>> and 1000” versus “show me all the rows that look like something 
>> between one and one thousand in words”.  The second statement is the 
>> same question but requires a lot more processing because the database 
>> has to think about every number as a word.  It might only take 
>> fractions of a second longer for one row, but multiply that by 300 
>> million rows (which is what I’m trying to analyse) and the difference 
>> is huge. PostgreSQL handles this really well as it has support for 
>> IP’s as a column type which presumably converts to integers in the back end.
>> 
>> I will try pmacct with ipv4 on my test collector with sql_num_hosts 
>> and if it works well then having the same for inet6_aton would be great.
>> 
>> -Scott
>> 
>> 
>> On 17/09/14 16:03, "[email protected]" <[email protected]> wrote:
>> 
>>> Hi Scott and Paolo,
>>> 
>>> May I ask a question per your good experience?
>>> 
>>> What is the performance difference between using string ipv4 and 
>>> integer ipv4?  And potential impact?
>>> 
>>> I have a PC running sfacctd that collects 2 giga-switches.  And it 
>>> collects around 5M records into mysql per day.
>>> 
>>> I am satisfied with the default string ipv4 for its convenience of 
>>> being readable.  And I would like to know the performance drawback.


_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to