Hi guys, I just wanted to summarize that:
It looks like a fine way to address IPs in a DB seems to be the pgsql way. Fast, and you can easily select data using subnets without relying on regexp. While in mysql you either obtain a int value difficult to handle, or a string very slow to search (choose between plague and cholera). For some reasons Scott is migrating from <something that works> to <something that doesn't match the needs and causes problems>. So, the DB engine choice was obviously not decided according to the needs. So the reasons were wrong. Rollback. Problem solved: there's no problem :) regards, Olivier Le 18 sept. 2014 à 12:04, [email protected] a écrit : > Hi Paolo and Scott, > > Your discussion pointed out an important issue - what is the proper design > when it comes to high frequently IP-based queries? > > Inspired by this thread, my idea is to use postgresql and count on its > cidr/inet function. (I am using mysql and store ipv4 as string(15)) > > This way the IP address is readable and processing IP address is also > efficient. It seems worthy to try. I will try this approach later. > > Any comment is appreciated. > -----Original Message----- > From: pmacct-discussion [mailto:[email protected]] On > Behalf Of Paolo Lucente > Sent: Thursday, September 18, 2014 8:44 AM > To: Scott Pettit - Vorco > Cc: [email protected] > Subject: Re: [pmacct-discussion] mySQL ATON/NTOA and pmacct performance > > Hi Scott, > > Great. Let me know how your testing goes. Should you hit any bugs or gaps, > feel free to contact me privately to solve them. Then we can summarize on the > list. > > Cheers, > Paolo > > On Wed, Sep 17, 2014 at 04:35:05AM +0000, Scott Pettit - Vorco wrote: >> In MySQL (any SQL actually), string searches are computationally very >> expensive when compared with integers. If you’re dealing with a >> relatively small number of rows then the difference isn’t a problem. >> >> It’s like asking “show me all the rows where the number is between 1 >> and 1000” versus “show me all the rows that look like something >> between one and one thousand in words”. The second statement is the >> same question but requires a lot more processing because the database >> has to think about every number as a word. It might only take >> fractions of a second longer for one row, but multiply that by 300 >> million rows (which is what I’m trying to analyse) and the difference >> is huge. PostgreSQL handles this really well as it has support for >> IP’s as a column type which presumably converts to integers in the back end. >> >> I will try pmacct with ipv4 on my test collector with sql_num_hosts >> and if it works well then having the same for inet6_aton would be great. >> >> -Scott >> >> >> On 17/09/14 16:03, "[email protected]" <[email protected]> wrote: >> >>> Hi Scott and Paolo, >>> >>> May I ask a question per your good experience? >>> >>> What is the performance difference between using string ipv4 and >>> integer ipv4? And potential impact? >>> >>> I have a PC running sfacctd that collects 2 giga-switches. And it >>> collects around 5M records into mysql per day. >>> >>> I am satisfied with the default string ipv4 for its convenience of >>> being readable. And I would like to know the performance drawback. _______________________________________________ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
