Let me explain in more detail what I'm doing. I have a Windows webserver I'm working with and I've built a program to import IIS log data from about 50 websites for a specific range of dates and then doing some analysis SQL queries on the log detail specifically focusing on activity which takes place across the multiple sites/logs on the server. Ultimately, I'm searching for bad actors attempting to do bad things on this server. I have a look up table with URL requests we consider to be bad and I'm doing a look up on the Client IP addresses to determine whether in the world the activity is coming from. To that end, I look up each unique IP address in the IP2LOCATION-LITE-DB3 list and then update the temporary log table table with the source locations. Once we identify a bad actor, I have a blocking system I use to completely block blacklisted IP's from even getting a response from the server, so we don't waste a lot of time generating 403 or 404 errors.
But it just dawned on me while writing this that one way to squeeze a little more speed out of the process, is to just update a separate table of the unique IP addresses and then join the results in my analysis queries rather than physically updating the working log tables. This is really an exercise in optimizing my programming as much as possible, although we are already seeing some significant results from our efforts so far. I cut my server memory usages in half the first day I started this strategy and shaved 35% off the number of active processes running at any one time. It's been like getting a new server. But more importantly, I'm having loads of fun testing what I can do with this little program and how fast I can make it go. :) Thanks! Paul H. Tarver Tarver Program Consultants, Inc. -----Original Message----- From: ProfoxTech [mailto:profoxtech-boun...@leafe.com] On Behalf Of Ted Roche Sent: Saturday, December 16, 2023 3:42 PM To: profoxt...@leafe.com Subject: Re: Index & Seek Question IPv4 addresses are 32-bit numbers expressed for human convenience as 4 numbers between 0 and 255 separated by dots, but they can really be handled more easily as a single 32-bit integer. So, have you added a column to your table that stores the integer value of the IP address? ALTER TABLE YourTable ADD COLUMN dIPADDR INT(16) UPDATE YourTable SET dIPADDR = ipsegment[1]*256^3 +ipsegment[2]*256*2 + ipsegment[3]*256 + ipsegment[4] Then add an index on that column. Convert your search value using the same algorithm. An answer on a 50,000 row table should be returned in less than one second. On Fri, Dec 15, 2023 at 6:08 PM Paul H. Tarver <p...@tpcqpc.com> wrote: > > Been quiet in here for a while, so I thought I'd toss out something I'm > tinkering with and ask for your advice. Besides, I want to keep my name on > the monthly top 10! J > > > > I have a large table populated with the IP2LOCATION-LITE-DB3 list. I'm > running a process to convert an IPV4 address into a decimal number and then > locate the first record in my table where the value I'm looking up is > between the From_IP numeric value and the To_IP numeric value in the table > > > > Here's the code snippet I have working now: > > ---------------------------------------------------------------------- > > LPARAMETERS toParams > > LOCAL lcBase2IP, lnElements, lnIpNumber > > > > IF USED("ip_locs") > > > > lcCurrSel = ALIAS() > > > > lnElements = ALINES(ipsegment, toParams.ocClientIp, 1, ".") > > > > IF lnElements = 4 > > > > lcBase2IP = > PADL(THISFORM.utilities.DEC2BASX(VAL(ALLTRIM(ipsegment[1])), 2), 8, "0") + ; > > > PADL(THISFORM.utilities.DEC2BASX(VAL(ALLTRIM(ipsegment[2])), 2), 8, "0") + ; > > > PADL(THISFORM.utilities.DEC2BASX(VAL(ALLTRIM(ipsegment[3])), 2), 8, "0") + ; > > > PADL(THISFORM.utilities.DEC2BASX(VAL(ALLTRIM(ipsegment[4])), 2), 8, "0") > > > > lnIpNumber = THISFORM.utilities.bin2dec(lcBase2IP) > > > > SELECT ip_locs > > LOCATE FOR lnIpNumber => ip_locs.ip_from AND lnIpNumber <= > ip_locs.ip_to > > > > IF FOUND('ip_locs') > > > > toParams.ocCountry_Code = ip_locs.Country_Code > > toParams.ocCountry_Name = ip_locs.Country_Name > > toParams.ocRegion_Name = ip_locs.Region_Name > > toParams.ocCity_Name = ip_locs.City_Name > > > > ENDIF > > > > ENDIF > > > > SELECT (lcCurrSel) > > > > ENDIF > > > > RETURN > > ----------------------------------------------------------- > > > > This works fine and it really isn't that slow. It takes about 30 - 60 > seconds to do the location look up on a list of 50,000+ ip address records. > I'm pretty sure I'm getting Rushmore optimization on the LOCATE FOR line, > but I've been wondering if there is a way to use SEEK with an index to find > a value between the numeric IP_FROM field and the numeric IP_TO field. But I > cannot figure out how I can create an index that would give me Rushmore > optimization and let me use the SEEK command for a value between those to > numerics. > > > > FYI, I tried doing a simple SQL query to a temporary cursor, but > surprisingly, the LOCATE FOR command is actually faster. Hey, I know I'm > being greedy, but I'm wondering if it is possible to squeeze any more speed > out of this process. I may be using the best method, but I figured if y'all > had a minute you might give me a little feedback. > > > > Thanks in advance! > > > > Paul H. Tarver > Tarver Program Consultants, Inc. > > > > > > > > --- StripMime Report -- processed MIME parts --- > multipart/alternative > text/plain (text body -- kept) > text/html > --- > [excessive quoting removed by server] _______________________________________________ Post Messages to: ProFox@leafe.com Subscription Maintenance: https://mail.leafe.com/mailman/listinfo/profox OT-free version of this list: https://mail.leafe.com/mailman/listinfo/profoxtech Searchable Archive: https://leafe.com/archives This message: https://leafe.com/archives/byMID/1b7401da31c7$ea5b8410$bf128c30$@tpcqpc.com ** All postings, unless explicitly stated otherwise, are the opinions of the author, and do not constitute legal or medical advice. This statement is added to the messages for those lawyers who are too stupid to see the obvious.