Re: [PHP] Re: Fuzzy Array Search
On 06/07/2011 04:35 PM, Alex Nikitin wrote: > Shawn, == is not good for string comparison, its a bad habit that one > should get out of, use ===, its much safer . Yes, except that I was comparing a string to an array of integers :) > > Also try the same algorithm on 10 arrays of some number of values > 10-1000 perhaps, that would give you better performance statistics :) > LOOP WITH PREG MATCH: 650.627260 seconds PREG_GREP: 123.110386 seconds This was with 100,000 arrays each with 1,000 values. Of course the loop iterates through the entire loop which is needed to find all matches. If only one match is needed then you can just break out of the loop to save time. If we assume that half of the matches will be in the lower 500 and half in the upper then we can half the time for the loop but it is still 325 seconds. Thanks! -Shawn -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: Fuzzy Array Search
On 06/07/2011 04:28 PM, Floyd Resler wrote: > > Shawn, > I'm terrible with regular expressions. Could you give me an example? > > Thanks! > Floyd > > Depends. Could be as simple as this to return an array of all occurrences of $needle in any of the $haystack_array values: $haystack_array = array('something is here', 'nothing matches here', 'there will be a somethingsomething match here'); $needle = 'something'; $matches_array = preg_grep("/$needle/", $haystack_array); Depends on where you are getting the search criteria and how you want it to match in the array. BTW... If this is originally in a database then you can do it there with the LIKE operator and % wildcard. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: Fuzzy Array Search
It runs fast on my 2.33 core 2, and about as fast on this small data set, on the dual 6 core with 96GB ram, or the 8 core 9GB box, it depends on the size of your data set, memory speed and latency, and miniscule amount of processing power (once again assuming small data set). That said, you could probably do some clever stuff to minimize the range you are looking for. For example, you could use the average record size with imploding the array and searching, capturing the offset, you could potentially cut out a lot of records that you are, within a certain probability sure that the result is not in, making your search execute faster by not even looking in the majority of data in most cases, this would be interesting to test out actually. You could sort the array to further narrow down the search by some criteria, what have you. This would all apply if you are searching very large data sets, i am talking about multiple billion data points. And all that said, arrays are not really a good data-structure for searching anyways, that's why they are rarely used in file systems or as memory data structures ;) Shawn, == is not good for string comparison, its a bad habit that one should get out of, use ===, its much safer . Also try the same algorithm on 10 arrays of some number of values 10-1000 perhaps, that would give you better performance statistics :) -- Alex -- The trouble with programmers is that you can never tell what a programmer is doing until it’s too late. ~Seymour Cray On Tue, Jun 7, 2011 at 5:25 PM, Shawn McKenzie wrote: > On 06/07/2011 03:57 PM, Floyd Resler wrote: > > > > On Jun 7, 2011, at 4:42 PM, Alex Nikitin wrote: > > > >> If you don't need the location, you can implode the array and use preg > >> match, quickly testing it, that gives you about 4.5 times performance > >> increase, but it wont give you the location, only if a certain value > exists > >> within the array... You can kind of do some really clever math to get > your > >> search parameters from there, which would be feasible on really large > data > >> sets, but if you want location, you will have to iterate at some > point... > >> > >> (sorry i keep on hitting reply instead of reply to all) > >> > >> -- > >> The trouble with programmers is that you can never tell what a > programmer is > >> doing until it’s too late. ~Seymour Cray > >> > >> > >> > >> On Tue, Jun 7, 2011 at 2:57 PM, Shawn McKenzie > wrote: > >> > >>> On 06/07/2011 12:45 PM, Floyd Resler wrote: > What would be the easiest way to do a fuzzy array search? Can I do > this > >>> without having to step through the array? > > Thanks! > Floyd > > >>> > >>> I use preg_grep() > >>> > >>> -- > >>> Thanks! > >>> -Shawn > >>> http://www.spidean.com > >>> > > > > I actually do need the location since I need to get the resulting match. > I went ahead and tried to iterate the array and it was MUCH faster than I > expected it to be! Of course, considering the machine I'm running this on > is a monster (2.66 GHz 8 cores, 24GB of RAM) it shouldn't have surprised me! > > > > Thanks! > > Floyd > > > > If you are using a straight equality comparison then the loop would be > faster (but then array search would probably be better), however if you > need to use a preg_match() in the loop ("fuzzy search"), then > preg_grep() will be much faster than the loop. > > LOOP WITH PREG_MATCH: 10 > 0.435957 seconds > PREG_GREP: 10 > 0.085604 seconds > > LOOP WITH IF ==: 10 > 0.044594 seconds > PREG_GREP: 10 > 0.091519 seconds > > -- > Thanks! > -Shawn > http://www.spidean.com >
Re: [PHP] Re: Fuzzy Array Search
On 06/07/2011 03:57 PM, Floyd Resler wrote: > > On Jun 7, 2011, at 4:42 PM, Alex Nikitin wrote: > >> If you don't need the location, you can implode the array and use preg >> match, quickly testing it, that gives you about 4.5 times performance >> increase, but it wont give you the location, only if a certain value exists >> within the array... You can kind of do some really clever math to get your >> search parameters from there, which would be feasible on really large data >> sets, but if you want location, you will have to iterate at some point... >> >> (sorry i keep on hitting reply instead of reply to all) >> >> -- >> The trouble with programmers is that you can never tell what a programmer is >> doing until it’s too late. ~Seymour Cray >> >> >> >> On Tue, Jun 7, 2011 at 2:57 PM, Shawn McKenzie wrote: >> >>> On 06/07/2011 12:45 PM, Floyd Resler wrote: What would be the easiest way to do a fuzzy array search? Can I do this >>> without having to step through the array? Thanks! Floyd >>> >>> I use preg_grep() >>> >>> -- >>> Thanks! >>> -Shawn >>> http://www.spidean.com >>> > > I actually do need the location since I need to get the resulting match. I > went ahead and tried to iterate the array and it was MUCH faster than I > expected it to be! Of course, considering the machine I'm running this on is > a monster (2.66 GHz 8 cores, 24GB of RAM) it shouldn't have surprised me! > > Thanks! > Floyd > If you are using a straight equality comparison then the loop would be faster (but then array search would probably be better), however if you need to use a preg_match() in the loop ("fuzzy search"), then preg_grep() will be much faster than the loop. LOOP WITH PREG_MATCH: 10 0.435957 seconds PREG_GREP: 10 0.085604 seconds LOOP WITH IF ==: 10 0.044594 seconds PREG_GREP: 10 0.091519 seconds -- Thanks! -Shawn http://www.spidean.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: Fuzzy Array Search
On Jun 7, 2011, at 4:42 PM, Alex Nikitin wrote: > If you don't need the location, you can implode the array and use preg > match, quickly testing it, that gives you about 4.5 times performance > increase, but it wont give you the location, only if a certain value exists > within the array... You can kind of do some really clever math to get your > search parameters from there, which would be feasible on really large data > sets, but if you want location, you will have to iterate at some point... > > (sorry i keep on hitting reply instead of reply to all) > > -- > The trouble with programmers is that you can never tell what a programmer is > doing until it’s too late. ~Seymour Cray > > > > On Tue, Jun 7, 2011 at 2:57 PM, Shawn McKenzie wrote: > >> On 06/07/2011 12:45 PM, Floyd Resler wrote: >>> What would be the easiest way to do a fuzzy array search? Can I do this >> without having to step through the array? >>> >>> Thanks! >>> Floyd >>> >> >> I use preg_grep() >> >> -- >> Thanks! >> -Shawn >> http://www.spidean.com >> I actually do need the location since I need to get the resulting match. I went ahead and tried to iterate the array and it was MUCH faster than I expected it to be! Of course, considering the machine I'm running this on is a monster (2.66 GHz 8 cores, 24GB of RAM) it shouldn't have surprised me! Thanks! Floyd -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: Fuzzy Array Search
If you don't need the location, you can implode the array and use preg match, quickly testing it, that gives you about 4.5 times performance increase, but it wont give you the location, only if a certain value exists within the array... You can kind of do some really clever math to get your search parameters from there, which would be feasible on really large data sets, but if you want location, you will have to iterate at some point... (sorry i keep on hitting reply instead of reply to all) -- The trouble with programmers is that you can never tell what a programmer is doing until it’s too late. ~Seymour Cray On Tue, Jun 7, 2011 at 2:57 PM, Shawn McKenzie wrote: > On 06/07/2011 12:45 PM, Floyd Resler wrote: > > What would be the easiest way to do a fuzzy array search? Can I do this > without having to step through the array? > > > > Thanks! > > Floyd > > > > I use preg_grep() > > -- > Thanks! > -Shawn > http://www.spidean.com > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > >
[PHP] Re: Fuzzy Array Search
On 06/07/2011 12:45 PM, Floyd Resler wrote: > What would be the easiest way to do a fuzzy array search? Can I do this > without having to step through the array? > > Thanks! > Floyd > I use preg_grep() -- Thanks! -Shawn http://www.spidean.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php