Re: [PHP] Re: Fuzzy Array Search

2011-06-07 Thread Alex Nikitin
If you don't need the location, you can implode the array and use preg
match, quickly testing it, that gives you about 4.5 times performance
increase, but it wont give you the location, only if a certain value exists
within the array... You can kind of do some really clever math to get your
search parameters from there, which would be feasible on really large data
sets, but if you want location, you will have to iterate at some point...

(sorry i keep on hitting reply instead of reply to all)

--
The trouble with programmers is that you can never tell what a programmer is
doing until it’s too late.  ~Seymour Cray



On Tue, Jun 7, 2011 at 2:57 PM, Shawn McKenzie nos...@mckenzies.net wrote:

 On 06/07/2011 12:45 PM, Floyd Resler wrote:
  What would be the easiest way to do a fuzzy array search?  Can I do this
 without having to step through the array?
 
  Thanks!
  Floyd
 

 I use preg_grep()

 --
 Thanks!
 -Shawn
 http://www.spidean.com

 --
 PHP General Mailing List (http://www.php.net/)
 To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] Re: Fuzzy Array Search

2011-06-07 Thread Floyd Resler

On Jun 7, 2011, at 4:42 PM, Alex Nikitin wrote:

 If you don't need the location, you can implode the array and use preg
 match, quickly testing it, that gives you about 4.5 times performance
 increase, but it wont give you the location, only if a certain value exists
 within the array... You can kind of do some really clever math to get your
 search parameters from there, which would be feasible on really large data
 sets, but if you want location, you will have to iterate at some point...
 
 (sorry i keep on hitting reply instead of reply to all)
 
 --
 The trouble with programmers is that you can never tell what a programmer is
 doing until it’s too late.  ~Seymour Cray
 
 
 
 On Tue, Jun 7, 2011 at 2:57 PM, Shawn McKenzie nos...@mckenzies.net wrote:
 
 On 06/07/2011 12:45 PM, Floyd Resler wrote:
 What would be the easiest way to do a fuzzy array search?  Can I do this
 without having to step through the array?
 
 Thanks!
 Floyd
 
 
 I use preg_grep()
 
 --
 Thanks!
 -Shawn
 http://www.spidean.com
 

I actually do need the location since I need to get the resulting match.  I 
went ahead and tried to iterate the array and it was MUCH faster than I 
expected it to be!  Of course, considering the machine I'm running this on is a 
monster (2.66 GHz 8 cores, 24GB of RAM) it shouldn't have surprised me!

Thanks!
Floyd


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Re: Fuzzy Array Search

2011-06-07 Thread Shawn McKenzie
On 06/07/2011 03:57 PM, Floyd Resler wrote:
 
 On Jun 7, 2011, at 4:42 PM, Alex Nikitin wrote:
 
 If you don't need the location, you can implode the array and use preg
 match, quickly testing it, that gives you about 4.5 times performance
 increase, but it wont give you the location, only if a certain value exists
 within the array... You can kind of do some really clever math to get your
 search parameters from there, which would be feasible on really large data
 sets, but if you want location, you will have to iterate at some point...

 (sorry i keep on hitting reply instead of reply to all)

 --
 The trouble with programmers is that you can never tell what a programmer is
 doing until it’s too late.  ~Seymour Cray



 On Tue, Jun 7, 2011 at 2:57 PM, Shawn McKenzie nos...@mckenzies.net wrote:

 On 06/07/2011 12:45 PM, Floyd Resler wrote:
 What would be the easiest way to do a fuzzy array search?  Can I do this
 without having to step through the array?

 Thanks!
 Floyd


 I use preg_grep()

 --
 Thanks!
 -Shawn
 http://www.spidean.com

 
 I actually do need the location since I need to get the resulting match.  I 
 went ahead and tried to iterate the array and it was MUCH faster than I 
 expected it to be!  Of course, considering the machine I'm running this on is 
 a monster (2.66 GHz 8 cores, 24GB of RAM) it shouldn't have surprised me!
 
 Thanks!
 Floyd
 

If you are using a straight equality comparison then the loop would be
faster (but then array search would probably be better), however if you
need to use a preg_match() in the loop (fuzzy search), then
preg_grep() will be much faster than the loop.

LOOP WITH PREG_MATCH: 10
 0.435957 seconds
PREG_GREP: 10
 0.085604 seconds

LOOP WITH IF ==: 10
 0.044594 seconds
PREG_GREP: 10
 0.091519 seconds

-- 
Thanks!
-Shawn
http://www.spidean.com

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Re: Fuzzy Array Search

2011-06-07 Thread Alex Nikitin
It runs fast on my 2.33 core 2, and about as fast on this small data set, on
the dual 6 core with 96GB ram, or the 8 core 9GB box, it depends on the size
of your data set, memory speed and latency, and miniscule amount of
processing power (once again assuming small data set).

That said, you could probably do some clever stuff to minimize the range you
are looking for. For example, you could use the average record size with
imploding the array and searching, capturing the offset, you could
potentially cut out a lot of records that you are, within a certain
probability sure that the result is not in, making your search execute
faster by not even looking in the majority of data in most cases, this would
be interesting to test out actually. You could sort the array to further
narrow down the search by some criteria, what have you. This would all apply
if you are searching very large data sets, i am talking about multiple
billion data points. And all that said, arrays are not really a good
data-structure for searching anyways, that's why they are rarely used in
file systems or as memory data structures ;)

Shawn, == is not good for string comparison, its a bad habit that one should
get out of, use ===, its much safer .

Also try the same algorithm on 10 arrays of some number of values
10-1000 perhaps, that would give you better performance statistics :)



-- Alex

--
The trouble with programmers is that you can never tell what a programmer is
doing until it’s too late.  ~Seymour Cray



On Tue, Jun 7, 2011 at 5:25 PM, Shawn McKenzie nos...@mckenzies.net wrote:

 On 06/07/2011 03:57 PM, Floyd Resler wrote:
 
  On Jun 7, 2011, at 4:42 PM, Alex Nikitin wrote:
 
  If you don't need the location, you can implode the array and use preg
  match, quickly testing it, that gives you about 4.5 times performance
  increase, but it wont give you the location, only if a certain value
 exists
  within the array... You can kind of do some really clever math to get
 your
  search parameters from there, which would be feasible on really large
 data
  sets, but if you want location, you will have to iterate at some
 point...
 
  (sorry i keep on hitting reply instead of reply to all)
 
  --
  The trouble with programmers is that you can never tell what a
 programmer is
  doing until it’s too late.  ~Seymour Cray
 
 
 
  On Tue, Jun 7, 2011 at 2:57 PM, Shawn McKenzie nos...@mckenzies.net
 wrote:
 
  On 06/07/2011 12:45 PM, Floyd Resler wrote:
  What would be the easiest way to do a fuzzy array search?  Can I do
 this
  without having to step through the array?
 
  Thanks!
  Floyd
 
 
  I use preg_grep()
 
  --
  Thanks!
  -Shawn
  http://www.spidean.com
 
 
  I actually do need the location since I need to get the resulting match.
  I went ahead and tried to iterate the array and it was MUCH faster than I
 expected it to be!  Of course, considering the machine I'm running this on
 is a monster (2.66 GHz 8 cores, 24GB of RAM) it shouldn't have surprised me!
 
  Thanks!
  Floyd
 

 If you are using a straight equality comparison then the loop would be
 faster (but then array search would probably be better), however if you
 need to use a preg_match() in the loop (fuzzy search), then
 preg_grep() will be much faster than the loop.

 LOOP WITH PREG_MATCH: 10
  0.435957 seconds
 PREG_GREP: 10
  0.085604 seconds

 LOOP WITH IF ==: 10
  0.044594 seconds
 PREG_GREP: 10
  0.091519 seconds

 --
 Thanks!
 -Shawn
 http://www.spidean.com



Re: [PHP] Re: Fuzzy Array Search

2011-06-07 Thread Shawn McKenzie


On 06/07/2011 04:28 PM, Floyd Resler wrote:

 Shawn,
   I'm terrible with regular expressions.  Could you give me an example?

 Thanks!
 Floyd



Depends.  Could be as simple as this to return an array of all
occurrences of $needle in any of the $haystack_array values:

$haystack_array = array('something is here', 'nothing matches here',
'there will be a somethingsomething match here');
$needle = 'something';
$matches_array = preg_grep(/$needle/, $haystack_array);

Depends on where you are getting the search criteria and how you want it
to match in the array.  BTW...  If this is originally in a database then
you can do it there with the LIKE operator and % wildcard.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Re: Fuzzy Array Search

2011-06-07 Thread Shawn McKenzie


On 06/07/2011 04:35 PM, Alex Nikitin wrote:
 Shawn, == is not good for string comparison, its a bad habit that one
 should get out of, use ===, its much safer .
Yes, except that I was comparing a string to an array of integers :)

 Also try the same algorithm on 10 arrays of some number of values
 10-1000 perhaps, that would give you better performance statistics :)

LOOP WITH PREG MATCH:  650.627260 seconds
PREG_GREP:  123.110386 seconds

This was with 100,000 arrays each with 1,000 values.  Of course the loop
iterates through the entire loop which is needed to find all matches. 
If only one match is needed then you can just break out of the loop to
save time.  If we assume that half of the matches will be in the lower
500 and half in the upper then we can half the time for the loop but it
is still 325 seconds.

Thanks!
-Shawn

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php