Ok, I'm at a point where I'm just going to throw this out there and see if 
anyone has any good solutions because I'm just not seeing one.  Of course 
there's always a better solution, but I'm too brain dead right now to see it.  
So any input would be appreciated.

Also..forgive me if this is more of a MySQL issues.. I'm trying to find the 
right combination of MySQL and PHP interaction to make this all work properly 
so thought this would be the better forum (versus just the MySQL lists) and 
that there would be plenty of people with insight on both to add their 2 cents.

Basic info:

We have a table with contacts and we need to determine if someone's already in 
there but with possibly slightly different info or if the old data contains 
duplicates.  Current criteria (I'm sure there's better):

If home phone matches any existing home, work or alt phones.
If work phone matches any existing home, work or alt phones.
If alt phone matches any existing home, work or alt phones.
If a concat of last name, street address and zip match any existing.

Table currently contains about 60,000 entries (imported from old system) and we 
can expect it to grow.

Indexes are on home, work and alt phones, as well as last name, street address 
and zip.  Tried it without indexes, it was slow.. tried it with indexes.. still 
about the same speed.

Tried seeing if putting the dupe check into a prepared statement sped it up 
any.  It didn't.

Analyzed table.  Same speed.

Out of 60,000 entries, we have roughly 40,000+ unique phone numbers (that's 
doing a union on all three phone number columns).  So even pre-compiling just 
that list and querying another table is going to be similar in time issues it 
seems.

For new contacts being brought in, we might have 30-60 at a time (maybe more) 
so this query would end up being run on each of the 30-60.  At 1.5-2 sec for 
each query, that ends up being a full minute or two to do the whole scan.

If I pull the phone #'s and name/address/zip combies into an array in PHP 
first, it's only like 12 secs, but each user would end up pulling it (and maybe 
store in a session variable or something) and re-pull it periodically to make 
sure it's semi-fresh.  This sounds like the quicker, but not the "best" 
solution.

MySQL version 4.1.11-standard

PHP version 4.3.4


Any tricks, tips or philosophical and/or theoretical things I've just not 
seeing, please let me know.

Thanks

-TG


___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

-- 
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to