Re: [PHP-DB] Building indexes

listread Fri, 19 Mar 2010 13:42:32 -0700

On 3/18/2010 4:59 PM, Chris wrote:
> What do your queries end up looking like?


Here's an example of a tough query (there are lots of Smiths!).

SELECT voter1.County, voter1.`First Name`, voter1.`Middle Name`,voter1.`Last Name`, voter1.`Residential ZipCode`,voter1.`House Number`,voter1.`Mailing Address`, voter1.`Street Name`, voter1.`ResidentialCity`, voter1.`Voter ID`, voter1.`Congressional Distri`,voter1.idVoter,voter1.County FROM voter1 WHERE voter1.`County` LIKE '%' ANDvoter1.`Last Name` LIKE 'Smith%' AND voter1.`First Name` LIKE 'John%'AND voter1.`Residential ZipCode` LIKE '%' LIMIT 0, 10;

Do you think the LIKE '%' hurts for the fields we don't have searchcriteria? We do that to keep the query definition simple and flexible.

More upfront db prep can be justified since we have to do two or threehundred thousand queries in the course of validating all the signatureson a petition.


Thanks!

- Ron

listread wrote:
Chris,
I just assumed that everyone on this list was using MySQL... That'swhat we're using (v. 5.1.45 GA) with InnoDB as the engine.
Most people are but there are lots of types of databases out there :)
(I just read your tutorial athttp://www.designmagick.com/article/16/ It was very helpful - I lookforward to checking out more of your articles.)
If it is just as efficient to use multiple separate indexes, thatwould make index building less complicated on our large db.
It is, though wildcard searches can't always use indexes.

If you do

field like 'abcdef%';
then an index can potentially be used because the db (mysql orotherwise) can look at the start of the string to see if it matches.The longer the string the more likely an index can be used (eg doingfield like 'a%' probably won't use an index, it'll end up beingquicker to scan the actual data).
If you do

field like '%abcdef%';
then an index can't be used since abcdef could appear anywhere in thestring.
Without a large dataset, it hard to truly test a system and if youhave a large dataset, like we do, it takes quite a while to buildindexes.
Definitely, it's the best way to test and also the hardest sincerebuilding the db takes so long.
Our project is a petition signature validation suite. Since many ofthe handwritten names and addresses on petition sheets are difficultto read, the user needs to be able to do some fuzzy searching.Sometimes it's easier to read the address than it is the name. Thezip code is usually easy to read. We almost always need to use LIKEqueries, since some part of the name or address is typically hard toread. (We try to use as many of the leading characters as we can andwildcard the remaining.)
I'd suggest fulltext but that won't work with innodb, only myisam. Youcould do something like keep the addresses and names in a separatemyisam table just for searching, though that means a whole new importprocess and also means you'd end up having to do two queries (maybe asubquery or join, you'd have to test) - one do to full text search andone to get the rest of the data based on the result of the first.
What do your queries end up looking like?



--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DB] Building indexes

Reply via email to