On 2011-08-16, adam_pgsql <adam_pg...@witneyweb.org> wrote:
>
> Hi,
>
> I have a query hitting a table of 25 million rows. The table has a
>text field ('identifier') which i need to query for matching rows. The
>question is if i have multiple strings to match against this field I
>can use multiple OR sub-statements or multiple statements in a UNION.
>The UNION seems to run quicker.... is this to be expected? or is there
>anything else I can do improve the speed of this query? Some query
>details:        

>  WHERE
>      ( lower(identifier) LIKE lower('BUGS0000001884677') OR
>        lower(identifier) LIKE lower('BUGS0000001884678') OR
>        lower(identifier) LIKE lower('BUGS0000001884679') OR
>        lower(identifier) LIKE lower('SpTIGR4-2210 (6F24)') )
>  ORDER BY a.identifier;

don't use like use regex.

...

WHERE identifier ~* 
E'^(BUGS0000001884677|BUGS0000001884678|BUGS0000001884679|SpTIGR4-2210
\\(6F24\\))$'

or 

where lower(identifier) ~* 
lower(E'^(BUGS0000001884677|BUGS0000001884678|BUGS0000001884679|SpTIGR4-2210
\\(6F24\\))$')

on the other hand you aren't doing any pattern stuff. - you you could
just use 'in':

WHERE lower(identifier) in
  (lower('BUGS0000001884677'),
   lower('BUGS0000001884678'),
   lower('BUGS0000001884679'),
   lower('SpTIGR4-2210(6F24)') 
  )

or if you need like, use like any:

WHERE lower(identifier) like ANY
  (lower('BUGS0000001884677'),
   lower('BUGS0000001884678'),
   lower('BUGS0000001884679'),
   lower('SpTIGR4-2210(6F24)') 
  )


> Also which should scale better if I add more strings to match? would there be 
> any better design patterns for this problem?

use one of the above: preferably in, else regex, or failing that like any.
"= any" will also work but I don't thing it will ever be better than "in"

-- 
⚂⚃ 100% natural


-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Reply via email to