[PERFORM] why index scan not working when using 'like'?

2003-11-25 Thread LIANHE SHAO
Hi all,

I want to use index on the gene_symbol column in my
query and gene_symbol is indexed. but when I use
lower (gene_symbol) like lower('%mif%'), the index
is not used. While when I change to
lower(gene_symbol) = lower('mif'), the index is used
and index scan works, but this is not what I like. I
want all the gene_symbols  containing substring
'mif' are pulled out, and not necessarily exactly match.

could anybody give me some hints how to deal with 
this. If I do not used index, it take too long for
the query.

   
PGA explain select distinct probeset_id, chip,
gene_symbol, title, sequence_description, pfam from
affy_array_annotation where lower(gene_symbol) like
upper('%mif%');
   QUERY PLAN
-
 Unique  (cost=29576.44..29591.44 rows=86 width=265)
   -  Sort  (cost=29576.44..29578.59 rows=857
width=265)
 Sort Key: probeset_id, chip, gene_symbol,
title, sequence_description, pfam
 -  Seq Scan on affy_array_annotation 
(cost=0.00..29534.70 rows=857 width=265)
   Filter: (lower((gene_symbol)::text)
~~ 'MIF%'::text)
(5 rows)


PGA= explain select distinct probeset_id, chip,
gene_symbol, title, sequence_description, pfam from
affy_array_annotation where lower(gene_symbol) =
upper('%mif%');
   
 QUERY PLAN
-
 Unique  (cost=3433.44..3448.44 rows=86 width=265)
   -  Sort  (cost=3433.44..3435.58 rows=857 width=265)
 Sort Key: probeset_id, chip, gene_symbol,
title, sequence_description, pfam
 -  Index Scan using gene_symbol_idx_fun1
on affy_array_annotation  (cost=0.00..3391.70
rows=857 width=265)
   Index Cond:
(lower((gene_symbol)::text) = '%MIF%'::text)
(5 rows)





Regards,
William


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] why index scan not working when using 'like'?

2003-11-25 Thread Dror Matalon

Hi,

Searches with like or regexes often can't use the index. Think of the index as
a sorted list of your items. It's easy to find an item when you know it
starts with mif so ('mif%' should use the index). But when you use a
'like' that starts with '%' the index is useless and the search needs to
do a sequential scan.

Regards,

Dror

On Tue, Nov 25, 2003 at 07:48:49PM +, LIANHE SHAO wrote:
 Hi all,
 
 I want to use index on the gene_symbol column in my
 query and gene_symbol is indexed. but when I use
 lower (gene_symbol) like lower('%mif%'), the index
 is not used. While when I change to
 lower(gene_symbol) = lower('mif'), the index is used
 and index scan works, but this is not what I like. I
 want all the gene_symbols  containing substring
 'mif' are pulled out, and not necessarily exactly match.
 
 could anybody give me some hints how to deal with 
 this. If I do not used index, it take too long for
 the query.
 

 PGA explain select distinct probeset_id, chip,
 gene_symbol, title, sequence_description, pfam from
 affy_array_annotation where lower(gene_symbol) like
 upper('%mif%');
QUERY PLAN
 -
  Unique  (cost=29576.44..29591.44 rows=86 width=265)
-  Sort  (cost=29576.44..29578.59 rows=857
 width=265)
  Sort Key: probeset_id, chip, gene_symbol,
 title, sequence_description, pfam
  -  Seq Scan on affy_array_annotation 
 (cost=0.00..29534.70 rows=857 width=265)
Filter: (lower((gene_symbol)::text)
 ~~ 'MIF%'::text)
 (5 rows)
 
 
 PGA= explain select distinct probeset_id, chip,
 gene_symbol, title, sequence_description, pfam from
 affy_array_annotation where lower(gene_symbol) =
 upper('%mif%');

  QUERY PLAN
 -
  Unique  (cost=3433.44..3448.44 rows=86 width=265)
-  Sort  (cost=3433.44..3435.58 rows=857 width=265)
  Sort Key: probeset_id, chip, gene_symbol,
 title, sequence_description, pfam
  -  Index Scan using gene_symbol_idx_fun1
 on affy_array_annotation  (cost=0.00..3391.70
 rows=857 width=265)
Index Cond:
 (lower((gene_symbol)::text) = '%MIF%'::text)
 (5 rows)
 
 
 
 
 
 Regards,
 William
 
 
 ---(end of broadcast)---
 TIP 8: explain analyze is your friend

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.fastbuzz.com
http://www.zapatec.com

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] why index scan not working when using 'like'?

2003-11-25 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes:
 In regular text fields containing words, your problem is solvable with full 
 text indexing (FTI).   Unfortunately, FTI is not designed for arbitrary 
 non-language strings.  It could be adapted, but would require a lot of 
 hacking.

I'm not sure why you say that FTI isn't a usable solution.  As long as
the gene symbols are separated by whitespace or some other non-letters
(eg, foo mif bar not foomifbar), I'd think FTI would work.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])