On 5/23/07, Andy <[EMAIL PROTECTED]> wrote:
An example would be:
SELECT * FROM table
                             WHERE name like '%john%' or street like '%srt%'

Anyway, the query planner always does seq scan on the whole table and that
takes some time. How can this be optimized or made in another way to be
faster?

There's no algorithm in existence that can "index" arbitrary
substrings the way you think. The only rational way to accomplish this
is to first break the text into substrings using some algorithm (eg.,
words delimited by whitespace and punctuation), and index the
substrings individually.

You can do this using vanilla PostgreSQL, and you can use Tsearch2
and/or its GIN indexes to help speed up the searches. The simplest
solution would be to put all the substrings in a table, one row per
substring, along with an attribute referencing the source table/row.

At this point you have effectively reduced your search space: you can
use a query to isolate the words in your "dictionary" that contain the
substrings. So a query might be:

 select ... from persons where id in (
   select person_id from person_words
   where word like '%john%';
 )

The "like" search, even though it will use a sequential scan, is bound
to be much faster on these small words than searching for substrings
through large gobs of text in the persons table.

Note that PostgreSQL *can* exploit the index for *prefix* matching, if
you tell it to use the right operator class:

 create index persons_name_index on persons (name text_pattern_ops);

or, if you're using varchars (though there is rarely any reason to):

 create index persons_name_index on persons (name varchar_pattern_ops);

(These two may be identical internally. Check the docs.)

Now you can do:

 select ... from persons where name like 'john%';

which will yield a query plan such as this:

Index Scan using persons_name_index on persons  (cost=0.00..8.27
rows=1 width=29) (actual time=0.184..0.373 rows=51 loops=1)
  Index Cond: ((name ~>=~ 'john'::text) AND (name ~<~ 'joho'::text))
  Filter: (title ~~ 'john%'::text)

Alexander.

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to