Re: [PERFORM] sequential scan performance

2005-05-30 Thread Michael Engelhart
Thanks everyone for all the  suggestions.  I'll check into those  
contrib modules.


Michael
On May 29, 2005, at 2:44 PM, Oleg Bartunov wrote:


Michael,

I'd recommend our contrib/pg_trgm module, which provides
trigram based fuzzy search and return results ordered by similarity
to your query.  Read http://www.sai.msu.su/~megera/postgres/gist/ 
pg_trgm/README.pg_trgm

for more details.

Oleg
On Sun, 29 May 2005, Michael Engelhart wrote:



Hi -

I have a table of about 3 million rows of city aliases that I  
need to query using LIKE - for example:


select * from city_alias where city_name like '%FRANCISCO'


When I do an EXPLAIN ANALYZE on the above query, the result is:

Seq Scan on city_alias  (cost=0.00..59282.31 rows=2 width=42)  
(actual time=73.369..3330.281 rows=407 loops=1)

  Filter: ((name)::text ~~ '%FRANCISCO'::text)
Total runtime: 3330.524 ms
(3 rows)


this is a query that our system needs to do a LOT.   Is there any  
way to improve the performance on this either with changes to our  
query or by configuring the database deployment?   We have an  
index on city_name but when using the % operator on the front of  
the query string postgresql can't use the index .


Thanks for any help.

Mike

---(end of  
broadcast)---

TIP 2: you can get off all lists at once with the unregister command
  (send unregister YourEmailAddressHere to  
[EMAIL PROTECTED])





Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---(end of  
broadcast)---

TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] sequential scan performance

2005-05-30 Thread Steve Atkins
On Sun, May 29, 2005 at 08:27:26AM -0500, Michael Engelhart wrote:
 Hi -
 
 I have a table of about 3 million rows of city aliases that I need  
 to query using LIKE - for example:
 
 select * from city_alias where city_name like '%FRANCISCO'
 
 
 When I do an EXPLAIN ANALYZE on the above query, the result is:
 
  Seq Scan on city_alias  (cost=0.00..59282.31 rows=2 width=42)  
 (actual time=73.369..3330.281 rows=407 loops=1)
Filter: ((name)::text ~~ '%FRANCISCO'::text)
 Total runtime: 3330.524 ms
 (3 rows)
 
 
 this is a query that our system needs to do a LOT.   Is there any way  
 to improve the performance on this either with changes to our query  
 or by configuring the database deployment?   We have an index on  
 city_name but when using the % operator on the front of the query  
 string postgresql can't use the index .

If that's really what you're doing (the wildcard is always at the beginning)
then something like this

  create index city_name_idx on foo (reverse(city_name));

  select * from city_alias where reverse(city_name) like reverse('%FRANCISCO');

should do just what you need.

I use this, with a plpgsql implementation of reverse, and it works nicely.

CREATE OR REPLACE FUNCTION reverse(text) RETURNS text AS '
DECLARE
   original alias for $1;
   reverse_str text;
   i int4;
BEGIN
 reverse_str = ;
 FOR i IN REVERSE LENGTH(original)..1 LOOP
  reverse_str = reverse_str || substr(original,i,1);
 END LOOP;
 return reverse_str;
END;'
LANGUAGE 'plpgsql' IMMUTABLE;


Someone will no doubt suggest using tsearch2, and you might want to
take a look at it if you actually need full-text search, but my
experience has been that it's too slow to be useful in production, and
it's not needed for the simple leading wildcard case.

Cheers,
  Steve

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[PERFORM] sequential scan performance

2005-05-29 Thread Michael Engelhart

Hi -

I have a table of about 3 million rows of city aliases that I need  
to query using LIKE - for example:


select * from city_alias where city_name like '%FRANCISCO'


When I do an EXPLAIN ANALYZE on the above query, the result is:

 Seq Scan on city_alias  (cost=0.00..59282.31 rows=2 width=42)  
(actual time=73.369..3330.281 rows=407 loops=1)

   Filter: ((name)::text ~~ '%FRANCISCO'::text)
Total runtime: 3330.524 ms
(3 rows)


this is a query that our system needs to do a LOT.   Is there any way  
to improve the performance on this either with changes to our query  
or by configuring the database deployment?   We have an index on  
city_name but when using the % operator on the front of the query  
string postgresql can't use the index .


Thanks for any help.

Mike

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] sequential scan performance

2005-05-29 Thread Steinar H. Gunderson
On Sun, May 29, 2005 at 08:27:26AM -0500, Michael Engelhart wrote:
 this is a query that our system needs to do a LOT.   Is there any way  
 to improve the performance on this either with changes to our query  
 or by configuring the database deployment?   We have an index on  
 city_name but when using the % operator on the front of the query  
 string postgresql can't use the index .

Try tsearch2 from contrib, it might help you.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] sequential scan performance

2005-05-29 Thread Christopher Kings-Lynne

When I do an EXPLAIN ANALYZE on the above query, the result is:

 Seq Scan on city_alias  (cost=0.00..59282.31 rows=2 width=42)  (actual 
time=73.369..3330.281 rows=407 loops=1)

   Filter: ((name)::text ~~ '%FRANCISCO'::text)
Total runtime: 3330.524 ms
(3 rows)


this is a query that our system needs to do a LOT.   Is there any way  
to improve the performance on this either with changes to our query  or 
by configuring the database deployment?   We have an index on  city_name 
but when using the % operator on the front of the query  string 
postgresql can't use the index .


Of course not.  There really is now way to make your literal query above 
fast.  You could try making a functional index on the reverse() of the 
string and querying for the reverse() of 'francisco'.


Or, if you want a general full text index, you should absolutely be 
using contrib/tsearch2.


Chris

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly