Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

Matthew Wakeling Mon, 30 Jun 2008 09:23:04 -0700

On Mon, 30 Jun 2008, Moritz Onken wrote:

select count(1) from result where url in (select shorturl from item
where shorturl = result.url);


I really don't see what your query tries to accomplish. Why would you want
"url IN (... where .. = url)"? Wouldn't you want a different qualifier
somehow?


well, it counts the number of rows with urls which already exist in another
table.
How would you describe the query?
If the "(select shorturl from item where shorturl = result.url)"
clause is empty the row is not counted, that's what I want...

The thing here is that you are effectively causing Postgres to run asub-select for each row of the "result" table, each time generating eitheran empty list or a list with one or more identical URLs. This iseffectively forcing a nested loop. In a way, you have two constraintswhere you only need one.

You can safely take out the constraint in the subquery, so it is likethis:


SELECT COUNT(*) FROM result WHERE url IN (SELECT shorturl FROM item);

This will generate equivalent results, because those rows that didn'tmatch the constraint wouldn't have affected the IN anyway. However, itwill alter the performance, because the subquery will contain moreresults, but it will only be run once, rather than multiple times. This iseffectively forcing a hash join (kind of).

Whereas if you rewrite the query as I demonstrated earlier, then you allowPostgres to make its own choice about which join algorithm will work best.


Matthew

--
Anyone who goes to a psychiatrist ought to have his head examined.

--
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

Reply via email to