Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

2008-06-30 Thread Moritz Onken
Anfang der weitergeleiteten E-Mail: Von: Moritz Onken [EMAIL PROTECTED] Datum: 30. Juni 2008 09:16:06 MESZ An: Steinar H. Gunderson [EMAIL PROTECTED] Betreff: Re: [PERFORM] Planner should use index on a LIKE 'foo%' query Am 28.06.2008 um 21:19 schrieb Steinar H. Gunderson: On Sat, Jun 28

Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

2008-06-30 Thread Moritz Onken
Am 28.06.2008 um 21:19 schrieb Steinar H. Gunderson: On Sat, Jun 28, 2008 at 06:24:42PM +0200, Moritz Onken wrote: SELECT distinct url from item where url like 'http://www.micro%' limit 10; Here, the planner knows the pattern beforehand, and can see that it's a simple prefix. select *

Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

2008-06-30 Thread Matthew Wakeling
On Mon, 30 Jun 2008, Moritz Onken wrote: I created a new column in item where I store the shortened url which makes = comparisons possible. Good idea. Now create an index on that column. select count(1) from result where url in (select shorturl from item where shorturl = result.url); What

Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

2008-06-30 Thread Dimitri Fontaine
Hi, Le samedi 28 juin 2008, Moritz Onken a écrit : select count(*) from result where exists (select * from item where item.url LIKE result.url || '%' limit 1); which basically returns the number of items which exist in table result and match a URL in table item by its prefix. It

Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

2008-06-30 Thread Moritz Onken
Am 30.06.2008 um 12:19 schrieb Matthew Wakeling: select count(1) from result where url in (select shorturl from item where shorturl = result.url); What on earth is wrong with writing it like this? SELECT COUNT(*) FROM (SELECT DISTINCT result.url FROM result, item WHERE item.shorturl

Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

2008-06-30 Thread Matthew Wakeling
On Mon, 30 Jun 2008, Moritz Onken wrote: SELECT COUNT(*) FROM (SELECT DISTINCT result.url FROM result, item WHERE item.shorturl = result.url) AS a I tried the this approach but it's slower than WHERE IN in my case. However there's a lot more scope for improving a query along these lines,

Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

2008-06-30 Thread Moritz Onken
Am 30.06.2008 um 16:59 schrieb Steinar H. Gunderson: On Mon, Jun 30, 2008 at 09:16:06AM +0200, Moritz Onken wrote: the result table has 20.000.000 records and the item table 5.000.000. The query select count(1) from result where url in (select shorturl from item where shorturl = result.url);

Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

2008-06-30 Thread Matthew Wakeling
On Mon, 30 Jun 2008, Moritz Onken wrote: select count(1) from result where url in (select shorturl from item where shorturl = result.url); I really don't see what your query tries to accomplish. Why would you want url IN (... where .. = url)? Wouldn't you want a different qualifier somehow?

Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

2008-06-30 Thread Moritz Onken
The thing here is that you are effectively causing Postgres to run a sub-select for each row of the result table, each time generating either an empty list or a list with one or more identical URLs. This is effectively forcing a nested loop. In a way, you have two constraints where you

[PERFORM] Planner should use index on a LIKE 'foo%' query

2008-06-28 Thread Moritz Onken
Hi, I have a query select count(*) from result where exists (select * from item where item.url LIKE result.url || '%' limit 1); which basically returns the number of items which exist in table result and match a URL in table item by its prefix. I read all about idexes

Re: [PERFORM] Planner should use index on a LIKE 'foo%' query

2008-06-28 Thread Steinar H. Gunderson
On Sat, Jun 28, 2008 at 06:24:42PM +0200, Moritz Onken wrote: SELECT distinct url from item where url like 'http://www.micro%' limit 10; Here, the planner knows the pattern beforehand, and can see that it's a simple prefix. select * from result where exists (select * from item where