Arjen van der Meijden <[EMAIL PROTECTED]> writes:
I noticed on a forum a query taking a surprisingly large amount of time in MySQL. Of course I wanted to prove PostgreSQL 8.0.1 could do it much better. To my surprise PostgreSQL was ten times worse on the same machine! And I don't understand why.
Wrong index ... what you probably could use here is an index on data_main.range, so that the query could run with postcodes as the outer side. I get such a plan by default with empty tables:
Aggregate (cost=99177.80..99177.80 rows=1 width=0) -> Nested Loop (cost=0.00..98021.80 rows=462400 width=0) -> Seq Scan on postcodes p (cost=0.00..30.40 rows=2040 width=4) -> Index Scan using rangei on data_main dm (cost=0.00..44.63 rows=227 width=2) Index Cond: ((dm.range >= "outer".range_from) AND (dm.range <= "outer".range_till))
but I'm not sure if the planner would prefer it with the tables loaded up. (It might not be the right thing anyway ... but seems worth trying.)
No it didn't prefer it.
Given the relatively small size of the postcodes table, and the fact that each data_main row seems to join to about one postcodes row, it's possible that what the planner did for you was actually the optimal thing anyhow. I'm not sure that any range-capable index would be faster than just scanning through 160 entries in memory ...
regards, tom lane
Yep, there is only one or in corner cases two postcode-ranges per postcode. Actually it should be only one, but my generated data is not perfect.
But the sequential scan per record is not really what surprises me, especially since the postcode table is only two pages of data, I didn't really expect otherwise.
It is the fact that it takes 22 seconds that surprises me. Especially since the two other examples on the same data which consider about the same amount of records per table/record only take 1.4 and 0.18 seconds.
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend