Richard van den Berg wrote:
We have a table with 1M rows that contain sessions with a start and finish timestamps. When joining this table with a 10k table with rounded timestamps, explain shows me sequential scans are used, and the join takes about 6 hours (2s per seq scan on session table * 10000):
Nested Loop (cost=252.80..233025873.16 rows=1035480320 width=97) Join Filter: (("outer".starttime <= "inner".ts) AND ("outer".finishtime
-> Seq Scan on sessions us (cost=0.00..42548.36 rows=924536 width=105) -> Materialize (cost=252.80..353.60 rows=10080 width=8) -> Seq Scan on duration du (cost=0.00..252.80 rows=10080 width=8)
However, during the initial loading of the data (we first load into text tables, then convert to tables using timestamps etc, then run this query) the same query took only 12 minutes. While debugging, I increased cpu_tuple_cost to 0.1 (from 0.01). Now the explain shows an index scan, and the run time comes down to 11 minutes:
Nested Loop (cost=0.00..667700310.42 rows=1035480320 width=97) -> Seq Scan on sessions us (cost=0.00..125756.60 rows=924536 width=105) -> Index Scan using ix_du_ts on duration du (cost=0.00..604.46 rows=1120 width=8) Index Cond: (("outer".starttime <= du.ts) AND ("outer".finishtime >= du.ts))
I am glad that I found a way to force the use of the index, but still can't explain why in the initial run the planner made the right choice, but now I need to give it a hand. Could this have to do with the statistics of the tables? I make very sure (during the initial load and while testing) that I vacuum analyze all tables after I fill them.
I'm runing postgres 7.4.7.
Any help is appreciated.
I believe the problem is that postgres doesn't recognize how restrictive a date-range is unless it uses constants. So saying:
select blah from du WHERE time between '2004-10-10' and '2004-10-15'; Will properly use the index, because it realizes it only returns a few rows. However select blah from du, us where du.ts between us.starttime and us.finishtime; Doesn't know how selective that BETWEEN is.
This has been discussed as a future improvement to the planner (in 8.*). I don't know the current status.
Also, in the future, you really should post your table schema, and explain analyze instead of just explain. (I realize that with a 6hr query it is a little painful.)
Notice that in the above plans, the expected number of rows drops from 10k down to 1k (which is probably where the planner decides to switch). And if you actually did the analyze probably the number of rows is much lower still.
Probably you should try to find out the status of multi-table selectivity. It was discussed in the last couple of months.
Description: OpenPGP digital signature