Ok, here's the explain analyze result. Again, this is Postgres 8.3.3 and
I vacuumed-analyzed both tables directly after they were created.
# explain analyze select fls.function_verified, fls.score,
fls.go_category_group_ref, fs1.gene_ref, fs1.function_verified_exactly,
fs2.gene_ref, fs2.function_verified_exactly from
functional_linkage_scores fls, gene_prediction_view fs1,
gene_prediction_view fs2 where fls.gene_ref1 = fs1.gene_ref and
fls.gene_ref2 = fs2.gene_ref and fs1.go_term_ref = 2 and fs2.go_term_ref
= 2;
QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=1399203593.41..6702491234.74 rows=352770803726
width=22) (actual time=6370194.467..22991303.434 rows=15610535128 loops=1)
Merge Cond: (fs2.gene_ref = fls.gene_ref2)
-> Index Scan using gene_prediction_view_gene_ref on
gene_prediction_view fs2 (cost=0.00..12111899.77 rows=197899 width=5)
(actual time=29.592..469838.583 rows=180629 loops=1)
Index Cond: (go_term_ref = 2)
-> Materialize (cost=1399069432.20..1483728633.52 rows=6772736105
width=21) (actual time=6370164.864..16623552.417 rows=15610535121 loops=1)
-> Sort (cost=1399069432.20..1416001272.47 rows=6772736105
width=21) (actual time=6370164.860..13081970.248 rows=1897946790 loops=1)
Sort Key: fls.gene_ref2
Sort Method: external merge Disk: 61192240kB
-> Merge Join (cost=40681244.97..154286110.62
rows=6772736105 width=21) (actual time=592112.778..2043161.851
rows=1897946790 loops=1)
Merge Cond: (fs1.gene_ref = fls.gene_ref1)
-> Index Scan using gene_prediction_view_gene_ref
on gene_prediction_view fs1 (cost=0.00..12111899.77 rows=197899
width=5) (actual time=0.015..246613.129 rows=180644 loops=1)
Index Cond: (go_term_ref = 2)
-> Materialize (cost=40586010.10..43490582.70
rows=232365808 width=20) (actual time=592112.755..1121366.375
rows=1897946783 loops=1)
-> Sort (cost=40586010.10..41166924.62
rows=232365808 width=20) (actual time=592112.721..870349.308
rows=232241678 loops=1)
Sort Key: fls.gene_ref1
Sort Method: external merge Disk:
7260856kB
-> Seq Scan on
functional_linkage_scores fls (cost=0.00..3928457.08 rows=232365808
width=20) (actual time=14.221..86455.902 rows=232241678 loops=1)
Total runtime: 24183346.271 ms
(18 rows)
Jeremy Harris wrote:
John Beaver wrote:
I'm having a strange problem with a query. The query is fairly
simple, with a few constants and two joins. All relevant columns
should be indexed, and I'm pretty sure there aren't any type
conversion issues. But the query plan includes a fairly heavy seq
scan. The only possible complication is that the tables involved are
fairly large - hundreds of millions of rows each.
Can anyone explain this? There should only ever be a maximum of about
50 rows returned when the query is executed.
You didn't say when you last vacuumed?
If there should only be 50 rows returned then the estimates from the
planner are way out.
If that doesn't help, we'll need version info, and (if you can afford
the time) an "explain analyze"
Cheers,
Jeremy
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance