Hi all; I have a simple query against two very large tables ( > 800million rows in theurl_hits_category_jt table and 9.2 million in the url_hits_klk1 table )
I have indexes on the join columns and I've run an explain. also I've set the default statistics to 250 for both join columns. I get a very high overall query cost: explain select category_id, url_hits_id from url_hits_klk1 a , pwreport.url_hits_category_jt b where a.id = b.url_hits_id ; QUERY PLAN -------------------------------------------------------------------------------------------- Hash Join (cost=296959.90..126526916.55 rows=441764338 width=8) Hash Cond: (b.url_hits_id = a.id) -> Seq Scan on url_hits_category_jt b (cost=0.00..62365120.22 rows=4323432222 width=8) -> Hash (cost=179805.51..179805.51 rows=9372351 width=4) -> Seq Scan on url_hits_klk1 a (cost=0.00..179805.51 rows=9372351 width=4) (5 rows) If I turn off sequential scans I still get an even higher query cost: set enable_seqscan = off; SET explain select category_id, url_hits_id from url_hits_klk1 a , pwreport.url_hits_category_jt b where a.id = b.url_hits_id ; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=127548504.83..133214707.19 rows=441791932 width=8) Merge Cond: (a.id = b.url_hits_id) -> Index Scan using klk1 on url_hits_klk1 a (cost=0.00..303773.29 rows=9372351 width=4) -> Index Scan using mt_url_hits_category_jt_url_hits_id_index on url_hits_category_jt b (cost=0.00..125058243.39 rows=4323702284 width=8) (4 rows) Thoughts? Thanks in advance