The following bug has been logged on the website: Bug reference: 6668 Logged by: Postgres User Email address: postgresu...@yahoo.com PostgreSQL version: 9.1.3 Operating system: Ubuntu Description:
work_mem 1MB create table small(i) as select (g/1000) * 1000 from generate_series(1,10000) g; create table large(i) as select generate_series(1,100000000); vacuum; vacuum; vacuum analyze; explain analyze select * from small inner join large using (i); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ ------------- Hash Join (cost=3083103.00..3475328.00 rows=10000 width=4) (actual time=84079.899..84989.419 rows=9001 loops=1) Hash Cond: (small.i = large.i) -> Seq Scan on small (cost=0.00..145.00 rows=10000 width=4) (actual time=0.008..0.588 rows=10000 loops=1) -> Hash (cost=1442478.00..1442478.00 rows=100000000 width=4) (actual time=84079.741..84079.741 rows=100000000 loops =1) Buckets: 4096 Batches: 4096 Memory Usage: 853kB -> Seq Scan on large (cost=0.00..1442478.00 rows=100000000 width=4) (actual time=0.005..59011.443 rows=100000 000 loops=1) Total runtime: 84990.270 ms (7 rows) It doesn't matter how big the big table is... for this distribution large table is hashed. Forcing (gdb) the cost in one of the cost_hashjoin calls to 0, it chooses to hash the smaller table with better results: explain analyze select * from small inner join large using (i); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ ------ Hash Join (cost=270.00..0.00 rows=10000 width=4) (actual time=14028.034..16510.598 rows=9001 loops=1) Hash Cond: (large.i = small.i) -> Seq Scan on large (cost=0.00..1442478.00 rows=100000000 width=4) (actual time=0.010..5893.344 rows=100000000 loo ps=1) -> Hash (cost=145.00..145.00 rows=10000 width=4) (actual time=3.854..3.854 rows=10000 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 352kB -> Seq Scan on small (cost=0.00..145.00 rows=10000 width=4) (actual time=0.005..1.585 rows=10000 loops=1) Total runtime: 16510.962 ms (7 rows) More in gdb, all of the cost seems to come from: run_cost += hash_qual_cost.per_tuple * outer_path_rows * clamp_row_est(inner_path_rows * innerbucketsize) * 0.5; (outer_path_rows * clamp_row_est(inner_path_rows * innerbucketsize) * 0.5) is 50 billion, leading to a wild cost. The parent's estimate of the number of rows is rightly estimated at 10000, so 50 billion comparisons is obviously bad estimate. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs