The following bug has been logged on the website:

Bug reference:      6668
Logged by:          Postgres User
Email address:      postgresu...@yahoo.com
PostgreSQL version: 9.1.3
Operating system:   Ubuntu
Description:        


work_mem 1MB

create table small(i) as select (g/1000) * 1000 from
generate_series(1,10000) g;

create table large(i) as select generate_series(1,100000000);

vacuum;
vacuum;
vacuum analyze;

explain analyze select * from small inner join large using (i);

                                                             QUERY PLAN     
                                           
             
------------------------------------------------------------------------------------------------------------------------
-------------
 Hash Join  (cost=3083103.00..3475328.00 rows=10000 width=4) (actual
time=84079.899..84989.419 rows=9001 loops=1)
   Hash Cond: (small.i = large.i)
   ->  Seq Scan on small  (cost=0.00..145.00 rows=10000 width=4) (actual
time=0.008..0.588 rows=10000 loops=1)
   ->  Hash  (cost=1442478.00..1442478.00 rows=100000000 width=4) (actual
time=84079.741..84079.741 rows=100000000 loops
=1)
         Buckets: 4096  Batches: 4096  Memory Usage: 853kB
         ->  Seq Scan on large  (cost=0.00..1442478.00 rows=100000000
width=4) (actual time=0.005..59011.443 rows=100000
000 loops=1)
 Total runtime: 84990.270 ms
(7 rows)


It doesn't matter how big the big table is... for this distribution large
table is hashed.

Forcing (gdb) the cost in one of the cost_hashjoin calls to 0, it chooses to
hash the smaller table with better results:

explain analyze select * from small inner join large using (i);
                                                          QUERY PLAN        
                                           
      
------------------------------------------------------------------------------------------------------------------------
------
 Hash Join  (cost=270.00..0.00 rows=10000 width=4) (actual
time=14028.034..16510.598 rows=9001 loops=1)
   Hash Cond: (large.i = small.i)
   ->  Seq Scan on large  (cost=0.00..1442478.00 rows=100000000 width=4)
(actual time=0.010..5893.344 rows=100000000 loo
ps=1)
   ->  Hash  (cost=145.00..145.00 rows=10000 width=4) (actual
time=3.854..3.854 rows=10000 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 352kB
         ->  Seq Scan on small  (cost=0.00..145.00 rows=10000 width=4)
(actual time=0.005..1.585 rows=10000 loops=1)
 Total runtime: 16510.962 ms
(7 rows)

More in gdb, all of the cost seems to come from:

        run_cost += hash_qual_cost.per_tuple * outer_path_rows *
            clamp_row_est(inner_path_rows * innerbucketsize) * 0.5;

(outer_path_rows *  clamp_row_est(inner_path_rows * innerbucketsize) * 0.5)
is 50 billion, leading to a wild cost. The parent's estimate of the number
of rows is rightly estimated at 10000, so 50 billion comparisons is
obviously bad estimate.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to