Hi ,
Recently I came across a case where the planner choose larger table on
hash side. I am not sure whether it is an intended behavior or we are missing
something.
I have two tables (a and b) each with single column in it. One table 'a'
is large with around 30 million distinct rows and other table 'b' has merely
70,000 rows with one-seventh (10,000) distinct rows. I have analyzed both the
table. But while joining both the table I get the larger table on hash side.
tpch=# explain select b from b left join a on a = b;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Hash Left Join (cost=824863.75..950104.42 rows=78264 width=4)
Hash Cond: (b.b = a.a)o
-> Foreign Scan on b (cost=0.00..821.64 rows=78264 width=4)
CStore File: /home/likewise-open/pg96/data/cstore_fdw/1818708/1849879
CStore File Size: 314587
-> Hash (cost=321721.22..321721.22 rows=30667722 width=4)
-> Foreign Scan on a (cost=0.00..321721.22 rows=30667722 width=4)
CStore File:
/home/likewise-open/pg96/data/cstore_fdw/1818708/1849876
CStore File Size: 123236206
(9 rows)
I would like to know the reason for choosing this plan and Is there a easy fix
to prevent such plans (especially like this one where it choose a larger hash
table) ?
Thanks,
Pradeep