On Mon, Mar 12, 2018 at 10:02 PM, Narendra Pradeep U U
<narendra.prad...@zohocorp.com> wrote:
> Hi ,
>       Recently I came across a case where the planner choose larger table on
> hash side. I am not sure whether it is an intended  behavior or we are
> missing something.
>      I have two tables (a and b) each with  single column in it. One table
> 'a' is large with around 30 million distinct rows and other table 'b' has
> merely 70,000 rows with one-seventh (10,000) distinct rows. I have analyzed
> both the table.  But while joining both the table I get the larger table on
> hash side.
> tpch=# explain select b from b left join a on a = b;
>                                                QUERY PLAN
> ---------------------------------------------------------------------------------------------------------
>  Hash Left Join  (cost=824863.75..950104.42 rows=78264 width=4)
>    Hash Cond: (b.b = a.a)o
>    ->  Foreign Scan on b  (cost=0.00..821.64 rows=78264 width=4)
>          CStore File:
> /home/likewise-open/pg96/data/cstore_fdw/1818708/1849879
>          CStore File Size: 314587
>    ->  Hash  (cost=321721.22..321721.22 rows=30667722 width=4)
>          ->  Foreign Scan on a  (cost=0.00..321721.22 rows=30667722 width=4)
>                CStore File:
> /home/likewise-open/pg96/data/cstore_fdw/1818708/1849876
>                CStore File Size: 123236206
> (9 rows)
> I would like to know the reason for choosing this plan and Is there a easy
> fix to prevent such plans (especially like this one where it choose a larger
> hash table) ?

A plan with larger table being hashed doesn't necessarily bad
performing one. During partition-wise join analysis I have seen plans
with larger table being hashed perform better than the plans with
smaller table being hashed. But I have seen the other way around as
well. Although, I don't know an easy way to force which side of join
gets hashed. I tried that under the debugger. In your case, if you run
EXPLAIN ANALYZE on this query, produce outputs of two plans: one with
larger table being hashed and second with the smaller one being
hashed, you will see which of them performs better.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Reply via email to