>> I think the gather-reader-order patch will fix this.  Here's a test
>> with all three patches.
> Yeah right, After applying all three patches this problem is fixed, now
> parallel hash join is faster than normal hash join.

Thanks.  I've committed the two smaller patches; it seems fairly clear
that those are good changes independent of the parallel join stuff.

> I have tested one more case which Amit mentioned, I can see in that case
> parallel plan (parallel degree>= 3) is still slow, In Normal case it selects
> "Hash Join" but in case of parallel worker > 3 it selects Parallel "Nest
> Loop Join" which is making it costlier.

Hmm, I'm not sure why that is happening.  I'll poke at it a bit.

