Hi,
I run TPC-DS benchmark for Postgres and find the join size estimation has
several problems.
For example, Ndistinct is key to join selectivity's estimation, this value
does not take restrictions
of the rel, I hit some cases in the function eqjoinsel, nd is much larger
than vardata.rel->rows.
Accurate estimation need good math model that considering dependency of
join var and vars in restriction.
But at least, indistinct should not be greater than the number of rows.
See the attached patch to adjust nd in eqjoinsel.
Best,
Zhenghua Lyu
0001-Adjust-ndistinct-with-nrows-in-the-rel-when-estimati.patch
Description: 0001-Adjust-ndistinct-with-nrows-in-the-rel-when-estimati.patch
