Re: Incorrect estimation of HashJoin rows resulted from inaccurate small table statistics

2023-06-17 Thread Tomas Vondra
On 6/17/23 02:02, Quan Zongliang wrote: > > > On 2023/6/17 06:46, Tom Lane wrote: >> Quan Zongliang writes: >>> Perhaps we should discard this (dups cnt > 1) restriction? >> >> That's not going to happen on the basis of one test case that you >> haven't even shown us.  The implications of doing

Re: Incorrect estimation of HashJoin rows resulted from inaccurate small table statistics

2023-06-17 Thread Tomas Vondra
On 6/17/23 00:32, Quan Zongliang wrote: > ... > > It's not just a small table. If a column's value is nearly unique. It > also causes the same problem because we exclude values that occur only > once. samplerows <= num_mcv just solves one scenario. > Perhaps we should discard this (dups cnt > 1) re

Re: Incorrect estimation of HashJoin rows resulted from inaccurate small table statistics

2023-06-16 Thread Quan Zongliang
On 2023/6/17 06:46, Tom Lane wrote: Quan Zongliang writes: Perhaps we should discard this (dups cnt > 1) restriction? That's not going to happen on the basis of one test case that you haven't even shown us. The implications of doing it are very unclear. In particular, I seem to recall tha

Re: Incorrect estimation of HashJoin rows resulted from inaccurate small table statistics

2023-06-16 Thread Tom Lane
Quan Zongliang writes: > Perhaps we should discard this (dups cnt > 1) restriction? That's not going to happen on the basis of one test case that you haven't even shown us. The implications of doing it are very unclear. In particular, I seem to recall that there are bits of logic that depend on

Re: Incorrect estimation of HashJoin rows resulted from inaccurate small table statistics

2023-06-16 Thread Quan Zongliang
On 2023/6/16 23:39, Tomas Vondra wrote: On 6/16/23 11:25, Quan Zongliang wrote: We have a small table with only 23 rows and 21 values. The resulting MCV and histogram is as follows stanumbers1 | {0.08695652,0.08695652} stavalues1  | {v1,v2} stavalues2  | {v3,v4,v5,v6,v7,v8,v9,v10,v11,v12,

Re: Incorrect estimation of HashJoin rows resulted from inaccurate small table statistics

2023-06-16 Thread Tomas Vondra
On 6/16/23 11:25, Quan Zongliang wrote: > > We have a small table with only 23 rows and 21 values. > > The resulting MCV and histogram is as follows > stanumbers1 | {0.08695652,0.08695652} > stavalues1  | {v1,v2} > stavalues2  | > {v3,v4,v5,v6,v7,v8,v9,v10,v11,v12,v13,v14,v15,v16,v17,v18,v19,v

Incorrect estimation of HashJoin rows resulted from inaccurate small table statistics

2023-06-16 Thread Quan Zongliang
We have a small table with only 23 rows and 21 values. The resulting MCV and histogram is as follows stanumbers1 | {0.08695652,0.08695652} stavalues1 | {v1,v2} stavalues2 | {v3,v4,v5,v6,v7,v8,v9,v10,v11,v12,v13,v14,v15,v16,v17,v18,v19,v20,v21} An incorrect number of rows was estimated when