I looked into the problem described at
https://www.postgresql.org/message-id/flat/VisenaEmail.26.df42f82acae38a58.156463942b8%40tc7-visena
and I believe I've reproduced it: the requirement is that the inner join
column for the antijoin must contain a lot of NULL values, and what isn't
NULL must be unique or nearly so. If ANALYZE doesn't come across any
duplicated values, it will set the column's stadistinct value to "-1",
which causes the planner to believe that each row of the inner table
produces a unique value, resulting in a bogus answer from
get_variable_numdistinct() and thence a bogus join size estimate.
Here's an example in the regression database, making use of the existing
unique column tenk1.unique1:
regression=# create table manynulls as select case when random() < 0.1 then
unique1 else null end as unique1 from tenk1;
SELECT 10000
regression=# analyze manynulls;
ANALYZE
regression=# explain analyze select * from tenk1 t where not exists(select 1
from manynulls m where m.unique1 = t.unique1);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Hash Anti Join (cost=261.00..756.50 rows=1 width=244) (actual
time=4.632..14.729 rows=8973 loops=1)
Hash Cond: (t.unique1 = m.unique1)
-> Seq Scan on tenk1 t (cost=0.00..458.00 rows=10000 width=244) (actual
time=0.015..2.683 rows=10000 loops=1)
-> Hash (cost=136.00..136.00 rows=10000 width=4) (actual time=4.553..4.553
rows=1027 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 165kB
-> Seq Scan on manynulls m (cost=0.00..136.00 rows=10000 width=4)
(actual time=0.019..2.668 rows=10000 loops=1)
Planning time: 0.808 ms
Execution time: 15.670 ms
(8 rows)
So the antijoin size estimate is way off, but it's hardly the planner's
fault because the stats are insane:
regression=# select attname,null_frac,n_distinct from pg_stats where tablename
= 'manynulls';
attname | null_frac | n_distinct
---------+-----------+------------
unique1 | 0.8973 | -1
(1 row)
With the patch attached below, ANALYZE produces
regression=# analyze manynulls;
ANALYZE
regression=# select attname,null_frac,n_distinct from pg_stats where tablename
= 'manynulls';
attname | null_frac | n_distinct
---------+-----------+------------
unique1 | 0.8973 | -0.1027
(1 row)
and now the join size estimate is dead on:
regression=# explain analyze select * from tenk1 t where not exists(select 1
from manynulls m where m.unique1 = t.unique1);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Hash Anti Join (cost=261.00..847.69 rows=8973 width=244) (actual
time=4.501..13.888 rows=8973 loops=1)
Hash Cond: (t.unique1 = m.unique1)
-> Seq Scan on tenk1 t (cost=0.00..458.00 rows=10000 width=244) (actual
time=0.031..4.959 rows=10000 loops=1)
-> Hash (cost=136.00..136.00 rows=10000 width=4) (actual time=4.404..4.404
rows=1027 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 165kB
-> Seq Scan on manynulls m (cost=0.00..136.00 rows=10000 width=4)
(actual time=0.034..2.576 rows=10000 loops=1)
Planning time: 1.388 ms
Execution time: 14.542 ms
(8 rows)
What I did in the patch is to scale the formerly fixed "-1.0" stadistinct
estimate to discount the fraction of nulls we found. An alternative
possibility might be to decree that a fractional stadistinct considers
only non-nulls, but then all the other paths through ANALYZE would be
wrong. The spec for it in pg_statistic.h doesn't suggest any such
interpretation, either.
Looking around, there are a couple of places outside commands/analyze.c
that are making the same mistake, so this patch isn't complete, but it
illustrates what needs to be done.
This is a bit reminiscent of the nulls-accounting problem we fixed in
commit be4b4dc75, though that tended to result in underestimates not
overestimates of the number of distinct values. We didn't back-patch
that fix, so probably we shouldn't back-patch this either. On the other
hand, it is far more open-and-shut that this is wrong. Thoughts?
regards, tom lane
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 5fcedd7..9ac7122 100644
*** a/src/backend/commands/analyze.c
--- b/src/backend/commands/analyze.c
*************** compute_distinct_stats(VacAttrStatsP sta
*** 2049,2056 ****
if (nmultiple == 0)
{
! /* If we found no repeated values, assume it's a unique column */
! stats->stadistinct = -1.0;
}
else if (track_cnt < track_max && toowide_cnt == 0 &&
nmultiple == track_cnt)
--- 2049,2059 ----
if (nmultiple == 0)
{
! /*
! * If we found no repeated non-null values, assume it's a unique
! * column; but be sure to discount for any nulls we found.
! */
! stats->stadistinct = -1.0 * (1.0 - stats->stanullfrac);
}
else if (track_cnt < track_max && toowide_cnt == 0 &&
nmultiple == track_cnt)
*************** compute_scalar_stats(VacAttrStatsP stats
*** 2426,2433 ****
if (nmultiple == 0)
{
! /* If we found no repeated values, assume it's a unique column */
! stats->stadistinct = -1.0;
}
else if (toowide_cnt == 0 && nmultiple == ndistinct)
{
--- 2429,2439 ----
if (nmultiple == 0)
{
! /*
! * If we found no repeated non-null values, assume it's a unique
! * column; but be sure to discount for any nulls we found.
! */
! stats->stadistinct = -1.0 * (1.0 - stats->stanullfrac);
}
else if (toowide_cnt == 0 && nmultiple == ndistinct)
{
*************** compute_scalar_stats(VacAttrStatsP stats
*** 2753,2759 ****
else
stats->stawidth = stats->attrtype->typlen;
/* Assume all too-wide values are distinct, so it's a unique column */
! stats->stadistinct = -1.0;
}
else if (null_cnt > 0)
{
--- 2759,2765 ----
else
stats->stawidth = stats->attrtype->typlen;
/* Assume all too-wide values are distinct, so it's a unique column */
! stats->stadistinct = -1.0 * (1.0 - stats->stanullfrac);
}
else if (null_cnt > 0)
{
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers