On Thu, Oct 07, 2021 at 03:26:46PM -0500, Jaime Casanova wrote: > On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote: > > On Sat, Sep 25, 2021 at 05:31:52PM -0500, Justin Pryzby wrote: > > > It seems like your patch should also check "inh" in examine_variable and > > > statext_expressions_load. > > > > I tried adding that - I mostly kept my patches separate. > > Hopefully this is more helpful than a complication. > > I added at: https://commitfest.postgresql.org/35/3332/ > > > > Actually, this is confusing. Which patch is the one we should be > reviewing?
It is confusing, but not as much as I first thought. Please check the commit messages. The first two patches are meant to be applied to master *and* backpatched. The first one intends to fixes the bug that non-inherited stats are being used for queries of inheritance trees. The 2nd one fixes the regression that stats are not collected for inheritence trees of partitioned tables (which is the only type of stats they could ever possibly have). And the 3rd+4th patches (Tomas' plus my changes) allow collecting both inherited and non-inherited stats, only in master, since it requires a catalog change. It's a bit confusing that patch #4 removes most what I added in patches 1 and 2. But that's exactly what's needed to collect and apply both inherited and non-inherited stats: the first two patches avoid applying stats collected with the wrong inheritence. That's also what's needed for the patchset to follow the normal "apply to master and backpatch" process, rather than 2 patches which are backpatched but not applied to master, and one which is applied to master and not backpatched.. @Tomas: I just found commit 427c6b5b9, which is a remarkably similar issue affecting column stats 15 years ago. Rebased since there were conflicts with my typos fixes. -- Justin
>From f6fb0e139e5b19ea288040328590a91a8526e2a6 Mon Sep 17 00:00:00 2001 From: Justin Pryzby <[email protected]> Date: Sat, 25 Sep 2021 19:42:41 -0500 Subject: [PATCH v2 1/5] Do not use extended statistics on inheritence trees.. Since 859b3003de, inherited ext stats are not built. However, the non-inherited stats stats were incorrectly used during planning of queries with inheritence heirarchies. Since the ext stats do not include child tables, they can lead to worse estimates. This is remarkably similar to 427c6b5b9, which affected column statistics 15 years ago. choose_best_statistics is handled a bit differently (in the calling function), because it isn't passed rel nor rel->inh, and it's an exported function, so avoid changing its signature in back branches. https://www.postgresql.org/message-id/flat/[email protected] Backpatch to v10 --- src/backend/statistics/dependencies.c | 5 +++++ src/backend/statistics/extended_stats.c | 5 +++++ src/backend/utils/adt/selfuncs.c | 9 +++++++++ src/test/regress/expected/stats_ext.out | 23 +++++++++++++++++++++++ src/test/regress/sql/stats_ext.sql | 14 ++++++++++++++ 5 files changed, 56 insertions(+) diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c index 8bf80db8e4..b2e33329c7 100644 --- a/src/backend/statistics/dependencies.c +++ b/src/backend/statistics/dependencies.c @@ -1593,6 +1593,11 @@ dependencies_clauselist_selectivity(PlannerInfo *root, int nexprs; int k; MVDependencies *deps; + RangeTblEntry *rte = root->simple_rte_array[rel->relid]; + + /* If it's an inheritence tree, skip statistics (which do not include child stats) */ + if (rte->inh) + break; /* skip statistics that are not of the correct type */ if (stat->kind != STATS_EXT_DEPENDENCIES) diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index 69ca52094f..6c69e5bc96 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -1744,6 +1744,11 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli StatisticExtInfo *stat; List *stat_clauses; Bitmapset *simple_clauses; + RangeTblEntry *rte = root->simple_rte_array[rel->relid]; + + /* If it's an inheritence tree, skip statistics (which do not include child stats) */ + if (rte->inh) + break; /* find the best suited statistics object for these attnums */ stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index 10895fb287..a0932e39e1 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -3913,6 +3913,11 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel, Oid statOid = InvalidOid; MVNDistinct *stats; StatisticExtInfo *matched_info = NULL; + RangeTblEntry *rte = root->simple_rte_array[rel->relid]; + + /* If it's an inheritence tree, skip statistics (which do not include child stats) */ + if (rte->inh) + return false; /* bail out immediately if the table has no extended statistics */ if (!rel->statlist) @@ -5232,6 +5237,10 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, if (vardata->statsTuple) break; + /* If it's an inheritence tree, skip statistics (which do not include child stats) */ + if (planner_rt_fetch(onerel->relid, root)->inh) + break; + /* skip stats without per-expression stats */ if (info->kind != STATS_EXT_EXPRESSIONS) continue; diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out index c60ba45aba..5c15e44bd6 100644 --- a/src/test/regress/expected/stats_ext.out +++ b/src/test/regress/expected/stats_ext.out @@ -176,6 +176,29 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; ANALYZE ab1; DROP TABLE ab1 CASCADE; NOTICE: drop cascades to table ab1c +-- Ensure non-inherited stats are not applied to inherited query +CREATE TABLE stxdinh(i int, j int); +CREATE TABLE stxdinh1() INHERITS(stxdinh); +INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a; +INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a; +VACUUM ANALYZE stxdinh, stxdinh1; +-- Without stats object, it looks like this +SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); + estimated | actual +-----------+-------- + 1000 | 1008 +(1 row) + +CREATE STATISTICS stxdinh ON i,j FROM stxdinh; +VACUUM ANALYZE stxdinh, stxdinh1; +-- Since the stats object does not include inherited stats, it should not affect the estimates +SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); + estimated | actual +-----------+-------- + 1000 | 1008 +(1 row) + +DROP TABLE stxdinh, stxdinh1; -- basic test for statistics on expressions CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ); -- expression stats may be built on a single expression column diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql index 6fb37962a7..610f7ed17f 100644 --- a/src/test/regress/sql/stats_ext.sql +++ b/src/test/regress/sql/stats_ext.sql @@ -112,6 +112,20 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; ANALYZE ab1; DROP TABLE ab1 CASCADE; +-- Ensure non-inherited stats are not applied to inherited query +CREATE TABLE stxdinh(i int, j int); +CREATE TABLE stxdinh1() INHERITS(stxdinh); +INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a; +INSERT INTO stxdinh1 SELECT a, a FROM generate_series(1,999)a; +VACUUM ANALYZE stxdinh, stxdinh1; +-- Without stats object, it looks like this +SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); +CREATE STATISTICS stxdinh ON i,j FROM stxdinh; +VACUUM ANALYZE stxdinh, stxdinh1; +-- Since the stats object does not include inherited stats, it should not affect the estimates +SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); +DROP TABLE stxdinh, stxdinh1; + -- basic test for statistics on expressions CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ); -- 2.17.0
>From b5bcff5331d052ea753d25ec7f443dc1d807fb13 Mon Sep 17 00:00:00 2001 From: Tomas Vondra <[email protected]> Date: Sat, 25 Sep 2021 23:01:21 +0200 Subject: [PATCH v2 2/5] Build inherited extended stats on partitioned tables Since 859b3003de, ext stats on partitioned tables are not built, which is a regression. For back branches, pg_statistic_ext cannot support both inherited (FROM) and non-inherited (FROM ONLY) stats on inheritence heirarchies. But there's no issue building inherited stats for partitioned tables, which are empty, so cannot have non-inherited stats. See also: 8c5cdb7f4f6e1d6a6104cb58ce4f23453891651b https://www.postgresql.org/message-id/20210923212624.GI831%40telsasoft.com Backpatch to v10 --- src/backend/commands/analyze.c | 5 ++++- src/backend/statistics/dependencies.c | 2 +- src/backend/statistics/extended_stats.c | 2 +- src/backend/utils/adt/selfuncs.c | 9 ++++++--- src/test/regress/expected/stats_ext.out | 19 +++++++++++++++++++ src/test/regress/sql/stats_ext.sql | 10 ++++++++++ 6 files changed, 41 insertions(+), 6 deletions(-) diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c index 8bfb2ad958..299f4893b8 100644 --- a/src/backend/commands/analyze.c +++ b/src/backend/commands/analyze.c @@ -548,6 +548,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params, { MemoryContext col_context, old_context; + bool build_ext_stats; pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE, PROGRESS_ANALYZE_PHASE_COMPUTE_STATS); @@ -611,13 +612,15 @@ do_analyze_rel(Relation onerel, VacuumParams *params, thisdata->attr_cnt, thisdata->vacattrstats); } + build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh); + /* * Build extended statistics (if there are any). * * For now we only build extended statistics on individual relations, * not for relations representing inheritance trees. */ - if (!inh) + if (build_ext_stats) BuildRelationExtStatistics(onerel, totalrows, numrows, rows, attr_cnt, vacattrstats); } diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c index b2e33329c7..0659307b02 100644 --- a/src/backend/statistics/dependencies.c +++ b/src/backend/statistics/dependencies.c @@ -1596,7 +1596,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root, RangeTblEntry *rte = root->simple_rte_array[rel->relid]; /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (rte->inh) + if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) break; /* skip statistics that are not of the correct type */ diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index 6c69e5bc96..9e518830ae 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -1747,7 +1747,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli RangeTblEntry *rte = root->simple_rte_array[rel->relid]; /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (rte->inh) + if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) break; /* find the best suited statistics object for these attnums */ diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index a0932e39e1..b15f14e1a0 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -3916,7 +3916,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte = root->simple_rte_array[rel->relid]; /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (rte->inh) + if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) return false; /* bail out immediately if the table has no extended statistics */ @@ -5238,8 +5238,11 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, break; /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (planner_rt_fetch(onerel->relid, root)->inh) - break; + { + RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root); + if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) + break; + } /* skip stats without per-expression stats */ if (info->kind != STATS_EXT_EXPRESSIONS) diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out index 5c15e44bd6..67234b9fc2 100644 --- a/src/test/regress/expected/stats_ext.out +++ b/src/test/regress/expected/stats_ext.out @@ -199,6 +199,25 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); (1 row) DROP TABLE stxdinh, stxdinh1; +-- Ensure inherited stats ARE applied to inherited query in partitioned table +CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i); +CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100); +INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a; +CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp; +VACUUM ANALYZE stxdinp; -- partitions are processed recursively +SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass; + ?column? +---------- + 1 +(1 row) + +SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2'); + estimated | actual +-----------+-------- + 10 | 10 +(1 row) + +DROP TABLE stxdinp; -- basic test for statistics on expressions CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ); -- expression stats may be built on a single expression column diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql index 610f7ed17f..2371043ca1 100644 --- a/src/test/regress/sql/stats_ext.sql +++ b/src/test/regress/sql/stats_ext.sql @@ -126,6 +126,16 @@ VACUUM ANALYZE stxdinh, stxdinh1; SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); DROP TABLE stxdinh, stxdinh1; +-- Ensure inherited stats ARE applied to inherited query in partitioned table +CREATE TABLE stxdinp(i int, a int, b int) PARTITION BY RANGE (i); +CREATE TABLE stxdinp1 PARTITION OF stxdinp FOR VALUES FROM (1)TO(100); +INSERT INTO stxdinp SELECT 1, a/100, a/100 FROM generate_series(1,999)a; +CREATE STATISTICS stxdinp ON (a),(b) FROM stxdinp; +VACUUM ANALYZE stxdinp; -- partitions are processed recursively +SELECT 1 FROM pg_statistic_ext WHERE stxrelid='stxdinp'::regclass; +SELECT * FROM check_estimated_rows('SELECT a, b FROM stxdinp GROUP BY 1,2'); +DROP TABLE stxdinp; + -- basic test for statistics on expressions CREATE TABLE ab1 (a INTEGER, b INTEGER, c TIMESTAMP, d TIMESTAMPTZ); -- 2.17.0
>From 8672bbd68cfb76447aafde5133bf2b897625be75 Mon Sep 17 00:00:00 2001 From: Tomas Vondra <[email protected]> Date: Sat, 25 Sep 2021 21:27:10 +0200 Subject: [PATCH v2 3/5] Add stxdinherit; build inherited extended stats on inheritence parents pg_statistic has an inherited flag which is part of the unique index, but pg_statistic has never had that. In back branches, pg_statistic stores the cannot store both inherited and non-inherited stats. So it stores non-inherited stats (FROM ONLY) for inheritence parents and inherited stats for partitioned tables. This patch allows storing both inherited and non-inherited stats for non-empty inheritence parents, and avoids the above, confusing definition. --- doc/src/sgml/catalogs.sgml | 23 +++ src/backend/catalog/system_views.sql | 1 + src/backend/commands/analyze.c | 15 +- src/backend/commands/statscmds.c | 20 ++- src/backend/optimizer/util/plancat.c | 186 +++++++++++--------- src/backend/statistics/dependencies.c | 13 +- src/backend/statistics/extended_stats.c | 67 ++++--- src/backend/statistics/mcv.c | 9 +- src/backend/statistics/mvdistinct.c | 5 +- src/backend/utils/adt/selfuncs.c | 2 +- src/backend/utils/cache/syscache.c | 6 +- src/include/catalog/pg_statistic_ext_data.h | 4 +- src/include/nodes/pathnodes.h | 1 + src/include/statistics/statistics.h | 9 +- src/test/regress/expected/rules.out | 1 + 15 files changed, 217 insertions(+), 145 deletions(-) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index fd6910ddbe..dc79b0737f 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -7443,6 +7443,19 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l created with <link linkend="sql-createstatistics"><command>CREATE STATISTICS</command></link>. </para> + <para> + Normally there is one entry, with <structfield>stxdinherit</structfield> = + <literal>false</literal>, for each statistics object that has been analyzed. + If the table has inheritance children, a second entry with + <structfield>stxdinherit</structfield> = <literal>true</literal> is also created. + This row represents the statistics object over the inheritance tree, i.e., + statistics for the data you'd see with + <literal>SELECT * FROM <replaceable>table</replaceable>*</literal>, + whereas the <structfield>stxdinherit</structfield> = <literal>false</literal> row + represents the results of + <literal>SELECT * FROM ONLY <replaceable>table</replaceable></literal>. + </para> + <para> Like <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>, <structname>pg_statistic_ext_data</structname> should not be @@ -7482,6 +7495,16 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l </para></entry> </row> + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>stxdinherit</structfield> <type>bool</type> + </para> + <para> + If true, the stats include inheritance child columns, not just the + values in the specified relation + </para></entry> + </row> + <row> <entry role="catalog_table_entry"><para role="column_definition"> <structfield>stxdndistinct</structfield> <type>pg_ndistinct</type> diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 55f6e3711d..07ab18dc52 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -266,6 +266,7 @@ CREATE VIEW pg_stats_ext WITH (security_barrier) AS ) AS attnames, pg_get_statisticsobjdef_expressions(s.oid) as exprs, s.stxkind AS kinds, + sd.stxdinherit AS inherited, sd.stxdndistinct AS n_distinct, sd.stxddependencies AS dependencies, m.most_common_vals, diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c index 299f4893b8..7f4b0f5320 100644 --- a/src/backend/commands/analyze.c +++ b/src/backend/commands/analyze.c @@ -548,7 +548,6 @@ do_analyze_rel(Relation onerel, VacuumParams *params, { MemoryContext col_context, old_context; - bool build_ext_stats; pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE, PROGRESS_ANALYZE_PHASE_COMPUTE_STATS); @@ -612,17 +611,9 @@ do_analyze_rel(Relation onerel, VacuumParams *params, thisdata->attr_cnt, thisdata->vacattrstats); } - build_ext_stats = (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) ? inh : (!inh); - - /* - * Build extended statistics (if there are any). - * - * For now we only build extended statistics on individual relations, - * not for relations representing inheritance trees. - */ - if (build_ext_stats) - BuildRelationExtStatistics(onerel, totalrows, numrows, rows, - attr_cnt, vacattrstats); + /* Build extended statistics (if there are any). */ + BuildRelationExtStatistics(onerel, inh, totalrows, numrows, rows, + attr_cnt, vacattrstats); } pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE, diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c index 8f1550ec80..4395d878c7 100644 --- a/src/backend/commands/statscmds.c +++ b/src/backend/commands/statscmds.c @@ -524,6 +524,9 @@ CreateStatistics(CreateStatsStmt *stmt) datavalues[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statoid); + /* create only the "stxdinherit=false", because that always exists */ + datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = ObjectIdGetDatum(false); + /* no statistics built yet */ datanulls[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true; datanulls[Anum_pg_statistic_ext_data_stxddependencies - 1] = true; @@ -726,6 +729,7 @@ RemoveStatisticsById(Oid statsOid) HeapTuple tup; Form_pg_statistic_ext statext; Oid relid; + int inh; /* * First delete the pg_statistic_ext_data tuple holding the actual @@ -733,14 +737,20 @@ RemoveStatisticsById(Oid statsOid) */ relation = table_open(StatisticExtDataRelationId, RowExclusiveLock); - tup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid)); + /* hack to delete both stxdinherit = true/false */ + for (inh = 0; inh <= 1; inh++) + { + tup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid), + BoolGetDatum(inh)); - if (!HeapTupleIsValid(tup)) /* should not happen */ - elog(ERROR, "cache lookup failed for statistics data %u", statsOid); + if (!HeapTupleIsValid(tup)) /* should not happen */ + // elog(ERROR, "cache lookup failed for statistics data %u", statsOid); + continue; - CatalogTupleDelete(relation, &tup->t_self); + CatalogTupleDelete(relation, &tup->t_self); - ReleaseSysCache(tup); + ReleaseSysCache(tup); + } table_close(relation, RowExclusiveLock); diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index c5194fdbbf..154d48a330 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -30,6 +30,7 @@ #include "catalog/pg_am.h" #include "catalog/pg_proc.h" #include "catalog/pg_statistic_ext.h" +#include "catalog/pg_statistic_ext_data.h" #include "foreign/fdwapi.h" #include "miscadmin.h" #include "nodes/makefuncs.h" @@ -1311,127 +1312,144 @@ get_relation_statistics(RelOptInfo *rel, Relation relation) { Oid statOid = lfirst_oid(l); Form_pg_statistic_ext staForm; + Form_pg_statistic_ext_data dataForm; HeapTuple htup; HeapTuple dtup; Bitmapset *keys = NULL; List *exprs = NIL; int i; + int inh; htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid)); if (!HeapTupleIsValid(htup)) elog(ERROR, "cache lookup failed for statistics object %u", statOid); staForm = (Form_pg_statistic_ext) GETSTRUCT(htup); - dtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid)); - if (!HeapTupleIsValid(dtup)) - elog(ERROR, "cache lookup failed for statistics object %u", statOid); - - /* - * First, build the array of columns covered. This is ultimately - * wasted if no stats within the object have actually been built, but - * it doesn't seem worth troubling over that case. - */ - for (i = 0; i < staForm->stxkeys.dim1; i++) - keys = bms_add_member(keys, staForm->stxkeys.values[i]); - /* - * Preprocess expressions (if any). We read the expressions, run them - * through eval_const_expressions, and fix the varnos. + * Hack to load stats with stxdinherit true/false - there should be + * a better way to do this, I guess. */ + for (inh = 0; inh <= 1; inh++) { - bool isnull; - Datum datum; + dtup = SearchSysCache2(STATEXTDATASTXOID, + ObjectIdGetDatum(statOid), BoolGetDatum((bool) inh)); + if (!HeapTupleIsValid(dtup)) + continue; - /* decode expression (if any) */ - datum = SysCacheGetAttr(STATEXTOID, htup, - Anum_pg_statistic_ext_stxexprs, &isnull); + dataForm = (Form_pg_statistic_ext_data) GETSTRUCT(dtup); - if (!isnull) + /* + * First, build the array of columns covered. This is ultimately + * wasted if no stats within the object have actually been built, but + * it doesn't seem worth troubling over that case. + */ + for (i = 0; i < staForm->stxkeys.dim1; i++) + keys = bms_add_member(keys, staForm->stxkeys.values[i]); + + /* + * Preprocess expressions (if any). We read the expressions, run them + * through eval_const_expressions, and fix the varnos. + */ { - char *exprsString; + bool isnull; + Datum datum; - exprsString = TextDatumGetCString(datum); - exprs = (List *) stringToNode(exprsString); - pfree(exprsString); + /* decode expression (if any) */ + datum = SysCacheGetAttr(STATEXTOID, htup, + Anum_pg_statistic_ext_stxexprs, &isnull); - /* - * Run the expressions through eval_const_expressions. This is - * not just an optimization, but is necessary, because the - * planner will be comparing them to similarly-processed qual - * clauses, and may fail to detect valid matches without this. - * We must not use canonicalize_qual, however, since these - * aren't qual expressions. - */ - exprs = (List *) eval_const_expressions(NULL, (Node *) exprs); + if (!isnull) + { + char *exprsString; - /* May as well fix opfuncids too */ - fix_opfuncids((Node *) exprs); + exprsString = TextDatumGetCString(datum); + exprs = (List *) stringToNode(exprsString); + pfree(exprsString); - /* - * Modify the copies we obtain from the relcache to have the - * correct varno for the parent relation, so that they match - * up correctly against qual clauses. - */ - if (varno != 1) - ChangeVarNodes((Node *) exprs, 1, varno, 0); + /* + * Run the expressions through eval_const_expressions. This is + * not just an optimization, but is necessary, because the + * planner will be comparing them to similarly-processed qual + * clauses, and may fail to detect valid matches without this. + * We must not use canonicalize_qual, however, since these + * aren't qual expressions. + */ + exprs = (List *) eval_const_expressions(NULL, (Node *) exprs); + + /* May as well fix opfuncids too */ + fix_opfuncids((Node *) exprs); + + /* + * Modify the copies we obtain from the relcache to have the + * correct varno for the parent relation, so that they match + * up correctly against qual clauses. + */ + if (varno != 1) + ChangeVarNodes((Node *) exprs, 1, varno, 0); + } } - } - /* add one StatisticExtInfo for each kind built */ - if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT)) - { - StatisticExtInfo *info = makeNode(StatisticExtInfo); + /* add one StatisticExtInfo for each kind built */ + if (statext_is_kind_built(dtup, STATS_EXT_NDISTINCT)) + { + StatisticExtInfo *info = makeNode(StatisticExtInfo); - info->statOid = statOid; - info->rel = rel; - info->kind = STATS_EXT_NDISTINCT; - info->keys = bms_copy(keys); - info->exprs = exprs; + info->statOid = statOid; + info->inherit = dataForm->stxdinherit; + info->rel = rel; + info->kind = STATS_EXT_NDISTINCT; + info->keys = bms_copy(keys); + info->exprs = exprs; - stainfos = lappend(stainfos, info); - } + stainfos = lappend(stainfos, info); + } - if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES)) - { - StatisticExtInfo *info = makeNode(StatisticExtInfo); + if (statext_is_kind_built(dtup, STATS_EXT_DEPENDENCIES)) + { + StatisticExtInfo *info = makeNode(StatisticExtInfo); - info->statOid = statOid; - info->rel = rel; - info->kind = STATS_EXT_DEPENDENCIES; - info->keys = bms_copy(keys); - info->exprs = exprs; + info->statOid = statOid; + info->inherit = dataForm->stxdinherit; + info->rel = rel; + info->kind = STATS_EXT_DEPENDENCIES; + info->keys = bms_copy(keys); + info->exprs = exprs; - stainfos = lappend(stainfos, info); - } + stainfos = lappend(stainfos, info); + } - if (statext_is_kind_built(dtup, STATS_EXT_MCV)) - { - StatisticExtInfo *info = makeNode(StatisticExtInfo); + if (statext_is_kind_built(dtup, STATS_EXT_MCV)) + { + StatisticExtInfo *info = makeNode(StatisticExtInfo); - info->statOid = statOid; - info->rel = rel; - info->kind = STATS_EXT_MCV; - info->keys = bms_copy(keys); - info->exprs = exprs; + info->statOid = statOid; + info->inherit = dataForm->stxdinherit; + info->rel = rel; + info->kind = STATS_EXT_MCV; + info->keys = bms_copy(keys); + info->exprs = exprs; - stainfos = lappend(stainfos, info); - } + stainfos = lappend(stainfos, info); + } - if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS)) - { - StatisticExtInfo *info = makeNode(StatisticExtInfo); + if (statext_is_kind_built(dtup, STATS_EXT_EXPRESSIONS)) + { + StatisticExtInfo *info = makeNode(StatisticExtInfo); - info->statOid = statOid; - info->rel = rel; - info->kind = STATS_EXT_EXPRESSIONS; - info->keys = bms_copy(keys); - info->exprs = exprs; + info->statOid = statOid; + info->inherit = dataForm->stxdinherit; + info->rel = rel; + info->kind = STATS_EXT_EXPRESSIONS; + info->keys = bms_copy(keys); + info->exprs = exprs; + + stainfos = lappend(stainfos, info); + } - stainfos = lappend(stainfos, info); + ReleaseSysCache(dtup); } ReleaseSysCache(htup); - ReleaseSysCache(dtup); bms_free(keys); } diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c index 0659307b02..835f4bdf7a 100644 --- a/src/backend/statistics/dependencies.c +++ b/src/backend/statistics/dependencies.c @@ -618,14 +618,16 @@ dependency_is_fully_matched(MVDependency *dependency, Bitmapset *attnums) * Load the functional dependencies for the indicated pg_statistic_ext tuple */ MVDependencies * -statext_dependencies_load(Oid mvoid) +statext_dependencies_load(Oid mvoid, bool inh) { MVDependencies *result; bool isnull; Datum deps; HeapTuple htup; - htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid)); + htup = SearchSysCache2(STATEXTDATASTXOID, + ObjectIdGetDatum(mvoid), + BoolGetDatum(inh)); if (!HeapTupleIsValid(htup)) elog(ERROR, "cache lookup failed for statistics object %u", mvoid); @@ -1410,6 +1412,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root, int ndependencies; int i; AttrNumber attnum_offset; + RangeTblEntry *rte = root->simple_rte_array[rel->relid]; /* unique expressions */ Node **unique_exprs; @@ -1603,6 +1606,10 @@ dependencies_clauselist_selectivity(PlannerInfo *root, if (stat->kind != STATS_EXT_DEPENDENCIES) continue; + /* skip statistics with mismatching stxdinherit value */ + if (stat->inherit != rte->inh) + continue; + /* * Count matching attributes - we have to undo the attnum offsets. The * input attribute numbers are not offset (expressions are not @@ -1649,7 +1656,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root, if (nmatched + nexprs < 2) continue; - deps = statext_dependencies_load(stat->statOid); + deps = statext_dependencies_load(stat->statOid, rte->inh); /* * The expressions may be represented by different attnums in the diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index 9e518830ae..8cfcd17ad6 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -77,7 +77,7 @@ typedef struct StatExtEntry static List *fetch_statentries_for_relation(Relation pg_statext, Oid relid); static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs, int nvacatts, VacAttrStats **vacatts); -static void statext_store(Oid statOid, +static void statext_store(Oid statOid, bool inh, MVNDistinct *ndistinct, MVDependencies *dependencies, MCVList *mcv, Datum exprs, VacAttrStats **stats); static int statext_compute_stattarget(int stattarget, @@ -110,7 +110,7 @@ static StatsBuildData *make_build_data(Relation onerel, StatExtEntry *stat, * requested stats, and serializes them back into the catalog. */ void -BuildRelationExtStatistics(Relation onerel, double totalrows, +BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows, int numrows, HeapTuple *rows, int natts, VacAttrStats **vacattrstats) { @@ -230,7 +230,8 @@ BuildRelationExtStatistics(Relation onerel, double totalrows, } /* store the statistics in the catalog */ - statext_store(stat->statOid, ndistinct, dependencies, mcv, exprstats, stats); + statext_store(stat->statOid, inh, + ndistinct, dependencies, mcv, exprstats, stats); /* for reporting progress */ pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED, @@ -781,7 +782,7 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs, * tuple. */ static void -statext_store(Oid statOid, +statext_store(Oid statOid, bool inh, MVNDistinct *ndistinct, MVDependencies *dependencies, MCVList *mcv, Datum exprs, VacAttrStats **stats) { @@ -790,14 +791,19 @@ statext_store(Oid statOid, oldtup; Datum values[Natts_pg_statistic_ext_data]; bool nulls[Natts_pg_statistic_ext_data]; - bool replaces[Natts_pg_statistic_ext_data]; pg_stextdata = table_open(StatisticExtDataRelationId, RowExclusiveLock); memset(nulls, true, sizeof(nulls)); - memset(replaces, false, sizeof(replaces)); memset(values, 0, sizeof(values)); + /* basic info */ + values[Anum_pg_statistic_ext_data_stxoid - 1] = ObjectIdGetDatum(statOid); + nulls[Anum_pg_statistic_ext_data_stxoid - 1] = false; + + values[Anum_pg_statistic_ext_data_stxdinherit - 1] = BoolGetDatum(inh); + nulls[Anum_pg_statistic_ext_data_stxdinherit - 1] = false; + /* * Construct a new pg_statistic_ext_data tuple, replacing the calculated * stats. @@ -830,25 +836,27 @@ statext_store(Oid statOid, values[Anum_pg_statistic_ext_data_stxdexpr - 1] = exprs; } - /* always replace the value (either by bytea or NULL) */ - replaces[Anum_pg_statistic_ext_data_stxdndistinct - 1] = true; - replaces[Anum_pg_statistic_ext_data_stxddependencies - 1] = true; - replaces[Anum_pg_statistic_ext_data_stxdmcv - 1] = true; - replaces[Anum_pg_statistic_ext_data_stxdexpr - 1] = true; - - /* there should already be a pg_statistic_ext_data tuple */ - oldtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statOid)); - if (!HeapTupleIsValid(oldtup)) + /* + * Delete the old tuple if it exists, and insert a new one. It's easier + * than trying to update or insert, based on various conditions. + * + * There should always be a pg_statistic_ext_data tuple for inh=false, + * but there may be none for inh=true yet. + */ + oldtup = SearchSysCache2(STATEXTDATASTXOID, + ObjectIdGetDatum(statOid), + BoolGetDatum(inh)); + if (HeapTupleIsValid(oldtup)) + { + CatalogTupleDelete(pg_stextdata, &(oldtup->t_self)); + ReleaseSysCache(oldtup); + } + else if (!inh) elog(ERROR, "cache lookup failed for statistics object %u", statOid); - /* replace it */ - stup = heap_modify_tuple(oldtup, - RelationGetDescr(pg_stextdata), - values, - nulls, - replaces); - ReleaseSysCache(oldtup); - CatalogTupleUpdate(pg_stextdata, &stup->t_self, stup); + /* form a new tuple */ + stup = heap_form_tuple(RelationGetDescr(pg_stextdata), values, nulls); + CatalogTupleInsert(pg_stextdata, stup); heap_freetuple(stup); @@ -1234,7 +1242,7 @@ stat_covers_expressions(StatisticExtInfo *stat, List *exprs, * further tiebreakers are needed. */ StatisticExtInfo * -choose_best_statistics(List *stats, char requiredkind, +choose_best_statistics(List *stats, char requiredkind, bool inh, Bitmapset **clause_attnums, List **clause_exprs, int nclauses) { @@ -1256,6 +1264,10 @@ choose_best_statistics(List *stats, char requiredkind, if (info->kind != requiredkind) continue; + /* skip statistics with mismatching inheritance flag */ + if (info->inherit != inh) + continue; + /* * Collect attributes and expressions in remaining (unestimated) * clauses fully covered by this statistic object. @@ -1694,6 +1706,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli List **list_exprs; /* expressions matched to any statistic */ int listidx; Selectivity sel = (is_or) ? 0.0 : 1.0; + RangeTblEntry *rte = root->simple_rte_array[rel->relid]; /* check if there's any stats that might be useful for us. */ if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV)) @@ -1751,7 +1764,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli break; /* find the best suited statistics object for these attnums */ - stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, + stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh, list_attnums, list_exprs, list_length(clauses)); @@ -1840,7 +1853,7 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli MCVList *mcv_list; /* Load the MCV list stored in the statistics object */ - mcv_list = statext_mcv_load(stat->statOid); + mcv_list = statext_mcv_load(stat->statOid, rte->inh); /* * Compute the selectivity of the ORed list of clauses covered by @@ -2411,7 +2424,7 @@ statext_expressions_load(Oid stxoid, int idx) HeapTupleData tmptup; HeapTuple tup; - htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid)); + htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false)); if (!HeapTupleIsValid(htup)) elog(ERROR, "cache lookup failed for statistics object %u", stxoid); diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c index 35b39ece07..173f746e41 100644 --- a/src/backend/statistics/mcv.c +++ b/src/backend/statistics/mcv.c @@ -559,12 +559,13 @@ build_column_frequencies(SortItem *groups, int ngroups, * Load the MCV list for the indicated pg_statistic_ext tuple. */ MCVList * -statext_mcv_load(Oid mvoid) +statext_mcv_load(Oid mvoid, bool inh) { MCVList *result; bool isnull; Datum mcvlist; - HeapTuple htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid)); + HeapTuple htup = SearchSysCache2(STATEXTDATASTXOID, + ObjectIdGetDatum(mvoid), BoolGetDatum(inh)); if (!HeapTupleIsValid(htup)) elog(ERROR, "cache lookup failed for statistics object %u", mvoid); @@ -2040,11 +2041,13 @@ mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat, MCVList *mcv; Selectivity s = 0.0; + RangeTblEntry *rte = root->simple_rte_array[rel->relid]; + /* match/mismatch bitmap for each MCV item */ bool *matches = NULL; /* load the MCV list stored in the statistics object */ - mcv = statext_mcv_load(stat->statOid); + mcv = statext_mcv_load(stat->statOid, rte->inh); /* build a match bitmap for the clauses */ matches = mcv_get_match_bitmap(root, clauses, stat->keys, stat->exprs, diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c index 4481312d61..ab1f10d6c0 100644 --- a/src/backend/statistics/mvdistinct.c +++ b/src/backend/statistics/mvdistinct.c @@ -146,14 +146,15 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data) * Load the ndistinct value for the indicated pg_statistic_ext tuple */ MVNDistinct * -statext_ndistinct_load(Oid mvoid) +statext_ndistinct_load(Oid mvoid, bool inh) { MVNDistinct *result; bool isnull; Datum ndist; HeapTuple htup; - htup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(mvoid)); + htup = SearchSysCache2(STATEXTDATASTXOID, + ObjectIdGetDatum(mvoid), BoolGetDatum(inh)); if (!HeapTupleIsValid(htup)) elog(ERROR, "cache lookup failed for statistics object %u", mvoid); diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index b15f14e1a0..aab3bd7696 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -4008,7 +4008,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel, Assert(nmatches_vars + nmatches_exprs > 1); - stats = statext_ndistinct_load(statOid); + stats = statext_ndistinct_load(statOid, rte->inh); /* * If we have a match, search it for the specific item that matches (there diff --git a/src/backend/utils/cache/syscache.c b/src/backend/utils/cache/syscache.c index d6cb78dea8..eabd74952f 100644 --- a/src/backend/utils/cache/syscache.c +++ b/src/backend/utils/cache/syscache.c @@ -740,11 +740,11 @@ static const struct cachedesc cacheinfo[] = { 32 }, {StatisticExtDataRelationId, /* STATEXTDATASTXOID */ - StatisticExtDataStxoidIndexId, - 1, + StatisticExtDataStxoidInhIndexId, + 2, { Anum_pg_statistic_ext_data_stxoid, - 0, + Anum_pg_statistic_ext_data_stxdinherit, 0, 0 }, diff --git a/src/include/catalog/pg_statistic_ext_data.h b/src/include/catalog/pg_statistic_ext_data.h index 7b73b790d2..8ffd8b68cd 100644 --- a/src/include/catalog/pg_statistic_ext_data.h +++ b/src/include/catalog/pg_statistic_ext_data.h @@ -32,6 +32,7 @@ CATALOG(pg_statistic_ext_data,3429,StatisticExtDataRelationId) { Oid stxoid BKI_LOOKUP(pg_statistic_ext); /* statistics object * this data is for */ + bool stxdinherit; /* true if inheritance children are included */ #ifdef CATALOG_VARLEN /* variable-length fields start here */ @@ -53,6 +54,7 @@ typedef FormData_pg_statistic_ext_data * Form_pg_statistic_ext_data; DECLARE_TOAST(pg_statistic_ext_data, 3430, 3431); -DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_index, 3433, StatisticExtDataStxoidIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops)); +DECLARE_UNIQUE_INDEX_PKEY(pg_statistic_ext_data_stxoid_inh_index, 3433, StatisticExtDataStxoidInhIndexId, on pg_statistic_ext_data using btree(stxoid oid_ops, stxdinherit bool_ops)); + #endif /* PG_STATISTIC_EXT_DATA_H */ diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index 2a53a6e344..884bda7232 100644 --- a/src/include/nodes/pathnodes.h +++ b/src/include/nodes/pathnodes.h @@ -934,6 +934,7 @@ typedef struct StatisticExtInfo NodeTag type; Oid statOid; /* OID of the statistics row */ + bool inherit; /* includes child relations */ RelOptInfo *rel; /* back-link to statistic's table */ char kind; /* statistics kind of this entry */ Bitmapset *keys; /* attnums of the columns covered */ diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h index 326cf26fea..02ee41b9f3 100644 --- a/src/include/statistics/statistics.h +++ b/src/include/statistics/statistics.h @@ -94,11 +94,11 @@ typedef struct MCVList MCVItem items[FLEXIBLE_ARRAY_MEMBER]; /* array of MCV items */ } MCVList; -extern MVNDistinct *statext_ndistinct_load(Oid mvoid); -extern MVDependencies *statext_dependencies_load(Oid mvoid); -extern MCVList *statext_mcv_load(Oid mvoid); +extern MVNDistinct *statext_ndistinct_load(Oid mvoid, bool inh); +extern MVDependencies *statext_dependencies_load(Oid mvoid, bool inh); +extern MCVList *statext_mcv_load(Oid mvoid, bool inh); -extern void BuildRelationExtStatistics(Relation onerel, double totalrows, +extern void BuildRelationExtStatistics(Relation onerel, bool inh, double totalrows, int numrows, HeapTuple *rows, int natts, VacAttrStats **vacattrstats); extern int ComputeExtStatisticsRows(Relation onerel, @@ -121,6 +121,7 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root, bool is_or); extern bool has_stats_of_kind(List *stats, char requiredkind); extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind, + bool inh, Bitmapset **clause_attnums, List **clause_exprs, int nclauses); diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 2fa00a3c29..8ab5187ccb 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -2425,6 +2425,7 @@ pg_stats_ext| SELECT cn.nspname AS schemaname, JOIN pg_attribute a ON (((a.attrelid = s.stxrelid) AND (a.attnum = k.k))))) AS attnames, pg_get_statisticsobjdef_expressions(s.oid) AS exprs, s.stxkind AS kinds, + sd.stxdinherit AS inherited, sd.stxdndistinct AS n_distinct, sd.stxddependencies AS dependencies, m.most_common_vals, -- 2.17.0
>From b9b66a66b70566c6c783d62bda412bb76dcb8563 Mon Sep 17 00:00:00 2001 From: Justin Pryzby <[email protected]> Date: Sat, 25 Sep 2021 18:20:03 -0500 Subject: [PATCH v2 4/5] f! check inh statext_expressions_load examine_variable estimate_multivariate_ndistinct TODO: pg_stats_ext_exprs needs to expose inh flag --- src/backend/statistics/dependencies.c | 5 ----- src/backend/statistics/extended_stats.c | 9 ++------- src/backend/utils/adt/selfuncs.c | 27 ++++++++----------------- src/include/statistics/statistics.h | 2 +- src/test/regress/expected/stats_ext.out | 11 ++++++++-- src/test/regress/sql/stats_ext.sql | 4 +++- 6 files changed, 23 insertions(+), 35 deletions(-) diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c index 835f4bdf7a..02cf0efc66 100644 --- a/src/backend/statistics/dependencies.c +++ b/src/backend/statistics/dependencies.c @@ -1596,11 +1596,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root, int nexprs; int k; MVDependencies *deps; - RangeTblEntry *rte = root->simple_rte_array[rel->relid]; - - /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) - break; /* skip statistics that are not of the correct type */ if (stat->kind != STATS_EXT_DEPENDENCIES) diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c index 8cfcd17ad6..d2ce4e13b1 100644 --- a/src/backend/statistics/extended_stats.c +++ b/src/backend/statistics/extended_stats.c @@ -1706,7 +1706,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli List **list_exprs; /* expressions matched to any statistic */ int listidx; Selectivity sel = (is_or) ? 0.0 : 1.0; - RangeTblEntry *rte = root->simple_rte_array[rel->relid]; /* check if there's any stats that might be useful for us. */ if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV)) @@ -1759,10 +1758,6 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli Bitmapset *simple_clauses; RangeTblEntry *rte = root->simple_rte_array[rel->relid]; - /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) - break; - /* find the best suited statistics object for these attnums */ stat = choose_best_statistics(rel->statlist, STATS_EXT_MCV, rte->inh, list_attnums, list_exprs, @@ -2414,7 +2409,7 @@ serialize_expr_stats(AnlExprData *exprdata, int nexprs) * identified by the supplied index. */ HeapTuple -statext_expressions_load(Oid stxoid, int idx) +statext_expressions_load(Oid stxoid, bool inh, int idx) { bool isnull; Datum value; @@ -2424,7 +2419,7 @@ statext_expressions_load(Oid stxoid, int idx) HeapTupleData tmptup; HeapTuple tup; - htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), BoolGetDatum(false)); + htup = SearchSysCache2(STATEXTDATASTXOID, ObjectIdGetDatum(stxoid), inh); if (!HeapTupleIsValid(htup)) elog(ERROR, "cache lookup failed for statistics object %u", stxoid); diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index aab3bd7696..19b4aaf7eb 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -3915,10 +3915,6 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel, StatisticExtInfo *matched_info = NULL; RangeTblEntry *rte = root->simple_rte_array[rel->relid]; - /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) - return false; - /* bail out immediately if the table has no extended statistics */ if (!rel->statlist) return false; @@ -5237,13 +5233,6 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, if (vardata->statsTuple) break; - /* If it's an inheritence tree, skip statistics (which do not include child stats) */ - { - RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root); - if (rte->inh && rte->relkind != RELKIND_PARTITIONED_TABLE) - break; - } - /* skip stats without per-expression stats */ if (info->kind != STATS_EXT_EXPRESSIONS) continue; @@ -5262,22 +5251,22 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, /* found a match, see if we can extract pg_statistic row */ if (equal(node, expr)) { - HeapTuple t = statext_expressions_load(info->statOid, pos); - - /* Get statistics object's table for permission check */ - RangeTblEntry *rte; + RangeTblEntry *rte = planner_rt_fetch(onerel->relid, root); Oid userid; + bool inh; - vardata->statsTuple = t; + Assert(rte->rtekind == RTE_RELATION); /* * XXX Not sure if we should cache the tuple somewhere. * Now we just create a new copy every time. */ - vardata->freefunc = ReleaseDummy; + inh = root->append_rel_array == NULL ? false : + root->append_rel_array[onerel->relid]->parent_relid != 0; + vardata->statsTuple = + statext_expressions_load(info->statOid, inh, pos); - rte = planner_rt_fetch(onerel->relid, root); - Assert(rte->rtekind == RTE_RELATION); + vardata->freefunc = ReleaseDummy; /* * Use checkAsUser if it's set, in case we're accessing diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h index 02ee41b9f3..3868e43f8a 100644 --- a/src/include/statistics/statistics.h +++ b/src/include/statistics/statistics.h @@ -125,6 +125,6 @@ extern StatisticExtInfo *choose_best_statistics(List *stats, char requiredkind, Bitmapset **clause_attnums, List **clause_exprs, int nclauses); -extern HeapTuple statext_expressions_load(Oid stxoid, int idx); +extern HeapTuple statext_expressions_load(Oid stxoid, bool inh, int idx); #endif /* STATISTICS_H */ diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out index 67234b9fc2..35edc6a361 100644 --- a/src/test/regress/expected/stats_ext.out +++ b/src/test/regress/expected/stats_ext.out @@ -176,7 +176,6 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; ANALYZE ab1; DROP TABLE ab1 CASCADE; NOTICE: drop cascades to table ab1c --- Ensure non-inherited stats are not applied to inherited query CREATE TABLE stxdinh(i int, j int); CREATE TABLE stxdinh1() INHERITS(stxdinh); INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a; @@ -191,11 +190,19 @@ SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); CREATE STATISTICS stxdinh ON i,j FROM stxdinh; VACUUM ANALYZE stxdinh, stxdinh1; +-- Ensure non-inherited stats are not applied to inherited query -- Since the stats object does not include inherited stats, it should not affect the estimates SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); estimated | actual -----------+-------- - 1000 | 1008 + 1008 | 1008 +(1 row) + +-- Ensure correct (non-inherited) stats are applied to inherited query +SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2'); + estimated | actual +-----------+-------- + 9 | 9 (1 row) DROP TABLE stxdinh, stxdinh1; diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql index 2371043ca1..8490da9558 100644 --- a/src/test/regress/sql/stats_ext.sql +++ b/src/test/regress/sql/stats_ext.sql @@ -112,7 +112,6 @@ CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; ANALYZE ab1; DROP TABLE ab1 CASCADE; --- Ensure non-inherited stats are not applied to inherited query CREATE TABLE stxdinh(i int, j int); CREATE TABLE stxdinh1() INHERITS(stxdinh); INSERT INTO stxdinh SELECT a, a/10 FROM generate_series(1,9)a; @@ -122,8 +121,11 @@ VACUUM ANALYZE stxdinh, stxdinh1; SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); CREATE STATISTICS stxdinh ON i,j FROM stxdinh; VACUUM ANALYZE stxdinh, stxdinh1; +-- Ensure non-inherited stats are not applied to inherited query -- Since the stats object does not include inherited stats, it should not affect the estimates SELECT * FROM check_estimated_rows('SELECT * FROM stxdinh* GROUP BY 1,2'); +-- Ensure correct (non-inherited) stats are applied to inherited query +SELECT * FROM check_estimated_rows('SELECT * FROM ONLY stxdinh GROUP BY 1,2'); DROP TABLE stxdinh, stxdinh1; -- Ensure inherited stats ARE applied to inherited query in partitioned table -- 2.17.0
>From daf816806bdd3d927e9ed002a75d48277cad883b Mon Sep 17 00:00:00 2001 From: Justin Pryzby <[email protected]> Date: Sat, 25 Sep 2021 18:58:33 -0500 Subject: [PATCH v2 5/5] Refactor parent ACL check selfuncs.c is 8k lines long, and this makes it 30 LOC shorter. --- src/backend/utils/adt/selfuncs.c | 140 ++++++++++++------------------- 1 file changed, 52 insertions(+), 88 deletions(-) diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index 19b4aaf7eb..54324a71c0 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -187,6 +187,8 @@ static char *convert_string_datum(Datum value, Oid typid, Oid collid, bool *failure); static double convert_timevalue_to_scalar(Datum value, Oid typid, bool *failure); +static void recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata, + Oid relid); static void examine_simple_variable(PlannerInfo *root, Var *var, VariableStatData *vardata); static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata, @@ -5152,51 +5154,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, (pg_class_aclcheck(rte->relid, userid, ACL_SELECT) == ACLCHECK_OK); - /* - * If the user doesn't have permissions to - * access an inheritance child relation, check - * the permissions of the table actually - * mentioned in the query, since most likely - * the user does have that permission. Note - * that whole-table select privilege on the - * parent doesn't quite guarantee that the - * user could read all columns of the child. - * But in practice it's unlikely that any - * interesting security violation could result - * from allowing access to the expression - * index's stats, so we allow it anyway. See - * similar code in examine_simple_variable() - * for additional comments. - */ - if (!vardata->acl_ok && - root->append_rel_array != NULL) - { - AppendRelInfo *appinfo; - Index varno = index->rel->relid; - - appinfo = root->append_rel_array[varno]; - while (appinfo && - planner_rt_fetch(appinfo->parent_relid, - root)->rtekind == RTE_RELATION) - { - varno = appinfo->parent_relid; - appinfo = root->append_rel_array[varno]; - } - if (varno != index->rel->relid) - { - /* Repeat access check on this rel */ - rte = planner_rt_fetch(varno, root); - Assert(rte->rtekind == RTE_RELATION); - - userid = rte->checkAsUser ? rte->checkAsUser : GetUserId(); - - vardata->acl_ok = - rte->securityQuals == NIL && - (pg_class_aclcheck(rte->relid, - userid, - ACL_SELECT) == ACLCHECK_OK); - } - } + recheck_parent_acl(root, vardata, index->rel->relid); } else { @@ -5287,49 +5245,7 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, (pg_class_aclcheck(rte->relid, userid, ACL_SELECT) == ACLCHECK_OK); - /* - * If the user doesn't have permissions to access an - * inheritance child relation, check the permissions of - * the table actually mentioned in the query, since most - * likely the user does have that permission. Note that - * whole-table select privilege on the parent doesn't - * quite guarantee that the user could read all columns of - * the child. But in practice it's unlikely that any - * interesting security violation could result from - * allowing access to the expression stats, so we allow it - * anyway. See similar code in examine_simple_variable() - * for additional comments. - */ - if (!vardata->acl_ok && - root->append_rel_array != NULL) - { - AppendRelInfo *appinfo; - Index varno = onerel->relid; - - appinfo = root->append_rel_array[varno]; - while (appinfo && - planner_rt_fetch(appinfo->parent_relid, - root)->rtekind == RTE_RELATION) - { - varno = appinfo->parent_relid; - appinfo = root->append_rel_array[varno]; - } - if (varno != onerel->relid) - { - /* Repeat access check on this rel */ - rte = planner_rt_fetch(varno, root); - Assert(rte->rtekind == RTE_RELATION); - - userid = rte->checkAsUser ? rte->checkAsUser : GetUserId(); - - vardata->acl_ok = - rte->securityQuals == NIL && - (pg_class_aclcheck(rte->relid, - userid, - ACL_SELECT) == ACLCHECK_OK); - } - } - + recheck_parent_acl(root, vardata, onerel->relid); break; } @@ -5339,6 +5255,54 @@ examine_variable(PlannerInfo *root, Node *node, int varRelid, } } +/* + * If the user doesn't have permissions to access an inheritance child + * relation, check the permissions of the table actually mentioned in the + * query, since most likely the user does have that permission. Note that + * whole-table select privilege on the parent doesn't quite guarantee that the + * user could read all columns of the child. But in practice it's unlikely + * that any interesting security violation could result from allowing access to + * the expression stats, so we allow it anyway. See similar code in + * examine_simple_variable() for additional comments. + */ +static void +recheck_parent_acl(PlannerInfo *root, VariableStatData *vardata, Oid relid) +{ + RangeTblEntry *rte; + Oid userid; + + if (!vardata->acl_ok && + root->append_rel_array != NULL) + { + AppendRelInfo *appinfo; + Index varno = relid; + + appinfo = root->append_rel_array[varno]; + while (appinfo && + planner_rt_fetch(appinfo->parent_relid, + root)->rtekind == RTE_RELATION) + { + varno = appinfo->parent_relid; + appinfo = root->append_rel_array[varno]; + } + + if (varno != relid) + { + /* Repeat access check on this rel */ + rte = planner_rt_fetch(varno, root); + Assert(rte->rtekind == RTE_RELATION); + + userid = rte->checkAsUser ? rte->checkAsUser : GetUserId(); + + vardata->acl_ok = + rte->securityQuals == NIL && + (pg_class_aclcheck(rte->relid, + userid, + ACL_SELECT) == ACLCHECK_OK); + } + } +} + /* * examine_simple_variable * Handle a simple Var for examine_variable -- 2.17.0
