pg_stats_export is a view that aggregates pg_statistic data by relation oid and stores all of the column statistical data in a system-indepdent (i.e. no oids, collation information removed, all MCV values rendered as text) jsonb format, along with the relation's relname, reltuples, and relpages from pg_class, as well as the schemaname from pg_namespace.
pg_import_rel_stats is a function which takes a relation oid, server_version_num, num_tuples, num_pages, and a column_stats jsonb in a format matching that of pg_stats_export, and applies that data to the specified pg_class and pg_statistics rows for the relation specified. The most common use-case for such a function is in upgrades and dump/restore, wherein the upgrade process would capture the output of pg_stats_export into a regular table, perform the upgrade, and then join that data to the existing pg_class rows, updating statistics to be a close approximation of what they were just prior to the upgrade. The hope is that these statistics are better than the early stages of --analyze-in-stages and can be applied faster, thus reducing system downtime. The values applied to pg_class are done inline, which is to say non-transactionally. The values applied to pg_statitics are applied transactionally, as if an ANALYZE operation was reading from a cheat-sheet. This function and view will need to be followed up with corresponding ones for pg_stastitic_ext and pg_stastitic_ext_data, and while we would likely never backport the import functions, we can have user programs do the same work as the export views such that statistics can be brought forward from versions as far back as there is jsonb to store it. While the primary purpose of the import function(s) are to reduce downtime during an upgrade, it is not hard to see that they could also be used to facilitate tuning and development operations, asking questions like "how might this query plan change if this table has 1000x rows in it?", without actually putting those rows into the table.
From 834a415f594b3662716c9728a2ab46e425e80e82 Mon Sep 17 00:00:00 2001 From: coreyhuinker <corey.huin...@gmail.com> Date: Thu, 31 Aug 2023 01:30:57 -0400 Subject: [PATCH v1] Introduce the system view pg_stats_export and the function pg_import_rel_stats. pg_stats_export is a view that aggregates pg_statistic data by relation oid and stores all of the column statistical data in a system-indepdent (i.e. no oids) jsonb format, along with the relation's relname, reltuples, and relpages from pg_class, as well as the schemaname from pg_namespace. pg_import_rel_stats is a function which takes a relation oid, server_version_num, num_tuples, num_pages, and a column_stats jsonb in a format matching that of pg_stats_export, and applies that data to the specified pg_class and pg_statistics rows for the relation specified. The most common use-case for such a function is in upgrades and dump/restore, wherein the upgrade process would capture the output of pg_stats_export into a regular table, perform the upgrade, and then join that data to the existing pg_class rows, updating statistics to be a close approximation of what they were just prior to the upgrade. The hope is that these statistics are better than the early stages of --analyze-in-stages and can be applied faster, thus reducing system downtime. The values applied to pg_class are done inline, which is to say non-transactionally. The values applied to pg_statitics are applied transactionally, as if an ANALYZE operation was reading from a cheat-sheet. The statistics applied are no more durable than any other, and will likely be overwritten by the next autovacuum analyze. --- src/include/catalog/pg_proc.dat | 5 + src/include/commands/vacuum.h | 3 + src/backend/catalog/system_views.sql | 92 +++++ src/backend/commands/analyze.c | 581 +++++++++++++++++++++++++++ src/test/regress/expected/rules.out | 47 +++ src/test/regress/expected/vacuum.out | 95 +++++ src/test/regress/sql/vacuum.sql | 83 ++++ doc/src/sgml/func.sgml | 44 +- doc/src/sgml/system-views.sgml | 5 + 9 files changed, 954 insertions(+), 1 deletion(-) diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 9805bc6118..48c662c93c 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -5636,6 +5636,11 @@ proname => 'pg_stat_get_db_checksum_last_failure', provolatile => 's', proparallel => 'r', prorettype => 'timestamptz', proargtypes => 'oid', prosrc => 'pg_stat_get_db_checksum_last_failure' }, +{ oid => '3813', + descr => 'statistics: import to relation', + proname => 'pg_import_rel_stats', provolatile => 'v', proisstrict => 'f', + proparallel => 'u', prorettype => 'bool', proargtypes => 'oid int4 float4 int4 jsonb', + prosrc => 'pg_import_rel_stats' }, { oid => '3074', descr => 'statistics: last reset for a database', proname => 'pg_stat_get_db_stat_reset_time', provolatile => 's', proparallel => 'r', prorettype => 'timestamptz', proargtypes => 'oid', diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 4af02940c5..3ef05fa8a1 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -386,4 +386,7 @@ extern double anl_random_fract(void); extern double anl_init_selection_state(int n); extern double anl_get_next_S(double t, int n, double *stateptr); +extern Datum pg_import_rel_stats(PG_FUNCTION_ARGS); + + #endif /* VACUUM_H */ diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 77b06e2a7a..37383be786 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -253,6 +253,98 @@ CREATE VIEW pg_stats WITH (security_barrier) AS REVOKE ALL ON pg_statistic FROM public; +CREATE VIEW pg_statistic_export WITH (security_barrier) AS + SELECT + n.nspname AS schemaname, + r.relname AS relname, + current_setting('server_version_num')::integer AS server_version_num, + r.reltuples::float4 AS n_tuples, + r.relpages::integer AS n_pages, + ( + SELECT + jsonb_object_agg(a.attname, + ( + SELECT + jsonb_object_agg(attsta.inherit_type, + attsta.stat_obj) + FROM ( + SELECT + CASE s.stainherit + WHEN TRUE THEN 'inherited' + ELSE 'regular' + END AS inherit_type, + jsonb_build_object( + 'attstattarget', + CASE + WHEN a.attstattarget >= 0 THEN a.attstattarget + ELSE current_setting('default_statistics_target')::int4 + END, + 'stanullfrac', s.stanullfrac::text, + 'stawidth', s.stawidth::text, + 'stadistinct', s.stadistinct::text, + 'stakind1', s.stakind1::text, + 'stakind2', s.stakind2::text, + 'stakind3', s.stakind3::text, + 'stakind4', s.stakind4::text, + 'stakind5', s.stakind5::text, + 'staop1', + CASE s.stakind1 + WHEN 0 THEN '0' + WHEN 1 THEN '=' + ELSE '<' + END, + 'staop2', + CASE s.stakind2 + WHEN 0 THEN '0' + WHEN 1 THEN '=' + ELSE '<' + END, + 'staop3', + CASE s.stakind3 + WHEN 0 THEN '0' + WHEN 1 THEN '=' + ELSE '<' + END, + 'staop4', + CASE s.stakind4 + WHEN 0 THEN '0' + WHEN 1 THEN '=' + ELSE '<' + END, + 'staop5', + CASE s.stakind5 + WHEN 0 THEN '0' + WHEN 1 THEN '=' + ELSE '<' + END, + 'stanumbers1', s.stanumbers1::text[], + 'stanumbers2', s.stanumbers2::text[], + 'stanumbers3', s.stanumbers3::text[], + 'stanumbers4', s.stanumbers4::text[], + 'stanumbers5', s.stanumbers5::text[], + /* casting these to text makes re-casting easier */ + 'stavalues1', s.stavalues1::text::text[], + 'stavalues2', s.stavalues2::text::text[], + 'stavalues3', s.stavalues3::text::text[], + 'stavalues4', s.stavalues4::text::text[], + 'stavalues5', s.stavalues5::text::text[] + ) AS stat_obj + FROM pg_statistic AS s + WHERE s.starelid = a.attrelid + AND s.staattnum = a.attnum + ) AS attsta + ) + ORDER BY a.attnum) + FROM pg_attribute AS a + WHERE a.attrelid = r.oid + AND NOT a.attisdropped + AND has_column_privilege(r.oid, a.attnum, 'SELECT') + AND a.attnum > 0 + ) AS columns + FROM pg_class AS r + JOIN pg_namespace AS n + ON n.oid = r.relnamespace; + CREATE VIEW pg_stats_ext WITH (security_barrier) AS SELECT cn.nspname AS schemaname, c.relname AS tablename, diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c index bfd981aa3f..426eb19990 100644 --- a/src/backend/commands/analyze.c +++ b/src/backend/commands/analyze.c @@ -18,6 +18,7 @@ #include "access/detoast.h" #include "access/genam.h" +#include "access/heapam.h" #include "access/multixact.h" #include "access/relation.h" #include "access/sysattr.h" @@ -40,6 +41,7 @@ #include "commands/vacuum.h" #include "common/pg_prng.h" #include "executor/executor.h" +#include "fmgr.h" #include "foreign/fdwapi.h" #include "miscadmin.h" #include "nodes/nodeFuncs.h" @@ -59,14 +61,17 @@ #include "utils/datum.h" #include "utils/fmgroids.h" #include "utils/guc.h" +#include "utils/jsonb.h" #include "utils/lsyscache.h" #include "utils/memutils.h" +#include "utils/numeric.h" #include "utils/pg_rusage.h" #include "utils/sampling.h" #include "utils/sortsupport.h" #include "utils/spccache.h" #include "utils/syscache.h" #include "utils/timestamp.h" +#include "utils/typcache.h" /* Per-index data for ANALYZE */ @@ -3056,3 +3061,579 @@ analyze_mcv_list(int *mcv_counts, } return num_mcv; } + +/* + * Get a JsonbValue from a JsonbContainer + */ +static +JsonbValue *key_lookup(JsonbContainer *cont, const char *key) +{ + return getKeyJsonValueFromContainer(cont, key, strlen(key), NULL); +} + +/* + * Get a JsonbValue from a JsonbContainer and ensure that it is a string + */ +static +JsonbValue *key_lookup_string(JsonbContainer *cont, const char *key) +{ + JsonbValue *j = key_lookup(cont,key); + + if (j == NULL) + return NULL; + + if (j->type != jbvString) + { + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid statistics format, %s must be a string",key))); + } + + return j; +} + +/* + * Get a JsonbContainer from a JsonbContainer and ensure that it is a object + */ +static +JsonbContainer *key_lookup_object(JsonbContainer *cont, const char *key) +{ + JsonbValue *j = key_lookup(cont,key); + + if (j == NULL) + return NULL; + + if ((j->type != jbvBinary) || (!JsonContainerIsObject(j->val.binary.data))) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid statistics format, %s must be a binary object", key))); + + return j->val.binary.data; +} + +static +bool jbvIsBinaryArray(JsonbValue *j) +{ + return (j->type == jbvBinary) && JsonContainerIsArray(j->val.binary.data); +} + +static +char *jbvstr_to_cstring(JsonbValue *j) +{ + char *s; + + Assert(j->type == jbvString); + /* make a string we are sure is null-terminated */ + s = palloc(j->val.string.len + 1); + memcpy(s, j->val.string.val, j->val.string.len); + s[j->val.string.len] = '\0'; + return s; +} + +static +Datum jbvstr_to_float4datum(JsonbValue *j) +{ + char * s = jbvstr_to_cstring(j); + Datum result = DirectFunctionCall1(float4in, CStringGetDatum(s)); + pfree(s); + return result; +} + +static +Datum jbvstr_to_int32datum(JsonbValue *j) +{ + char * s = jbvstr_to_cstring(j); + Datum result = DirectFunctionCall1(int4in, CStringGetDatum(s)); + pfree(s); + return result; +} + +static +Datum jbvstr_to_int16datum(JsonbValue *j) +{ + char * s = jbvstr_to_cstring(j); + Datum result = DirectFunctionCall1(int2in, CStringGetDatum(s)); + pfree(s); + return result; +} + +static +Datum jbvstr_to_attrtypedatum(JsonbValue *j, FmgrInfo *finfo, Oid input_func, Oid typioparam, int32 typmod) +{ + char * s = jbvstr_to_cstring(j); + Datum result = InputFunctionCall(finfo, s, typioparam, typmod); + pfree(s); + return result; +} + +static +void import_pg_statistic_rows(Oid relid, Relation sd, TupleDesc tupleDesc, + Form_pg_attribute attr, + FmgrInfo *finfo, Oid input_func, + Oid typioparams, Oid eq_opr, Oid lt_opr, + CatalogIndexState indstate, + JsonbContainer *cont, const char *name, bool inh) +{ + + HeapTuple stup, + tup; + int i, + j, + k; + Datum values[Natts_pg_statistic]; + bool nulls[Natts_pg_statistic]; + bool replaces[Natts_pg_statistic]; + bool newrow = true; + JsonbValue *jv; + + /* Is there already a pg_statistic tuple for this attribute? */ + tup = SearchSysCache3(STATRELATTINH, + ObjectIdGetDatum(relid), + Int16GetDatum(attr->attnum), + BoolGetDatum(inh)); + + if (HeapTupleIsValid(tup)) + { + /* use the tuple that already exists */ + newrow = false; + heap_deform_tuple(tup, tupleDesc, values, nulls); + for (i = 0; i < Natts_pg_statistic; ++i) + replaces[i] = false; + } + else + { + /* new row */ + for (i = 0; i < Natts_pg_statistic; ++i) + { + nulls[i] = false; + replaces[i] = true; + } + + values[Anum_pg_statistic_starelid - 1] = relid; + values[Anum_pg_statistic_staattnum - 1] = attr->attnum; + values[Anum_pg_statistic_stainherit - 1] = BoolGetDatum(inh); + } + + i = Anum_pg_statistic_stanullfrac - 1; + jv = key_lookup_string(cont, "stanullfrac"); + if (jv != NULL) + { + values[i] = jbvstr_to_float4datum(jv); + replaces[i] = true; + } + else if (newrow) + values[i] = Float4GetDatum(0.0); + + i = Anum_pg_statistic_stawidth - 1; + jv = key_lookup_string(cont, "stawidth"); + if (jv != NULL) + { + values[i] = jbvstr_to_int32datum(jv); + replaces[i] = true; + } + else if (newrow) + values[i] = Int32GetDatum(0); + + i = Anum_pg_statistic_stadistinct - 1; + jv = key_lookup_string(cont, "stadistinct"); + if (jv != NULL) + { + values[i] = jbvstr_to_float4datum(jv); + replaces[i] = true; + } + else if (newrow) + values[i] = Float4GetDatum(0.0); + + i = Anum_pg_statistic_stakind1 - 1; + for (k = 1; k <= STATISTIC_NUM_SLOTS; k++) + { + char key[20]; + sprintf(key, "stakind%d", k); + jv = key_lookup_string(cont, key); + if (jv != NULL) + { + values[i] = jbvstr_to_int16datum(jv); + replaces[i] = true; + } + else if (newrow) + values[i] = Int16GetDatum(0); + + i++; + } + + + /* + * set staopN rows. Use staopN as a proxy for when to set stacollN + * + * collation cannot be changed in stats, only attempt to set if this is a + * new row, and set it to the attcollation - it is possible that if this + * column is an expression on an index, then the collation could be + * different but this will be reset anyway on the next autoanalyze. + */ + i = Anum_pg_statistic_staop1 - 1; + j = Anum_pg_statistic_stacoll1 - 1; + for (k = 1; k <= STATISTIC_NUM_SLOTS; k++) + { + char key[20]; + Oid res_opr; + sprintf(key, "staop%d", k); + jv = key_lookup(cont, key); + if (jv != NULL) + { + if (jv->type != jbvString) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid format: %s must be one of: '=', '<', ''", key))); + + if (jv->val.string.len != 1) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid format: %s must be one of: '=', '<', ''", key))); + + if (jv->val.string.val[0] == '0') + res_opr = 0; + else if (jv->val.string.val[0] == '=') + res_opr = eq_opr; + else if (jv->val.string.val[0] == '<') + res_opr = lt_opr; + else + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid format: %s must be one of: '=', '<', ''", key))); + + values[i] = ObjectIdGetDatum(res_opr); + replaces[i] = true; + /* staopN was set, so set stacollN if this row is new */ + if (newrow) + { + if (res_opr != 0) + values[j] = ObjectIdGetDatum(attr->attcollation); + else + values[j] = ObjectIdGetDatum(0); + } + } + else if (newrow) + { + values[i] = ObjectIdGetDatum(0); + values[j] = ObjectIdGetDatum(0); + } + + i++; + j++; + } + + for (k = 1; k <= STATISTIC_NUM_SLOTS; k++) + { + char key[20]; + int num_elements; + Datum *numdatums; + ArrayType *arry; + int n; + + sprintf(key, "stanumbers%d", k); + + /* compute offset to allow for continue bailouts */ + i = Anum_pg_statistic_stanumbers1 - 2 + k; + + jv = key_lookup(cont, key); + + if (jv == NULL) + { + /* no key set, do not modify existing row value */ + if (newrow) + { + nulls[i] = true; + values[i] = (Datum) 0; + } + continue; + } + + /* can be null or a binary array */ + if (jv->type == jbvNull) + { + /* explicitly set valuesN null */ + nulls[i] = true; + values[i] = (Datum) 0; + replaces[i] = true; + continue; + } + + if (!jbvIsBinaryArray(jv)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid statistics format, %s must be a array or null", key))); + + num_elements = JsonContainerSize(jv->val.binary.data); + + if (num_elements == 0) + { + /* empty array is just null */ + nulls[i] = true; + values[i] = (Datum) 0; + replaces[i] = true; + continue; + } + + numdatums = (Datum *) palloc(num_elements * sizeof(Datum)); + for (n = 0; n < num_elements; n++) + { + JsonbValue *elem; + + elem = getIthJsonbValueFromContainer(jv->val.binary.data, n); + if (elem->type != jbvString) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid format: element %d of %s must be a string", n, key))); + + numdatums[n] = jbvstr_to_float4datum(elem); + } + arry = construct_array_builtin(numdatums, num_elements, FLOAT4OID); + values[i] = PointerGetDatum(arry); /* stanumbersN */ + replaces[i] = true; + } + + i = Anum_pg_statistic_stavalues1 - 1; + for (k = 1; k <= STATISTIC_NUM_SLOTS; k++) + { + char key[20]; + int num_elements; + Datum *numdatums; + ArrayType *arry; + int n; + + sprintf(key, "stavalues%d", k); + + /* compute offset to allow for continue bailouts */ + i = Anum_pg_statistic_stavalues1 - 2 + k; + + jv = key_lookup(cont, key); + + if (jv == NULL) + { + /* no key set, do not modify existing row value */ + if (newrow) + { + nulls[i] = true; + values[i] = (Datum) 0; + } + continue; + } + + /* can be null or a binary array */ + if (jv->type == jbvNull) + { + /* explicitly set valuesN null */ + nulls[i] = true; + values[i] = (Datum) 0; + replaces[i] = true; + continue; + } + + if (!jbvIsBinaryArray(jv)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid statistics format, %s must be a array or null", key))); + + num_elements = JsonContainerSize(jv->val.binary.data); + + if (num_elements == 0) + { + /* empty array is just null */ + nulls[i] = true; + values[i] = (Datum) 0; + replaces[i] = true; + continue; + } + + numdatums = (Datum *) palloc(num_elements * sizeof(Datum)); + + for (n = 0; n < num_elements; n++) + { + /* All elements must be of type string that is iocoerce-friendly */ + JsonbValue *elem = getIthJsonbValueFromContainer(jv->val.binary.data, n); + if (elem->type != jbvString) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid format: element %d of %s must be a string", + n, key))); + numdatums[n] = jbvstr_to_attrtypedatum(elem, finfo, + input_func, + typioparams, + attr->atttypmod); + } + + arry = construct_array(numdatums, num_elements, attr->atttypid, + attr->attlen, attr->attbyval, attr->attalign); + values[i] = PointerGetDatum(arry); /* stavaluesN */ + replaces[i] = true; + } + + if (newrow) + { + /* insert new tuple */ + stup = heap_form_tuple(RelationGetDescr(sd), values, nulls); + CatalogTupleInsertWithInfo(sd, stup, indstate); + } + else + { + /* modify existing tuple */ + stup = heap_modify_tuple(tup, RelationGetDescr(sd), + values, nulls, replaces); + CatalogTupleUpdateWithInfo(sd, &stup->t_self, stup, indstate); + ReleaseSysCache(tup); + } + + heap_freetuple(stup); +} + + +/* + * Import statistics from JSONB export into relation + * to-do: pg_import_ext_stats() + */ +Datum +pg_import_rel_stats(PG_FUNCTION_ARGS) +{ + Oid relid = PG_GETARG_OID(0); + int32 stats_version_num = PG_GETARG_INT32(1); + Jsonb *jb = PG_ARGISNULL(4) ? NULL : PG_GETARG_JSONB_P(4); + Relation onerel; + + if (jb != NULL && !JB_ROOT_IS_OBJECT(jb)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid statistics format"))); + + if ( stats_version_num < 80000) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid statistics version: %d is earlier than earliest supported version", + stats_version_num))); + + /* to-do: change this to found current server version */ + if ( stats_version_num > 170000) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid statistics version: %d is greater than current version", + stats_version_num))); + + onerel = vacuum_open_relation(relid, NULL, VACOPT_ANALYZE, true, + ShareUpdateExclusiveLock); + + if (onerel == NULL) + PG_RETURN_BOOL(false); + + if (!vacuum_is_relation_owner(RelationGetRelid(onerel), + onerel->rd_rel, + VACOPT_ANALYZE)) + { + relation_close(onerel, ShareUpdateExclusiveLock); + PG_RETURN_BOOL(false); + } + + + /* only modify pg_class row if changes are to be made */ + if ( ! PG_ARGISNULL(2) || ! PG_ARGISNULL(3) ) + { + Relation pg_class_rel; + HeapTuple ctup; + Form_pg_class pgcform; + + /* + * Open the relation, getting ShareUpdateExclusiveLock to ensure that no + * other stat-setting operation can run on it concurrently. + */ + pg_class_rel = table_open(RelationRelationId, ShareUpdateExclusiveLock); + + /* leave if relation could not be opened or locked */ + if (!pg_class_rel) + PG_RETURN_BOOL(false); + + /* to-do: allow import IF FDW allows analyze */ + if (pg_class_rel->rd_rel->relkind == RELKIND_FOREIGN_TABLE) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("cannot import stats to foreign table"))); + + + ctup = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(relid)); + if (!HeapTupleIsValid(ctup)) + elog(ERROR, "pg_class entry for relid %u vanished during statistics import", + relid); + pgcform = (Form_pg_class) GETSTRUCT(ctup); + + /* leave un-set values alone */ + if (! PG_ARGISNULL(2)) + pgcform->reltuples = PG_GETARG_FLOAT4(2); + if (! PG_ARGISNULL(3)) + pgcform->relpages = PG_GETARG_INT32(3); + + heap_inplace_update(pg_class_rel, ctup); + table_close(pg_class_rel, ShareUpdateExclusiveLock); + } + + /* Apply statistical updates, if any, to copied tuple */ + if (! PG_ARGISNULL(4)) + { + TupleDesc tupdesc; + Relation sd; + TupleDesc stupdesc; + CatalogIndexState indstate; + int i; + + tupdesc = RelationGetDescr(onerel); + sd = table_open(StatisticRelationId, RowExclusiveLock); + stupdesc = RelationGetDescr(sd); + indstate = CatalogOpenIndexes(sd); + + for (i = 0; i < tupdesc->natts; i++) + { + Form_pg_attribute attr; + char *attname; + JsonbContainer *attrcont; + JsonbContainer *inheritcont; + + Oid in_func; + Oid typioparams; + FmgrInfo finfo; + TypeCacheEntry *typentry; + + attr = TupleDescAttr(tupdesc, i); + typentry = lookup_type_cache(attr->atttypid, TYPECACHE_LT_OPR | TYPECACHE_EQ_OPR); + + /* get input function for stavaluesN InputFunctionCall */ + getTypeInputInfo(attr->atttypid, &in_func, &typioparams); + fmgr_info(in_func, &finfo); + + /* Look for column key matching attname */ + attname = NameStr(attr->attname); + + attrcont = key_lookup_object(&jb->root, attname); + + if (attrcont == NULL) + continue; + + inheritcont = key_lookup_object(attrcont, "regular"); + if (inheritcont != NULL) + import_pg_statistic_rows(relid, sd, stupdesc, attr, &finfo, + in_func, typioparams, + typentry->eq_opr, typentry->lt_opr, + indstate, + inheritcont, "regular", false); + + inheritcont = key_lookup_object(attrcont, "inherited"); + if (inheritcont != NULL) + import_pg_statistic_rows(relid, sd, stupdesc, attr, &finfo, + in_func, typioparams, + typentry->eq_opr, typentry->lt_opr, + indstate, + inheritcont, "inherited", true); + } + + CatalogCloseIndexes(indstate); + table_close(sd, RowExclusiveLock); + relation_close(onerel, NoLock); + } + + PG_RETURN_BOOL(true); +} diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 5058be5411..66a56dee65 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -2404,6 +2404,53 @@ pg_statio_user_tables| SELECT relid, tidx_blks_hit FROM pg_statio_all_tables WHERE ((schemaname <> ALL (ARRAY['pg_catalog'::name, 'information_schema'::name])) AND (schemaname !~ '^pg_toast'::text)); +pg_statistic_export| SELECT n.nspname AS schemaname, + r.relname, + (current_setting('server_version_num'::text))::integer AS server_version_num, + r.reltuples AS n_tuples, + r.relpages AS n_pages, + ( SELECT jsonb_object_agg(a.attname, ( SELECT jsonb_object_agg(attsta.inherit_type, attsta.stat_obj) AS jsonb_object_agg + FROM ( SELECT + CASE s.stainherit + WHEN true THEN 'inherited'::text + ELSE 'regular'::text + END AS inherit_type, + jsonb_build_object('attstattarget', + CASE + WHEN (a.attstattarget >= 0) THEN (a.attstattarget)::integer + ELSE (current_setting('default_statistics_target'::text))::integer + END, 'stanullfrac', (s.stanullfrac)::text, 'stawidth', (s.stawidth)::text, 'stadistinct', (s.stadistinct)::text, 'stakind1', (s.stakind1)::text, 'stakind2', (s.stakind2)::text, 'stakind3', (s.stakind3)::text, 'stakind4', (s.stakind4)::text, 'stakind5', (s.stakind5)::text, 'staop1', + CASE s.stakind1 + WHEN 0 THEN '0'::text + WHEN 1 THEN '='::text + ELSE '<'::text + END, 'staop2', + CASE s.stakind2 + WHEN 0 THEN '0'::text + WHEN 1 THEN '='::text + ELSE '<'::text + END, 'staop3', + CASE s.stakind3 + WHEN 0 THEN '0'::text + WHEN 1 THEN '='::text + ELSE '<'::text + END, 'staop4', + CASE s.stakind4 + WHEN 0 THEN '0'::text + WHEN 1 THEN '='::text + ELSE '<'::text + END, 'staop5', + CASE s.stakind5 + WHEN 0 THEN '0'::text + WHEN 1 THEN '='::text + ELSE '<'::text + END, 'stanumbers1', (s.stanumbers1)::text[], 'stanumbers2', (s.stanumbers2)::text[], 'stanumbers3', (s.stanumbers3)::text[], 'stanumbers4', (s.stanumbers4)::text[], 'stanumbers5', (s.stanumbers5)::text[], 'stavalues1', ((s.stavalues1)::text)::text[], 'stavalues2', ((s.stavalues2)::text)::text[], 'stavalues3', ((s.stavalues3)::text)::text[], 'stavalues4', ((s.stavalues4)::text)::text[], 'stavalues5', ((s.stavalues5)::text)::text[]) AS stat_obj + FROM pg_statistic s + WHERE ((s.starelid = a.attrelid) AND (s.staattnum = a.attnum))) attsta) ORDER BY a.attnum) AS jsonb_object_agg + FROM pg_attribute a + WHERE ((a.attrelid = r.oid) AND (NOT a.attisdropped) AND has_column_privilege(r.oid, a.attnum, 'SELECT'::text) AND (a.attnum > 0))) AS columns + FROM (pg_class r + JOIN pg_namespace n ON ((n.oid = r.relnamespace))); pg_stats| SELECT n.nspname AS schemaname, c.relname AS tablename, a.attname, diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out index 4def90b805..cf7c4029a5 100644 --- a/src/test/regress/expected/vacuum.out +++ b/src/test/regress/expected/vacuum.out @@ -508,3 +508,98 @@ RESET ROLE; DROP TABLE vacowned; DROP TABLE vacowned_parted; DROP ROLE regress_vacuum; +CREATE TYPE stats_import_complex_type AS ( + a integer, + b float, + c text, + d date, + e jsonb); +CREATE TABLE stats_import_test( + id INTEGER PRIMARY KEY, + name text, + comp stats_import_complex_type +); +INSERT INTO stats_import_test +SELECT 1, 'one', (1, 1.1, 'ONE', '2001-01-01', '{ "xkey": "xval" }')::stats_import_complex_type +UNION ALL +SELECT 2, 'two', (2, 2.2, 'TWO', '2002-02-02', '[true, 4, "six"]')::stats_import_complex_type +UNION ALL +SELECT 3, 'tre', (3, 3.3, 'TRE', '2003-03-03', NULL)::stats_import_complex_type +UNION ALL +SELECT 4, 'four', NULL; +ANALYZE stats_import_test; +CREATE TABLE stats_export AS +SELECT e.* +FROM pg_catalog.pg_statistic_export AS e +WHERE e.schemaname = 'public' +AND e.relname = 'stats_import_test'; +SELECT c.reltuples, c.relpages +FROM pg_class AS c +WHERE oid = 'stats_import_test'::regclass; + reltuples | relpages +-----------+---------- + 4 | 1 +(1 row) + +-- test settting tuples and pages but no columns +SELECT pg_import_rel_stats(c.oid, current_setting('server_version_num')::integer, + 1000.0, 200, NULL::jsonb) +FROM pg_class AS c +WHERE oid = 'stats_import_test'::regclass; +WARNING: relcache reference leak: relation "stats_import_test" not closed + pg_import_rel_stats +--------------------- + t +(1 row) + +SELECT c.reltuples, c.relpages +FROM pg_class AS c +WHERE oid = 'stats_import_test'::regclass; + reltuples | relpages +-----------+---------- + 1000 | 200 +(1 row) + +-- create a table just like stats_import_test +CREATE TABLE stats_import_clone ( LIKE stats_import_test ); +-- copy table stats to clone table +SELECT pg_import_rel_stats(c.oid, e.server_version_num, + e.n_tuples, e.n_pages, e.columns) +FROM pg_class AS c +JOIN pg_namespace AS n +ON n.oid = c.relnamespace +JOIN stats_export AS e +ON e.schemaname = 'public' +AND e.relname = 'stats_import_test' +WHERE c.oid = 'stats_import_clone'::regclass; + pg_import_rel_stats +--------------------- + t +(1 row) + +-- stats should match +SELECT staattnum, stainherit, stanullfrac, stawidth, stadistinct, + stakind1, stakind2, stakind3, stakind4, stakind5, + staop1, staop2, staop3, staop4, staop5, stacoll1, stacoll2, stacoll3, stacoll4, stacoll5, + stanumbers1, stanumbers2, stanumbers3, stanumbers4, stanumbers5, + stavalues1::text AS sv1, stavalues2::text AS sv2, stavalues3::text AS sv3, + stavalues4::text AS sv4, stavalues5::text AS sv5 +FROM pg_statistic AS s +WHERE s.starelid = 'stats_import_test'::regclass +EXCEPT +SELECT staattnum, stainherit, stanullfrac, stawidth, stadistinct, + stakind1, stakind2, stakind3, stakind4, stakind5, + staop1, staop2, staop3, staop4, staop5, stacoll1, stacoll2, stacoll3, stacoll4, stacoll5, + stanumbers1, stanumbers2, stanumbers3, stanumbers4, stanumbers5, + stavalues1::text AS sv1, stavalues2::text AS sv2, stavalues3::text AS sv3, + stavalues4::text AS sv4, stavalues5::text AS sv5 +FROM pg_statistic AS s +WHERE s.starelid = 'stats_import_clone'::regclass; + staattnum | stainherit | stanullfrac | stawidth | stadistinct | stakind1 | stakind2 | stakind3 | stakind4 | stakind5 | staop1 | staop2 | staop3 | staop4 | staop5 | stacoll1 | stacoll2 | stacoll3 | stacoll4 | stacoll5 | stanumbers1 | stanumbers2 | stanumbers3 | stanumbers4 | stanumbers5 | sv1 | sv2 | sv3 | sv4 | sv5 +-----------+------------+-------------+----------+-------------+----------+----------+----------+----------+----------+--------+--------+--------+--------+--------+----------+----------+----------+----------+----------+-------------+-------------+-------------+-------------+-------------+-----+-----+-----+-----+----- +(0 rows) + +DROP TABLE stats_export; +DROP TABLE stats_import_clone; +DROP TABLE stats_import_test; +DROP TYPE stats_import_complex_type; diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql index 51d7b1fecc..602e84dfa2 100644 --- a/src/test/regress/sql/vacuum.sql +++ b/src/test/regress/sql/vacuum.sql @@ -377,3 +377,86 @@ RESET ROLE; DROP TABLE vacowned; DROP TABLE vacowned_parted; DROP ROLE regress_vacuum; + + +CREATE TYPE stats_import_complex_type AS ( + a integer, + b float, + c text, + d date, + e jsonb); + +CREATE TABLE stats_import_test( + id INTEGER PRIMARY KEY, + name text, + comp stats_import_complex_type +); + +INSERT INTO stats_import_test +SELECT 1, 'one', (1, 1.1, 'ONE', '2001-01-01', '{ "xkey": "xval" }')::stats_import_complex_type +UNION ALL +SELECT 2, 'two', (2, 2.2, 'TWO', '2002-02-02', '[true, 4, "six"]')::stats_import_complex_type +UNION ALL +SELECT 3, 'tre', (3, 3.3, 'TRE', '2003-03-03', NULL)::stats_import_complex_type +UNION ALL +SELECT 4, 'four', NULL; + +ANALYZE stats_import_test; + +CREATE TABLE stats_export AS +SELECT e.* +FROM pg_catalog.pg_statistic_export AS e +WHERE e.schemaname = 'public' +AND e.relname = 'stats_import_test'; + +SELECT c.reltuples, c.relpages +FROM pg_class AS c +WHERE oid = 'stats_import_test'::regclass; + +-- test settting tuples and pages but no columns +SELECT pg_import_rel_stats(c.oid, current_setting('server_version_num')::integer, + 1000.0, 200, NULL::jsonb) +FROM pg_class AS c +WHERE oid = 'stats_import_test'::regclass; + +SELECT c.reltuples, c.relpages +FROM pg_class AS c +WHERE oid = 'stats_import_test'::regclass; + +-- create a table just like stats_import_test +CREATE TABLE stats_import_clone ( LIKE stats_import_test ); + +-- copy table stats to clone table +SELECT pg_import_rel_stats(c.oid, e.server_version_num, + e.n_tuples, e.n_pages, e.columns) +FROM pg_class AS c +JOIN pg_namespace AS n +ON n.oid = c.relnamespace +JOIN stats_export AS e +ON e.schemaname = 'public' +AND e.relname = 'stats_import_test' +WHERE c.oid = 'stats_import_clone'::regclass; + +-- stats should match +SELECT staattnum, stainherit, stanullfrac, stawidth, stadistinct, + stakind1, stakind2, stakind3, stakind4, stakind5, + staop1, staop2, staop3, staop4, staop5, stacoll1, stacoll2, stacoll3, stacoll4, stacoll5, + stanumbers1, stanumbers2, stanumbers3, stanumbers4, stanumbers5, + stavalues1::text AS sv1, stavalues2::text AS sv2, stavalues3::text AS sv3, + stavalues4::text AS sv4, stavalues5::text AS sv5 +FROM pg_statistic AS s +WHERE s.starelid = 'stats_import_test'::regclass +EXCEPT +SELECT staattnum, stainherit, stanullfrac, stawidth, stadistinct, + stakind1, stakind2, stakind3, stakind4, stakind5, + staop1, staop2, staop3, staop4, staop5, stacoll1, stacoll2, stacoll3, stacoll4, stacoll5, + stanumbers1, stanumbers2, stanumbers3, stanumbers4, stanumbers5, + stavalues1::text AS sv1, stavalues2::text AS sv2, stavalues3::text AS sv3, + stavalues4::text AS sv4, stavalues5::text AS sv5 +FROM pg_statistic AS s +WHERE s.starelid = 'stats_import_clone'::regclass; + +DROP TABLE stats_export; +DROP TABLE stats_import_clone; +DROP TABLE stats_import_test; +DROP TYPE stats_import_complex_type; diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 7a0d4b9134..fbb0257593 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -1,4 +1,4 @@ -<!-- doc/src/sgml/func.sgml --> + <chapter id="functions"> <title>Functions and Operators</title> @@ -27904,6 +27904,48 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset in identifying the specific disk files associated with database objects. </para> + <table id="functions-admin-statsimport"> + <title>Database Object Statistics Import Functions</title> + <tgroup cols="1"> + <thead> + <row> + <entry role="func_table_entry"><para role="func_signature"> + Function + </para> + <para> + Description + </para></entry> + </row> + </thead> + + <tbody> + <row> + <entry role="func_table_entry"><para role="func_signature"> + <indexterm> + <primary>pg_import_rel_stats</primary> + </indexterm> + <function>pg_import_rel_stats</function> ( <parameter>relation</parameter> <type>regclass</type>, <parameter>server_version_num</parameter> <type>integer</type>, <parameter>num_tuples</parameter> <type>float4</type>, <parameter>num_pages</parameter> <type>integer</type>, <parameter>column_stats</parameter> <type>jsonb</type> ) + <returnvalue>boolean</returnvalue> + </para> + <para> + Modifies the <structname>pg_class</structname> row with the + <structfield>oid</structfield> matching <parameter>relation</parameter> + to set the <structfield>reltuples</structfield> and + <structfield>relpages</structfield> fields. This is done nontransactionally. + The <structname>pg_statistic</structname> rows for the + <structfield>statrelid</structfield> matching <parameter>relation</parameter> + are replaced with the values found in <parameter>column_stats</parameter>, + and this is done transactionally. The purpose of this function is to apply + statistics values in an upgrade situation that are "good enough" for system + operation until they are replaced by the next auto-analyze. This function + is used by <program>pg_upgrade</program> and <program>pg_restore</program> + to convey the statistics from the old system version into the new one. + </para></entry> + </row> + </tbody> + </tgroup> + </table> + <table id="functions-admin-dblocation"> <title>Database Object Location Functions</title> <tgroup cols="1"> diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml index 2b35c2f91b..17430e581b 100644 --- a/doc/src/sgml/system-views.sgml +++ b/doc/src/sgml/system-views.sgml @@ -191,6 +191,11 @@ <entry>extended planner statistics for expressions</entry> </row> + <row> + <entry><link linkend="view-pg-stats"><structname>pg_stats_export</structname></link></entry> + <entry>planner statistics for export/upgrade purposes</entry> + </row> + <row> <entry><link linkend="view-pg-tables"><structname>pg_tables</structname></link></entry> <entry>tables</entry> -- 2.41.0