This is a new thread that continues the work [1] of the Extended Statistics set/restore/clear functions thread [2] which itself was a continuation of the work [3] of the Statistics Export and Import thread [4], all of which is too much for anyone to review, so I'll give a recap:
A longstanding complaint about postgres has been the amount of time required to rebuild stats after a major version upgrade, during which the database is online but under heavy I/O and queries of any real complexity will get _terrible_ query plans because they lack the optimizer statistics which the previous major version had just moments ago, but weren't carried over to the new version. Version 18 introduced the ability to import stats at the relation level and attribute level [3], and these stats were now dumped and restored by default via pg_upgrade. This meant that most customers could reduce the time in which the database was online but in a state of degraded performance. It was, however, not a complete solution, because it still lacked statistics for extended objects (i.e. CREATE STATISTICS xyz...) and custom statistic kinds like those found in extensions like PostGIS. Still, this made things better for 99% of installations, and while it was not trivial to determine if a given instance was in that 1%, enhancements were made to vacuumdb [5] to detect what tables were missing statistics and analyze just those tables, thus reducing the scope of the I/O-intensive rebuild period for those in the unlucky 1%. Work in this 19-dev cycle has sought to close that 1% gap by importing statistics for extended statistics objects. These stats are quite a bit more complicated than their relation and attribute equivalents, but good progress has been made [1], resulting in the carryover of many statistics types: ndistinct, dependencies, and MCV. All of them, except for the statistics associated with expressions in the definition of the statistics object (i.e. CREATE STATISTICS xyz on upper(name), ...). This last type of statistics has proved to be a tough nut to crack for reasons I will describe in detail. We could stop here, but if we did we would actually create work for vacuumdb, specifically the code that processes the --missing-stats-only option, which currently looks for matching extended statistics data (pg_statistic_ext_data) rows that match the object definition (pg_statistic_ext) rows, and considers any match to be sufficient for "not missing stats". That assumption would no longer hold in the case of stats objects that have expressions, because they'd be missing their stxdexprs stats. While we can teach vacuumdb that difference, we could instead solve the expressions problem, and close the statistics gap even further [6]. We have a working but not thoroughly tested implementation (attached). There remains one sticky problem: the serialization format of the statistics stored in pg_statistic_ext_data.stxdexprs. The type of the attribute is pg_catalog.pg_statistic[], which is to say that it's an array of records, the length of the array is equal to the number of expressions in the extended statistics object. pg_statistic is where attribute statistics are imported, so the structure has the complexity of attribute stats itself, slightly reduced by the fact that the fields describing the attribute are left as InvalidOid values, but still quite complicated. Several of the attributes in pg_statistic are of type ANYARRAY, because they are most-common-values/histogram/most-common-elements arrays, each of which has a composite datatype determined by the datatype of the expression(s) and other columns in the object definition. This presents a problem for utilities like deconstruct_array(), in that the datatype to be deconstructed varies by column and by the datatype of the expression definition, and that datatype could itself be an array which deconstruct_array would then try to deconstruct...there is no way to get deconstruct_array() to stop 2 levels deep. This problem was solved for pg_restore_attribute_stats by having pg_dump export the ANYARRAY values CAST-ed to type "text" rather than "text[]", which allowed each type of statistics to be decomposed according to it's own rules, and that worked fine when each statistic type became a parameter in pg_restore_attribute_stats(). But now we've got all of those types, and we're getting them multiple times, so that method doesn't quite scale. I've considered several ways around this issue: 1. Defining a strict order for the statistics types, following the order they appear in pg_stats_ext (null_frac, avg_width, n_distinct, most_common_elems, ...) and then exprs from pg_stats_ext_exprs in last place. Each value is CAST-ed to "text", which means that we can deconstruct them in a fashion very similar to how we did for attribute stats. Those text values are put into an array in the strict order, and those arrays are aggregated into a 2-D array. Pros: - This method is implemented and it works, and the code for it is reusing tools and coding patterns we've already incorporated (deconstruct_array, safe input functions, arglist arrays). Patch attached. Cons: - The ordering is completely opaque. Documenting that ordering might help a little, but there's nothing intuitive about it and checking it has been an irritant to author and reviewer alike. - This opaque ordering must be kept in sync between pg_restore_extended_stats and pg_dump or else statistical garbage will result. 2. Defining a record type specifically for purpose. Pros: - It could be decomposed via standard composite input function, and then each type deconstructed on its own terms Cons: - It's type-clutter, and a type that is likely only used during upgrades. - When new stats types are introduced, the structure would also have to change, breaking typecasts of composite values from older versions. This alone makes this option unpalatable to most reviewers, and I can't advocate for it. 3. Keeping the 2-D text array in #1, but each "row" is a list of kwargs-like pairs like the arguments used in pg_restore_attribute_stats (i.e. ARRAY['null_frac','0.5','avg_width','1.0','most_common_values',...] Pros: - Flexibility in ordering - Clarity at a glance, provided that the reader has seen the kwargs convention of the pg_restore_*_stats functions. - Still uses existing tooling like #1, and not that much more overhead. - The name-value pairing problem has the same solution as the arg-pairing that the function already does Cons: - adds overhead of storing the stat type names, and the key-value pairing - the 2-D nature of the array requires that the number of elements be fixed, so we couldn't leave out a stat type from one row unless we left it out of the other one as well - adds the possibility of duplicate names 4. JSON. The outer structure would be an array of objects, each object would be a key-value. Pros: - Flexibility in ordering - Clarity at a glance in a format well understood even without prior knowledge of our kwargs convention - we have already implemented similar things for the new formats of pg_ndistinct and pg_dependences. - This method currently has the interest of Michael Paquier, the committer of all the v19 work to date. Cons: - Requires implementing a state engine to parse the json, check for missing values, resolve duplicates. We do that for pg_dependencies, and that takes 800-ish lines of code to handle 3 key names, this would have 10. - the escaping of values in a composite record CASTed to text and then further encoded as a JSON value would be extremely unreadable, and likely quite bloated. - using JSON for stats serialization met with immediate strong opposition from several reviewers. That resistance may not be there for this vastly reduced scope, especially in light of the new JSON-compatible formats for pg_ndistinct and pg_dependencies, but it does give me pause. And...that's the major decision point. If we solve that, the rest is far less controversial. My apologies that this summary itself needs a summary. Thanks for reading. Eager to hear perspectives on the serialization methods propsed, or suggestions of other methods. [1] https://commitfest.postgresql.org/patch/5517/ [2] https://www.postgresql.org/message-id/flat/aTE4AL7U0dp1Jjrx%40paquier.xyz#72116daf9d37828a47ce477a852a78d3 [3] https://commitfest.postgresql.org/patch/4538/ [4] https://www.postgresql.org/message-id/flat/CADkLM%3DcB0rF3p_FuWRTMSV0983ihTRpsH%2BOCpNyiqE7Wk0vUWA%40mail.gmail.com [5] https://commitfest.postgresql.org/patch/5523/ [6] The issue of custom statistic kinds like those found in PostGIS would still remain.
From 96d8ac8cc603e61d9ab5b0ce46d0526a98203d33 Mon Sep 17 00:00:00 2001 From: Corey Huinker <[email protected]> Date: Tue, 27 Jan 2026 03:05:36 -0500 Subject: [PATCH v1] Add support for "exprs" in pg_restore_extended_stats() This commit adds support for the restore of extended statistics of the kind "exprs". The input format consists of a two dimensional text array, one outer row for every expression defined for the statistics object. The inner row consists of the text-casted values of the statistical columns of pg_stats_ext_exprs (i.e. everything after "inherited"). This is necessary because the elements represented in the "most_common_vals" field might themselves be arrays, and our existing array deconstruction infrastructure isn't capable of stopping decomoposition at the second dimension. So, by force-casting those elements to text, we ensure that the we can validate the "most_common_vals" by applyint the proper input functions as defined by the "stxkeys" for that statistics object. --- src/backend/statistics/extended_stats_funcs.c | 520 +++++++++++++++++- src/bin/pg_dump/pg_dump.c | 34 +- src/test/regress/expected/stats_import.out | 363 +++++++++++- src/test/regress/sql/stats_import.sql | 196 ++++++- doc/src/sgml/func/func-admin.sgml | 17 +- 5 files changed, 1121 insertions(+), 9 deletions(-) diff --git a/src/backend/statistics/extended_stats_funcs.c b/src/backend/statistics/extended_stats_funcs.c index db107684607..16edcada219 100644 --- a/src/backend/statistics/extended_stats_funcs.c +++ b/src/backend/statistics/extended_stats_funcs.c @@ -34,6 +34,7 @@ #include "utils/fmgroids.h" #include "utils/lsyscache.h" #include "utils/syscache.h" +#include "utils/typcache.h" /* @@ -51,6 +52,7 @@ enum extended_stats_argnum MOST_COMMON_VALS_ARG, MOST_COMMON_FREQS_ARG, MOST_COMMON_BASE_FREQS_ARG, + EXPRESSIONS_ARG, NUM_EXTENDED_STATS_ARGS, }; @@ -70,9 +72,50 @@ static struct StatsArgInfo extarginfo[] = [MOST_COMMON_VALS_ARG] = {"most_common_vals", TEXTARRAYOID}, [MOST_COMMON_FREQS_ARG] = {"most_common_freqs", FLOAT8ARRAYOID}, [MOST_COMMON_BASE_FREQS_ARG] = {"most_common_base_freqs", FLOAT8ARRAYOID}, + [EXPRESSIONS_ARG] = {"exprs", TEXTARRAYOID}, [NUM_EXTENDED_STATS_ARGS] = {0}, }; +/* + * An index of the elements of a stxdexprs Datum, which repeat for each + * expression in the extended statistics object. + * + * NOTE: the RANGE_LENGTH & RANGE_BOUNDS stats are not yet reflected in any + * version of pg_stat_ext_exprs. + */ +enum extended_stats_exprs_element +{ + NULL_FRAC_ELEM = 0, + AVG_WIDTH_ELEM, + N_DISTINCT_ELEM, + MOST_COMMON_VALS_ELEM, + MOST_COMMON_FREQS_ELEM, + HISTOGRAM_BOUNDS_ELEM, + CORRELATION_ELEM, + MOST_COMMON_ELEMS_ELEM, + MOST_COMMON_ELEM_FREQS_ELEM, + ELEM_COUNT_HISTOGRAM_ELEM, + NUM_ATTRIBUTE_STATS_ELEMS +}; + +/* + * The argument names and typoids of the repeating arguments for stxdexprs. + */ +static struct StatsArgInfo extexprarginfo[] = +{ + [NULL_FRAC_ELEM] = {"null_frac", FLOAT4OID}, + [AVG_WIDTH_ELEM] = {"avg_width", INT4OID}, + [N_DISTINCT_ELEM] = {"n_distinct", FLOAT4OID}, + [MOST_COMMON_VALS_ELEM] = {"most_common_vals", TEXTOID}, + [MOST_COMMON_FREQS_ELEM] = {"most_common_freqs", FLOAT4ARRAYOID}, + [HISTOGRAM_BOUNDS_ELEM] = {"histogram_bounds", TEXTOID}, + [CORRELATION_ELEM] = {"correlation", FLOAT4OID}, + [MOST_COMMON_ELEMS_ELEM] = {"most_common_elems", TEXTOID}, + [MOST_COMMON_ELEM_FREQS_ELEM] = {"most_common_elem_freqs", FLOAT4ARRAYOID}, + [ELEM_COUNT_HISTOGRAM_ELEM] = {"elem_count_histogram", FLOAT4ARRAYOID}, + [NUM_ATTRIBUTE_STATS_ELEMS] = {0} +}; + static bool extended_statistics_update(FunctionCallInfo fcinfo); static HeapTuple get_pg_statistic_ext(Relation pg_stext, Oid nspoid, @@ -98,6 +141,10 @@ static void upsert_pg_statistic_ext_data(const Datum *values, static bool check_mcvlist_array(const ArrayType *arr, int argindex, int required_ndims, int mcv_length); +static Datum import_expressions(Relation pgsd, int numexprs, + Oid *atttypids, int32 *atttypmods, + Oid *atttypcolls, ArrayType *exprs_arr, + bool *ok); static Datum import_mcv(const ArrayType *mcv_arr, const ArrayType *freqs_arr, const ArrayType *base_freqs_arr, @@ -105,6 +152,61 @@ static Datum import_mcv(const ArrayType *mcv_arr, Oid *atttypcolls, int numattrs, bool *ok); +/* + * Safe conversion of text to float4. + */ +static bool +text_to_float4(Datum input, const char *argname, Datum *output) +{ + ErrorSaveContext escontext = {T_ErrorSaveContext}; + + char *s; + bool ok; + + s = TextDatumGetCString(input); + ok = DirectInputFunctionCallSafe(float4in, s, InvalidOid, -1, + (Node *) &escontext, output); + + if (!ok) + { + ereport(WARNING, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("could not parse expression element \"%s\" value \"%s\": invalid type \"%s\"", + argname, s, "float4")); + *output = (Datum) 0; + } + + pfree(s); + return ok; +} + +/* + * Safe conversion of text to int4. + */ +static bool +text_to_int4(Datum input, const char *argname, Datum *output) +{ + ErrorSaveContext escontext = {T_ErrorSaveContext}; + + char *s; + bool ok; + + s = TextDatumGetCString(input); + ok = DirectInputFunctionCallSafe(int4in, s, InvalidOid, -1, + (Node *) &escontext, output); + + if (!ok) + { + ereport(WARNING, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("could not parse expression element \"%s\" value \"%s\": invalid type \"%s\"", + argname, s, "integer")); + *output = (Datum) 0; + } + + pfree(s); + return ok; +} /* * Fetch a pg_statistic_ext row by name and namespace OID. @@ -296,6 +398,7 @@ extended_statistics_update(FunctionCallInfo fcinfo) !PG_ARGISNULL(MOST_COMMON_BASE_FREQS_ARG)); has.ndistinct = !PG_ARGISNULL(NDISTINCT_ARG); has.dependencies = !PG_ARGISNULL(DEPENDENCIES_ARG); + has.expressions = !PG_ARGISNULL(EXPRESSIONS_ARG); if (RecoveryInProgress()) { @@ -492,6 +595,20 @@ extended_statistics_update(FunctionCallInfo fcinfo) } } + /* If the object can't support expressions, we should not have them. */ + if (has.expressions && !enabled.expressions) + { + ereport(WARNING, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("cannot specify parameter \"%s\".", + extarginfo[EXPRESSIONS_ARG].argname), + errhint("Extended statistics object " + "does not support statistics of this type.")); + + has.expressions = false; + success = false; + } + /* * Either of these statistic types requires that we supply a semi-filled * VacAttrStatP array. @@ -501,7 +618,7 @@ extended_statistics_update(FunctionCallInfo fcinfo) * attstattarget is 0, and we may have statistics data to import for those * attributes. */ - if (has.mcv) + if (has.mcv || has.expressions) { atttypids = palloc0_array(Oid, numattrs); atttypmods = palloc0_array(int32, numattrs); @@ -636,6 +753,42 @@ extended_statistics_update(FunctionCallInfo fcinfo) success = false; } + if (has.expressions) + { + Datum datum; + Relation pgsd; + bool ok = false; + + pgsd = table_open(StatisticRelationId, RowExclusiveLock); + + /* + * Generate the expressions array. + * + * The attytypids, attytypmods, and atttypcols arrays have all the + * regular attributes listed first, so we can pass those arrays with a + * start point after the last regular attribute, and there should be + * numexprs elements remaining. + */ + datum = import_expressions(pgsd, numexprs, + &atttypids[numattnums], + &atttypmods[numattnums], + &atttypcolls[numattnums], + PG_GETARG_ARRAYTYPE_P(EXPRESSIONS_ARG), + &ok); + + table_close(pgsd, RowExclusiveLock); + + if (ok) + { + Assert(datum != (Datum) 0); + values[Anum_pg_statistic_ext_data_stxdexpr - 1] = datum; + replaces[Anum_pg_statistic_ext_data_stxdexpr - 1] = true; + nulls[Anum_pg_statistic_ext_data_stxdexpr - 1] = false; + } + else + success = false; + } + upsert_pg_statistic_ext_data(values, nulls, replaces); cleanup: @@ -764,6 +917,371 @@ mcv_error: return mcv; } +/* + * Create the stxdexprs datum using the user input in an array of array of + * text, referenced against the datatypes for the expressions. + * + * This datum is needed to fill out a complete pg_statistic_ext_data tuple. + * + * The input arrays should each have "numexprs" elements in them and they + * should be in the order that the expressions appear in the statistics + * object. + * + * It is not practical to update parts of an element of stxdexprs, so if any + * conversion errors are found in this function, then the entire attribute + * will be left unchanged. + */ +static Datum +import_expressions(Relation pgsd, int numexprs, + Oid *atttypids, int32 *atttypmods, + Oid *atttypcolls, ArrayType *exprs_arr, + bool *ok) +{ + Datum *exprs_elems = NULL; + bool *exprs_nulls = NULL; + int num_text_elems; + int offset = 0; + + FmgrInfo array_in_fn; + + Oid pgstypoid = get_rel_type_id(StatisticRelationId); + + ArrayBuildState *astate = NULL; + Datum result = (Datum) 0; + + *ok = false; + + /* Overall array must be 2-D */ + if (ARR_NDIM(exprs_arr) != 2) + { + ereport(WARNING, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("malformed array \"%s\": expected text array of 2 dimensions", + extarginfo[EXPRESSIONS_ARG].argname)); + goto exprs_error; + } + + /* + * The outer array must be equal in length to the number of expression + * nodes found in pg_statistic_ext.stxexprs. + */ + if (ARR_DIMS(exprs_arr)[0] != numexprs) + { + ereport(WARNING, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("malformed array \"%s\": incorect number of dimensions (%d required)", + extarginfo[EXPRESSIONS_ARG].argname, numexprs)); + goto exprs_error; + } + + /* + * The inner array is constructed from pg_stats_ext_exprs and is for the + * time being is fixed in length. + * + * If future versions of pg_stats_ext_exprs add new statistics types, then + * the number of expected elements will depend on the version parameter + * given in the call to pg_restore_extended_stats(), and we will have to + * accommodate all valid past versions. + */ + if (ARR_DIMS(exprs_arr)[1] != NUM_ATTRIBUTE_STATS_ELEMS) + { + ereport(WARNING, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("could not use parameter \"%s\" due to incorrect inner dimension with %d elements", + extarginfo[EXPRESSIONS_ARG].argname, + NUM_ATTRIBUTE_STATS_ELEMS)); + goto exprs_error; + } + + deconstruct_array_builtin(exprs_arr, TEXTOID, &exprs_elems, + &exprs_nulls, &num_text_elems); + + Assert((numexprs*NUM_ATTRIBUTE_STATS_ELEMS) == num_text_elems); + + fmgr_info(F_ARRAY_IN, &array_in_fn); + + /* + * Iterate over each expected expression. + * + * The values/nulls/replaces arrays are deconstructed into a 1-D arrays, + * so we have to advance an offset by NUM_ATTRIBUTE_STATS_ELEMS to get to + * the next row of the 2-D array. + */ + for (int i = 0; i < numexprs; i++) + { + Oid typid = atttypids[i]; + int32 typmod = atttypmods[i]; + Oid stacoll = atttypcolls[i]; + TypeCacheEntry *typcache; + + Oid elemtypid = InvalidOid; + Oid elem_eq_opr = InvalidOid; + + Datum values[Natts_pg_statistic]; + bool nulls[Natts_pg_statistic]; + bool replaces[Natts_pg_statistic]; + + HeapTuple pgstup; + Datum pgstdat; + + /* Advance the indexes to the next offset. */ + const int null_frac_idx = offset + NULL_FRAC_ELEM; + const int avg_width_idx = offset + AVG_WIDTH_ELEM; + const int n_distinct_idx = offset + N_DISTINCT_ELEM; + const int most_common_vals_idx = offset + MOST_COMMON_VALS_ELEM; + const int most_common_freqs_idx = offset + MOST_COMMON_FREQS_ELEM; + const int histogram_bounds_idx = offset + HISTOGRAM_BOUNDS_ELEM; + const int correlation_idx = offset + CORRELATION_ELEM; + const int most_common_elems_idx = offset + MOST_COMMON_ELEMS_ELEM; + const int most_common_elems_freqs_idx = offset + MOST_COMMON_ELEM_FREQS_ELEM; + const int elem_count_histogram_idx = offset + ELEM_COUNT_HISTOGRAM_ELEM; + + /* + * XXX: + * + * We may need to duplicate some steps from statatt_get_type() that we + * do not currently, those are: + * + * #include "catalog/pg_collation_d.h" if (typid == TSVECTOROID) + * stacoll = DEFAULT_COLLATION_OID; + * + * The multirange step-down may also need to happen here too. + */ + /* This finds the right operators even if atttypid is a domain */ + typcache = lookup_type_cache(typid, TYPECACHE_LT_OPR | TYPECACHE_EQ_OPR); + + statatt_init_empty_tuple(InvalidOid, InvalidAttrNumber, false, + values, nulls, replaces); + + /* + * Check each of the fixed attributes to see if they have values set. + * If not set, then just let them stay with the default values set in + * statatt_init_empty_tuple(). + */ + if (!exprs_nulls[null_frac_idx]) + { + if (!text_to_float4(exprs_elems[null_frac_idx], + extexprarginfo[NULL_FRAC_ELEM].argname, + &values[Anum_pg_statistic_stanullfrac - 1])) + goto exprs_error; + } + + if (!exprs_nulls[avg_width_idx]) + { + if (!text_to_int4(exprs_elems[avg_width_idx], + extexprarginfo[AVG_WIDTH_ELEM].argname, + &values[Anum_pg_statistic_stawidth - 1])) + goto exprs_error; + } + + if (!exprs_nulls[n_distinct_idx]) + { + if (!text_to_float4(exprs_elems[n_distinct_idx], + extexprarginfo[N_DISTINCT_ELEM].argname, + &values[Anum_pg_statistic_stadistinct - 1])) + goto exprs_error; + } + + /* + * The STAKIND statistics are the same as the ones found in attribute + * stats. However, these are all derived from text columns, whereas + * the ones derived for attribute stats are a mix of datatypes. This + * limits the opportunities for code sharing between the two. + * + * Some statistic kinds have both a stanumbers and a stavalues + * components. In those cases, both values must either be NOT NULL or + * both NULL, and if they aren't then we need to reject that stakind + * completely. Currently we go a step further and reject the + * expression array completely. + * + * Once it is established that the pairs are in NULL/NOT-NULL + * alignment, we can test either expr_nulls[] value to see if the + * stakind has value(s) that we can set or not. + */ + + /* STATISTIC_KIND_MCV */ + if (exprs_nulls[most_common_vals_idx] != + exprs_nulls[most_common_freqs_idx]) + { + ereport(WARNING, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid expression elements %s and %s: conflicting NULL/NOT NULL", + extexprarginfo[MOST_COMMON_VALS_ELEM].argname, + extexprarginfo[MOST_COMMON_FREQS_ELEM].argname), + errhint("The elements must both be NULL or both NOT NULL.")); + goto exprs_error; + } + + if (!exprs_nulls[most_common_vals_idx]) + { + Datum stavalues; + Datum stanumbers; + bool val_ok = false; + + stavalues = statatt_build_stavalues(extexprarginfo[MOST_COMMON_VALS_ELEM].argname, + &array_in_fn, exprs_elems[most_common_vals_idx], + typid, typmod, &val_ok); + + if (!val_ok) + goto exprs_error; + + stanumbers = statatt_build_stavalues(extexprarginfo[MOST_COMMON_FREQS_ELEM].argname, + &array_in_fn, exprs_elems[most_common_freqs_idx], + FLOAT4OID, -1, &val_ok); + + if (!val_ok) + goto exprs_error; + + statatt_set_slot(values, nulls, replaces, + STATISTIC_KIND_MCV, + typcache->eq_opr, stacoll, + stanumbers, false, stavalues, false); + } + + /* STATISTIC_KIND_HISTOGRAM */ + if (!exprs_nulls[histogram_bounds_idx]) + { + Datum stavalues; + bool val_ok = false; + + stavalues = statatt_build_stavalues(extexprarginfo[HISTOGRAM_BOUNDS_ELEM].argname, + &array_in_fn, + exprs_elems[histogram_bounds_idx], + typid, typmod, &val_ok); + + if (!val_ok) + goto exprs_error; + + statatt_set_slot(values, nulls, replaces, + STATISTIC_KIND_HISTOGRAM, + typcache->lt_opr, stacoll, + 0, true, stavalues, false); + } + + /* STATISTIC_KIND_CORRELATION */ + if (!exprs_nulls[correlation_idx]) + { + Datum corr[] = {(Datum) 0}; + ArrayType *arry; + Datum stanumbers; + + if (!text_to_float4(exprs_elems[correlation_idx], + extexprarginfo[CORRELATION_ELEM].argname, + &corr[0])) + goto exprs_error; + + arry = construct_array_builtin(corr, 1, FLOAT4OID); + + stanumbers = PointerGetDatum(arry); + + statatt_set_slot(values, nulls, replaces, + STATISTIC_KIND_CORRELATION, + typcache->lt_opr, stacoll, + stanumbers, false, 0, true); + } + + /* STATISTIC_KIND_MCELEM */ + if (exprs_nulls[most_common_elems_idx] != + exprs_nulls[most_common_elems_freqs_idx]) + { + ereport(WARNING, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid expression elements %s and %s: conflicting NULL/NOT NULL", + extexprarginfo[MOST_COMMON_ELEMS_ELEM].argname, + extexprarginfo[MOST_COMMON_ELEM_FREQS_ELEM].argname), + errhint("The elements must both be NULL or both NOT NULL.")); + goto exprs_error; + } + + /* + * We only need to fetch element type and eq operator if we have a + * stat of type MCELEM or DECHIST, otherwise the values are + * unnecessary and not meaningful. + */ + if (!exprs_nulls[most_common_elems_idx] || + !exprs_nulls[elem_count_histogram_idx]) + { + if (!statatt_get_elem_type(typid, typcache->typtype, + &elemtypid, &elem_eq_opr)) + { + ereport(WARNING, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("could not determine element type of expression")); + goto exprs_error; + } + } + + if (!exprs_nulls[most_common_elems_idx]) + { + Datum stavalues; + Datum stanumbers; + bool val_ok = false; + + stavalues = statatt_build_stavalues(extexprarginfo[MOST_COMMON_ELEMS_ELEM].argname, + &array_in_fn, + exprs_elems[most_common_elems_idx], + elemtypid, typmod, &val_ok); + + if (!val_ok) + goto exprs_error; + + stanumbers = statatt_build_stavalues(extexprarginfo[MOST_COMMON_ELEM_FREQS_ELEM].argname, + &array_in_fn, + exprs_elems[most_common_elems_freqs_idx], + FLOAT4OID, -1, &val_ok); + + if (!val_ok) + goto exprs_error; + + statatt_set_slot(values, nulls, replaces, + STATISTIC_KIND_MCELEM, + elem_eq_opr, stacoll, + stanumbers, false, stavalues, false); + } + + if (!exprs_nulls[elem_count_histogram_idx]) + { + Datum stanumbers; + bool val_ok = false; + + stanumbers = statatt_build_stavalues(extexprarginfo[ELEM_COUNT_HISTOGRAM_ELEM].argname, + &array_in_fn, + exprs_elems[elem_count_histogram_idx], + FLOAT4OID, -1, &val_ok); + + if (!val_ok) + goto exprs_error; + + statatt_set_slot(values, nulls, replaces, STATISTIC_KIND_DECHIST, + elem_eq_opr, stacoll, + stanumbers, false, 0, true); + } + + /* + * Currently there is no extended stats export of the statistic kinds + * BOUNDS_HISTOGRAM or RANGE_LENGTH_HISTOGRAM so these cannot be + * imported. These may be added in the future. + */ + + pgstup = heap_form_tuple(RelationGetDescr(pgsd), values, nulls); + pgstdat = heap_copy_tuple_as_datum(pgstup, RelationGetDescr(pgsd)); + astate = accumArrayResult(astate, pgstdat, false, pgstypoid, + CurrentMemoryContext); + + offset += NUM_ATTRIBUTE_STATS_ELEMS; + } + result = makeArrayResult(astate, CurrentMemoryContext); + *ok = true; + +exprs_error: + /* clean up */ + if (exprs_elems != NULL) + pfree(exprs_elems); + if (exprs_nulls != NULL) + pfree(exprs_nulls); + return result; +} + /* * Remove an existing pg_statistic_ext_data row for a given pg_statistic_ext * row and "inherited" pair. diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 2bebefd0ba2..17f38770af1 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -18652,11 +18652,36 @@ dumpStatisticsExtStats(Archive *fout, const StatsExtInfo *statsextinfo) if (fout->remoteVersion >= 130000) appendPQExpBufferStr(pq, "e.most_common_vals, e.most_common_freqs, " - "e.most_common_base_freqs "); + "e.most_common_base_freqs, "); else appendPQExpBufferStr(pq, "NULL AS most_common_vals, NULL AS most_common_freqs, " - "NULL AS most_common_base_freqs "); + "NULL AS most_common_base_freqs, "); + + /* Expressions were introduced in v14 */ + if (fout->remoteVersion >= 140000) + { + appendPQExpBufferStr(pq, + "( " + "SELECT array_agg( " + " ARRAY[ee.null_frac::text, ee.avg_width::text, " + " ee.n_distinct::text, ee.most_common_vals::text, " + " ee.most_common_freqs::text, ee.histogram_bounds::text, " + " ee.correlation::text, ee.most_common_elems::text, " + " ee.most_common_elem_freqs::text, " + " ee.elem_count_histogram::text]) " + "FROM pg_stats_ext_exprs AS ee " + "WHERE ee.statistics_schemaname = $1 " + "AND ee.statistics_name = $2 "); + + /* Inherited expressions introduced in v15 */ + if (fout->remoteVersion >= 150000) + appendPQExpBufferStr(pq, "AND ee.inherited = e.inherited"); + + appendPQExpBufferStr(pq, ") AS exprs "); + } + else + appendPQExpBufferStr(pq, "NULL AS exprs "); /* pg_stats_ext introduced in v12 */ if (fout->remoteVersion >= 120000) @@ -18710,6 +18735,7 @@ dumpStatisticsExtStats(Archive *fout, const StatsExtInfo *statsextinfo) int i_mcv = PQfnumber(res, "most_common_vals"); int i_mcf = PQfnumber(res, "most_common_freqs"); int i_mcbf = PQfnumber(res, "most_common_base_freqs"); + int i_exprs = PQfnumber(res, "exprs"); for (int i = 0; i < nstats; i++) { @@ -18757,6 +18783,10 @@ dumpStatisticsExtStats(Archive *fout, const StatsExtInfo *statsextinfo) appendNamedArgument(out, fout, "most_common_base_freqs", "double precision[]", PQgetvalue(res, i, i_mcbf)); + if (!PQgetisnull(res, i, i_exprs)) + appendNamedArgument(out, fout, "exprs", "text[]", + PQgetvalue(res, i, i_exprs)); + appendPQExpBufferStr(out, "\n);\n"); } diff --git a/src/test/regress/expected/stats_import.out b/src/test/regress/expected/stats_import.out index 37131f9ceab..30bb9138854 100644 --- a/src/test/regress/expected/stats_import.out +++ b/src/test/regress/expected/stats_import.out @@ -2155,7 +2155,8 @@ SELECT pg_catalog.pg_restore_extended_stats( {red,"{[11,13),[15,19),[20,30)}","{[11,13),[15,19),[20,30),[10000,10200)}"}, {red,"{[21,23),[25,29),[120,130)}","{[21,23),[25,29),[120,130),[10000,10200)}"}}'::text[], 'most_common_freqs', '{0.3333333333333333,0.3333333333333333,0.3333333333333333}'::double precision[], - 'most_common_base_freqs', '{0.1111111111111111,0.1111111111111111,0.1111111111111111}'::double precision[] + 'most_common_base_freqs', '{0.1111111111111111,0.1111111111111111,0.1111111111111111}'::double precision[], + 'exprs', '{{0,60,-1,NULL,NULL,NULL,NULL,NULL,NULL,NULL}}'::text[] ); pg_restore_extended_stats --------------------------- @@ -2191,6 +2192,366 @@ most_common_val_nulls | {{f,f,f},{f,f,f},{f,f,f}} most_common_freqs | {0.3333333333333333,0.3333333333333333,0.3333333333333333} most_common_base_freqs | {0.1111111111111111,0.1111111111111111,0.1111111111111111} +SELECT e.expr, e.null_frac, e.avg_width, e.n_distinct, e.most_common_vals, + e.most_common_freqs, e.histogram_bounds, e.correlation, + e.most_common_elems, e.most_common_elem_freqs, e.elem_count_histogram +FROM pg_stats_ext_exprs AS e +WHERE e.statistics_schemaname = 'stats_import' AND + e.statistics_name = 'test_mr_stat' AND + e.inherited = false +\gx +-[ RECORD 1 ]----------+--------------------------------------------- +expr | (mrange + '{[10000,10200)}'::int4multirange) +null_frac | 0 +avg_width | 60 +n_distinct | -1 +most_common_vals | +most_common_freqs | +histogram_bounds | +correlation | +most_common_elems | +most_common_elem_freqs | +elem_count_histogram | + +-- Incorrect extended stats kind, exprs not supported +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_ndistinct', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +WARNING: cannot specify parameter "exprs". +HINT: Extended statistics object does not support statistics of this type. + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- Invalid exprs, not 2-D +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{0,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL}'::text[]); +WARNING: malformed array "exprs": expected text array of 2 dimensions + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs outer dimension wrong +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL}}'::text[]); +WARNING: malformed array "exprs": incorect number of dimensions (2 required) + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs inner dimension missing 1 element +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL}}'::text[]); +WARNING: could not use parameter "exprs" due to incorrect inner dimension with 10 elements + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs null_frac not a float +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{BADNULLFRAC,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +WARNING: could not parse expression element "null_frac" value "BADNULLFRAC": invalid type "float4" + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs avg_width not an integer +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,BADAVGWIDTH,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +WARNING: could not parse expression element "avg_width" value "BADAVGWIDTH": invalid type "integer" + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs n_dinstinct not a float +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,BADNDISTINCT,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +WARNING: could not parse expression element "n_distinct" value "BADNDISTINCT": invalid type "float4" + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs conflicting most_common_vals/most_common_freqs NULL/NOTNULL +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}",NULL,"{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +WARNING: invalid expression elements most_common_vals and most_common_freqs: conflicting NULL/NOT NULL +HINT: The elements must both be NULL or both NOT NULL. + pg_restore_extended_stats +--------------------------- + f +(1 row) + +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,NULL,"{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +WARNING: invalid expression elements most_common_vals and most_common_freqs: conflicting NULL/NOT NULL +HINT: The elements must both be NULL or both NOT NULL. + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs most_common_vals element wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{BADMCV}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +WARNING: invalid input syntax for type integer: "BADMCV" + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs most_common_freqs element wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{BADMCF}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +WARNING: invalid input syntax for type real: "BADMCF" + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs histogram wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{BADHIST,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +WARNING: invalid input syntax for type integer: "BADHIST" + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs correlation wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{-1,0}",BADCORR,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +WARNING: could not parse expression element "correlation" value "BADCORR": invalid type "float4" + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- ok: exprs +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); + pg_restore_extended_stats +--------------------------- + t +(1 row) + +SELECT e.expr, e.null_frac, e.avg_width, e.n_distinct, e.most_common_vals, + e.most_common_freqs, e.histogram_bounds, e.correlation, + e.most_common_elems, e.most_common_elem_freqs, e.elem_count_histogram +FROM pg_stats_ext_exprs AS e +WHERE e.statistics_schemaname = 'stats_import' AND + e.statistics_name = 'test_stat_clone' AND + e.inherited = false +\gx +-[ RECORD 1 ]----------+---------------------- +expr | lower(arange) +null_frac | 0 +avg_width | 4 +n_distinct | -0.75 +most_common_vals | {1} +most_common_freqs | {0.5} +histogram_bounds | {-1,0} +correlation | -0.6 +most_common_elems | +most_common_elem_freqs | +elem_count_histogram | +-[ RECORD 2 ]----------+---------------------- +expr | array_length(tags, 1) +null_frac | 0.25 +avg_width | 4 +n_distinct | -0.5 +most_common_vals | {2} +most_common_freqs | {0.5} +histogram_bounds | +correlation | 1 +most_common_elems | +most_common_elem_freqs | +elem_count_histogram | + +-- A statistics object for testing MCELEM values in expressions +CREATE STATISTICS stats_import.test_stat_mcelem + ON name, (ARRAY[(comp).a, lower(arange)]) + FROM stats_import.test; +-- exprs conflicting most_common_elems/most_common_elem_freqs NULL/NOTNULL +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_mcelem', + 'inherited', false, + 'exprs', '{{0,33,-1,NULL,NULL,"{\"{1,1}\",\"{2,1}\",\"{3,-1}\",\"{NULL,0}\"}",1,"{-1,0,1,2,3}",NULL,"{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}"}}'::text[]); +WARNING: invalid expression elements most_common_elems and most_common_elem_freqs: conflicting NULL/NOT NULL +HINT: The elements must both be NULL or both NOT NULL. + pg_restore_extended_stats +--------------------------- + f +(1 row) + +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_mcelem', + 'inherited', false, + 'exprs', '{{0,33,-1,NULL,NULL,"{\"{1,1}\",\"{2,1}\",\"{3,-1}\",\"{NULL,0}\"}",1,NULL,"{0.25,0.25,0.5,0.25,0.25,0.25,0.5,0.25}","{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}"}}'::text[]); +WARNING: invalid expression elements most_common_elems and most_common_elem_freqs: conflicting NULL/NOT NULL +HINT: The elements must both be NULL or both NOT NULL. + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs most_common_elems element wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_mcelem', + 'inherited', false, + 'exprs', '{{0,33,-1,NULL,NULL,"{\"{1,1}\",\"{2,1}\",\"{3,-1}\",\"{NULL,0}\"}",1,"{-1,BADELEM,1,2,3}","{0.25,0.25,0.5,0.25,0.25,0.25,0.5,0.25}","{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}"}}'::text[]); +WARNING: invalid input syntax for type integer: "BADELEM" + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs most_common_elem_freqs element wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_mcelem', + 'inherited', false, + 'exprs', '{{0,33,-1,NULL,NULL,"{\"{1,1}\",\"{2,1}\",\"{3,-1}\",\"{NULL,0}\"}",1,"{-1,0,1,2,3}","{BADELEMFREQ,0.25,0.5,0.25,0.25,0.25,0.5,0.25}","{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}"}}'::text[]); +WARNING: invalid input syntax for type real: "BADELEMFREQ" + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- exprs histogram bounds element wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_mcelem', + 'inherited', false, + 'exprs', '{{0,33,-1,NULL,NULL,"{\"{1,1}\",\"{2,1}\",\"{3,-1}\",\"{NULL,0}\"}",1,"{-1,0,1,2,3}","{0.25,0.25,0.5,0.25,0.25,0.25,0.5,0.25}","{BADELEMHIST,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}"}}'::text[]); +WARNING: invalid input syntax for type real: "BADELEMHIST" + pg_restore_extended_stats +--------------------------- + f +(1 row) + +-- ok: exprs mcelem +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_mcelem', + 'inherited', false, + 'exprs', '{{0,33,-1,NULL,NULL,"{\"{1,1}\",\"{2,1}\",\"{3,-1}\",\"{NULL,0}\"}",1,"{-1,0,1,2,3}","{0.25,0.25,0.5,0.25,0.25,0.25,0.5,0.25}","{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}"}}'::text[]); + pg_restore_extended_stats +--------------------------- + t +(1 row) + +SELECT e.expr, e.null_frac, e.avg_width, e.n_distinct, e.most_common_vals, + e.most_common_freqs, e.histogram_bounds, e.correlation, + e.most_common_elems, e.most_common_elem_freqs, e.elem_count_histogram +FROM pg_stats_ext_exprs AS e +WHERE e.statistics_schemaname = 'stats_import' AND + e.statistics_name = 'test_stat_mcelem' AND + e.inherited = false +\gx +-[ RECORD 1 ]----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- +expr | ARRAY[(comp).a, lower(arange)] +null_frac | 0 +avg_width | 33 +n_distinct | -1 +most_common_vals | +most_common_freqs | +histogram_bounds | {"{1,1}","{2,1}","{3,-1}","{NULL,0}"} +correlation | 1 +most_common_elems | {-1,0,1,2,3} +most_common_elem_freqs | {0.25,0.25,0.5,0.25,0.25,0.25,0.5,0.25} +elem_count_histogram | {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5} + -- Test the ability of pg_restore_extended_stats() to import all of the -- statistic values from an extended statistic object that has been -- populated via a regular ANALYZE. This checks after the statistics diff --git a/src/test/regress/sql/stats_import.sql b/src/test/regress/sql/stats_import.sql index 8db7cd93b88..138b78c157e 100644 --- a/src/test/regress/sql/stats_import.sql +++ b/src/test/regress/sql/stats_import.sql @@ -1543,7 +1543,8 @@ SELECT pg_catalog.pg_restore_extended_stats( {red,"{[11,13),[15,19),[20,30)}","{[11,13),[15,19),[20,30),[10000,10200)}"}, {red,"{[21,23),[25,29),[120,130)}","{[21,23),[25,29),[120,130),[10000,10200)}"}}'::text[], 'most_common_freqs', '{0.3333333333333333,0.3333333333333333,0.3333333333333333}'::double precision[], - 'most_common_base_freqs', '{0.1111111111111111,0.1111111111111111,0.1111111111111111}'::double precision[] + 'most_common_base_freqs', '{0.1111111111111111,0.1111111111111111,0.1111111111111111}'::double precision[], + 'exprs', '{{0,60,-1,NULL,NULL,NULL,NULL,NULL,NULL,NULL}}'::text[] ); SELECT replace(e.n_distinct, '}, ', E'},\n') AS n_distinct, @@ -1557,6 +1558,199 @@ WHERE e.statistics_schemaname = 'stats_import' AND e.inherited = false \gx +SELECT e.expr, e.null_frac, e.avg_width, e.n_distinct, e.most_common_vals, + e.most_common_freqs, e.histogram_bounds, e.correlation, + e.most_common_elems, e.most_common_elem_freqs, e.elem_count_histogram +FROM pg_stats_ext_exprs AS e +WHERE e.statistics_schemaname = 'stats_import' AND + e.statistics_name = 'test_mr_stat' AND + e.inherited = false +\gx + +-- Incorrect extended stats kind, exprs not supported +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_ndistinct', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); + +-- Invalid exprs, not 2-D +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{0,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL}'::text[]); +-- exprs outer dimension wrong +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL}}'::text[]); +-- exprs inner dimension missing 1 element +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL}}'::text[]); +-- exprs null_frac not a float +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{BADNULLFRAC,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +-- exprs avg_width not an integer +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,BADAVGWIDTH,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +-- exprs n_dinstinct not a float +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,BADNDISTINCT,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +-- exprs conflicting most_common_vals/most_common_freqs NULL/NOTNULL +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}",NULL,"{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,NULL,"{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +-- exprs most_common_vals element wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{BADMCV}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +-- exprs most_common_freqs element wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{BADMCF}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +-- exprs histogram wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{BADHIST,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +-- exprs correlation wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{-1,0}",BADCORR,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); +-- ok: exprs +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test_clone', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_clone', + 'inherited', false, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); + +SELECT e.expr, e.null_frac, e.avg_width, e.n_distinct, e.most_common_vals, + e.most_common_freqs, e.histogram_bounds, e.correlation, + e.most_common_elems, e.most_common_elem_freqs, e.elem_count_histogram +FROM pg_stats_ext_exprs AS e +WHERE e.statistics_schemaname = 'stats_import' AND + e.statistics_name = 'test_stat_clone' AND + e.inherited = false +\gx + +-- A statistics object for testing MCELEM values in expressions +CREATE STATISTICS stats_import.test_stat_mcelem + ON name, (ARRAY[(comp).a, lower(arange)]) + FROM stats_import.test; + +-- exprs conflicting most_common_elems/most_common_elem_freqs NULL/NOTNULL +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_mcelem', + 'inherited', false, + 'exprs', '{{0,33,-1,NULL,NULL,"{\"{1,1}\",\"{2,1}\",\"{3,-1}\",\"{NULL,0}\"}",1,"{-1,0,1,2,3}",NULL,"{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}"}}'::text[]); +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_mcelem', + 'inherited', false, + 'exprs', '{{0,33,-1,NULL,NULL,"{\"{1,1}\",\"{2,1}\",\"{3,-1}\",\"{NULL,0}\"}",1,NULL,"{0.25,0.25,0.5,0.25,0.25,0.25,0.5,0.25}","{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}"}}'::text[]); +-- exprs most_common_elems element wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_mcelem', + 'inherited', false, + 'exprs', '{{0,33,-1,NULL,NULL,"{\"{1,1}\",\"{2,1}\",\"{3,-1}\",\"{NULL,0}\"}",1,"{-1,BADELEM,1,2,3}","{0.25,0.25,0.5,0.25,0.25,0.25,0.5,0.25}","{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}"}}'::text[]); +-- exprs most_common_elem_freqs element wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_mcelem', + 'inherited', false, + 'exprs', '{{0,33,-1,NULL,NULL,"{\"{1,1}\",\"{2,1}\",\"{3,-1}\",\"{NULL,0}\"}",1,"{-1,0,1,2,3}","{BADELEMFREQ,0.25,0.5,0.25,0.25,0.25,0.5,0.25}","{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}"}}'::text[]); +-- exprs histogram bounds element wrong type +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_mcelem', + 'inherited', false, + 'exprs', '{{0,33,-1,NULL,NULL,"{\"{1,1}\",\"{2,1}\",\"{3,-1}\",\"{NULL,0}\"}",1,"{-1,0,1,2,3}","{0.25,0.25,0.5,0.25,0.25,0.25,0.5,0.25}","{BADELEMHIST,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}"}}'::text[]); +-- ok: exprs mcelem +SELECT pg_catalog.pg_restore_extended_stats( + 'schemaname', 'stats_import', + 'relname', 'test', + 'statistics_schemaname', 'stats_import', + 'statistics_name', 'test_stat_mcelem', + 'inherited', false, + 'exprs', '{{0,33,-1,NULL,NULL,"{\"{1,1}\",\"{2,1}\",\"{3,-1}\",\"{NULL,0}\"}",1,"{-1,0,1,2,3}","{0.25,0.25,0.5,0.25,0.25,0.25,0.5,0.25}","{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.5}"}}'::text[]); + +SELECT e.expr, e.null_frac, e.avg_width, e.n_distinct, e.most_common_vals, + e.most_common_freqs, e.histogram_bounds, e.correlation, + e.most_common_elems, e.most_common_elem_freqs, e.elem_count_histogram +FROM pg_stats_ext_exprs AS e +WHERE e.statistics_schemaname = 'stats_import' AND + e.statistics_name = 'test_stat_mcelem' AND + e.inherited = false +\gx + -- Test the ability of pg_restore_extended_stats() to import all of the -- statistic values from an extended statistic object that has been -- populated via a regular ANALYZE. This checks after the statistics diff --git a/doc/src/sgml/func/func-admin.sgml b/doc/src/sgml/func/func-admin.sgml index 3ac81905d1f..bdfad1c8929 100644 --- a/doc/src/sgml/func/func-admin.sgml +++ b/doc/src/sgml/func/func-admin.sgml @@ -2198,12 +2198,14 @@ SELECT pg_restore_attribute_stats( <structname>myschema.mystatsobj</structname>: <programlisting> SELECT pg_restore_extended_stats( - 'schemaname', 'tab_schema'::name, - 'relname', 'tab_name'::name, - 'statistics_schemaname', 'stats_schema'::name, - 'statistics_name', 'stats_name'::name, + 'schemaname', 'tab_schema', + 'relname', 'tab_name', + 'statistics_schemaname', 'stats_schema', + 'statistics_name', 'stats_name', 'inherited', false, 'n_distinct', '[{"attributes" : [2,3], "ndistinct" : 4}]'::pg_ndistinct); + 'dependencies', '{"2 => 1": 1.000000, "2 => -1": 1.000000, "2 => -2": 1.000000}'::pg_dependencies, + 'exprs', '{{0,4,-0.75,"{1}","{0.5}","{-1,0}",-0.6,NULL,NULL,NULL},{0.25,4,-0.5,"{2}","{0.5}",NULL,1,NULL,NULL,NULL}}'::text[]); </programlisting> </para> <para> @@ -2226,6 +2228,13 @@ SELECT pg_restore_attribute_stats( <literal>dependencies</literal>, <literal>most_common_vals</literal>, <literal>most_common_freqs</literal>, and <literal>most_common_base_freqs</literal>. + To accept statistics for any expressions in the extended + statistics object, the parameter <literal>exprs</literal> with a type + <type>text[]</type> is available, the array must be two dimensional with + an outer array in length equal to the number of expressions in the object, + and the inner array elements for each of the statistical columns in + <link linkend="view-pg-stats-ext-exprs"><structname>pg_stats_ext_exprs</structname></link>, + some of which are themselves arrays. </para> <para> Additionally, this function accepts argument name base-commit: bb26a81ee28c9d9c64e6f233fafa2792768ece1b -- 2.52.0
