Hello, This should have been added with [1].
Excerpt from the documentation: "pg_stats is also designed to present the information in a more readable format than the underlying catalog — at the cost that its schema must be extended whenever new slot types are defined for pg_statistic." [2] So, I added a reminder in pg_statistic.h. Attached is v2 of this patch with some cosmetic changes. Renamed the columns a bit and updated the docs to be a bit more descriptive. (range_length_empty_frac -> empty_range_frac, range_bounds_histogram -> range_bounds_histograms) One question: We do have the option of representing the histogram of lower bounds separately from the histogram of upper bounds, as two separate view columns. Don't know if there is much utility though and there is a fair bit of added complexity: see below. Thoughts? My attempts via SQL (unnest -> lower|upper -> array_agg) were futile given unnest does not play nice with anyarray. For instance: select unnest(stavalues1) from pg_statistic; ERROR: cannot determine element type of "anyarray" argument Maybe the only option is to write a UDF pg_get_{lower|upper}_bounds_histogram which can do something similar to what calc_hist_selectivity does: /* * Convert histogram of ranges into histograms of its lower and upper * bounds. */ nhist = hslot.nvalues; hist_lower = (RangeBound *) palloc(sizeof(RangeBound) * nhist); hist_upper = (RangeBound *) palloc(sizeof(RangeBound) * nhist); for (i = 0; i < nhist; i++) { bool empty; range_deserialize(rng_typcache, DatumGetRangeTypeP(hslot.values[i]), &hist_lower[i], &hist_upper[i], &empty); /* The histogram should not contain any empty ranges */ if (empty) elog(ERROR, "bounds histogram contains an empty range"); } This is looking good and ready. [1] https://github.com/postgres/postgres/commit/918eee0c497c88260a2e107318843c9b1947bc6f [2] https://www.postgresql.org/docs/devel/view-pg-stats.html Regards, Soumyadeep (VMware)
From 1f2ed0911eb3f7a53b16047cefb499692daf5ef6 Mon Sep 17 00:00:00 2001 From: Egor Rogov <e.ro...@postgrespro.ru> Date: Sun, 11 Jul 2021 11:09:45 -0700 Subject: [PATCH v2 1/1] Display length and bounds histograms in pg_stats Values corresponding to STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM and STATISTIC_KIND_BOUNDS_HISTOGRAM were not exposed to pg_stats when these slot kinds were introduced in 918eee0c49. This commit adds the missing fields to pg_stats. TODO: catalog version bump --- doc/src/sgml/catalogs.sgml | 32 ++++++++++++++++++++++++++++ src/backend/catalog/system_views.sql | 23 +++++++++++++++++++- src/include/catalog/pg_statistic.h | 3 +++ src/test/regress/expected/rules.out | 26 +++++++++++++++++++++- 4 files changed, 82 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index f517a7d4af..7168ff9a13 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -12877,6 +12877,38 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx non-null elements. (Null for scalar types.) </para></entry> </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>range_length_histogram</structfield> <type>anyarray</type> + </para> + <para> + A histogram of the lengths of non-empty and non-null range values of a + range type column. (Null for non-range types.) + </para></entry> + </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>empty_range_frac</structfield> <type>float4</type> + </para> + <para> + Fraction of column entries whose values are empty ranges. + (Null for non-range types.) + </para></entry> + </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>range_bounds_histograms</structfield> <type>anyarray</type> + </para> + <para> + Histograms of lower and upper bounds of non-empty, non-null ranges, + combined into a single array of range values. The lower and upper bounds + of each value correspond to the histograms of lower and upper bounds + respectively. (Null for non-range types.) + </para></entry> + </row> </tbody> </tgroup> </table> diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 999d984068..d8bc622ad5 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -243,7 +243,28 @@ CREATE VIEW pg_stats WITH (security_barrier) AS WHEN stakind3 = 5 THEN stanumbers3 WHEN stakind4 = 5 THEN stanumbers4 WHEN stakind5 = 5 THEN stanumbers5 - END AS elem_count_histogram + END AS elem_count_histogram, + CASE + WHEN stakind1 = 6 THEN stavalues1 + WHEN stakind2 = 6 THEN stavalues2 + WHEN stakind3 = 6 THEN stavalues3 + WHEN stakind4 = 6 THEN stavalues4 + WHEN stakind5 = 6 THEN stavalues5 + END AS range_length_histogram, + CASE + WHEN stakind1 = 6 THEN stanumbers1[1] + WHEN stakind2 = 6 THEN stanumbers2[1] + WHEN stakind3 = 6 THEN stanumbers3[1] + WHEN stakind4 = 6 THEN stanumbers4[1] + WHEN stakind5 = 6 THEN stanumbers5[1] + END AS empty_range_frac, + CASE + WHEN stakind1 = 7 THEN stavalues1 + WHEN stakind2 = 7 THEN stavalues2 + WHEN stakind3 = 7 THEN stavalues3 + WHEN stakind4 = 7 THEN stavalues4 + WHEN stakind5 = 7 THEN stavalues5 + END AS range_bounds_histograms FROM pg_statistic s JOIN pg_class c ON (c.oid = s.starelid) JOIN pg_attribute a ON (c.oid = attrelid AND attnum = s.staattnum) LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace) diff --git a/src/include/catalog/pg_statistic.h b/src/include/catalog/pg_statistic.h index 4f95d7ade4..d0b4923424 100644 --- a/src/include/catalog/pg_statistic.h +++ b/src/include/catalog/pg_statistic.h @@ -152,6 +152,9 @@ DECLARE_FOREIGN_KEY((starelid, staattnum), pg_attribute, (attrelid, attnum)); * data "kind" will appear in any particular slot. Instead, search the * stakind fields to see if the desired data is available. (The standard * function get_attstatsslot() may be used for this.) + * + * Note: The pg_stats view needs to be modified whenever a new slot kind is + * added to core. */ /* diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index e5ab11275d..1ce26aa717 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -2409,7 +2409,31 @@ pg_stats| SELECT n.nspname AS schemaname, WHEN (s.stakind4 = 5) THEN s.stanumbers4 WHEN (s.stakind5 = 5) THEN s.stanumbers5 ELSE NULL::real[] - END AS elem_count_histogram + END AS elem_count_histogram, + CASE + WHEN (s.stakind1 = 6) THEN s.stavalues1 + WHEN (s.stakind2 = 6) THEN s.stavalues2 + WHEN (s.stakind3 = 6) THEN s.stavalues3 + WHEN (s.stakind4 = 6) THEN s.stavalues4 + WHEN (s.stakind5 = 6) THEN s.stavalues5 + ELSE NULL::anyarray + END AS range_length_histogram, + CASE + WHEN (s.stakind1 = 6) THEN s.stanumbers1[1] + WHEN (s.stakind2 = 6) THEN s.stanumbers2[1] + WHEN (s.stakind3 = 6) THEN s.stanumbers3[1] + WHEN (s.stakind4 = 6) THEN s.stanumbers4[1] + WHEN (s.stakind5 = 6) THEN s.stanumbers5[1] + ELSE NULL::real + END AS empty_range_frac, + CASE + WHEN (s.stakind1 = 7) THEN s.stavalues1 + WHEN (s.stakind2 = 7) THEN s.stavalues2 + WHEN (s.stakind3 = 7) THEN s.stavalues3 + WHEN (s.stakind4 = 7) THEN s.stavalues4 + WHEN (s.stakind5 = 7) THEN s.stavalues5 + ELSE NULL::anyarray + END AS range_bounds_histograms FROM (((pg_statistic s JOIN pg_class c ON ((c.oid = s.starelid))) JOIN pg_attribute a ON (((c.oid = a.attrelid) AND (a.attnum = s.staattnum)))) -- 2.25.1