On 1/19/21 1:44 AM, Tatsuro Yamada wrote:
Hi Tomas,
As for how to deal with this, I can think of about three ways:
1) simplify the command to only print information from
pg_statistic_ext (so on information about which stats are built or
sizes)
2) extend pg_stats_ext with necessary information (e.g. sizes)
3) create a new system view, with necessary information (so that
pg_stats_ext does not need to be modified)
4) add functions returning the necessary information, possibly only
for statistics the user can access (similarly to what pg_stats_ext
does)
Options 2-4 have the obvious disadvantage that this won't work on
older releases (we can't add views or functions there). So I'm
leaning towards #1 even if that means we have to remove some of the
details. We can consider adding that for new releases, though.
Thanks for the useful advice. I go with option 1).
The following query is created by using pg_stats_ext instead of
pg_statistic_ext and pg_statistic_ext_data. However, I was confused
about writing a part of the query for calculating MCV size because
there are four columns related to MCV. For example, most_common_vals,
most_common_val_nulls, most_common_freqs, and most_common_base_freqs.
Currently, I don't know how to calculate the size of MCV by using the
four columns. Thoughts? :-)
Well, my suggestion was to use pg_statistic_ext, because that lists
all statistics, while pg_stats_ext is filtering statistics depending
on access privileges. I think that's more appropriate for \dX, the
contents should not change depending on the user.
Also, let me clarify - with option (1) we'd not show the sizes at all.
The size of the formatted statistics may be very different from the
on-disk representation, so I see no point in showing it in \dX.
We might show other stats (e.g. number of MCV items, or the fraction
of data represented by the MCV list), but the user can inspect
pg_stats_ext if needed.
What we might do is to show those stats when a superuser is running
this command, but I'm not sure that's a good idea (or how difficult
would it be to implement).
Thanks for clarifying.
I see that your suggestion was to use pg_statistic_ext, not pg_stats_ext.
And we don't need the size of stats.
If that's the case, we also can't get the status of stats since PG12 or
later
because we can't use pg_statistic_ext_data, as you know. Therefore, it
would be
better to replace the query with the old query that I sent five months
ago like this:
# the old query
SELECT
stxnamespace::pg_catalog.regnamespace AS "Schema",
stxrelid::pg_catalog.regclass AS "Table",
stxname AS "Name",
(SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')
FROM pg_catalog.unnest(stxkeys) s(attnum)
JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND
a.attnum = s.attnum AND NOT attisdropped)) AS "Columns",
'd' = any(stxkind) AS "Ndistinct",
'f' = any(stxkind) AS "Dependencies",
'm' = any(stxkind) AS "MCV"
FROM pg_catalog.pg_statistic_ext stat
ORDER BY 1,2;
Schema | Table | Name | Columns | Ndistinct | Dependencies | MCV
--------+--------+------------+---------+-----------+--------------+-----
public | hoge1 | hoge1_ext | a, b | t | t | t
public | hoge_t | hoge_t_ext | a, b | t | t | t
(2 rows)
The above query is so simple so that we would better to use the
following query:
# This query works on PG10 or later
SELECT
es.stxnamespace::pg_catalog.regnamespace::text AS "Schema",
es.stxname AS "Name",
pg_catalog.format('%s FROM %s',
(SELECT
pg_catalog.string_agg(pg_catalog.quote_ident(a.attname),', ')
FROM pg_catalog.unnest(es.stxkeys) s(attnum)
JOIN pg_catalog.pg_attribute a
ON (es.stxrelid = a.attrelid
AND a.attnum = s.attnum
AND NOT a.attisdropped)),
es.stxrelid::regclass) AS "Definition",
CASE WHEN 'd' = any(es.stxkind) THEN 'defined'
END AS "Ndistinct",
CASE WHEN 'f' = any(es.stxkind) THEN 'defined'
END AS "Dependencies",
CASE WHEN 'm' = any(es.stxkind) THEN 'defined'
END AS "MCV"
FROM pg_catalog.pg_statistic_ext es
ORDER BY 1, 2;
Schema | Name | Definition | Ndistinct | Dependencies |
Dependencies
--------+------------+------------------+-----------+--------------+--------------
public | hoge1_ext | a, b FROM hoge1 | defined | defined |
defined
public | hoge_t_ext | a, b FROM hoge_t | defined | defined |
defined
(2 rows)
I'm going to create the WIP patch to use the above queriy.
Any comments welcome. :-D
Yes, I think using this simpler query makes sense. If we decide we need
something more elaborate, we can improve that by in future PostgreSQL
versions (after adding view/function to core), but I'd leave that as a
work for the future.
Apologies for all the extra work - I haven't realized this flaw when
pushing for showing more stuff :-(
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company