Looking through extended_stats.c, I found a corner case that can lead to a seg-fault:
CREATE TABLE foo(); CREATE STATISTICS s ON (1) FROM foo; ANALYSE foo; This crashes in lookup_var_attr_stats(), because it isn't expecting nvacatts to be 0. I can't think of any case where building stats on a table with no analysable columns is useful, so it should probably just exit early in that case. In BuildRelationExtStatistics(), it looks like min_attrs should be declared assert-only. In evaluate_expressions(): + /* set the pointers */ + result = (ExprInfo *) ptr; + ptr += sizeof(ExprInfo); I think that should probably have a MAXALIGN(). A slightly bigger issue that I don't like is the way it assigns attribute numbers for expressions starting from MaxHeapAttributeNumber+1, so the first expression has an attnum of 1601. That leads to pretty inefficient use of Bitmapsets, since most tables only contain a handful of columns, and there's a large block of unused space in the middle the Bitmapset. An alternative approach might be to use regular attnums for columns and use negative indexes -1, -2, -3, ... for expressions in the stored stats. Then when storing and retrieving attnums from Bitmapsets, it could just offset by STATS_MAX_DIMENSIONS (8) to avoid negative values in the Bitmapsets, since there can't be more than that many expressions (just like other code stores system attributes using FirstLowInvalidHeapAttributeNumber). That would be a somewhat bigger change, but hopefully fairly mechanical, and then some code like add_expressions_to_attributes() would go away. Looking at the new view pg_stats_ext_exprs, I noticed that it fails to show expressions until the statistics have been built. For example: CREATE TABLE foo(a int, b int); CREATE STATISTICS s ON (a+b), (a*b) FROM foo; SELECT statistics_name, tablename, expr, n_distinct FROM pg_stats_ext_exprs; statistics_name | tablename | expr | n_distinct -----------------+-----------+------+------------ s | foo | | (1 row) but after populating and analysing the table, this becomes: statistics_name | tablename | expr | n_distinct -----------------+-----------+---------+------------ s | foo | (a + b) | 11 s | foo | (a * b) | 11 (2 rows) I think it should show the expressions even before the stats have been built. Another issue is that it returns rows for non-expression stats as well. For example: CREATE TABLE foo(a int, b int); CREATE STATISTICS s ON a, b FROM foo; SELECT statistics_name, tablename, expr, n_distinct FROM pg_stats_ext_exprs; statistics_name | tablename | expr | n_distinct -----------------+-----------+------+------------ s | foo | | (1 row) and those values will never be populated, since they're not expressions, so I would expect them to not be shown in the view. So basically, instead of + LEFT JOIN LATERAL ( + SELECT + * + FROM ( + SELECT + unnest(pg_get_statisticsobjdef_expressions(s.oid)) AS expr, + unnest(sd.stxdexpr)::pg_statistic AS a + ) x + ) stat ON sd.stxdexpr IS NOT NULL; perhaps just + JOIN LATERAL ( + SELECT + * + FROM ( + SELECT + unnest(pg_get_statisticsobjdef_expressions(s.oid)) AS expr, + unnest(sd.stxdexpr)::pg_statistic AS a + ) x + ) stat ON true; Regards, Dean