On Thu, 25 Mar 2021 at 19:59, Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > > Attached is an updated patch series, with all the changes discussed > here. I've cleaned up the ndistinct stuff a bit more (essentially > reverting back from GroupExprInfo to GroupVarInfo name), and got rid of > the UpdateStatisticsForTypeChange. >
I've looked over all that and I think it's in pretty good shape. I particularly like how much simpler the ndistinct code has now become. Some (hopefully final) review comments: 1). I don't think index.c is the right place for StatisticsGetRelation(). I appreciate that it is very similar to the adjacent IndexGetRelation() function, but index.c is really only for index-related code, so I think StatisticsGetRelation() should go in statscmds.c 2). Perhaps the error message at statscmds.c:293 should read "expression cannot be used in multivariate statistics because its type %s has no default btree operator class" (i.e., add the word "multivariate", since the same expression *can* be used in univariate statistics even though it has no less-than operator). 3). The comment for ATExecAddStatistics() should probably mention that "ALTER TABLE ADD STATISTICS" isn't a command in the grammar, in a similar way to other similar functions, e.g.: /* * ALTER TABLE ADD STATISTICS * * This is no such command in the grammar, but we use this internally to add * AT_ReAddStatistics subcommands to rebuild extended statistics after a table * column type change. */ 4). The comment at the start of ATPostAlterTypeParse() needs updating to mention CREATE STATISTICS statements. 5). I think ATPostAlterTypeParse() should also attempt to preserve any COMMENTs attached to statistics objects, i.e., something like: --- src/backend/commands/tablecmds.c.orig 2021-03-26 10:39:38.328631864 +0000 +++ src/backend/commands/tablecmds.c 2021-03-26 10:47:21.042279580 +0000 @@ -12619,6 +12619,9 @@ CreateStatsStmt *stmt = (CreateStatsStmt *) stm; AlterTableCmd *newcmd; + /* keep the statistics object's comment */ + stmt->stxcomment = GetComment(oldId, StatisticExtRelationId, 0); + newcmd = makeNode(AlterTableCmd); newcmd->subtype = AT_ReAddStatistics; newcmd->def = (Node *) stmt; 6). Comment typo at extended_stats.c:2532 - s/statitics/statistics/ 7). I don't think that the big XXX comment near the start of estimate_multivariate_ndistinct() is really relevant anymore, now that the code has been simplified and we no longer extract Vars from expressions, so perhaps it can just be deleted. Regards, Dean