On Thu, 2024-07-18 at 02:09 -0400, Corey Huinker wrote: > v23: > > Split pg_set_relation_stats into two functions: pg_set_relation_stats > with named parameters like it had around v19 and > pg_restore_relations_stats with the variadic parameters it has had in > more recent versions, which processes the variadic parameters and > then makes a call to pg_set_relation_stats. > > Split pg_set_attribute_stats into two functions: > pg_set_attribute_stats with named parameters like it had around v19 > and pg_restore_attribute_stats with the variadic parameters it has > had in more recent versions, which processes the variadic parameters > and then makes a call to pg_set_attribute_stats. > > The intention here is that the named parameters signatures are easier > for ad-hoc use, while the variadic signatures are evergreen and thus > ideal for pg_dump/pg_upgrade.
v23-0001: * I like the split for the reason you mention. I'm not 100% sure that we need both, but from the standpoint of reviewing, it makes things easier. We can always remove one at the last minute if its found to be unnecessary. I also like the names. * Doc build error and malformatting. * I'm not certain that we want all changes to relation stats to be non- transactional. Are there transactional use cases? Should it be an option? Should it be transactional for pg_set_relation_stats() but non- transactional for pg_restore_relation_stats()? * The documentation for the pg_set_attribute_stats() still refers to upgrade scenarios -- shouldn't that be in the pg_restore_attribute_stats() docs? I imagine the pg_set variant to be used for ad-hoc planner stuff rather than upgrades. * For the "WARNING: stat names must be of type text" I think we need an ERROR instead. The calling convention of name/value pairs is broken and we can't safely continue. * The huge list of "else if (strcmp(statname, mc_freqs_name) == 0) ..." seems wasteful and hard to read. I think we already discussed this, what was the reason we can't just use an array to map the arg name to an arg position type OID? * How much error checking did we decide is appropriate? Do we need to check that range_length_hist is always specified with range_empty_frac, or should we just call that the planner's problem if one is specified and the other not? Similarly, range stats for a non-range type. * I think most of the tests should be of pg_set_*_stats(). For pg_restore_, we just want to know that it's translating the name/value pairs reasonably well and throwing WARNINGs when appropriate. Then, for pg_dump tests, it should exercise pg_restore_*_stats() more completely. * It might help to clarify which arguments are important (like n_distinct) vs not. I assume the difference is that it's a non-NULLable column in pg_statistic. * Some arguments, like the relid, just seem absolutely required, and it's weird to just emit a WARNING and return false in that case. * To clarify: a return of "true" means all settings were successfully applied, whereas "false" means that some were applied and some were unrecognized, correct? Or does it also mean that some recognized options may not have been applied? * pg_set_attribute_stats(): why initialize the output tuple nulls array to false? It seems like initializing it to true would be safer. * please use a better name for "k" and add some error checking to make sure it doesn't overrun the available slots. * the pg_statistic tuple is always completely replaced, but the way you can call pg_set_attribute_stats() doesn't imply that -- calling pg_set_attribute_stats(..., most_common_vals => ..., most_common_freqs => ...) looks like it would just replace the most_common_vals+freqs and leave histogram_bounds as it was, but it actually clears histogram_bounds, right? Should we make that work or should we document better that it doesn't? Regards, Jeff Davis