> > > I'd also like feedback, though I feel very strongly that we should do what > ANALYZE does. In an upgrade situation, nearly all tables will have stats > imported, which would result in an immediate doubling of pg_class - not the > end of the world, but not great either. > > Given the recent bugs associated with inplace updates and race conditions, > if we don't want to do in-place here, we should also consider getting rid > of it for ANALYZE. I briefly pondered if it would make sense to vertically > partition pg_class into the stable attributes and the attributes that get > modified in-place, but that list is pretty long: relpages, reltuples, > relallvisible, relhasindex, reltoastrelid, relhasrules, relhastriggers, > relfrozenxid, and reminmxid, > > If we don't want to do inplace updates in pg_restore_relation_stats(), > then we could mitigate the bloat with a VACUUM FULL pg_class at the tail > end of the upgrade if stats were enabled. > > >> pg_restore_*_stats() functions. But there's a lot of overlap, so it may >> be worth discussing again whether we should only have one set of >> functions. >> > > For the reason of in-place updates and error tolerance, I think they have > to remain separate functions, but I'm also interested in hearing other's > opinions. > >
A lot of stuff has been committed, but a lot still remains, so I'm going to give a state-of-the-patchset update. WHAT HAS BEEN COMMITTED The functions pg_set_relation_stats(), and pg_set_attribute_stats(). These are new functions allowing the table owner/maintainer to inject arbitrary statistics into a table/index/matview or attribute thereof. The goal is to provide a tool to test "what if" experiments on the planner. The function pg_clear_relation_stats(). It resets pg_class stats variables back to the new-table defaults. However, the potential change to make the default -1 for partitioned tables/indexes means that this function would need to be updated. The function pg_clear_attribute_stats(). This function deletes the pg_statistic row associated with a given attribute. The functions pg_restore_relation_stats() and pg_restore_attribute_stats(). These perform a similar function to their -set counterparts, but in a way more friendly to use via pg_upgrade, and using a parameter syntax that's future-tolerant if not future-proof. NEXT STEPS, PATCHES ATTACHED 0001 Add booleans dumpSchema, dumpData to the dump/restore options structures without removing schemaOnly and dataOnly. This patch is borne of the combinatorial complexity that we would have to deal with if we kept all "should I dump this object?" logic based on schemaOnly, dataOnly, and a new statisticsOnly flag. Better to just flip these negatives to positives and resolve them once. 0002 Based on feedback from Nathan Bossart. This removes schemaOnly and dataOnly from the dump/restore option structure as they are now redundant and could cause confusion. 0003 This adds statistics import to pg_restore and pg_upgrade. This is what all of these patches have been leading up to. 0004 Importing statistics in a pg_upgrade would touch all relations (tables, indexes, matviews) in a database, meaning that each pg_class entry would be updated. If those updates are not inplace (as is done in VACUUM and ANALYZE), then we've just added near-100% bloat to pg_class. To avoid that, we re-introduce inplace updates for pg_restore_relation_stats(). 0005 Depending on the outcome of https://www.postgresql.org/message-id/be7951d424b9c16c761c39b9b2677a57fb697b1f.ca...@j-davis.com, we may have to modify pg_clear_relation_stats() to give a different default relpages depending on the relkind. If that comes to pass, then this patch will be needed. WHAT IS NOT DONE - EXTENDED STATISTICS It is a general consensus in the community that "nobody uses extended statistics", though I've had difficulty getting actual figures to back this up, even from my own employer. Surveying several vendors at PgConf.EU, the highest estimate was that at most 1% of their customers used extended statistics, though more probably should. This reinforces my belief that a feature that would eliminate a major pain point in upgrades for 99% of customers shouldn't be held back by the fact that the other 1% only have a reduced hassle. However, having relation and attribute statistics carry over on major version upgrades presents a slight problem: running vacuumdb --analyze-in-stages after such an upgrade is completely unnecessary for those without extended statistics, and would actually result in _worse_ statistics for the database until the last stage is complete. Granted, we've had great difficulty getting users to know that vacuumdb is a thing that should be run, but word has slowly spread through our own documentation and those "This one simple trick will make your postgres go fast post-upgrade" blog posts. Those posts will continue to lurk in search results long after this feature goes into release, and it would be a rude surprise to users to find out that the extra work they put in to learn about a feature that helped their upgrade in 17 was suddenly detrimental (albeit temporarily) in 18. We should never punish people for only being a little-bit current in their knowledge. Moreover, this surprise would persist even after we add extended statistics import function functionality. I presented this problem to several people at PgConf.EU, and the consensus least-bad solution was that vacuumdb should filter out tables that are not missing any statistics when using options --analyze, --analyze-only, and --analyze-in-stages, with an additional flag for now called --force-analyze to restore the un-filtered functionality. This gives the outcome tree: 1. Users who do not have extended statistics and do not use (or not even know about) vacuumdb will be blissfully unaware, and will get better post-upgrade performance. 2. Users who do not have extended statistics but use vacuumdb --analyze-in-stages will be pleasantly surprised that the vacuumdb run is almost a no-op, and completes quickly. Those who are surprised by this and re-run vacuumdb --analyze-in-stages will get another no-op. 3. Users who have extended statistics and use vacuumdb --analyze-in-stages will get a quicker vacuumdb run, as only the tables with extended stats will pass the filter. Subsequent re-runs of vacuumdb --analyze-in-stages would be the no-op. 4. Users who have extended statistics and don't use vacuumdb will still get better performance than they would have without any stats imported. In case anyone is curious, I'm defining "missing stats" as a table/matview with any of the following: 1. A table with an attribute that lacks a corresponding pg_statistic row. 2. A table with an index with an expression attribute that lacks a corresponding pg_statistic row (non-expression attributes just borrow the pg_statistic row from the table's attribute). 3. A table with at least one extended statistic that does not have a corresponding pg_statistic_ext_data row. Note that none of these criteria are concerned with the substance of the statistics (ex. pg_statistic row should have mcv stats but does not), merely their row-existence. Some rejected alternative solutions were: 1. Adding a new option --analyze-missing-stats. While simple, few people would learn about it, knowledge of it would be drowned out by the aforementioned sea of existing blog posts. 2. Adding --analyze-missing-stats and making --analyze-in-stages fail with an error message educating the user about --analyze-missing-stats. Users might not see the error, existing tooling wouldn't be able to act on the error, and there are legitimate non-upgrade uses of --analyze-in-stages. MAIN CONCERN GOING FORWARD This change to vacuumdb will require some reworking of the vacuum_one_database() function so that the list of tables analyzed is preserved across the stages, as subsequent stages runs won't be able to detect which tables were previously missing stats.
From b329af5d656873938a7fc77cd9244c32e078adf3 Mon Sep 17 00:00:00 2001 From: Corey Huinker <corey.huinker@gmail.com> Date: Sat, 4 May 2024 04:52:38 -0400 Subject: [PATCH v30 1/4] Add derivative flags dumpSchema, dumpData. User-set flags --schema-only and --data-only are often consulted by various operations to determine if they should be skipped or not. While this logic works when there are only two mutually-exclusive -only options, it will get progressively more confusing when more are added. In anticipation of this, create the flags dumpSchema and dumpData which are derivative of the existing options schemaOnly and dataOnly. This allows us to restate current skip-this-section tests in terms of what is enabled, rather than checking if the other -only mode is turned off. --- src/bin/pg_dump/pg_backup.h | 8 ++ src/bin/pg_dump/pg_dump.c | 186 ++++++++++++++++++----------------- src/bin/pg_dump/pg_restore.c | 4 + 3 files changed, 107 insertions(+), 91 deletions(-) diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h index 68ae2970ad..9bd3266a63 100644 --- a/src/bin/pg_dump/pg_backup.h +++ b/src/bin/pg_dump/pg_backup.h @@ -158,6 +158,10 @@ typedef struct _restoreOptions int enable_row_security; int sequence_data; /* dump sequence data even in schema-only mode */ int binary_upgrade; + + /* flags derived entirely from the user-settable flags */ + bool dumpSchema; + bool dumpData; } RestoreOptions; typedef struct _dumpOptions @@ -204,6 +208,10 @@ typedef struct _dumpOptions int sequence_data; /* dump sequence data even in schema-only mode */ int do_nothing; + + /* flags derived entirely from the user-settable flags */ + bool dumpSchema; + bool dumpData; } DumpOptions; /* diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index d8c6330732..3eb8523fd7 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -793,6 +793,10 @@ main(int argc, char **argv) if (dopt.if_exists && !dopt.outputClean) pg_fatal("option --if-exists requires option -c/--clean"); + /* set derivative flags */ + dopt.dumpSchema = (!dopt.dataOnly); + dopt.dumpData = (!dopt.schemaOnly); + /* * --inserts are already implied above if --column-inserts or * --rows-per-insert were specified. @@ -975,7 +979,7 @@ main(int argc, char **argv) * -s means "schema only" and LOs are data, not schema, so we never * include LOs when -s is used. */ - if (dopt.include_everything && !dopt.schemaOnly && !dopt.dontOutputLOs) + if (dopt.include_everything && dopt.dumpData && !dopt.dontOutputLOs) dopt.outputLOs = true; /* @@ -989,15 +993,15 @@ main(int argc, char **argv) */ tblinfo = getSchemaData(fout, &numTables); - if (!dopt.schemaOnly) + if (dopt.dumpData) { getTableData(&dopt, tblinfo, numTables, 0); buildMatViewRefreshDependencies(fout); - if (dopt.dataOnly) + if (!dopt.dumpSchema) getTableDataFKConstraints(); } - if (dopt.schemaOnly && dopt.sequence_data) + if (!dopt.dumpData && dopt.sequence_data) getTableData(&dopt, tblinfo, numTables, RELKIND_SEQUENCE); /* @@ -4159,8 +4163,8 @@ dumpPolicy(Archive *fout, const PolicyInfo *polinfo) const char *cmd; char *tag; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; /* @@ -4375,8 +4379,8 @@ dumpPublication(Archive *fout, const PublicationInfo *pubinfo) char *qpubname; bool first = true; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; delq = createPQExpBuffer(); @@ -4690,8 +4694,8 @@ dumpPublicationNamespace(Archive *fout, const PublicationSchemaInfo *pubsinfo) PQExpBuffer query; char *tag; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; tag = psprintf("%s %s", pubinfo->dobj.name, schemainfo->dobj.name); @@ -4733,8 +4737,8 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo) PQExpBuffer query; char *tag; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; tag = psprintf("%s %s", pubinfo->dobj.name, tbinfo->dobj.name); @@ -5119,8 +5123,8 @@ dumpSubscriptionTable(Archive *fout, const SubRelInfo *subrinfo) PQExpBuffer query; char *tag; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; Assert(fout->dopt->binary_upgrade && fout->remoteVersion >= 170000); @@ -5193,8 +5197,8 @@ dumpSubscription(Archive *fout, const SubscriptionInfo *subinfo) int i; char two_phase_disabled[] = {LOGICALREP_TWOPHASE_STATE_DISABLED, '\0'}; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; delq = createPQExpBuffer(); @@ -7345,8 +7349,8 @@ getPartitioningInfo(Archive *fout) /* hash partitioning didn't exist before v11 */ if (fout->remoteVersion < 110000) return; - /* needn't bother if schema-only dump */ - if (fout->dopt->schemaOnly) + /* needn't bother if not dumping data */ + if (!fout->dopt->dumpData) return; query = createPQExpBuffer(); @@ -8998,7 +9002,7 @@ getTableAttrs(Archive *fout, TableInfo *tblinfo, int numTables) * Now get info about column defaults. This is skipped for a data-only * dump, as it is only needed for table schemas. */ - if (!dopt->dataOnly && tbloids->len > 1) + if (dopt->dumpSchema && tbloids->len > 1) { AttrDefInfo *attrdefs; int numDefaults; @@ -9128,7 +9132,7 @@ getTableAttrs(Archive *fout, TableInfo *tblinfo, int numTables) * Get info about table CHECK constraints. This is skipped for a * data-only dump, as it is only needed for table schemas. */ - if (!dopt->dataOnly && checkoids->len > 2) + if (dopt->dumpSchema && checkoids->len > 2) { ConstraintInfo *constrs; int numConstrs; @@ -10038,13 +10042,13 @@ dumpCommentExtended(Archive *fout, const char *type, /* Comments are schema not data ... except LO comments are data */ if (strcmp(type, "LARGE OBJECT") != 0) { - if (dopt->dataOnly) + if (!dopt->dumpSchema) return; } else { /* We do dump LO comments in binary-upgrade mode */ - if (dopt->schemaOnly && !dopt->binary_upgrade) + if (!dopt->dumpData && !dopt->binary_upgrade) return; } @@ -10151,7 +10155,7 @@ dumpTableComment(Archive *fout, const TableInfo *tbinfo, return; /* Comments are SCHEMA not data */ - if (dopt->dataOnly) + if (!dopt->dumpSchema) return; /* Search for comments associated with relation, using table */ @@ -10597,8 +10601,8 @@ dumpNamespace(Archive *fout, const NamespaceInfo *nspinfo) PQExpBuffer delq; char *qnspname; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -10674,8 +10678,8 @@ dumpExtension(Archive *fout, const ExtensionInfo *extinfo) PQExpBuffer delq; char *qextname; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -10799,8 +10803,8 @@ dumpType(Archive *fout, const TypeInfo *tyinfo) { DumpOptions *dopt = fout->dopt; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; /* Dump out in proper style */ @@ -11910,8 +11914,8 @@ dumpShellType(Archive *fout, const ShellTypeInfo *stinfo) DumpOptions *dopt = fout->dopt; PQExpBuffer q; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -11962,8 +11966,8 @@ dumpProcLang(Archive *fout, const ProcLangInfo *plang) FuncInfo *inlineInfo = NULL; FuncInfo *validatorInfo = NULL; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; /* @@ -12170,8 +12174,8 @@ dumpFunc(Archive *fout, const FuncInfo *finfo) int nconfigitems = 0; const char *keyword; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; query = createPQExpBuffer(); @@ -12562,8 +12566,8 @@ dumpCast(Archive *fout, const CastInfo *cast) const char *sourceType; const char *targetType; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; /* Cannot dump if we don't have the cast function's info */ @@ -12668,8 +12672,8 @@ dumpTransform(Archive *fout, const TransformInfo *transform) char *lanname; const char *transformType; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; /* Cannot dump if we don't have the transform functions' info */ @@ -12817,8 +12821,8 @@ dumpOpr(Archive *fout, const OprInfo *oprinfo) char *oprregproc; char *oprref; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; /* @@ -13104,8 +13108,8 @@ dumpAccessMethod(Archive *fout, const AccessMethodInfo *aminfo) PQExpBuffer delq; char *qamname; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -13207,8 +13211,8 @@ dumpOpclass(Archive *fout, const OpclassInfo *opcinfo) bool needComma; int i; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; query = createPQExpBuffer(); @@ -13478,8 +13482,8 @@ dumpOpfamily(Archive *fout, const OpfamilyInfo *opfinfo) bool needComma; int i; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; query = createPQExpBuffer(); @@ -13685,8 +13689,8 @@ dumpCollation(Archive *fout, const CollInfo *collinfo) const char *colllocale; const char *collicurules; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; query = createPQExpBuffer(); @@ -13939,8 +13943,8 @@ dumpConversion(Archive *fout, const ConvInfo *convinfo) const char *conproc; bool condefault; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; query = createPQExpBuffer(); @@ -14087,8 +14091,8 @@ dumpAgg(Archive *fout, const AggInfo *agginfo) const char *proparallel; char defaultfinalmodify; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; query = createPQExpBuffer(); @@ -14417,8 +14421,8 @@ dumpTSParser(Archive *fout, const TSParserInfo *prsinfo) PQExpBuffer delq; char *qprsname; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -14485,8 +14489,8 @@ dumpTSDictionary(Archive *fout, const TSDictInfo *dictinfo) char *nspname; char *tmplname; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -14561,8 +14565,8 @@ dumpTSTemplate(Archive *fout, const TSTemplateInfo *tmplinfo) PQExpBuffer delq; char *qtmplname; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -14627,8 +14631,8 @@ dumpTSConfig(Archive *fout, const TSConfigInfo *cfginfo) int i_tokenname; int i_dictname; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -14739,8 +14743,8 @@ dumpForeignDataWrapper(Archive *fout, const FdwInfo *fdwinfo) PQExpBuffer delq; char *qfdwname; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -14812,8 +14816,8 @@ dumpForeignServer(Archive *fout, const ForeignServerInfo *srvinfo) char *qsrvname; char *fdwname; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -15003,8 +15007,8 @@ dumpDefaultACL(Archive *fout, const DefaultACLInfo *daclinfo) PQExpBuffer tag; const char *type; - /* Do nothing in data-only dump, or if we're skipping ACLs */ - if (dopt->dataOnly || dopt->aclsSkip) + /* Do nothing if not dumping schema, or if we're skipping ACLs */ + if (!dopt->dumpSchema || dopt->aclsSkip) return; q = createPQExpBuffer(); @@ -15104,7 +15108,7 @@ dumpACL(Archive *fout, DumpId objDumpId, DumpId altDumpId, return InvalidDumpId; /* --data-only skips ACLs *except* large object ACLs */ - if (dopt->dataOnly && strcmp(type, "LARGE OBJECT") != 0) + if (!dopt->dumpSchema && strcmp(type, "LARGE OBJECT") != 0) return InvalidDumpId; sql = createPQExpBuffer(); @@ -15233,13 +15237,13 @@ dumpSecLabel(Archive *fout, const char *type, const char *name, */ if (strcmp(type, "LARGE OBJECT") != 0) { - if (dopt->dataOnly) + if (!dopt->dumpSchema) return; } else { /* We do dump large object security labels in binary-upgrade mode */ - if (dopt->schemaOnly && !dopt->binary_upgrade) + if (!dopt->dumpData && !dopt->binary_upgrade) return; } @@ -15307,7 +15311,7 @@ dumpTableSecLabel(Archive *fout, const TableInfo *tbinfo, const char *reltypenam return; /* SecLabel are SCHEMA not data */ - if (dopt->dataOnly) + if (!dopt->dumpSchema) return; /* Search for comments associated with relation, using table */ @@ -15546,8 +15550,8 @@ dumpTable(Archive *fout, const TableInfo *tbinfo) DumpId tableAclDumpId = InvalidDumpId; char *namecopy; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; if (tbinfo->dobj.dump & DUMP_COMPONENT_DEFINITION) @@ -16643,8 +16647,8 @@ dumpTableAttach(Archive *fout, const TableAttachInfo *attachinfo) PGresult *res; char *partbound; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -16715,8 +16719,8 @@ dumpAttrDef(Archive *fout, const AttrDefInfo *adinfo) char *tag; char *foreign; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; /* Skip if not "separate"; it was dumped in the table's definition */ @@ -16804,8 +16808,8 @@ dumpIndex(Archive *fout, const IndxInfo *indxinfo) char *qindxname; char *qqindxname; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -16937,8 +16941,8 @@ dumpIndex(Archive *fout, const IndxInfo *indxinfo) static void dumpIndexAttach(Archive *fout, const IndexAttachInfo *attachinfo) { - /* Do nothing in data-only dump */ - if (fout->dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!fout->dopt->dumpSchema) return; if (attachinfo->partitionIdx->dobj.dump & DUMP_COMPONENT_DEFINITION) @@ -16984,8 +16988,8 @@ dumpStatisticsExt(Archive *fout, const StatsExtInfo *statsextinfo) PGresult *res; char *stxdef; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -17060,8 +17064,8 @@ dumpConstraint(Archive *fout, const ConstraintInfo *coninfo) char *tag = NULL; char *foreign; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; q = createPQExpBuffer(); @@ -17797,8 +17801,8 @@ dumpTrigger(Archive *fout, const TriggerInfo *tginfo) char *qtabname; char *tag; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; query = createPQExpBuffer(); @@ -17919,8 +17923,8 @@ dumpEventTrigger(Archive *fout, const EventTriggerInfo *evtinfo) PQExpBuffer delqry; char *qevtname; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; query = createPQExpBuffer(); @@ -18010,8 +18014,8 @@ dumpRule(Archive *fout, const RuleInfo *rinfo) PGresult *res; char *tag; - /* Do nothing in data-only dump */ - if (dopt->dataOnly) + /* Do nothing if not dumping schema */ + if (!dopt->dumpSchema) return; /* diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c index df119591cc..3475168a64 100644 --- a/src/bin/pg_dump/pg_restore.c +++ b/src/bin/pg_dump/pg_restore.c @@ -360,6 +360,10 @@ main(int argc, char **argv) if (opts->single_txn && numWorkers > 1) pg_fatal("cannot specify both --single-transaction and multiple jobs"); + /* set derivative flags */ + opts->dumpSchema = (opts->dataOnly != 1); + opts->dumpData = (opts->schemaOnly != 1); + opts->disable_triggers = disable_triggers; opts->enable_row_security = enable_row_security; opts->noDataForFailedTables = no_data_for_failed_tables; -- 2.47.0
From 1adbdb01db2a0498d247fd2f3a860945a12ab228 Mon Sep 17 00:00:00 2001 From: Corey Huinker <corey.huinker@gmail.com> Date: Mon, 4 Nov 2024 14:02:57 -0500 Subject: [PATCH v30 4/4] Enable in-place updates for pg_restore_relation_stats. This matches the behavior of the ANALYZE command, and would avoid bloating pg_class in an upgrade situation wherein pg_restore_relation_stats would be called for nearly every relation in the database. --- src/backend/statistics/relation_stats.c | 72 +++++++++++++++++++------ 1 file changed, 55 insertions(+), 17 deletions(-) diff --git a/src/backend/statistics/relation_stats.c b/src/backend/statistics/relation_stats.c index ed5dea2e05..939ad56fc2 100644 --- a/src/backend/statistics/relation_stats.c +++ b/src/backend/statistics/relation_stats.c @@ -22,6 +22,7 @@ #include "statistics/stat_utils.h" #include "utils/fmgrprotos.h" #include "utils/syscache.h" +#include "utils/fmgroids.h" #define DEFAULT_RELPAGES Int32GetDatum(0) #define DEFAULT_RELTUPLES Float4GetDatum(-1.0) @@ -50,13 +51,14 @@ static struct StatsArgInfo relarginfo[] = [NUM_RELATION_STATS_ARGS] = {0} }; -static bool relation_statistics_update(FunctionCallInfo fcinfo, int elevel); +static bool relation_statistics_update(FunctionCallInfo fcinfo, int elevel, + bool inplace); /* * Internal function for modifying statistics for a relation. */ static bool -relation_statistics_update(FunctionCallInfo fcinfo, int elevel) +relation_statistics_update(FunctionCallInfo fcinfo, int elevel, bool inplace) { Oid reloid; Relation crel; @@ -68,6 +70,7 @@ relation_statistics_update(FunctionCallInfo fcinfo, int elevel) int ncols = 0; TupleDesc tupdesc; bool result = true; + void *inplace_state; stats_check_required_arg(fcinfo, relarginfo, RELATION_ARG); reloid = PG_GETARG_OID(RELATION_ARG); @@ -81,7 +84,20 @@ relation_statistics_update(FunctionCallInfo fcinfo, int elevel) crel = table_open(RelationRelationId, RowExclusiveLock); tupdesc = RelationGetDescr(crel); - ctup = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(reloid)); + if (inplace) + { + ScanKeyData key[1]; + + ctup = NULL; + + ScanKeyInit(&key[0], Anum_pg_class_oid, BTEqualStrategyNumber, F_OIDEQ, + ObjectIdGetDatum(reloid)); + systable_inplace_update_begin(crel, ClassOidIndexId, true, NULL, 1, key, + &ctup, &inplace_state); + } + else + ctup = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(reloid)); + if (!HeapTupleIsValid(ctup)) { ereport(elevel, @@ -112,8 +128,13 @@ relation_statistics_update(FunctionCallInfo fcinfo, int elevel) } else if (relpages != pgcform->relpages) { - replaces[ncols] = Anum_pg_class_relpages; - values[ncols] = Int32GetDatum(relpages); + if (inplace) + pgcform->relpages = relpages; + else + { + replaces[ncols] = Anum_pg_class_relpages; + values[ncols] = Int32GetDatum(relpages); + } ncols++; } } @@ -131,8 +152,13 @@ relation_statistics_update(FunctionCallInfo fcinfo, int elevel) } else if (reltuples != pgcform->reltuples) { - replaces[ncols] = Anum_pg_class_reltuples; - values[ncols] = Float4GetDatum(reltuples); + if (inplace) + pgcform->reltuples = reltuples; + else + { + replaces[ncols] = Anum_pg_class_reltuples; + values[ncols] = Float4GetDatum(reltuples); + } ncols++; } @@ -151,8 +177,13 @@ relation_statistics_update(FunctionCallInfo fcinfo, int elevel) } else if (relallvisible != pgcform->relallvisible) { - replaces[ncols] = Anum_pg_class_relallvisible; - values[ncols] = Int32GetDatum(relallvisible); + if (inplace) + pgcform->relallvisible = relallvisible; + else + { + replaces[ncols] = Anum_pg_class_relallvisible; + values[ncols] = Int32GetDatum(relallvisible); + } ncols++; } } @@ -160,13 +191,20 @@ relation_statistics_update(FunctionCallInfo fcinfo, int elevel) /* only update pg_class if there is a meaningful change */ if (ncols > 0) { - HeapTuple newtup; + if (inplace) + systable_inplace_update_finish(inplace_state, ctup); + else + { + HeapTuple newtup; - newtup = heap_modify_tuple_by_cols(ctup, tupdesc, ncols, replaces, values, - nulls); - CatalogTupleUpdate(crel, &newtup->t_self, newtup); - heap_freetuple(newtup); + newtup = heap_modify_tuple_by_cols(ctup, tupdesc, ncols, replaces, values, + nulls); + CatalogTupleUpdate(crel, &newtup->t_self, newtup); + heap_freetuple(newtup); + } } + else if (inplace) + systable_inplace_update_cancel(inplace_state); /* release the lock, consistent with vac_update_relstats() */ table_close(crel, RowExclusiveLock); @@ -182,7 +220,7 @@ relation_statistics_update(FunctionCallInfo fcinfo, int elevel) Datum pg_set_relation_stats(PG_FUNCTION_ARGS) { - relation_statistics_update(fcinfo, ERROR); + relation_statistics_update(fcinfo, ERROR, false); PG_RETURN_VOID(); } @@ -206,7 +244,7 @@ pg_clear_relation_stats(PG_FUNCTION_ARGS) newfcinfo->args[3].value = DEFAULT_RELALLVISIBLE; newfcinfo->args[3].isnull = false; - relation_statistics_update(newfcinfo, ERROR); + relation_statistics_update(newfcinfo, ERROR, false); PG_RETURN_VOID(); } @@ -224,7 +262,7 @@ pg_restore_relation_stats(PG_FUNCTION_ARGS) relarginfo, WARNING)) result = false; - if (!relation_statistics_update(positional_fcinfo, WARNING)) + if (!relation_statistics_update(positional_fcinfo, WARNING, true)) result = false; PG_RETURN_BOOL(result); -- 2.47.0
From beca392b00f53b33438f0d019286239e45631cea Mon Sep 17 00:00:00 2001 From: Corey Huinker <corey.huinker@gmail.com> Date: Fri, 1 Nov 2024 11:56:34 -0400 Subject: [PATCH v30 2/4] Remove schemaOnly, dataOnly from dump/restore options structures. The user-facing flags still exist, but the results of those flags are already resolved into dumpSchema and dumpData. All logic about which objects to dump which previously used schemaOnly and dataOnly in the negative (ex. we should dump large objects because schemaOnly is NOT set), which would have gotten unweildly when a third option (statisticsOnly) was added. --- src/bin/pg_dump/pg_backup.h | 4 ---- src/bin/pg_dump/pg_backup_archiver.c | 26 +++++++++++++------------- src/bin/pg_dump/pg_dump.c | 26 ++++++++++++++------------ src/bin/pg_dump/pg_restore.c | 15 +++++++++------ 4 files changed, 36 insertions(+), 35 deletions(-) diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h index 9bd3266a63..f74e848714 100644 --- a/src/bin/pg_dump/pg_backup.h +++ b/src/bin/pg_dump/pg_backup.h @@ -116,8 +116,6 @@ typedef struct _restoreOptions int strict_names; const char *filename; - int dataOnly; - int schemaOnly; int dumpSections; int verbose; int aclsSkip; @@ -171,8 +169,6 @@ typedef struct _dumpOptions int binary_upgrade; /* various user-settable parameters */ - bool schemaOnly; - bool dataOnly; int dumpSections; /* bitmask of chosen sections */ bool aclsSkip; const char *lockWaitTimeout; diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c index 8c20c263c4..32c8e588f5 100644 --- a/src/bin/pg_dump/pg_backup_archiver.c +++ b/src/bin/pg_dump/pg_backup_archiver.c @@ -165,8 +165,8 @@ dumpOptionsFromRestoreOptions(RestoreOptions *ropt) dopt->cparams.username = ropt->cparams.username ? pg_strdup(ropt->cparams.username) : NULL; dopt->cparams.promptPassword = ropt->cparams.promptPassword; dopt->outputClean = ropt->dropSchema; - dopt->dataOnly = ropt->dataOnly; - dopt->schemaOnly = ropt->schemaOnly; + dopt->dumpData = ropt->dumpData; + dopt->dumpSchema = ropt->dumpSchema; dopt->if_exists = ropt->if_exists; dopt->column_inserts = ropt->column_inserts; dopt->dumpSections = ropt->dumpSections; @@ -419,12 +419,12 @@ RestoreArchive(Archive *AHX) * Work out if we have an implied data-only restore. This can happen if * the dump was data only or if the user has used a toc list to exclude * all of the schema data. All we do is look for schema entries - if none - * are found then we set the dataOnly flag. + * are found then we set the data-only flag. * * We could scan for wanted TABLE entries, but that is not the same as - * dataOnly. At this stage, it seems unnecessary (6-Mar-2001). + * data-only. At this stage, it seems unnecessary (6-Mar-2001). */ - if (!ropt->dataOnly) + if (ropt->dumpSchema) { int impliedDataOnly = 1; @@ -438,7 +438,7 @@ RestoreArchive(Archive *AHX) } if (impliedDataOnly) { - ropt->dataOnly = impliedDataOnly; + ropt->dumpSchema = false; pg_log_info("implied data-only restore"); } } @@ -824,7 +824,7 @@ restore_toc_entry(ArchiveHandle *AH, TocEntry *te, bool is_parallel) /* Dump any relevant dump warnings to stderr */ if (!ropt->suppressDumpWarnings && strcmp(te->desc, "WARNING") == 0) { - if (!ropt->dataOnly && te->defn != NULL && strlen(te->defn) != 0) + if (ropt->dumpSchema && te->defn != NULL && strlen(te->defn) != 0) pg_log_warning("warning from original dump file: %s", te->defn); else if (te->copyStmt != NULL && strlen(te->copyStmt) != 0) pg_log_warning("warning from original dump file: %s", te->copyStmt); @@ -1090,7 +1090,7 @@ _disableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te) RestoreOptions *ropt = AH->public.ropt; /* This hack is only needed in a data-only restore */ - if (!ropt->dataOnly || !ropt->disable_triggers) + if (ropt->dumpSchema || !ropt->disable_triggers) return; pg_log_info("disabling triggers for %s", te->tag); @@ -1116,7 +1116,7 @@ _enableTriggersIfNecessary(ArchiveHandle *AH, TocEntry *te) RestoreOptions *ropt = AH->public.ropt; /* This hack is only needed in a data-only restore */ - if (!ropt->dataOnly || !ropt->disable_triggers) + if (ropt->dumpSchema || !ropt->disable_triggers) return; pg_log_info("enabling triggers for %s", te->tag); @@ -3148,12 +3148,12 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH) return 0; /* Mask it if we only want schema */ - if (ropt->schemaOnly) + if (!ropt->dumpData) { /* - * The sequence_data option overrides schemaOnly for SEQUENCE SET. + * The sequence_data option overrides schema-only for SEQUENCE SET. * - * In binary-upgrade mode, even with schemaOnly set, we do not mask + * In binary-upgrade mode, even with schema-only set, we do not mask * out large objects. (Only large object definitions, comments and * other metadata should be generated in binary-upgrade mode, not the * actual data, but that need not concern us here.) @@ -3172,7 +3172,7 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH) } /* Mask it if we only want data */ - if (ropt->dataOnly) + if (!ropt->dumpSchema) res = res & REQ_DATA; return res; diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 3eb8523fd7..d5d2e51be2 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -425,6 +425,8 @@ main(int argc, char **argv) char *compression_algorithm_str = "none"; char *error_detail = NULL; bool user_compression_defined = false; + bool data_only = false; + bool schema_only = false; DataDirSyncMethod sync_method = DATA_DIR_SYNC_METHOD_FSYNC; static DumpOptions dopt; @@ -541,7 +543,7 @@ main(int argc, char **argv) switch (c) { case 'a': /* Dump data only */ - dopt.dataOnly = true; + data_only = true; break; case 'b': /* Dump LOs */ @@ -614,7 +616,7 @@ main(int argc, char **argv) break; case 's': /* dump schema only */ - dopt.schemaOnly = true; + schema_only = true; break; case 'S': /* Username for superuser in plain text output */ @@ -778,24 +780,24 @@ main(int argc, char **argv) if (dopt.binary_upgrade) dopt.sequence_data = 1; - if (dopt.dataOnly && dopt.schemaOnly) + if (data_only && schema_only) pg_fatal("options -s/--schema-only and -a/--data-only cannot be used together"); - if (dopt.schemaOnly && foreign_servers_include_patterns.head != NULL) + if (schema_only && foreign_servers_include_patterns.head != NULL) pg_fatal("options -s/--schema-only and --include-foreign-data cannot be used together"); if (numWorkers > 1 && foreign_servers_include_patterns.head != NULL) pg_fatal("option --include-foreign-data is not supported with parallel backup"); - if (dopt.dataOnly && dopt.outputClean) + if (data_only && dopt.outputClean) pg_fatal("options -c/--clean and -a/--data-only cannot be used together"); if (dopt.if_exists && !dopt.outputClean) pg_fatal("option --if-exists requires option -c/--clean"); /* set derivative flags */ - dopt.dumpSchema = (!dopt.dataOnly); - dopt.dumpData = (!dopt.schemaOnly); + dopt.dumpSchema = (!data_only); + dopt.dumpData = (!schema_only); /* * --inserts are already implied above if --column-inserts or @@ -1093,8 +1095,8 @@ main(int argc, char **argv) ropt->cparams.username = dopt.cparams.username ? pg_strdup(dopt.cparams.username) : NULL; ropt->cparams.promptPassword = dopt.cparams.promptPassword; ropt->dropSchema = dopt.outputClean; - ropt->dataOnly = dopt.dataOnly; - ropt->schemaOnly = dopt.schemaOnly; + ropt->dumpData = dopt.dumpData; + ropt->dumpSchema = dopt.dumpSchema; ropt->if_exists = dopt.if_exists; ropt->column_inserts = dopt.column_inserts; ropt->dumpSections = dopt.dumpSections; @@ -1989,7 +1991,7 @@ selectDumpableType(TypeInfo *tyinfo, Archive *fout) * Mark a default ACL as to be dumped or not * * For per-schema default ACLs, dump if the schema is to be dumped. - * Otherwise dump if we are dumping "everything". Note that dataOnly + * Otherwise dump if we are dumping "everything". Note that data-only * and aclsSkip are checked separately. */ static void @@ -17414,7 +17416,7 @@ collectSequences(Archive *fout) if (fout->remoteVersion < 100000) return; else if (fout->remoteVersion < 180000 || - (fout->dopt->schemaOnly && !fout->dopt->sequence_data)) + (!fout->dopt->dumpData && !fout->dopt->sequence_data)) query = "SELECT seqrelid, format_type(seqtypid, NULL), " "seqstart, seqincrement, " "seqmax, seqmin, " @@ -18281,7 +18283,7 @@ processExtensionTables(Archive *fout, ExtensionInfo extinfo[], * objects for them, ensuring their data will be dumped even though the * tables themselves won't be. * - * Note that we create TableDataInfo objects even in schemaOnly mode, ie, + * Note that we create TableDataInfo objects even in schema-only mode, ie, * user data in a configuration table is treated like schema data. This * seems appropriate since system data in a config table would get * reloaded by CREATE EXTENSION. If the extension is not listed in the diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c index 3475168a64..f6e04e3a3b 100644 --- a/src/bin/pg_dump/pg_restore.c +++ b/src/bin/pg_dump/pg_restore.c @@ -77,6 +77,9 @@ main(int argc, char **argv) static int no_subscriptions = 0; static int strict_names = 0; + bool data_only = false; + bool schema_only = false; + struct option cmdopts[] = { {"clean", 0, NULL, 'c'}, {"create", 0, NULL, 'C'}, @@ -161,7 +164,7 @@ main(int argc, char **argv) switch (c) { case 'a': /* Dump data only */ - opts->dataOnly = 1; + data_only = true; break; case 'c': /* clean (i.e., drop) schema prior to create */ opts->dropSchema = 1; @@ -237,7 +240,7 @@ main(int argc, char **argv) simple_string_list_append(&opts->triggerNames, optarg); break; case 's': /* dump schema only */ - opts->schemaOnly = 1; + schema_only = true; break; case 'S': /* Superuser username */ if (strlen(optarg) != 0) @@ -340,10 +343,10 @@ main(int argc, char **argv) opts->useDB = 1; } - if (opts->dataOnly && opts->schemaOnly) + if (data_only && schema_only) pg_fatal("options -s/--schema-only and -a/--data-only cannot be used together"); - if (opts->dataOnly && opts->dropSchema) + if (data_only && opts->dropSchema) pg_fatal("options -c/--clean and -a/--data-only cannot be used together"); if (opts->single_txn && opts->txn_size > 0) @@ -361,8 +364,8 @@ main(int argc, char **argv) pg_fatal("cannot specify both --single-transaction and multiple jobs"); /* set derivative flags */ - opts->dumpSchema = (opts->dataOnly != 1); - opts->dumpData = (opts->schemaOnly != 1); + opts->dumpSchema = (!data_only); + opts->dumpData = (!schema_only); opts->disable_triggers = disable_triggers; opts->enable_row_security = enable_row_security; -- 2.47.0
From 85969bb9df4049b175ada9f613efd55e71055f7e Mon Sep 17 00:00:00 2001 From: Corey Huinker <corey.huinker@gmail.com> Date: Sat, 16 Mar 2024 17:21:10 -0400 Subject: [PATCH v30 3/4] Enable dumping of table/index stats in pg_dump. For each table/matview/index dumped, it will generate a statement that calls pg_set_relation_stats(), and it will generate a series of statements that call pg_set_attribute_stats(), one per attribute. These statements will restore the statistics of the current system onto the destination system. Adds the command-line options -X / --statistics-only, which are mutually exclusive to --schema-only and --data-only. Statistics are not dumped when --schema-only is specified, except during a binary upgrade. As is the pattern with pg_dump options, staistics can be disabled using --no-statistics. Table statistics are dumped in the data section. This is true even if dumping stats in a binary upgrade. Index and Materialized View statistics are dumped in the post-data section. --- src/bin/pg_dump/pg_backup.h | 4 + src/bin/pg_dump/pg_backup_archiver.c | 5 + src/bin/pg_dump/pg_dump.c | 395 ++++++++++++++++++++++++++- src/bin/pg_dump/pg_dump.h | 8 + src/bin/pg_dump/pg_dump_sort.c | 5 + src/bin/pg_dump/pg_dumpall.c | 5 + src/bin/pg_dump/pg_restore.c | 18 +- src/bin/pg_dump/t/001_basic.pl | 18 ++ doc/src/sgml/ref/pg_dump.sgml | 36 ++- doc/src/sgml/ref/pg_restore.sgml | 31 ++- 10 files changed, 509 insertions(+), 16 deletions(-) diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h index f74e848714..b03b69ff00 100644 --- a/src/bin/pg_dump/pg_backup.h +++ b/src/bin/pg_dump/pg_backup.h @@ -113,6 +113,7 @@ typedef struct _restoreOptions int no_publications; /* Skip publication entries */ int no_security_labels; /* Skip security label entries */ int no_subscriptions; /* Skip subscription entries */ + int no_statistics; /* Skip statistics import */ int strict_names; const char *filename; @@ -160,6 +161,7 @@ typedef struct _restoreOptions /* flags derived entirely from the user-settable flags */ bool dumpSchema; bool dumpData; + bool dumpStatistics; } RestoreOptions; typedef struct _dumpOptions @@ -182,6 +184,7 @@ typedef struct _dumpOptions int no_security_labels; int no_publications; int no_subscriptions; + int no_statistics; int no_toast_compression; int no_unlogged_table_data; int serializable_deferrable; @@ -208,6 +211,7 @@ typedef struct _dumpOptions /* flags derived entirely from the user-settable flags */ bool dumpSchema; bool dumpData; + bool dumpStatistics; } DumpOptions; /* diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c index 32c8e588f5..4edcb4eca5 100644 --- a/src/bin/pg_dump/pg_backup_archiver.c +++ b/src/bin/pg_dump/pg_backup_archiver.c @@ -2958,6 +2958,10 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH) if (ropt->no_subscriptions && strcmp(te->desc, "SUBSCRIPTION") == 0) return 0; + /* If it's a stats dump, maybe ignore it */ + if (ropt->no_statistics && strcmp(te->desc, "STATISTICS") == 0) + return 0; + /* Ignore it if section is not to be dumped/restored */ switch (curSection) { @@ -2987,6 +2991,7 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH) */ if (strcmp(te->desc, "ACL") == 0 || strcmp(te->desc, "COMMENT") == 0 || + strcmp(te->desc, "STATISTICS") == 0 || strcmp(te->desc, "SECURITY LABEL") == 0) { /* Database properties react to createDB, not selectivity options. */ diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index d5d2e51be2..1f1845c396 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -427,6 +427,7 @@ main(int argc, char **argv) bool user_compression_defined = false; bool data_only = false; bool schema_only = false; + bool statistics_only = false; DataDirSyncMethod sync_method = DATA_DIR_SYNC_METHOD_FSYNC; static DumpOptions dopt; @@ -464,6 +465,7 @@ main(int argc, char **argv) {"encoding", required_argument, NULL, 'E'}, {"help", no_argument, NULL, '?'}, {"version", no_argument, NULL, 'V'}, + {"statistics-only", no_argument, NULL, 'X'}, /* * the following options don't have an equivalent short option letter @@ -492,6 +494,7 @@ main(int argc, char **argv) {"no-comments", no_argument, &dopt.no_comments, 1}, {"no-publications", no_argument, &dopt.no_publications, 1}, {"no-security-labels", no_argument, &dopt.no_security_labels, 1}, + {"no-statistics", no_argument, &dopt.no_statistics, 1}, {"no-subscriptions", no_argument, &dopt.no_subscriptions, 1}, {"no-toast-compression", no_argument, &dopt.no_toast_compression, 1}, {"no-unlogged-table-data", no_argument, &dopt.no_unlogged_table_data, 1}, @@ -537,7 +540,7 @@ main(int argc, char **argv) InitDumpOptions(&dopt); - while ((c = getopt_long(argc, argv, "abBcCd:e:E:f:F:h:j:n:N:Op:RsS:t:T:U:vwWxZ:", + while ((c = getopt_long(argc, argv, "abBcCd:e:E:f:F:h:j:n:N:Op:PRsS:t:T:U:vwWxXZ:", long_options, &optindex)) != -1) { switch (c) @@ -611,6 +614,10 @@ main(int argc, char **argv) dopt.cparams.pgport = pg_strdup(optarg); break; + case 'X': /* Dump statistics only */ + statistics_only = true; + break; + case 'R': /* no-op, still accepted for backwards compatibility */ break; @@ -782,6 +789,13 @@ main(int argc, char **argv) if (data_only && schema_only) pg_fatal("options -s/--schema-only and -a/--data-only cannot be used together"); + if (schema_only && statistics_only) + pg_fatal("options -s/--schema-only and -X/--statistics-only cannot be used together"); + if (data_only && statistics_only) + pg_fatal("options -a/--data-only and -X/--statistics-only cannot be used together"); + + if (statistics_only && dopt.no_statistics) + pg_fatal("options -X/--statistics-only and --no-statistics cannot be used together"); if (schema_only && foreign_servers_include_patterns.head != NULL) pg_fatal("options -s/--schema-only and --include-foreign-data cannot be used together"); @@ -796,8 +810,20 @@ main(int argc, char **argv) pg_fatal("option --if-exists requires option -c/--clean"); /* set derivative flags */ - dopt.dumpSchema = (!data_only); - dopt.dumpData = (!schema_only); + dopt.dumpData = data_only || (!schema_only && !statistics_only); + dopt.dumpSchema = schema_only || (!data_only && !statistics_only); + + if (statistics_only) + /* stats are only thing wanted */ + dopt.dumpStatistics = true; + else if (dopt.no_statistics) + /* stats specifically excluded */ + dopt.dumpStatistics = false; + else if (dopt.binary_upgrade) + /* binary upgrade and not specifically excluded */ + dopt.dumpStatistics = true; + else + dopt.dumpStatistics = (!data_only && !schema_only); /* * --inserts are already implied above if --column-inserts or @@ -1097,6 +1123,7 @@ main(int argc, char **argv) ropt->dropSchema = dopt.outputClean; ropt->dumpData = dopt.dumpData; ropt->dumpSchema = dopt.dumpSchema; + ropt->dumpStatistics = dopt.dumpStatistics; ropt->if_exists = dopt.if_exists; ropt->column_inserts = dopt.column_inserts; ropt->dumpSections = dopt.dumpSections; @@ -1175,7 +1202,7 @@ help(const char *progname) printf(_(" -?, --help show this help, then exit\n")); printf(_("\nOptions controlling the output content:\n")); - printf(_(" -a, --data-only dump only the data, not the schema\n")); + printf(_(" -a, --data-only dump only the data, not the schema or statistics\n")); printf(_(" -b, --large-objects include large objects in dump\n")); printf(_(" --blobs (same as --large-objects, deprecated)\n")); printf(_(" -B, --no-large-objects exclude large objects in dump\n")); @@ -1188,11 +1215,12 @@ help(const char *progname) printf(_(" -N, --exclude-schema=PATTERN do NOT dump the specified schema(s)\n")); printf(_(" -O, --no-owner skip restoration of object ownership in\n" " plain-text format\n")); - printf(_(" -s, --schema-only dump only the schema, no data\n")); + printf(_(" -s, --schema-only dump only the schema, no data or statistics\n")); printf(_(" -S, --superuser=NAME superuser user name to use in plain-text format\n")); printf(_(" -t, --table=PATTERN dump only the specified table(s)\n")); printf(_(" -T, --exclude-table=PATTERN do NOT dump the specified table(s)\n")); printf(_(" -x, --no-privileges do not dump privileges (grant/revoke)\n")); + printf(_(" -X, --statistics-only dump only the statistics, not schema or data\n")); printf(_(" --binary-upgrade for use by upgrade utilities only\n")); printf(_(" --column-inserts dump data as INSERT commands with column names\n")); printf(_(" --disable-dollar-quoting disable dollar quoting, use SQL standard quoting\n")); @@ -1219,6 +1247,7 @@ help(const char *progname) printf(_(" --no-comments do not dump comments\n")); printf(_(" --no-publications do not dump publications\n")); printf(_(" --no-security-labels do not dump security label assignments\n")); + printf(_(" --no-statistics do not dump statistics\n")); printf(_(" --no-subscriptions do not dump subscriptions\n")); printf(_(" --no-table-access-method do not dump table access methods\n")); printf(_(" --no-tablespaces do not dump tablespace assignments\n")); @@ -6769,6 +6798,42 @@ getFuncs(Archive *fout) destroyPQExpBuffer(query); } +/* + * getRelationStatistics + * register the statistics object as a dependent of the relation. + * + */ +static RelStatsInfo * +getRelationStatistics(Archive *fout, DumpableObject *rel, char relkind) +{ + if ((relkind == RELKIND_RELATION) || + (relkind == RELKIND_PARTITIONED_TABLE) || + (relkind == RELKIND_INDEX) || + (relkind == RELKIND_PARTITIONED_INDEX) || + (relkind == RELKIND_MATVIEW)) + { + RelStatsInfo *info = pg_malloc0(sizeof(RelStatsInfo)); + DumpableObject *dobj = &info->dobj; + + dobj->objType = DO_REL_STATS; + dobj->catId.tableoid = 0; + dobj->catId.oid = 0; + AssignDumpId(dobj); + dobj->dependencies = (DumpId *) pg_malloc(sizeof(DumpId)); + dobj->dependencies[0] = rel->dumpId; + dobj->nDeps = 1; + dobj->allocDeps = 1; + dobj->components |= DUMP_COMPONENT_STATISTICS; + dobj->dump = rel->dump; + dobj->name = pg_strdup(rel->name); + dobj->namespace = rel->namespace; + info->relkind = relkind; + + return info; + } + return NULL; +} + /* * getTables * read all the tables (no indexes) in the system catalogs, @@ -7146,6 +7211,7 @@ getTables(Archive *fout, int *numTables) /* Tables have data */ tblinfo[i].dobj.components |= DUMP_COMPONENT_DATA; + tblinfo[i].dobj.components |= DUMP_COMPONENT_STATISTICS; /* Mark whether table has an ACL */ if (!PQgetisnull(res, i, i_relacl)) @@ -7194,6 +7260,8 @@ getTables(Archive *fout, int *numTables) } } } + if (tblinfo[i].interesting) + getRelationStatistics(fout, &tblinfo[i].dobj, tblinfo[i].relkind); } if (query->len != 0) @@ -7638,11 +7706,14 @@ getIndexes(Archive *fout, TableInfo tblinfo[], int numTables) for (int c = 0; c < numinds; c++, j++) { char contype; + char indexkind; + RelStatsInfo *relstats; indxinfo[j].dobj.objType = DO_INDEX; indxinfo[j].dobj.catId.tableoid = atooid(PQgetvalue(res, j, i_tableoid)); indxinfo[j].dobj.catId.oid = atooid(PQgetvalue(res, j, i_oid)); AssignDumpId(&indxinfo[j].dobj); + indxinfo[j].dobj.components |= DUMP_COMPONENT_STATISTICS; indxinfo[j].dobj.dump = tbinfo->dobj.dump; indxinfo[j].dobj.name = pg_strdup(PQgetvalue(res, j, i_indexname)); indxinfo[j].dobj.namespace = tbinfo->dobj.namespace; @@ -7665,7 +7736,14 @@ getIndexes(Archive *fout, TableInfo tblinfo[], int numTables) { NULL, NULL }; + + if (indxinfo[j].parentidx == 0) + indexkind = RELKIND_INDEX; + else + indexkind = RELKIND_PARTITIONED_INDEX; + contype = *(PQgetvalue(res, j, i_contype)); + relstats = getRelationStatistics(fout, &indxinfo[j].dobj, indexkind); if (contype == 'p' || contype == 'u' || contype == 'x') { @@ -7699,6 +7777,8 @@ getIndexes(Archive *fout, TableInfo tblinfo[], int numTables) constrinfo->separate = true; indxinfo[j].indexconstraint = constrinfo->dobj.dumpId; + if (relstats != NULL) + addObjectDependency(&relstats->dobj, constrinfo->dobj.dumpId); } else { @@ -10136,6 +10216,296 @@ dumpComment(Archive *fout, const char *type, catalogId, subid, dumpId, NULL); } +/* + * Tabular description of the parameters to pg_restore_relation_stats() + * required, param_name, param_type + */ +static const char *rel_stats_arginfo[][3] = { + {"f", "relation", "regclass"}, + {"f", "version", "integer"}, + {"f", "relpages", "integer"}, + {"f", "reltuples", "real"}, + {"f", "relallvisible", "integer"}, +}; + +/* + * Tabular description of the parameters to pg_restore_attribute_stats() + * required, param_name, param_type + */ +static const char *att_stats_arginfo[][3] = { + {"f", "relation", "regclass"}, + {"f", "attname", "name"}, + {"f", "inherited", "boolean"}, + {"f", "version", "integer"}, + {"f", "null_frac", "float4"}, + {"f", "avg_width", "integer"}, + {"f", "n_distinct", "float4"}, + {"f", "most_common_vals", "text"}, + {"f", "most_common_freqs", "float4[]"}, + {"f", "histogram_bounds", "text"}, + {"f", "correlation", "float4"}, + {"f", "most_common_elems", "text"}, + {"f", "most_common_elem_freqs", "float4[]"}, + {"f", "elem_count_histogram", "float4[]"}, + {"f", "range_length_histogram", "text"}, + {"f", "range_empty_frac", "float4"}, + {"f", "range_bounds_histogram", "text"}, +}; + +/* + * getRelStatsExportQuery -- + * + * Generate a query that will fetch all relation (e.g. pg_class) + * stats for a given relation. + */ +static void +getRelStatsExportQuery(PQExpBuffer query, Archive *fout, + const char *schemaname, const char *relname) +{ + resetPQExpBuffer(query); + appendPQExpBufferStr(query, + "SELECT c.oid::regclass AS relation, " + "current_setting('server_version_num') AS version, " + "c.relpages, c.reltuples, c.relallvisible " + "FROM pg_class c " + "JOIN pg_namespace n " + "ON n.oid = c.relnamespace " + "WHERE n.nspname = "); + appendStringLiteralAH(query, schemaname, fout); + appendPQExpBufferStr(query, " AND c.relname = "); + appendStringLiteralAH(query, relname, fout); +} + +/* + * getAttStatsExportQuery -- + * + * Generate a query that will fetch all attribute (e.g. pg_statistic) + * stats for a given relation. + */ +static void +getAttStatsExportQuery(PQExpBuffer query, Archive *fout, + const char *schemaname, const char *relname) +{ + resetPQExpBuffer(query); + appendPQExpBufferStr(query, + "SELECT c.oid::regclass AS relation, " + "s.attname," + "s.inherited," + "current_setting('server_version_num') AS version, " + "s.null_frac," + "s.avg_width," + "s.n_distinct," + "s.most_common_vals," + "s.most_common_freqs," + "s.histogram_bounds," + "s.correlation," + "s.most_common_elems," + "s.most_common_elem_freqs," + "s.elem_count_histogram,"); + + if (fout->remoteVersion >= 170000) + appendPQExpBufferStr(query, + "s.range_length_histogram," + "s.range_empty_frac," + "s.range_bounds_histogram "); + else + appendPQExpBufferStr(query, + "NULL AS range_length_histogram," + "NULL AS range_empty_frac," + "NULL AS range_bounds_histogram "); + + appendPQExpBufferStr(query, + "FROM pg_stats s " + "JOIN pg_namespace n " + "ON n.nspname = s.schemaname " + "JOIN pg_class c " + "ON c.relname = s.tablename " + "AND c.relnamespace = n.oid " + "WHERE s.schemaname = "); + appendStringLiteralAH(query, schemaname, fout); + appendPQExpBufferStr(query, " AND s.tablename = "); + appendStringLiteralAH(query, relname, fout); + appendPQExpBufferStr(query, " ORDER BY s.attname, s.inherited"); +} + + +/* + * appendNamedArgument -- + * + * Convenience routine for constructing parameters of the form: + * 'paraname', 'value'::type + */ +static void +appendNamedArgument(PQExpBuffer out, Archive *fout, + const char *argname, bool positional, + const char *argval, const char *argtype) +{ + appendPQExpBufferStr(out, "\t"); + + if (!positional) + { + appendStringLiteralAH(out, argname, fout); + appendPQExpBufferStr(out, ", "); + } + + appendStringLiteralAH(out, argval, fout); + appendPQExpBuffer(out, "::%s", argtype); +} + +/* + * appendRelStatsImport -- + * + * Append a formatted pg_restore_relation_stats statement. + */ +static void +appendRelStatsImport(PQExpBuffer out, Archive *fout, PGresult *res) +{ + const char *sep = ""; + + if (PQntuples(res) == 0) + return; + + appendPQExpBufferStr(out, "SELECT * FROM pg_catalog.pg_restore_relation_stats(\n"); + + for (int argno = 0; argno < lengthof(rel_stats_arginfo); argno++) + { + bool positional = (rel_stats_arginfo[argno][0][0] == 't'); + const char *argname = rel_stats_arginfo[argno][1]; + const char *argtype = rel_stats_arginfo[argno][2]; + int fieldno = PQfnumber(res, argname); + + if (fieldno < 0) + pg_fatal("relation stats export query missing field '%s'", + argname); + + if (PQgetisnull(res, 0, fieldno)) + { + if (!positional) + pg_fatal("relation stats export query unexpected NULL in '%s'", + argname); + else + continue; + } + + appendPQExpBufferStr(out, sep); + appendNamedArgument(out, fout, argname, positional, + PQgetvalue(res, 0, fieldno), argtype); + + sep = ",\n"; + } + appendPQExpBufferStr(out, "\n);\n"); +} + +/* + * appendAttStatsImport -- + * + * Append a series of formatted pg_restore_attribute_stats statements. + */ +static void +appendAttStatsImport(PQExpBuffer out, Archive *fout, PGresult *res) +{ + for (int rownum = 0; rownum < PQntuples(res); rownum++) + { + const char *sep = ""; + + appendPQExpBufferStr(out, "SELECT * FROM pg_catalog.pg_restore_attribute_stats(\n"); + for (int argno = 0; argno < lengthof(att_stats_arginfo); argno++) + { + bool positional = (att_stats_arginfo[argno][0][0] == 't'); + const char *argname = att_stats_arginfo[argno][1]; + const char *argtype = att_stats_arginfo[argno][2]; + int fieldno = PQfnumber(res, argname); + + if (fieldno < 0) + pg_fatal("attribute stats export query missing field '%s'", + argname); + + if (PQgetisnull(res, rownum, fieldno)) + { + if (positional) + pg_fatal("attribute stats export query unexpected NULL in '%s'", + argname); + else + continue; + } + + appendPQExpBufferStr(out, sep); + appendNamedArgument(out, fout, argname, positional, + PQgetvalue(res, rownum, fieldno), argtype); + sep = ",\n"; + } + appendPQExpBufferStr(out, "\n);\n"); + } +} + +/* + * get statistics dump section, which depends on the parent object type + * + * objects created in SECTION_PRE_DATA have stats in SECTION_DATA + * objects created in SECTION_POST_DATA have stats in SECTION_POST_DATA + */ +static teSection +statisticsDumpSection(const RelStatsInfo *rsinfo) +{ + + if ((rsinfo->relkind == RELKIND_MATVIEW) || + (rsinfo->relkind == RELKIND_INDEX) || + (rsinfo->relkind == RELKIND_PARTITIONED_INDEX)) + return SECTION_POST_DATA; + + return SECTION_DATA; +} + +/* + * dumpRelationStats -- + * + * Dump command to import stats into the relation on the new database. + */ +static void +dumpRelationStats(Archive *fout, const RelStatsInfo *rsinfo) +{ + PGresult *res; + PQExpBuffer query; + PQExpBuffer out; + PQExpBuffer tag; + DumpableObject *dobj = (DumpableObject *) &rsinfo->dobj; + + /* nothing to do if we are not dumping statistics */ + if (!fout->dopt->dumpStatistics) + return; + + tag = createPQExpBuffer(); + appendPQExpBuffer(tag, "%s %s", "STATISTICS DATA", fmtId(dobj->name)); + + query = createPQExpBuffer(); + out = createPQExpBuffer(); + + getRelStatsExportQuery(query, fout, dobj->namespace->dobj.name, + dobj->name); + res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK); + appendRelStatsImport(out, fout, res); + PQclear(res); + + getAttStatsExportQuery(query, fout, dobj->namespace->dobj.name, + dobj->name); + res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK); + appendAttStatsImport(out, fout, res); + PQclear(res); + + ArchiveEntry(fout, nilCatalogId, createDumpId(), + ARCHIVE_OPTS(.tag = tag->data, + .namespace = dobj->namespace->dobj.name, + .description = "STATISTICS DATA", + .section = statisticsDumpSection(rsinfo), + .createStmt = out->data, + .deps = dobj->dependencies, + .nDeps = dobj->nDeps)); + + destroyPQExpBuffer(query); + destroyPQExpBuffer(out); + destroyPQExpBuffer(tag); +} + /* * dumpTableComment -- * @@ -10584,6 +10954,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj) case DO_SUBSCRIPTION_REL: dumpSubscriptionTable(fout, (const SubRelInfo *) dobj); break; + case DO_REL_STATS: + dumpRelationStats(fout, (const RelStatsInfo *) dobj); + break; case DO_PRE_DATA_BOUNDARY: case DO_POST_DATA_BOUNDARY: /* never dumped, nothing to do */ @@ -16921,6 +17294,8 @@ dumpIndex(Archive *fout, const IndxInfo *indxinfo) free(indstatvalsarray); } + /* Comments and stats share same .dep */ + /* Dump Index Comments */ if (indxinfo->dobj.dump & DUMP_COMPONENT_COMMENT) dumpComment(fout, "INDEX", qindxname, @@ -18708,6 +19083,16 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs, /* must come after the pre-data boundary */ addObjectDependency(dobj, preDataBound->dumpId); break; + case DO_REL_STATS: + /* stats section varies by parent object type, DATA or POST */ + if (statisticsDumpSection((RelStatsInfo *) dobj) == SECTION_DATA) + { + addObjectDependency(dobj, preDataBound->dumpId); + addObjectDependency(postDataBound, dobj->dumpId); + } + else + addObjectDependency(dobj, postDataBound->dumpId); + break; } } } diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h index 9f907ed5ad..5039ae6e2b 100644 --- a/src/bin/pg_dump/pg_dump.h +++ b/src/bin/pg_dump/pg_dump.h @@ -84,6 +84,7 @@ typedef enum DO_PUBLICATION_TABLE_IN_SCHEMA, DO_SUBSCRIPTION, DO_SUBSCRIPTION_REL, /* see note for SubRelInfo */ + DO_REL_STATS, } DumpableObjectType; /* @@ -101,6 +102,7 @@ typedef uint32 DumpComponents; #define DUMP_COMPONENT_ACL (1 << 4) #define DUMP_COMPONENT_POLICY (1 << 5) #define DUMP_COMPONENT_USERMAP (1 << 6) +#define DUMP_COMPONENT_STATISTICS (1 << 7) #define DUMP_COMPONENT_ALL (0xFFFF) /* @@ -417,6 +419,12 @@ typedef struct _indexAttachInfo IndxInfo *partitionIdx; /* link to index on partition */ } IndexAttachInfo; +typedef struct _relStatsInfo +{ + DumpableObject dobj; + char relkind; /* 'r', 'v', 'c', etc */ +} RelStatsInfo; + typedef struct _statsExtInfo { DumpableObject dobj; diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c index 675bbf1233..e0afe48e0e 100644 --- a/src/bin/pg_dump/pg_dump_sort.c +++ b/src/bin/pg_dump/pg_dump_sort.c @@ -1501,6 +1501,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize) "POST-DATA BOUNDARY (ID %d)", obj->dumpId); return; + case DO_REL_STATS: + snprintf(buf, bufsize, + "RELATION STATISTICS FOR %s (ID %d OID %u)", + obj->name, obj->dumpId, obj->catId.oid); + return; } /* shouldn't get here */ snprintf(buf, bufsize, diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c index e3ad8fb295..1799a03ff1 100644 --- a/src/bin/pg_dump/pg_dumpall.c +++ b/src/bin/pg_dump/pg_dumpall.c @@ -103,6 +103,7 @@ static int use_setsessauth = 0; static int no_comments = 0; static int no_publications = 0; static int no_security_labels = 0; +static int no_statistics = 0; static int no_subscriptions = 0; static int no_toast_compression = 0; static int no_unlogged_table_data = 0; @@ -172,6 +173,7 @@ main(int argc, char *argv[]) {"no-role-passwords", no_argument, &no_role_passwords, 1}, {"no-security-labels", no_argument, &no_security_labels, 1}, {"no-subscriptions", no_argument, &no_subscriptions, 1}, + {"no-statistics", no_argument, &no_statistics, 1}, {"no-sync", no_argument, NULL, 4}, {"no-toast-compression", no_argument, &no_toast_compression, 1}, {"no-unlogged-table-data", no_argument, &no_unlogged_table_data, 1}, @@ -451,6 +453,8 @@ main(int argc, char *argv[]) appendPQExpBufferStr(pgdumpopts, " --no-publications"); if (no_security_labels) appendPQExpBufferStr(pgdumpopts, " --no-security-labels"); + if (no_statistics) + appendPQExpBufferStr(pgdumpopts, " --no-statistics"); if (no_subscriptions) appendPQExpBufferStr(pgdumpopts, " --no-subscriptions"); if (no_toast_compression) @@ -666,6 +670,7 @@ help(void) printf(_(" --no-publications do not dump publications\n")); printf(_(" --no-role-passwords do not dump passwords for roles\n")); printf(_(" --no-security-labels do not dump security label assignments\n")); + printf(_(" --no-statistics do not dump statistics\n")); printf(_(" --no-subscriptions do not dump subscriptions\n")); printf(_(" --no-sync do not wait for changes to be written safely to disk\n")); printf(_(" --no-table-access-method do not dump table access methods\n")); diff --git a/src/bin/pg_dump/pg_restore.c b/src/bin/pg_dump/pg_restore.c index f6e04e3a3b..41857066c4 100644 --- a/src/bin/pg_dump/pg_restore.c +++ b/src/bin/pg_dump/pg_restore.c @@ -64,6 +64,7 @@ main(int argc, char **argv) int numWorkers = 1; Archive *AH; char *inputFileSpec; + bool statistics_only = false; static int disable_triggers = 0; static int enable_row_security = 0; static int if_exists = 0; @@ -75,6 +76,7 @@ main(int argc, char **argv) static int no_publications = 0; static int no_security_labels = 0; static int no_subscriptions = 0; + static int no_statistics = 0; static int strict_names = 0; bool data_only = false; @@ -110,6 +112,7 @@ main(int argc, char **argv) {"username", 1, NULL, 'U'}, {"verbose", 0, NULL, 'v'}, {"single-transaction", 0, NULL, '1'}, + {"statistics-only", no_argument, NULL, 'P'}, /* * the following options don't have an equivalent short option letter @@ -130,6 +133,7 @@ main(int argc, char **argv) {"no-security-labels", no_argument, &no_security_labels, 1}, {"no-subscriptions", no_argument, &no_subscriptions, 1}, {"filter", required_argument, NULL, 4}, + {"no-statistics", no_argument, &no_statistics, 1}, {NULL, 0, NULL, 0} }; @@ -273,6 +277,10 @@ main(int argc, char **argv) opts->aclsSkip = 1; break; + case 'X': /* Restore statistics only */ + statistics_only = true; + break; + case '1': /* Restore data in a single transaction */ opts->single_txn = true; opts->exit_on_error = true; @@ -345,6 +353,10 @@ main(int argc, char **argv) if (data_only && schema_only) pg_fatal("options -s/--schema-only and -a/--data-only cannot be used together"); + if (data_only && statistics_only) + pg_fatal("options -a/--data-only and -X/--statistics-only cannot be used together"); + if (schema_only && statistics_only) + pg_fatal("options -s/--schema-only and -X/--statistics-only cannot be used together"); if (data_only && opts->dropSchema) pg_fatal("options -c/--clean and -a/--data-only cannot be used together"); @@ -364,8 +376,9 @@ main(int argc, char **argv) pg_fatal("cannot specify both --single-transaction and multiple jobs"); /* set derivative flags */ - opts->dumpSchema = (!data_only); - opts->dumpData = (!schema_only); + opts->dumpSchema = (!data_only && !statistics_only); + opts->dumpData = (!schema_only && !statistics_only); + opts->dumpStatistics = (!no_statistics && !data_only && !schema_only); opts->disable_triggers = disable_triggers; opts->enable_row_security = enable_row_security; @@ -377,6 +390,7 @@ main(int argc, char **argv) opts->no_publications = no_publications; opts->no_security_labels = no_security_labels; opts->no_subscriptions = no_subscriptions; + opts->no_statistics = no_statistics; if (if_exists && !opts->dropSchema) pg_fatal("option --if-exists requires option -c/--clean"); diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl index b9d13a0e1d..76ebf15f55 100644 --- a/src/bin/pg_dump/t/001_basic.pl +++ b/src/bin/pg_dump/t/001_basic.pl @@ -50,12 +50,30 @@ command_fails_like( 'pg_dump: options -s/--schema-only and -a/--data-only cannot be used together' ); +command_fails_like( + [ 'pg_dump', '-s', '-X' ], + qr/\Qpg_dump: error: options -s\/--schema-only and -X\/--statistics-only cannot be used together\E/, + 'pg_dump: error: options -s/--schema-only and -X/--statistics-only cannot be used together' +); + +command_fails_like( + [ 'pg_dump', '-a', '-X' ], + qr/\Qpg_dump: error: options -a\/--data-only and -X\/--statistics-only cannot be used together\E/, + 'pg_dump: error: options -a/--data-only and -X/--statistics-only cannot be used together' +); + command_fails_like( [ 'pg_dump', '-s', '--include-foreign-data=xxx' ], qr/\Qpg_dump: error: options -s\/--schema-only and --include-foreign-data cannot be used together\E/, 'pg_dump: options -s/--schema-only and --include-foreign-data cannot be used together' ); +command_fails_like( + [ 'pg_dump', '--statistics-only', '--no-statistics' ], + qr/\Qpg_dump: error: options -X\/--statistics-only and --no-statistics cannot be used together\E/, + 'pg_dump: options -X\/--statistics-only and --no-statistics cannot be used together' +); + command_fails_like( [ 'pg_dump', '-j2', '--include-foreign-data=xxx' ], qr/\Qpg_dump: error: option --include-foreign-data is not supported with parallel backup\E/, diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml index ffc29b04fb..99f25e40c0 100644 --- a/doc/src/sgml/ref/pg_dump.sgml +++ b/doc/src/sgml/ref/pg_dump.sgml @@ -123,7 +123,7 @@ PostgreSQL documentation <term><option>--data-only</option></term> <listitem> <para> - Dump only the data, not the schema (data definitions). + Dump only the data, not the schema (data definitions) or statistics. Table data, large objects, and sequence values are dumped. </para> @@ -141,8 +141,9 @@ PostgreSQL documentation <listitem> <para> Include large objects in the dump. This is the default behavior - except when <option>--schema</option>, <option>--table</option>, or - <option>--schema-only</option> is specified. The <option>-b</option> + except when <option>--schema</option>, <option>--table</option>, + <option>--schema-only</option>, or + <option>--statistics-only</option> is specified. The <option>-b</option> switch is therefore only useful to add large objects to dumps where a specific schema or table has been requested. Note that large objects are considered data and therefore will be included when @@ -516,10 +517,11 @@ PostgreSQL documentation <term><option>--schema-only</option></term> <listitem> <para> - Dump only the object definitions (schema), not data. + Dump only the object definitions (schema), not data or statistics. </para> <para> - This option is the inverse of <option>--data-only</option>. + This option is mutually exclusive to <option>--data-only</option> + and <option>--statistics-only</option>. It is similar to, but for historical reasons not identical to, specifying <option>--section=pre-data --section=post-data</option>. @@ -652,6 +654,18 @@ PostgreSQL documentation </listitem> </varlistentry> + <varlistentry> + <term><option>-X</option></term> + <term><option>--statistics-only</option></term> + <listitem> + <para> + Dump only the statistics, not the schema (data definitions) or data. + Statistics for tables, materialized views, and indexes are dumped. + </para> + + </listitem> + </varlistentry> + <varlistentry> <term><option>-Z <replaceable class="parameter">level</replaceable></option></term> <term><option>-Z <replaceable class="parameter">method</replaceable></option>[:<replaceable>detail</replaceable>]</term> @@ -833,7 +847,8 @@ PostgreSQL documentation though you do not need the data in it. </para> <para> - To exclude data for all tables in the database, see <option>--schema-only</option>. + To exclude data for all tables in the database, see <option>--schema-only</option> + or <option>--statistics-only</option>. </para> </listitem> </varlistentry> @@ -1098,6 +1113,15 @@ PostgreSQL documentation </listitem> </varlistentry> + <varlistentry> + <term><option>--no-statistics</option></term> + <listitem> + <para> + Do not dump statistics. + </para> + </listitem> + </varlistentry> + <varlistentry> <term><option>--no-subscriptions</option></term> <listitem> diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml index b8b27e1719..ff1441a243 100644 --- a/doc/src/sgml/ref/pg_restore.sgml +++ b/doc/src/sgml/ref/pg_restore.sgml @@ -94,7 +94,7 @@ PostgreSQL documentation <term><option>--data-only</option></term> <listitem> <para> - Restore only the data, not the schema (data definitions). + Restore only the data, not the schema (data definitions) or statistics. Table data, large objects, and sequence values are restored, if present in the archive. </para> @@ -483,10 +483,11 @@ PostgreSQL documentation to the extent that schema entries are present in the archive. </para> <para> - This option is the inverse of <option>--data-only</option>. + This option is mutually exclusive of <option>--data-only</option> + and <option>--statistics-only</option>. It is similar to, but for historical reasons not identical to, specifying - <option>--section=pre-data --section=post-data</option>. + <option>--section=pre-data --section=post-data --no-statistics</option>. </para> <para> (Do not confuse this with the <option>--schema</option> option, which @@ -599,6 +600,20 @@ PostgreSQL documentation </listitem> </varlistentry> + <varlistentry> + <term><option>-X</option></term> + <term><option>--statistics-only</option></term> + <listitem> + <para> + Restore only the statistics, not schema or data. + </para> + <para> + (Do not confuse this with the <option>--schema</option> option, which + uses the word <quote>schema</quote> in a different meaning.) + </para> + </listitem> + </varlistentry> + <varlistentry> <term><option>-1</option></term> <term><option>--single-transaction</option></term> @@ -723,6 +738,16 @@ PostgreSQL documentation </listitem> </varlistentry> + <varlistentry> + <term><option>--no-statistics</option></term> + <listitem> + <para> + Do not output commands to restore statistics, even if the archive + contains them. + </para> + </listitem> + </varlistentry> + <varlistentry> <term><option>--no-subscriptions</option></term> <listitem> -- 2.47.0
From 7f68a7e4d84ee5f1376eb4d8681773e4eaf305e3 Mon Sep 17 00:00:00 2001 From: Corey Huinker <corey.huinker@gmail.com> Date: Mon, 4 Nov 2024 16:21:39 -0500 Subject: [PATCH v30 5/5] Enable pg_clear_relation_stats to handle different default relpages. If it comes to pass that relpages has the default of -1.0 for partitioned tables (and indexes), then this patch will handle that. --- src/backend/statistics/relation_stats.c | 47 +++++++++++++------------ 1 file changed, 24 insertions(+), 23 deletions(-) diff --git a/src/backend/statistics/relation_stats.c b/src/backend/statistics/relation_stats.c index 939ad56fc2..1aab63f5d8 100644 --- a/src/backend/statistics/relation_stats.c +++ b/src/backend/statistics/relation_stats.c @@ -19,15 +19,12 @@ #include "access/heapam.h" #include "catalog/indexing.h" +#include "catalog/pg_class_d.h" #include "statistics/stat_utils.h" #include "utils/fmgrprotos.h" #include "utils/syscache.h" #include "utils/fmgroids.h" -#define DEFAULT_RELPAGES Int32GetDatum(0) -#define DEFAULT_RELTUPLES Float4GetDatum(-1.0) -#define DEFAULT_RELALLVISIBLE Int32GetDatum(0) - /* * Positional argument numbers, names, and types for * relation_statistics_update(). @@ -51,14 +48,11 @@ static struct StatsArgInfo relarginfo[] = [NUM_RELATION_STATS_ARGS] = {0} }; -static bool relation_statistics_update(FunctionCallInfo fcinfo, int elevel, - bool inplace); - /* * Internal function for modifying statistics for a relation. */ static bool -relation_statistics_update(FunctionCallInfo fcinfo, int elevel, bool inplace) +relation_statistics_update(FunctionCallInfo fcinfo, int elevel, bool inplace, bool clear) { Oid reloid; Relation crel; @@ -110,9 +104,19 @@ relation_statistics_update(FunctionCallInfo fcinfo, int elevel, bool inplace) pgcform = (Form_pg_class) GETSTRUCT(ctup); /* relpages */ - if (!PG_ARGISNULL(RELPAGES_ARG)) + if (!PG_ARGISNULL(RELPAGES_ARG) || clear) { - int32 relpages = PG_GETARG_INT32(RELPAGES_ARG); + int32 relpages; + + if (clear) + /* relpages default varies by relkind */ + if ((crel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) || + (crel->rd_rel->relkind == RELKIND_PARTITIONED_INDEX)) + relpages = -1; + else + relpages = 0; + else + relpages = PG_GETARG_INT32(RELPAGES_ARG); /* * Partitioned tables may have relpages=-1. Note: for relations with @@ -139,9 +143,9 @@ relation_statistics_update(FunctionCallInfo fcinfo, int elevel, bool inplace) } } - if (!PG_ARGISNULL(RELTUPLES_ARG)) + if (!PG_ARGISNULL(RELTUPLES_ARG) || clear) { - float reltuples = PG_GETARG_FLOAT4(RELTUPLES_ARG); + float reltuples = (clear) ? -1.0 : PG_GETARG_FLOAT4(RELTUPLES_ARG); if (reltuples < -1.0) { @@ -164,9 +168,9 @@ relation_statistics_update(FunctionCallInfo fcinfo, int elevel, bool inplace) } - if (!PG_ARGISNULL(RELALLVISIBLE_ARG)) + if (!PG_ARGISNULL(RELALLVISIBLE_ARG) || clear) { - int32 relallvisible = PG_GETARG_INT32(RELALLVISIBLE_ARG); + int32 relallvisible = (clear) ? 0 : PG_GETARG_INT32(RELALLVISIBLE_ARG); if (relallvisible < 0) { @@ -220,7 +224,7 @@ relation_statistics_update(FunctionCallInfo fcinfo, int elevel, bool inplace) Datum pg_set_relation_stats(PG_FUNCTION_ARGS) { - relation_statistics_update(fcinfo, ERROR, false); + relation_statistics_update(fcinfo, ERROR, false, false); PG_RETURN_VOID(); } @@ -237,14 +241,11 @@ pg_clear_relation_stats(PG_FUNCTION_ARGS) newfcinfo->args[0].value = PG_GETARG_OID(0); newfcinfo->args[0].isnull = PG_ARGISNULL(0); - newfcinfo->args[1].value = DEFAULT_RELPAGES; - newfcinfo->args[1].isnull = false; - newfcinfo->args[2].value = DEFAULT_RELTUPLES; - newfcinfo->args[2].isnull = false; - newfcinfo->args[3].value = DEFAULT_RELALLVISIBLE; - newfcinfo->args[3].isnull = false; + newfcinfo->args[1].isnull = true; + newfcinfo->args[2].isnull = true; + newfcinfo->args[3].isnull = true; - relation_statistics_update(newfcinfo, ERROR, false); + relation_statistics_update(newfcinfo, ERROR, false, true); PG_RETURN_VOID(); } @@ -262,7 +263,7 @@ pg_restore_relation_stats(PG_FUNCTION_ARGS) relarginfo, WARNING)) result = false; - if (!relation_statistics_update(positional_fcinfo, WARNING, true)) + if (!relation_statistics_update(positional_fcinfo, WARNING, true, false)) result = false; PG_RETURN_BOOL(result); -- 2.47.0