On Thu, Aug 10, 2017 at 8:26 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Thu, Aug 10, 2017 at 3:47 AM, Rushabh Lathia > <rushabh.lat...@gmail.com> wrote: > >> (1) seems like a pretty arbitrary restriction, so I don't like that > >> option. (2) would hurt performance in some use cases. Do we have an > >> option (3)? > > > > How about protecting option 2) with the load-via-partition-root > protection. > > Means > > load the parents information even dump is not set only when > > load-via-partition-root > > & ispartition. > > > > This won't hurt performance for the normal cases. > > Yes, that seems like the right approach. > > + Dump data via the top-most partitioned table (rather than > partition > + table) when dumping data for the partition table. > > I think we should phrase this a bit more clearly, something like this: > When dumping a COPY or INSERT statement for a partitioned table, > target the root of the partitioning hierarchy which contains it rather > than the partition itself. This may be useful when reloading data on > a server where rows do not always fall into the same partitions as > they did on the original server. This could happen, for example, if > the partitioning column is of type text and the two system have > different definitions of the collation used to partition the data. > > Done. > + printf(_(" --load-via-partition-root load partition table via > the root relation\n")); > > "relation" seems odd to me here. root table? > > Done. > /* Don't bother computing anything for non-target tables, either > */ > if (!tblinfo[i].dobj.dump) > + { > + /* > + * Load the parents information for the partition table when > + * the load-via-partition-root option is set. As we need the > + * parents information to get the partition root. > + */ > + if (dopt->load_via_partition_root && > + tblinfo[i].ispartition) > + findParentsByOid(&tblinfo[i], inhinfo, numInherits); > continue; > + } > > Duplicating the call to findParentsByOid seems less then ideal. How > about doing something like this: > > if (tblinfo[i].dobj.dump) > { > find_parents = true; > mark_parents = true; > } > else if (dopt->load_via_partition_root && tblinfo[i].ispartition) > find_parents = true; > > if (find_parents) > findParentsByOid(&tblinfo[i], inhinfo, numInherits); > > etc. > > Done changes to avoid duplicate call to findParentsByOid(). > The comments for this function also need some work - e.g. the function > header comment deserves some kind of update for these changes. > > Done. > +static TableInfo * > +getRootTableInfo(TableInfo *tbinfo) > +{ > + Assert(tbinfo->ispartition); > + Assert(tbinfo->numParents == 1); > + if (tbinfo->parents[0]->ispartition) > + return getRootTableInfo(tbinfo->parents[0]); > + > + return tbinfo->parents[0]; > +} > > This code should iterate, not recurse, to avoid any risk of blowing > out the stack. > > Done. Please find attach patch with the changes. Thanks, Rushabh Lathia www.EnterpriseDB.com
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml index bafa031..de8297a 100644 --- a/doc/src/sgml/ref/pg_dump.sgml +++ b/doc/src/sgml/ref/pg_dump.sgml @@ -889,6 +889,21 @@ PostgreSQL documentation </varlistentry> <varlistentry> + <term><option>--load-via-partition-root</></term> + <listitem> + <para> + When dumping a COPY or INSERT statement for a partitioned table, + target the root of the partitioning hierarchy which contains it rather + than the partition itself. This will be useful when reloading data on + a server where rows do not always fall into the same partitions as + they did on the original server. This could happen, for example, if + the partitioning column is of type text and the two system have + different definitions of the collation used to partition the data. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term> <listitem> <para> diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml index aa944a2..dc1d3cc 100644 --- a/doc/src/sgml/ref/pg_dumpall.sgml +++ b/doc/src/sgml/ref/pg_dumpall.sgml @@ -431,6 +431,21 @@ PostgreSQL documentation </varlistentry> <varlistentry> + <term><option>--load-via-partition-root</></term> + <listitem> + <para> + When dumping a COPY or INSERT statement for a partitioned table, + target the root of the partitioning hierarchy which contains it rather + than the partition itself. This will be useful when reloading data on + a server where rows do not always fall into the same partitions as + they did on the original server. This could happen, for example, if + the partitioning column is of type text and the two system have + different definitions of the collation used to partition the data. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><option>--use-set-session-authorization</></term> <listitem> <para> diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c index 47191be..e7ee069 100644 --- a/src/bin/pg_dump/common.c +++ b/src/bin/pg_dump/common.c @@ -66,7 +66,7 @@ static int numExtensions; static ExtensionMemberId *extmembers; static int numextmembers; -static void flagInhTables(TableInfo *tbinfo, int numTables, +static void flagInhTables(Archive *fout, TableInfo *tbinfo, int numTables, InhInfo *inhinfo, int numInherits); static void flagInhAttrs(DumpOptions *dopt, TableInfo *tblinfo, int numTables); static DumpableObject **buildIndexArray(void *objArray, int numObjs, @@ -243,7 +243,7 @@ getSchemaData(Archive *fout, int *numTablesPtr) /* Link tables to parents, mark parents of target tables interesting */ if (g_verbose) write_msg(NULL, "finding inheritance relationships\n"); - flagInhTables(tblinfo, numTables, inhinfo, numInherits); + flagInhTables(fout, tblinfo, numTables, inhinfo, numInherits); if (g_verbose) write_msg(NULL, "reading column info for interesting tables\n"); @@ -301,12 +301,16 @@ getSchemaData(Archive *fout, int *numTablesPtr) * This is sufficient; we don't much care whether they inherited their * attributes or not. * + * When load-via-partition-root is set, load the parents information for the + * partition table. + * * modifies tblinfo */ static void -flagInhTables(TableInfo *tblinfo, int numTables, +flagInhTables(Archive *fout, TableInfo *tblinfo, int numTables, InhInfo *inhinfo, int numInherits) { + DumpOptions *dopt = fout->dopt; int i, j; int numParents; @@ -314,6 +318,8 @@ flagInhTables(TableInfo *tblinfo, int numTables, for (i = 0; i < numTables; i++) { + bool only_find_parents = false; + /* Some kinds never have parents */ if (tblinfo[i].relkind == RELKIND_SEQUENCE || tblinfo[i].relkind == RELKIND_VIEW || @@ -322,11 +328,26 @@ flagInhTables(TableInfo *tblinfo, int numTables, /* Don't bother computing anything for non-target tables, either */ if (!tblinfo[i].dobj.dump) - continue; + { + /* + * Load the parents information for the partition table when the + * load-via-partition-root option is set. As we need the parents + * information to get the partition root. + */ + if (dopt->load_via_partition_root && + tblinfo[i].ispartition) + only_find_parents = true; + else + continue; + } /* Find all the immediate parent tables */ findParentsByOid(&tblinfo[i], inhinfo, numInherits); + /* Skip mark the parents */ + if (only_find_parents) + continue; + /* Mark the parents as interesting for getTableAttrs */ numParents = tblinfo[i].numParents; parents = tblinfo[i].parents; diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h index 144068a..ce3100f 100644 --- a/src/bin/pg_dump/pg_backup.h +++ b/src/bin/pg_dump/pg_backup.h @@ -157,6 +157,7 @@ typedef struct _dumpOptions int outputNoTablespaces; int use_setsessauth; int enable_row_security; + int load_via_partition_root; /* default, if no "inclusion" switches appear, is to dump everything */ bool include_everything; diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 393b9e2..1703de8 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -269,6 +269,7 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions, const char *prefix, Archive *fout); static char *get_synchronized_snapshot(Archive *fout); static void setupDumpWorker(Archive *AHX); +static TableInfo *getRootTableInfo(TableInfo *tbinfo); int @@ -345,6 +346,7 @@ main(int argc, char **argv) {"lock-wait-timeout", required_argument, NULL, 2}, {"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1}, {"quote-all-identifiers", no_argument, "e_all_identifiers, 1}, + {"load-via-partition-root", no_argument, &dopt.load_via_partition_root, 1}, {"role", required_argument, NULL, 3}, {"section", required_argument, NULL, 5}, {"serializable-deferrable", no_argument, &dopt.serializable_deferrable, 1}, @@ -959,6 +961,7 @@ help(const char *progname) printf(_(" --no-tablespaces do not dump tablespace assignments\n")); printf(_(" --no-unlogged-table-data do not dump unlogged table data\n")); printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n")); + printf(_(" --load-via-partition-root load partition table via the root table\n")); printf(_(" --section=SECTION dump named section (pre-data, data, or post-data)\n")); printf(_(" --serializable-deferrable wait until the dump can run without anomalies\n")); printf(_(" --snapshot=SNAPSHOT use given snapshot for the dump\n")); @@ -1902,8 +1905,32 @@ dumpTableData_insert(Archive *fout, void *dcontext) if (insertStmt == NULL) { insertStmt = createPQExpBuffer(); + + /* + * When load-via-partition-root is set, get the root table + * name for the partition table, so that we can reload data + * through the root table. + */ + if (dopt->load_via_partition_root && tbinfo->ispartition) + { + TableInfo *parentTbinfo; + + parentTbinfo = getRootTableInfo(tbinfo); + + /* + * When we loading data through the root, we will qualify + * the table name. This is needed because earlier + * search_path will be set for the partition table. + */ + classname = (char *) fmtQualifiedId(fout->remoteVersion, + parentTbinfo->dobj.namespace->dobj.name, + parentTbinfo->dobj.name); + } + else + classname = fmtId(tbinfo->dobj.name); + appendPQExpBuffer(insertStmt, "INSERT INTO %s ", - fmtId(classname)); + classname); /* corner case for zero-column table */ if (nfields == 0) @@ -2025,6 +2052,27 @@ dumpTableData_insert(Archive *fout, void *dcontext) return 1; } +/* + * getRootTableInfo: + * get the root TableInfo for the given partition table. + */ +static TableInfo * +getRootTableInfo(TableInfo *tbinfo) +{ + TableInfo *parentTbinfo; + + Assert(tbinfo->ispartition); + Assert(tbinfo->numParents == 1); + + parentTbinfo = tbinfo->parents[0]; + while (parentTbinfo->ispartition) + { + Assert(parentTbinfo->numParents == 1); + parentTbinfo = parentTbinfo->parents[0]; + } + + return parentTbinfo; +} /* * dumpTableData - @@ -2041,14 +2089,38 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo) PQExpBuffer clistBuf = createPQExpBuffer(); DataDumperPtr dumpFn; char *copyStmt; + const char *copyFrom; if (!dopt->dump_inserts) { /* Dump/restore using COPY */ dumpFn = dumpTableData_copy; - /* must use 2 steps here 'cause fmtId is nonreentrant */ + + /* + * When load-via-partition-root is set, get the root table name for + * the partition table, so that we can reload data through the root + * table. + */ + if (dopt->load_via_partition_root && tbinfo->ispartition) + { + TableInfo *parentTbinfo; + + parentTbinfo = getRootTableInfo(tbinfo); + + /* + * When we loading data through the root, we will qualify the + * table name. This is needed because earlier search_path will be + * set for the partition table. + */ + copyFrom = fmtQualifiedId(fout->remoteVersion, + parentTbinfo->dobj.namespace->dobj.name, + parentTbinfo->dobj.name); + } + else + copyFrom = fmtId(tbinfo->dobj.name); + appendPQExpBuffer(copyBuf, "COPY %s ", - fmtId(tbinfo->dobj.name)); + copyFrom); appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n", fmtCopyColumnList(tbinfo, clistBuf), (tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : ""); diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c index b14bb8e..c4741c2 100644 --- a/src/bin/pg_dump/pg_dumpall.c +++ b/src/bin/pg_dump/pg_dumpall.c @@ -80,6 +80,7 @@ static int no_subscriptions = 0; static int no_unlogged_table_data = 0; static int no_role_passwords = 0; static int server_version; +static int load_via_partition_root = 0; static char role_catalog[10]; #define PG_AUTHID "pg_authid" @@ -128,6 +129,7 @@ main(int argc, char *argv[]) {"lock-wait-timeout", required_argument, NULL, 2}, {"no-tablespaces", no_argument, &no_tablespaces, 1}, {"quote-all-identifiers", no_argument, "e_all_identifiers, 1}, + {"load-via-partition-root", no_argument, &load_via_partition_root, 1}, {"role", required_argument, NULL, 3}, {"use-set-session-authorization", no_argument, &use_setsessauth, 1}, {"no-publications", no_argument, &no_publications, 1}, @@ -385,6 +387,8 @@ main(int argc, char *argv[]) appendPQExpBufferStr(pgdumpopts, " --no-tablespaces"); if (quote_all_identifiers) appendPQExpBufferStr(pgdumpopts, " --quote-all-identifiers"); + if (load_via_partition_root) + appendPQExpBufferStr(pgdumpopts, " --load-via-partition-root"); if (use_setsessauth) appendPQExpBufferStr(pgdumpopts, " --use-set-session-authorization"); if (no_publications) @@ -606,6 +610,7 @@ help(void) printf(_(" --no-tablespaces do not dump tablespace assignments\n")); printf(_(" --no-unlogged-table-data do not dump unlogged table data\n")); printf(_(" --quote-all-identifiers quote all identifiers, even if not key words\n")); + printf(_(" --load-via-partition-root load partition table via the root table\n")); printf(_(" --use-set-session-authorization\n" " use SET SESSION AUTHORIZATION commands instead of\n" " ALTER OWNER commands to set ownership\n"));
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers