On Thu, Aug 10, 2017 at 8:26 PM, Robert Haas <robertmh...@gmail.com> wrote:

> On Thu, Aug 10, 2017 at 3:47 AM, Rushabh Lathia
> <rushabh.lat...@gmail.com> wrote:
> >> (1) seems like a pretty arbitrary restriction, so I don't like that
> >> option.  (2) would hurt performance in some use cases.  Do we have an
> >> option (3)?
> >
> > How about protecting option 2) with the load-via-partition-root
> protection.
> > Means
> > load the parents information even dump is not set only when
> > load-via-partition-root
> > & ispartition.
> >
> > This won't hurt performance for the normal cases.
>
> Yes, that seems like the right approach.
>
> +        Dump data via the top-most partitioned table (rather than
> partition
> +        table) when dumping data for the partition table.
>
> I think we should phrase this a bit more clearly, something like this:
> When dumping a COPY or INSERT statement for a partitioned table,
> target the root of the partitioning hierarchy which contains it rather
> than the partition itself.  This may be useful when reloading data on
> a server where rows do not always fall into the same partitions as
> they did on the original server.  This could happen, for example, if
> the partitioning column is of type text and the two system have
> different definitions of the collation used to partition the data.
>
>
Done.


> +    printf(_("  --load-via-partition-root    load partition table via
> the root relation\n"));
>
> "relation" seems odd to me here.  root table?
>
>
Done.


>          /* Don't bother computing anything for non-target tables, either
> */
>          if (!tblinfo[i].dobj.dump)
> +        {
> +            /*
> +             * Load the parents information for the partition table when
> +             * the load-via-partition-root option is set. As we need the
> +             * parents information to get the partition root.
> +             */
> +            if (dopt->load_via_partition_root &&
> +                tblinfo[i].ispartition)
> +                findParentsByOid(&tblinfo[i], inhinfo, numInherits);
>              continue;
> +        }
>
> Duplicating the call to findParentsByOid seems less then ideal.  How
> about doing something like this:
>
> if (tblinfo[i].dobj.dump)
> {
>     find_parents = true;
>     mark_parents = true;
> }
> else if (dopt->load_via_partition_root && tblinfo[i].ispartition)
>     find_parents = true;
>
> if (find_parents)
>     findParentsByOid(&tblinfo[i], inhinfo, numInherits);
>
> etc.
>
>
Done changes to avoid duplicate call to findParentsByOid().


> The comments for this function also need some work - e.g. the function
> header comment deserves some kind of update for these changes.
>
>
Done.


> +static TableInfo *
> +getRootTableInfo(TableInfo *tbinfo)
> +{
> +    Assert(tbinfo->ispartition);
> +    Assert(tbinfo->numParents == 1);
> +    if (tbinfo->parents[0]->ispartition)
> +        return getRootTableInfo(tbinfo->parents[0]);
> +
> +    return tbinfo->parents[0];
> +}
>
> This code should iterate, not recurse, to avoid any risk of blowing
> out the stack.
>
>
Done.

Please find attach patch with the changes.

Thanks,
Rushabh Lathia
www.EnterpriseDB.com
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index bafa031..de8297a 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -889,6 +889,21 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>--load-via-partition-root</></term>
+      <listitem>
+       <para>
+        When dumping a COPY or INSERT statement for a partitioned table,
+        target the root of the partitioning hierarchy which contains it rather
+        than the partition itself.  This will be useful when reloading data on
+        a server where rows do not always fall into the same partitions as
+        they did on the original server.  This could happen, for example, if
+        the partitioning column is of type text and the two system have
+        different definitions of the collation used to partition the data.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
        <term><option>--section=<replaceable class="parameter">sectionname</replaceable></option></term>
        <listitem>
          <para>
diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index aa944a2..dc1d3cc 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -431,6 +431,21 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>--load-via-partition-root</></term>
+      <listitem>
+       <para>
+        When dumping a COPY or INSERT statement for a partitioned table,
+        target the root of the partitioning hierarchy which contains it rather
+        than the partition itself.  This will be useful when reloading data on
+        a server where rows do not always fall into the same partitions as
+        they did on the original server.  This could happen, for example, if
+        the partitioning column is of type text and the two system have
+        different definitions of the collation used to partition the data.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>--use-set-session-authorization</></term>
       <listitem>
        <para>
diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index 47191be..e7ee069 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -66,7 +66,7 @@ static int	numExtensions;
 static ExtensionMemberId *extmembers;
 static int	numextmembers;
 
-static void flagInhTables(TableInfo *tbinfo, int numTables,
+static void flagInhTables(Archive *fout, TableInfo *tbinfo, int numTables,
 			  InhInfo *inhinfo, int numInherits);
 static void flagInhAttrs(DumpOptions *dopt, TableInfo *tblinfo, int numTables);
 static DumpableObject **buildIndexArray(void *objArray, int numObjs,
@@ -243,7 +243,7 @@ getSchemaData(Archive *fout, int *numTablesPtr)
 	/* Link tables to parents, mark parents of target tables interesting */
 	if (g_verbose)
 		write_msg(NULL, "finding inheritance relationships\n");
-	flagInhTables(tblinfo, numTables, inhinfo, numInherits);
+	flagInhTables(fout, tblinfo, numTables, inhinfo, numInherits);
 
 	if (g_verbose)
 		write_msg(NULL, "reading column info for interesting tables\n");
@@ -301,12 +301,16 @@ getSchemaData(Archive *fout, int *numTablesPtr)
  * This is sufficient; we don't much care whether they inherited their
  * attributes or not.
  *
+ * When load-via-partition-root is set, load the parents information for the
+ * partition table.
+ *
  * modifies tblinfo
  */
 static void
-flagInhTables(TableInfo *tblinfo, int numTables,
+flagInhTables(Archive *fout, TableInfo *tblinfo, int numTables,
 			  InhInfo *inhinfo, int numInherits)
 {
+	DumpOptions *dopt = fout->dopt;
 	int			i,
 				j;
 	int			numParents;
@@ -314,6 +318,8 @@ flagInhTables(TableInfo *tblinfo, int numTables,
 
 	for (i = 0; i < numTables; i++)
 	{
+		bool		only_find_parents = false;
+
 		/* Some kinds never have parents */
 		if (tblinfo[i].relkind == RELKIND_SEQUENCE ||
 			tblinfo[i].relkind == RELKIND_VIEW ||
@@ -322,11 +328,26 @@ flagInhTables(TableInfo *tblinfo, int numTables,
 
 		/* Don't bother computing anything for non-target tables, either */
 		if (!tblinfo[i].dobj.dump)
-			continue;
+		{
+			/*
+			 * Load the parents information for the partition table when the
+			 * load-via-partition-root option is set. As we need the parents
+			 * information to get the partition root.
+			 */
+			if (dopt->load_via_partition_root &&
+				tblinfo[i].ispartition)
+				only_find_parents = true;
+			else
+				continue;
+		}
 
 		/* Find all the immediate parent tables */
 		findParentsByOid(&tblinfo[i], inhinfo, numInherits);
 
+		/* Skip mark the parents */
+		if (only_find_parents)
+			continue;
+
 		/* Mark the parents as interesting for getTableAttrs */
 		numParents = tblinfo[i].numParents;
 		parents = tblinfo[i].parents;
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 144068a..ce3100f 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -157,6 +157,7 @@ typedef struct _dumpOptions
 	int			outputNoTablespaces;
 	int			use_setsessauth;
 	int			enable_row_security;
+	int			load_via_partition_root;
 
 	/* default, if no "inclusion" switches appear, is to dump everything */
 	bool		include_everything;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 393b9e2..1703de8 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -269,6 +269,7 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 						const char *prefix, Archive *fout);
 static char *get_synchronized_snapshot(Archive *fout);
 static void setupDumpWorker(Archive *AHX);
+static TableInfo *getRootTableInfo(TableInfo *tbinfo);
 
 
 int
@@ -345,6 +346,7 @@ main(int argc, char **argv)
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &dopt.outputNoTablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
+		{"load-via-partition-root", no_argument, &dopt.load_via_partition_root, 1},
 		{"role", required_argument, NULL, 3},
 		{"section", required_argument, NULL, 5},
 		{"serializable-deferrable", no_argument, &dopt.serializable_deferrable, 1},
@@ -959,6 +961,7 @@ help(const char *progname)
 	printf(_("  --no-tablespaces             do not dump tablespace assignments\n"));
 	printf(_("  --no-unlogged-table-data     do not dump unlogged table data\n"));
 	printf(_("  --quote-all-identifiers      quote all identifiers, even if not key words\n"));
+	printf(_("  --load-via-partition-root    load partition table via the root table\n"));
 	printf(_("  --section=SECTION            dump named section (pre-data, data, or post-data)\n"));
 	printf(_("  --serializable-deferrable    wait until the dump can run without anomalies\n"));
 	printf(_("  --snapshot=SNAPSHOT          use given snapshot for the dump\n"));
@@ -1902,8 +1905,32 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 			if (insertStmt == NULL)
 			{
 				insertStmt = createPQExpBuffer();
+
+				/*
+				 * When load-via-partition-root is set, get the root table
+				 * name for the partition table, so that we can reload data
+				 * through the root table.
+				 */
+				if (dopt->load_via_partition_root && tbinfo->ispartition)
+				{
+					TableInfo  *parentTbinfo;
+
+					parentTbinfo = getRootTableInfo(tbinfo);
+
+					/*
+					 * When we loading data through the root, we will qualify
+					 * the table name. This is needed because earlier
+					 * search_path will be set for the partition table.
+					 */
+					classname = (char *) fmtQualifiedId(fout->remoteVersion,
+														parentTbinfo->dobj.namespace->dobj.name,
+														parentTbinfo->dobj.name);
+				}
+				else
+					classname = fmtId(tbinfo->dobj.name);
+
 				appendPQExpBuffer(insertStmt, "INSERT INTO %s ",
-								  fmtId(classname));
+								  classname);
 
 				/* corner case for zero-column table */
 				if (nfields == 0)
@@ -2025,6 +2052,27 @@ dumpTableData_insert(Archive *fout, void *dcontext)
 	return 1;
 }
 
+/*
+ * getRootTableInfo:
+ *     get the root TableInfo for the given partition table.
+ */
+static TableInfo *
+getRootTableInfo(TableInfo *tbinfo)
+{
+	TableInfo  *parentTbinfo;
+
+	Assert(tbinfo->ispartition);
+	Assert(tbinfo->numParents == 1);
+
+	parentTbinfo = tbinfo->parents[0];
+	while (parentTbinfo->ispartition)
+	{
+		Assert(parentTbinfo->numParents == 1);
+		parentTbinfo = parentTbinfo->parents[0];
+	}
+
+	return parentTbinfo;
+}
 
 /*
  * dumpTableData -
@@ -2041,14 +2089,38 @@ dumpTableData(Archive *fout, TableDataInfo *tdinfo)
 	PQExpBuffer clistBuf = createPQExpBuffer();
 	DataDumperPtr dumpFn;
 	char	   *copyStmt;
+	const char *copyFrom;
 
 	if (!dopt->dump_inserts)
 	{
 		/* Dump/restore using COPY */
 		dumpFn = dumpTableData_copy;
-		/* must use 2 steps here 'cause fmtId is nonreentrant */
+
+		/*
+		 * When load-via-partition-root is set, get the root table name for
+		 * the partition table, so that we can reload data through the root
+		 * table.
+		 */
+		if (dopt->load_via_partition_root && tbinfo->ispartition)
+		{
+			TableInfo  *parentTbinfo;
+
+			parentTbinfo = getRootTableInfo(tbinfo);
+
+			/*
+			 * When we loading data through the root, we will qualify the
+			 * table name. This is needed because earlier search_path will be
+			 * set for the partition table.
+			 */
+			copyFrom = fmtQualifiedId(fout->remoteVersion,
+									  parentTbinfo->dobj.namespace->dobj.name,
+									  parentTbinfo->dobj.name);
+		}
+		else
+			copyFrom = fmtId(tbinfo->dobj.name);
+
 		appendPQExpBuffer(copyBuf, "COPY %s ",
-						  fmtId(tbinfo->dobj.name));
+						  copyFrom);
 		appendPQExpBuffer(copyBuf, "%s %sFROM stdin;\n",
 						  fmtCopyColumnList(tbinfo, clistBuf),
 						  (tdinfo->oids && tbinfo->hasoids) ? "WITH OIDS " : "");
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index b14bb8e..c4741c2 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -80,6 +80,7 @@ static int	no_subscriptions = 0;
 static int	no_unlogged_table_data = 0;
 static int	no_role_passwords = 0;
 static int	server_version;
+static int	load_via_partition_root = 0;
 
 static char role_catalog[10];
 #define PG_AUTHID "pg_authid"
@@ -128,6 +129,7 @@ main(int argc, char *argv[])
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, &no_tablespaces, 1},
 		{"quote-all-identifiers", no_argument, &quote_all_identifiers, 1},
+		{"load-via-partition-root", no_argument, &load_via_partition_root, 1},
 		{"role", required_argument, NULL, 3},
 		{"use-set-session-authorization", no_argument, &use_setsessauth, 1},
 		{"no-publications", no_argument, &no_publications, 1},
@@ -385,6 +387,8 @@ main(int argc, char *argv[])
 		appendPQExpBufferStr(pgdumpopts, " --no-tablespaces");
 	if (quote_all_identifiers)
 		appendPQExpBufferStr(pgdumpopts, " --quote-all-identifiers");
+	if (load_via_partition_root)
+		appendPQExpBufferStr(pgdumpopts, " --load-via-partition-root");
 	if (use_setsessauth)
 		appendPQExpBufferStr(pgdumpopts, " --use-set-session-authorization");
 	if (no_publications)
@@ -606,6 +610,7 @@ help(void)
 	printf(_("  --no-tablespaces             do not dump tablespace assignments\n"));
 	printf(_("  --no-unlogged-table-data     do not dump unlogged table data\n"));
 	printf(_("  --quote-all-identifiers      quote all identifiers, even if not key words\n"));
+	printf(_("  --load-via-partition-root    load partition table via the root table\n"));
 	printf(_("  --use-set-session-authorization\n"
 			 "                               use SET SESSION AUTHORIZATION commands instead of\n"
 			 "                               ALTER OWNER commands to set ownership\n"));
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to