Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Fri, Apr 9, 2021 at 9:49 AM Justin Pryzby wrote: > I rebased this patch to resolve a trivial 1 line conflict from c5b7ba4e6. Thanks for rebasing! Actually, I've started reviewing this, but I couldn't finish my review. My apologies for not having much time on this. I'll continue to work on it for PG15. Sorry for the empty email. Best regards, Etsuro Fujita
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Fri, Apr 9, 2021 at 9:49 AM Justin Pryzby wrote: > > I rebased this patch to resolve a trivial 1 line conflict from c5b7ba4e6. > > -- > Justin
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Thu, Apr 8, 2021 at 5:49 PM Justin Pryzby wrote: > I rebased this patch to resolve a trivial 1 line conflict from c5b7ba4e6. > > -- > Justin > Hi, In src/backend/commands/copyfrom.c : + if (resultRelInfo->ri_RelationDesc->rd_rel->relkind == RELKIND_FOREIGN_TABLE) There are a few steps of indirection. Adding assertion before the if statement on resultRelInfo->ri_RelationDesc, etc would help catch potential invalid pointer. +CopyToStart(CopyToState cstate) ... +CopyToFinish(CopyToState cstate) Since 'copy to' is the action, it would be easier to read the method names if they're called StartCopyTo, FinishCopyTo, respectively. That way, the method names would be consistent with existing ones, such as: extern uint64 DoCopyTo(CopyToState cstate); +* If a partition's root parent isn't allowed to use it, neither is the In the above sentence, 'it' refers to multi insert. It would be more readable to explicitly mention 'multi insert' instead of 'it' Cheers
Re: [POC] Fast COPY FROM command for the table with foreign partitions
I rebased this patch to resolve a trivial 1 line conflict from c5b7ba4e6. -- Justin >From 0987ca4f62fb8c9b43a3fe142d955d8a9cb6f36f Mon Sep 17 00:00:00 2001 From: Takayuki Tsunakawa Date: Tue, 9 Feb 2021 12:50:00 +0900 Subject: [PATCH] Fast COPY FROM into the foreign or sharded table. This feature enables bulk COPY into foreign table when multi-insert is possible and foreign table has non-zero number of columns. The following routines are added to the FDW interface: * BeginForeignCopy * ExecForeignCopy * EndForeignCopy BeginForeignCopy and EndForeignCopy initialize and free the CopyState of bulk COPY. The ExecForeignCopy routine runs 'COPY ... FROM STDIN' command to the foreign server, in an iterative manner to send tuples using the CopyTo() machinery. Code that constructs a list of columns for a given foreign relation in the deparseAnalyzeSql() routine is split into deparseRelColumnList(). It is reused in deparseCopyFromSql(). Added TAP-tests on the specific corner cases of COPY FROM STDIN operation. By the analogy of CopyFrom() the CopyState structure was extended with data_dest_cb callback. It is used to send the text representation of a tuple to a custom destination. The PgFdwModifyState structure is extended with the cstate field. It is needed for avoid repeated initialization of CopyState. Also for this reason CopyTo() routine is split into the set of routines CopyToStart()/ CopyTo()/CopyToFinish(). When 0d5f05cde introduced support for using multi-insert mode when copying into partitioned tables, it introduced single variable of enum type CopyInsertMethod shared across all potential target relations (partitions) that, along with some target relation properties, dictated whether to engage multi-insert mode for a given target relation. Change that decision logic to the combination of ExecMultiInsertAllowed() and its caller. The former encapsulates the common criteria to allow multi-insert. The latter uses additional criteria and sets the new boolean field ri_usesMultiInsert of ResultRelInfo. That prevents repeated computation of the same information in some cases, especially for partitions, and the new arrangement results in slightly more readability. Enum CopyInsertMethod is removed. Authors: Andrey Lepikhov, Ashutosh Bapat, Amit Langote, Takayuki Tsunakawa Reviewed-by: Ashutosh Bapat, Amit Langote, Takayuki Tsunakawa Discussion: https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru --- contrib/postgres_fdw/deparse.c| 63 +++- .../postgres_fdw/expected/postgres_fdw.out| 46 ++- contrib/postgres_fdw/postgres_fdw.c | 141 + contrib/postgres_fdw/postgres_fdw.h | 1 + contrib/postgres_fdw/sql/postgres_fdw.sql | 45 +++ doc/src/sgml/fdwhandler.sgml | 71 - src/backend/commands/copy.c | 2 +- src/backend/commands/copyfrom.c | 271 -- src/backend/commands/copyto.c | 88 -- src/backend/executor/execMain.c | 44 +++ src/backend/executor/execPartition.c | 37 ++- src/include/commands/copy.h | 5 + src/include/commands/copyfrom_internal.h | 10 - src/include/executor/executor.h | 1 + src/include/foreign/fdwapi.h | 15 + src/include/nodes/execnodes.h | 8 +- 16 files changed, 637 insertions(+), 211 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index bdc4c3620d..bf93c1d091 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -185,6 +185,8 @@ static void appendAggOrderBy(List *orderList, List *targetList, static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); static Node *deparseSortGroupClause(Index ref, List *tlist, bool force_colno, deparse_expr_cxt *context); +static List *deparseRelColumnList(StringInfo buf, Relation rel, + bool enclose_in_parens); /* * Helper functions @@ -1859,6 +1861,23 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte, withCheckOptionList, returningList, retrieved_attrs); } +/* + * Deparse remote COPY FROM statement + * + * Note that this explicitly specifies the list of COPY's target columns + * to account for the fact that the remote table's columns may not match + * exactly with the columns declared in the local definition. + */ +void +deparseCopyFromSql(StringInfo buf, Relation rel) +{ + appendStringInfoString(buf, "COPY "); + deparseRelation(buf, rel); + (void) deparseRelColumnList(buf, rel, true); + + appendStringInfoString(buf, " FROM STDIN "); +} + /* * deparse remote UPDATE statement * @@ -2120,6 +2139,30 @@ deparseAnalyzeSizeSql(StringInfo buf, Relation rel) */ void deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) +{ + appendStringInfoString(buf, "SELECT "); + *retrieved_attrs = deparseRelColumnList(buf, rel,
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Justin Pryzby > On Mon, Mar 22, 2021 at 08:18:56PM -0700, Zhihong Yu wrote: > > with data_dest_cb callback. It is used for send text representation of a > > tuple to a custom destination. > > > > send text -> sending text > > I would say "It is used to send the text representation ..." I took Justin-san's suggestion. (It feels like I'm in a junior English class...) > It's actually a pointer: > src/include/commands/copy.h:typedef struct CopyToStateData *CopyToState; > > There's many data structures like this, where a structure is typedefed with a > "Data" suffix and the pointer is typedefed without the "Data" Yes. Thank you for good explanation, Justin-san. Regards TakayukiTsunakawa v22-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch Description: v22-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Mon, Mar 22, 2021 at 08:18:56PM -0700, Zhihong Yu wrote: > with data_dest_cb callback. It is used for send text representation of a > tuple to a custom destination. > > send text -> sending text I would say "It is used to send the text representation ..." > struct PgFdwModifyState *aux_fmstate; /* foreign-insert state, if > * created */ > + CopyToState cstate; /* foreign COPY state, if used */ > > Since foreign COPY is optional, should cstate be a pointer ? That would be > in line with aux_fmstate. It's actually a pointer: src/include/commands/copy.h:typedef struct CopyToStateData *CopyToState; There's many data structures like this, where a structure is typedefed with a "Data" suffix and the pointer is typedefed without the "Data" -- Justin
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi, In the description: with data_dest_cb callback. It is used for send text representation of a tuple to a custom destination. send text -> sending text struct PgFdwModifyState *aux_fmstate; /* foreign-insert state, if * created */ + CopyToState cstate; /* foreign COPY state, if used */ Since foreign COPY is optional, should cstate be a pointer ? That would be in line with aux_fmstate. Cheers On Mon, Mar 22, 2021 at 7:02 PM tsunakawa.ta...@fujitsu.com < tsunakawa.ta...@fujitsu.com> wrote: > From: Andrey Lepikhov > > Macros _() at the postgresExecForeignCopy routine: > > if (PQputCopyEnd(conn, OK ? NULL : _("canceled by server")) <= 0) > > > > uses gettext. Under linux it is compiled ok, because (as i understood) > > uses standard implementation of gettext: > > objdump -t contrib/postgres_fdw/postgres_fdw.so | grep 'gettext' > > gettext@@GLIBC_2.2.5 > > > > but in MacOS (and maybe somewhere else) we need to explicitly link > > libintl library in the Makefile: > > SHLIB_LINK += $(filter -lintl, $(LIBS) > > > > Also, we may not use gettext at all in this part of the code. > > I'm afraid so, because no extension in contrib/ has po/ directory. I just > removed _() and rebased the patch on HEAD. > > > Regards > TakayukiTsunakawa > > >
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Andrey Lepikhov > Macros _() at the postgresExecForeignCopy routine: > if (PQputCopyEnd(conn, OK ? NULL : _("canceled by server")) <= 0) > > uses gettext. Under linux it is compiled ok, because (as i understood) > uses standard implementation of gettext: > objdump -t contrib/postgres_fdw/postgres_fdw.so | grep 'gettext' > gettext@@GLIBC_2.2.5 > > but in MacOS (and maybe somewhere else) we need to explicitly link > libintl library in the Makefile: > SHLIB_LINK += $(filter -lintl, $(LIBS) > > Also, we may not use gettext at all in this part of the code. I'm afraid so, because no extension in contrib/ has po/ directory. I just removed _() and rebased the patch on HEAD. Regards TakayukiTsunakawa v21-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch Description: v21-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 5/3/21 21:54, tsunakawa.ta...@fujitsu.com wrote: I've managed to rebased it, although it took unexpectedly long. The patch is attached. It passes make check against core and postgres_fdw. I'll turn the CF status back to ready for committer shortly. Macros _() at the postgresExecForeignCopy routine: if (PQputCopyEnd(conn, OK ? NULL : _("canceled by server")) <= 0) uses gettext. Under linux it is compiled ok, because (as i understood) uses standard implementation of gettext: objdump -t contrib/postgres_fdw/postgres_fdw.so | grep 'gettext' gettext@@GLIBC_2.2.5 but in MacOS (and maybe somewhere else) we need to explicitly link libintl library in the Makefile: SHLIB_LINK += $(filter -lintl, $(LIBS) Also, we may not use gettext at all in this part of the code. -- regards, Andrey Lepikhov Postgres Professional
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Justin Pryzby > Could you rebase again and send an updated patch ? > I could do it if you want. Rebased and attached. Fortunately, there was no rebase conflict this time. make check passed for PG core and postgres_fdw. Regards Takayuki Tsunakawa v20-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch Description: v20-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Justin Pryzby > I think this change to the regression tests is suspicous: > > -CONTEXT: remote SQL command: INSERT INTO public.loc2(f1, f2) VALUES > ($1, $2) > -COPY rem2, line 1: "-1 xyzzy" > +CONTEXT: COPY loc2, line 1: "-1 xyzzy" > +remote SQL command: COPY public.loc2(f1, f2) FROM STDIN > +COPY rem2, line 2 > > I think it shouldn't say "COPY rem2, line 2" but rather a remote version of > the > same: > |COPY loc2, line 1: "-1 xyzzy" No, the output is OK. The remote message is included as the first line of the CONTEXT message field. The last line of the CONTEXT field is something that was added by the local COPY command. (Anyway, useful enough information is included in the message -- the constraint violation and the data that caused it.) > I have rebased this on my side over yesterday's libpq changes - I'll send it > if > you want, but it's probably just as easy if you do it. I've managed to rebased it, although it took unexpectedly long. The patch is attached. It passes make check against core and postgres_fdw. I'll turn the CF status back to ready for committer shortly. Regards Takayuki Tsunakawa v19-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch Description: v19-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch
Re: [POC] Fast COPY FROM command for the table with foreign partitions
I think this change to the regression tests is suspicous: -CONTEXT: remote SQL command: INSERT INTO public.loc2(f1, f2) VALUES ($1, $2) -COPY rem2, line 1: "-1 xyzzy" +CONTEXT: COPY loc2, line 1: "-1 xyzzy" +remote SQL command: COPY public.loc2(f1, f2) FROM STDIN +COPY rem2, line 2 I think it shouldn't say "COPY rem2, line 2" but rather a remote version of the same: |COPY loc2, line 1: "-1 xyzzy" I have rebased this on my side over yesterday's libpq changes - I'll send it if you want, but it's probably just as easy if you do it. -- Justin
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Thu, Mar 4, 2021 at 12:40 PM tsunakawa.ta...@fujitsu.com < tsunakawa.ta...@fujitsu.com> wrote: > From: Zhihong Yu > > This feature enables bulk COPY into foreign table in the case of > > multi inserts is possible > > > > 'is possible' -> 'if possible' > > > > FDWAPI was extended by next routines: > > > > next routines -> the following routines > > Thank you, fixed slightly differently. (I feel the need for pgEnglish > again.) > > > > + if ((!OK && PQresultStatus(res) != PGRES_FATAL_ERROR) || > > > > Is PGRES_FATAL_ERROR handled somewhere else ? I don't seem to find that > in the patch. > > Good catch. ok doesn't need to be consulted here, because failure during > row transmission causes PQputCopyEnd() to receive non-NULL for its second > argument, which in turn makes PQgetResult() return non-COMMAND_OK. > > > Regards > Takayuki Tsunakawa > > This patch set no longer applies http://cfbot.cputube.org/patch_32_2601.log Can we get a rebase? I am marking the patch "Waiting on Author" -- Ibrar Ahmed
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Zhihong Yu > This feature enables bulk COPY into foreign table in the case of > multi inserts is possible > > 'is possible' -> 'if possible' > > FDWAPI was extended by next routines: > > next routines -> the following routines Thank you, fixed slightly differently. (I feel the need for pgEnglish again.) > + if ((!OK && PQresultStatus(res) != PGRES_FATAL_ERROR) || > > Is PGRES_FATAL_ERROR handled somewhere else ? I don't seem to find that in > the patch. Good catch. ok doesn't need to be consulted here, because failure during row transmission causes PQputCopyEnd() to receive non-NULL for its second argument, which in turn makes PQgetResult() return non-COMMAND_OK. Regards Takayuki Tsunakawa v18-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch Description: v18-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi, This feature enables bulk COPY into foreign table in the case of multi inserts is possible 'is possible' -> 'if possible' FDWAPI was extended by next routines: next routines -> the following routines For postgresExecForeignCopy(): + if ((!OK && PQresultStatus(res) != PGRES_FATAL_ERROR) || Is PGRES_FATAL_ERROR handled somewhere else ? I don't seem to find that in the patch. Cheers On Wed, Mar 3, 2021 at 6:24 PM tsunakawa.ta...@fujitsu.com < tsunakawa.ta...@fujitsu.com> wrote: > From: Justin Pryzby > > Find attached some language fixes. > > Thanks a lot! (I wish there will be some tool like "pgEnglish" that > corrects English in code comments and docs.) > > > > |/* Do this to ensure we've pumped libpq back to idle state */ > > > > I don't know why you mean by "pumped"? > > I changed it to "have not gotten extra results" to match the error message. > > > > The CopySendEndOfRow "case COPY_CALLBACK:" should have a "break;" > > Added. > > > This touches some of the same parts as my "bulk insert" patch: > > https://commitfest.postgresql.org/32/2553/ > > My colleague will be reviewing it. > > > Regards > Takayuki Tsunakawa > >
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Justin Pryzby > Find attached some language fixes. Thanks a lot! (I wish there will be some tool like "pgEnglish" that corrects English in code comments and docs.) > |/* Do this to ensure we've pumped libpq back to idle state */ > > I don't know why you mean by "pumped"? I changed it to "have not gotten extra results" to match the error message. > The CopySendEndOfRow "case COPY_CALLBACK:" should have a "break;" Added. > This touches some of the same parts as my "bulk insert" patch: > https://commitfest.postgresql.org/32/2553/ My colleague will be reviewing it. Regards Takayuki Tsunakawa v17-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch Description: v17-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Find attached some language fixes. |/* Do this to ensure we've pumped libpq back to idle state */ I don't know why you mean by "pumped"? The CopySendEndOfRow "case COPY_CALLBACK:" should have a "break;" This touches some of the same parts as my "bulk insert" patch: https://commitfest.postgresql.org/32/2553/ -- Justin >From f7bb368963f5808bc5126f179b78507ca52b9cd2 Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Wed, 24 Feb 2021 02:23:17 -0600 Subject: [PATCH] language fixen --- contrib/postgres_fdw/postgres_fdw.c | 4 ++-- doc/src/sgml/fdwhandler.sgml| 13 +++-- src/backend/commands/copyfrom.c | 2 +- src/backend/commands/copyto.c | 4 ++-- 4 files changed, 12 insertions(+), 11 deletions(-) diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index b55f19b193..a3c394360e 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -2203,7 +2203,7 @@ pgfdw_copy_dest_cb(void *buf, int len) /* * postgresBeginForeignCopy - * Begin an COPY operation on a foreign table + * Begin a COPY operation on a foreign table */ static void postgresBeginForeignCopy(EState *estate, @@ -2306,7 +2306,7 @@ postgresExecForeignCopy(ResultRelInfo *resultRelInfo, /* * postgresEndForeignCopy - * Finish an COPY operation on a foreign table + * Finish a COPY operation on a foreign table */ static void postgresEndForeignCopy(EState *estate, ResultRelInfo *resultRelInfo) diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml index a49c17251f..666148aeb3 100644 --- a/doc/src/sgml/fdwhandler.sgml +++ b/doc/src/sgml/fdwhandler.sgml @@ -813,8 +813,9 @@ BeginForeignInsert(ModifyTableState *mtstate, Begin executing an insert operation on a foreign table. This routine is called right before the first tuple is inserted into the foreign table - in both cases when it is the partition chosen for tuple routing and the - target specified in a COPY FROM command. It should + target specified in a COPY FROM command, or when + the foreign table is the partition chosen for tuple routing of a + partitioned table. It should perform any initialization needed prior to the actual insertion. Subsequently, ExecForeignInsert or ExecForeignBatchInsert will be called for @@ -1072,12 +1073,12 @@ BeginForeignCopy(EState *estate, ResultRelInfo *rinfo); - Begin executing an copy operation on a foreign table. This routine is + Begin executing a copy operation on a foreign table. This routine is called right before the first call of ExecForeignCopy routine for the foreign table. It should perform any initialization needed prior to the actual COPY FROM operation. Subsequently, ExecForeignCopy will be called for - a bulk of tuples to be copied into the foreign table. + a batch of tuples to be copied into the foreign table. @@ -1101,12 +1102,12 @@ ExecForeignCopy(ResultRelInfo *rinfo, int nslots); - Copy a bulk of tuples into the foreign table. + Copy a batch of tuples into the foreign table. rinfo is the ResultRelInfo struct describing the target foreign table. slots contains the tuples to be inserted; it will match the row-type definition of the foreign table. - nslots is a number of tuples in the slots + nslots is the number of tuples in the slots diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c index 7b05da7871..b7c912cc3f 100644 --- a/src/backend/commands/copyfrom.c +++ b/src/backend/commands/copyfrom.c @@ -660,7 +660,7 @@ CopyFrom(CopyFromState cstate) resultRelInfo = target_resultRelInfo = makeNode(ResultRelInfo); ExecInitResultRelation(estate, resultRelInfo, 1); - Assert(target_resultRelInfo->ri_usesMultiInsert == false); + Assert(!target_resultRelInfo->ri_usesMultiInsert); /* * It's generally more efficient to prepare a bunch of tuples for diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c index 03c9df5084..2ac1e27acd 100644 --- a/src/backend/commands/copyto.c +++ b/src/backend/commands/copyto.c @@ -859,8 +859,8 @@ EndCopyTo(CopyToState cstate) /* * Start COPY TO operation. - * Separated to the routine to prevent duplicate operations in the case of - * manual mode, where tuples are copied to the destination one by one, by call of + * Separate from the main routine to prevent duplicate operations in + * manual mode, where tuples are copied to the destination one by one, by calling * the CopyOneRowTo() routine. */ void -- 2.17.0
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 2/15/21 1:31 PM, Amit Langote wrote: Tsunakawa-san, Andrey, +static void +postgresBeginForeignCopy(ModifyTableState *mtstate, + ResultRelInfo *resultRelInfo) +{ ... + if (resultRelInfo->ri_RangeTableIndex == 0) + { + ResultRelInfo *rootResultRelInfo = resultRelInfo->ri_RootResultRelInfo; + + rte = exec_rt_fetch(rootResultRelInfo->ri_RangeTableIndex, estate); It's better to add an Assert(rootResultRelInfo != NULL) here. Apparently, there are cases where ri_RangeTableIndex == 0 without ri_RootResultRelInfo being set. The Assert will ensure that BeginForeignCopy() is not mistakenly called on such ResultRelInfos. +1 I can't parse what the function's comment says about "using list of parameters". Maybe it means to say "list of columns" specified in the COPY FROM statement. How about writing this as: /* * Deparse remote COPY FROM statement * * Note that this explicitly specifies the list of COPY's target columns * to account for the fact that the remote table's columns may not match * exactly with the columns declared in the local definition. */ I'm hoping that I'm interpreting the original note correctly. Andrey? Yes, this is a good option. + + mtstate is the overall state of the + ModifyTable plan node being executed; global data about + the plan and execution state is available via this structure. ... +typedef void (*BeginForeignCopy_function) (ModifyTableState *mtstate, + ResultRelInfo *rinfo); Maybe a bit late realizing this, but why does BeginForeignCopy() accept a ModifyTableState pointer whereas maybe just an EState pointer will do? I can't imagine why an FDW would want to look at the ModifyTableState. Case in point, I see that postgresBeginForeignCopy() only uses the EState from the ModifyTableState passed to it. I think the ResultRelInfo that's being passed to the Copy APIs contains most of the necessary information. Also, EndForeignCopy() seems fine with just receiving the EState. +1 If the intention is to only prevent this error, maybe the condition above could be changed as this: /* * Check whether we support copying data out of the specified relation, * unless the caller also passed a non-NULL data_dest_cb, in which case, * the callback will take care of it */ if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION && data_dest_cb == NULL) Agreed. This is an atavism. In the first versions, I did not use the data_dest_cb routine. But now this is a redundant parameter. -- regards, Andrey Lepikhov Postgres Professional
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Amit Langote > Think I have mentioned upthread that this looks better as: > > if (rootResultRelInfo->ri_usesMultiInsert) > leaf_part_rri->ri_usesMultiInsert = ExecMultiInsertAllowed(leaf_part_rri); > > This keeps the logic confined to ExecInitPartitionInfo() where it > belongs. No point in burdening other callers of > ExecMultiInsertAllowed() in deciding whether or not it should pass a > valid value for the 2nd parameter. Oh, that's a good idea. (Why didn't I think of such a simple idea?) > Maybe a bit late realizing this, but why does BeginForeignCopy() > accept a ModifyTableState pointer whereas maybe just an EState pointer > will do? I can't imagine why an FDW would want to look at the > ModifyTableState. Case in point, I see that > postgresBeginForeignCopy() only uses the EState from the > ModifyTableState passed to it. I think the ResultRelInfo that's being > passed to the Copy APIs contains most of the necessary information. You're right. COPY is not under the control of a ModifyTable plan, so it's strange to pass ModifyTableState. > Also, EndForeignCopy() seems fine with just receiving the EState. I think this can have the ResultRelInfo like EndForeignInsert() and EndForeignModify() to correspond to the Begin function: "begin/end COPYing into this relation." > /* > * Check whether we support copying data out of the specified relation, > * unless the caller also passed a non-NULL data_dest_cb, in which case, > * the callback will take care of it > */ > if (rel != NULL && rel->rd_rel->relkind != RELKIND_RELATION && > data_dest_cb == NULL) > > I just checked that this works or at least doesn't break any newly added > tests. Good idea, too. The code has become more readable. Thank you a lot. Your other comments that are not mentioned above are also reflected. The attached patch passes the postgres_fdw regression test. Regards Takayuki Tsunakawa v16-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch Description: v16-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Tsunakawa-san, Andrey, On Mon, Feb 15, 2021 at 1:54 PM tsunakawa.ta...@fujitsu.com wrote: > From: Justin Pryzby > > This is crashing during fdw check. > > http://cfbot.cputube.org/andrey-lepikhov.html > > > > Maybe it's related to this patch: > > |commit 6214e2b2280462cbc3aa1986e350e167651b3905 > > |Fix permission checks on constraint violation errors on partitions. > > |Security: CVE-2021-3393 > > Thank you for your kind detailed investigation. The rebased version is > attached. Thanks for updating the patch. The commit message says this: Move that decision logic into InitResultRelInfo which now sets a new boolean field ri_usesMultiInsert of ResultRelInfo when a target relation is first initialized. That prevents repeated computation of the same information in some cases, especially for partitions, and the new arrangement results in slightly more readability. Enum CopyInsertMethod removed. This logic implements by ri_usesMultiInsert field of the ResultRelInfo structure. However, it is no longer InitResultRelInfo() that sets ri_usesMultiInsert. Doing that is now left for concerned functions who set it when they have enough information to do that correctly. Maybe update the message to make that clear to interested readers. + /* +* Use multi-insert mode if the condition checking passes for the +* parent and its child. +*/ + leaf_part_rri->ri_usesMultiInsert = + ExecMultiInsertAllowed(leaf_part_rri, rootResultRelInfo); Think I have mentioned upthread that this looks better as: if (rootResultRelInfo->ri_usesMultiInsert) leaf_part_rri->ri_usesMultiInsert = ExecMultiInsertAllowed(leaf_part_rri); This keeps the logic confined to ExecInitPartitionInfo() where it belongs. No point in burdening other callers of ExecMultiInsertAllowed() in deciding whether or not it should pass a valid value for the 2nd parameter. +static void +postgresBeginForeignCopy(ModifyTableState *mtstate, + ResultRelInfo *resultRelInfo) +{ ... + if (resultRelInfo->ri_RangeTableIndex == 0) + { + ResultRelInfo *rootResultRelInfo = resultRelInfo->ri_RootResultRelInfo; + + rte = exec_rt_fetch(rootResultRelInfo->ri_RangeTableIndex, estate); It's better to add an Assert(rootResultRelInfo != NULL) here. Apparently, there are cases where ri_RangeTableIndex == 0 without ri_RootResultRelInfo being set. The Assert will ensure that BeginForeignCopy() is not mistakenly called on such ResultRelInfos. +/* + * Deparse COPY FROM into given buf. + * We need to use list of parameters at each query. + */ +void +deparseCopyFromSql(StringInfo buf, Relation rel) +{ + appendStringInfoString(buf, "COPY "); + deparseRelation(buf, rel); + (void) deparseRelColumnList(buf, rel, true); + + appendStringInfoString(buf, " FROM STDIN "); +} I can't parse what the function's comment says about "using list of parameters". Maybe it means to say "list of columns" specified in the COPY FROM statement. How about writing this as: /* * Deparse remote COPY FROM statement * * Note that this explicitly specifies the list of COPY's target columns * to account for the fact that the remote table's columns may not match * exactly with the columns declared in the local definition. */ I'm hoping that I'm interpreting the original note correctly. Andrey? + + mtstate is the overall state of the + ModifyTable plan node being executed; global data about + the plan and execution state is available via this structure. ... +typedef void (*BeginForeignCopy_function) (ModifyTableState *mtstate, + ResultRelInfo *rinfo); Maybe a bit late realizing this, but why does BeginForeignCopy() accept a ModifyTableState pointer whereas maybe just an EState pointer will do? I can't imagine why an FDW would want to look at the ModifyTableState. Case in point, I see that postgresBeginForeignCopy() only uses the EState from the ModifyTableState passed to it. I think the ResultRelInfo that's being passed to the Copy APIs contains most of the necessary information. Also, EndForeignCopy() seems fine with just receiving the EState. + TupleDesc tupDesc;/* COPY TO will be used for manual tuple copying + * into the destination */ ... @@ -382,19 +393,24 @@ EndCopy(CopyToState cstate) CopyToState BeginCopyTo(ParseState *pstate, Relation rel, + TupleDesc srcTupDesc, I think that either the commentary around tupDesc/srcTupDesc needs to be improved or we should really find a way to do this without maintaining TupleDesc separately from the CopyState.rel. IIUC, this change is merely to allow postgres_fdw's ExecForeignCopy() to use CopyOneRowTo() which needs to be passed a valid CopyState. postgresBeginForeignCopy() initializes one by calling BeginCopyTo(), but it can't just pass the target foreign Relation to it, because generic BeginCopyTo() has this: if
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Justin Pryzby > This is crashing during fdw check. > http://cfbot.cputube.org/andrey-lepikhov.html > > Maybe it's related to this patch: > |commit 6214e2b2280462cbc3aa1986e350e167651b3905 > |Fix permission checks on constraint violation errors on partitions. > |Security: CVE-2021-3393 Thank you for your kind detailed investigation. The rebased version is attached. Regards Takayuki Tsunakawa v15-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch Description: v15-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Tue, Feb 09, 2021 at 04:35:03AM +, tsunakawa.ta...@fujitsu.com wrote: > Rebased to HEAD with the following modifications. It passes make check in > the top directory and contrib/postgres_fdw. > That said, with the reviews from some people and good performance results, I > think this can be ready for committer. This is crashing during fdw check. http://cfbot.cputube.org/andrey-lepikhov.html Maybe it's related to this patch: |commit 6214e2b2280462cbc3aa1986e350e167651b3905 |Fix permission checks on constraint violation errors on partitions. |Security: CVE-2021-3393 TRAP: FailedAssertion("n >= 0 && n < list->length", File: "../../src/include/nodes/pg_list.h", Line: 259, PID: 19780) (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x7fd33a557801 in __GI_abort () at abort.c:79 #2 0x55f7f53bbc88 in ExceptionalCondition (conditionName=conditionName@entry=0x7fd33b81bc40 "n >= 0 && n < list->length", errorType=errorType@entry=0x7fd33b81b698 "FailedAssertion", fileName=fileName@entry=0x7fd33b81be70 "../../src/include/nodes/pg_list.h", lineNumber=lineNumber@entry=259) at assert.c:69 #3 0x7fd33b816b54 in list_nth_cell (n=, list=) at ../../src/include/nodes/pg_list.h:259 #4 list_nth (n=, list=) at ../../src/include/nodes/pg_list.h:281 #5 exec_rt_fetch (estate=, rti=) at ../../src/include/executor/executor.h:558 #6 postgresBeginForeignCopy (mtstate=, resultRelInfo=) at postgres_fdw.c:2208 #7 0x55f7f5114bb4 in ExecInitRoutingInfo (mtstate=mtstate@entry=0x55f7f710a508, estate=estate@entry=0x55f7f71a7d50, proute=proute@entry=0x55f7f710a720, dispatch=dispatch@entry=0x55f7f710a778, partRelInfo=partRelInfo@entry=0x55f7f710eb20, partidx=partidx@entry=0) at execPartition.c:1004 #8 0x55f7f511618d in ExecInitPartitionInfo (partidx=0, rootResultRelInfo=0x55f7f710a278, dispatch=0x55f7f710a778, proute=0x55f7f710a720, estate=0x55f7f71a7d50, mtstate=0x55f7f710a508) at execPartition.c:742 #9 ExecFindPartition () at execPartition.c:400 #10 0x55f7f50a2718 in CopyFrom () at copyfrom.c:857 #11 0x55f7f50a1b06 in DoCopy () at copy.c:299 (gdb) up #7 0x55f7f5114bb4 in ExecInitRoutingInfo (mtstate=mtstate@entry=0x55f7f710a508, estate=estate@entry=0x55f7f71a7d50, proute=proute@entry=0x55f7f710a720, dispatch=dispatch@entry=0x55f7f710a778, partRelInfo=partRelInfo@entry=0x55f7f710eb20, partidx=partidx@entry=0) at execPartition.c:1004 1004 partRelInfo->ri_FdwRoutine->BeginForeignCopy(mtstate, partRelInfo); (gdb) p partRelInfo->ri_RangeTableIndex $7 = 0 (gdb) p *estate->es_range_table $9 = {type = T_List, length = 1, max_length = 5, elements = 0x55f7f717a2c0, initial_elements = 0x55f7f717a2c0} -- Justin
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Andrey V. Lepikhov > On 2/9/21 12:47 PM, tsunakawa.ta...@fujitsu.com wrote: > > As Tang-san and you showed, the basic part already demonstrated > impressive improvement. If there's no objection, I'd like to make this ready > for > committer in a few days. > Good. I've marked this as ready for committer. Good luck. Regards Takayuki Tsunakawa
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Andrey V. Lepikhov > I tried to use 1E4 - 1E8 rows in a tuple buffer. But the results weren't > impressive. I guess that's because the 64 KB threshold came first. > We can use one more GUC instead of a precompiled constant. Yes, agreed. > > Why don't we focus on committing the basic part and addressing the > extended part (0003 and 0004) separately later? > I focused only on the 0001 and 0002 patches. > > As Tang-san and you showed, the basic part already demonstrated > impressive improvement. If there's no objection, I'd like to make this ready > for > committer in a few days. > Good. Glad to hear that. Regards Takayuki Tsunakawa
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 2/9/21 12:47 PM, tsunakawa.ta...@fujitsu.com wrote: From: Andrey V. Lepikhov I guess you used many hash partitions. Sadly, The current COPY implementation only accumulates either 1,000 rows or 64 KB of input data (very small!) before flushing all CopyMultiInsertBuffers. One CopyMultiInsertBuffer corresponds to one partition. Flushing a CopyMultiInsertBuffer calls ExecForeignCopy() once, which connects to a remote database, runs COPY FROM STDIN, and disconnects. Here, the flushing trigger (1,000 rows or 64 KB input data, whichever comes first) is so small that if there are many target partitions, the amount of data for each partition is small. I tried to use 1E4 - 1E8 rows in a tuple buffer. But the results weren't impressive. We can use one more GUC instead of a precompiled constant. Why don't we focus on committing the basic part and addressing the extended part (0003 and 0004) separately later? I focused only on the 0001 and 0002 patches. As Tang-san and you showed, the basic part already demonstrated impressive improvement. If there's no objection, I'd like to make this ready for committer in a few days. Good. -- regards, Andrey Lepikhov Postgres Professional
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Andrey V. Lepikhov > On 2/9/21 9:35 AM, tsunakawa.ta...@fujitsu.com wrote: > > * Why is a separate FDW connection established for each COPY? To avoid > using the same FDW connection for multiple foreign table partitions in a > single > COPY run? > With separate connection you can init a 'COPY FROM' session for each > foreign partition just one time on partition initialization. > > > > * In what kind of test did you get 2-4x performance gain? COPY into many > foreign table partitions where the input rows are ordered randomly enough that > many rows don't accumulate in the COPY buffer? > I used 'INSERT INTO .. SELECT * FROM generate_series(1, N)' to generate > test data and HASH partitioning to avoid skews. I guess you used many hash partitions. Sadly, The current COPY implementation only accumulates either 1,000 rows or 64 KB of input data (very small!) before flushing all CopyMultiInsertBuffers. One CopyMultiInsertBuffer corresponds to one partition. Flushing a CopyMultiInsertBuffer calls ExecForeignCopy() once, which connects to a remote database, runs COPY FROM STDIN, and disconnects. Here, the flushing trigger (1,000 rows or 64 KB input data, whichever comes first) is so small that if there are many target partitions, the amount of data for each partition is small. Looking at the triggering threshold values, the description (of MAX_BUFFERED_TUPLES at least) seems to indicate that they take effect per CopyMultiInsertBuffer: /* * No more than this many tuples per CopyMultiInsertBuffer * * Caution: Don't make this too big, as we could end up with this many * CopyMultiInsertBuffer items stored in CopyMultiInsertInfo's * multiInsertBuffers list. Increasing this can cause quadratic growth in * memory requirements during copies into partitioned tables with a large * number of partitions. */ #define MAX_BUFFERED_TUPLES 1000 /* * Flush buffers if there are >= this many bytes, as counted by the input * size, of tuples stored. */ #define MAX_BUFFERED_BYTES 65535 But these threshold take effect across all CopyMultiInsertBuffers: /* * Returns true if the buffers are full */ static inline bool CopyMultiInsertInfoIsFull(CopyMultiInsertInfo *miinfo) { if (miinfo->bufferedTuples >= MAX_BUFFERED_TUPLES || miinfo->bufferedBytes >= MAX_BUFFERED_BYTES) return true; return false; } So, I think the direction to take is to allow more data to accumulate before flushing. I'm not very excited about the way 0003 and 0004 establishes a new connection for each partition; it adds flags to many places, and postgresfdw_xact_callback() has to be aware of COPY-specific processing. Plus, we have to take care of the message difference you found in the regression test. Why don't we focus on committing the basic part and addressing the extended part (0003 and 0004) separately later? As Tang-san and you showed, the basic part already demonstrated impressive improvement. If there's no objection, I'd like to make this ready for committer in a few days. Regards Takayuki Tsunakawa
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 2/9/21 9:35 AM, tsunakawa.ta...@fujitsu.com wrote: From: tsunakawa.ta...@fujitsu.com From: Andrey Lepikhov Also, I might defer working on the extended part (v9 0003 and 0004) and further separate them in a different thread, if it seems to take longer. I reviewed them but haven't rebased them (it seems to take more labor.) Andrey-san, could you tell us: * Why is a separate FDW connection established for each COPY? To avoid using the same FDW connection for multiple foreign table partitions in a single COPY run? With separate connection you can init a 'COPY FROM' session for each foreign partition just one time on partition initialization. * In what kind of test did you get 2-4x performance gain? COPY into many foreign table partitions where the input rows are ordered randomly enough that many rows don't accumulate in the COPY buffer? I used 'INSERT INTO .. SELECT * FROM generate_series(1, N)' to generate test data and HASH partitioning to avoid skews. -- regards, Andrey Lepikhov Postgres Professional
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: tsunakawa.ta...@fujitsu.com > From: Andrey Lepikhov > > Of course, you can rebase it. > > Thank you. I might modify the basic part to incorporate my past proposal > about improving the layering or modularity related to ri_useMultiInsert. > (But I > may end up giving up due to lack of energy.) Rebased to HEAD with the following modifications. It passes make check in the top directory and contrib/postgres_fdw. (1) Placed and ordered new three FDW functions consistently among their documentation, declaration and definition. (2) Check if BeginForeignCopy is not NULL before calling it, because the documentation says it's not mandatory. (3) Changed the function name ExecSetRelationUsesMultiInsert() to ExecMultiInsertAllowed() because it does *not* set anything but returns a boolean value to indicate whether the relation allows multi-insert. I was bugged about this function's interface and the use of ri_usesMultiInsert in ResultRelInfo. I still feel a bit uneasy about things like whether the function should really take the partition root (parent) argument, and whether it's a good design that ri_usesMultiInsert is used for the executor functions to determine which of Begin/EndForeignCopy() or Begin/EndForeignInsert() should be called. I'm fine with COPY using executor, but it feels a bit uncomfortable for the executor functions to be aware of COPY. That said, with the reviews from some people and good performance results, I think this can be ready for committer. > Also, I might defer working on the extended part (v9 0003 and 0004) and > further > separate them in a different thread, if it seems to take longer. I reviewed them but haven't rebased them (it seems to take more labor.) Andrey-san, could you tell us: * Why is a separate FDW connection established for each COPY? To avoid using the same FDW connection for multiple foreign table partitions in a single COPY run? * In what kind of test did you get 2-4x performance gain? COPY into many foreign table partitions where the input rows are ordered randomly enough that many rows don't accumulate in the COPY buffer? Regards Takayuki Tsunakawa v14-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch Description: v14-0001-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Andrey Lepikhov > Of course, you can rebase it. Thank you. I might modify the basic part to incorporate my past proposal about improving the layering or modularity related to ri_useMultiInsert. (But I may end up giving up due to lack of energy.) Also, I might defer working on the extended part (v9 0003 and 0004) and further separate them in a different thread, if it seems to take longer. Regards Takayuki Tsunakawa
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 2/2/21 11:57, tsunakawa.ta...@fujitsu.com wrote: Hello, Andrey-san, From: Tang, Haiying Sometimes before i suggested additional optimization [1] which can additionally speed up COPY by 2-4 times. Maybe you can perform the benchmark for this solution too? ... But the patch no longer applies, I tried to apply on e42b3c3bd6(2021/1/26) but failed. Can you send a rebased version? When do you think you can submit the rebased version of them? (IIUC at the off-list HighGo meeting, you're planning to post them late this week after the global snapshot patch.) Just in case you are not going to do them for the moment, can we rebase and/or further modify them so that they can be committed in PG 14? Of course, you can rebase it. -- regards, Andrey Lepikhov Postgres Professional
RE: [POC] Fast COPY FROM command for the table with foreign partitions
Hello, Andrey-san, From: Tang, Haiying > > Sometimes before i suggested additional optimization [1] which can > > additionally speed up COPY by 2-4 times. Maybe you can perform the > > benchmark for this solution too? ... > But the patch no longer applies, I tried to apply on e42b3c3bd6(2021/1/26) but > failed. > > Can you send a rebased version? I think the basic part of this patch set is the following. The latter file unfortunately no longer applies to HEAD. v13-0001-Move-multi-insert-decision-logic-into-executor.patch v13_3-0002-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch Plus, as Tang-san said, I'm afraid the following files are older and doesn't apply. v9-0003-Add-separated-connections-into-the-postgres_fdw.patch v9-0004-Optimized-version-of-the-Fast-COPY-FROM-feature When do you think you can submit the rebased version of them? (IIUC at the off-list HighGo meeting, you're planning to post them late this week after the global snapshot patch.) Just in case you are not going to do them for the moment, can we rebase and/or further modify them so that they can be committed in PG 14? Regards Takayuki Tsunakawa
RE: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey, > Sometimes before i suggested additional optimization [1] which can > additionally speed up COPY by 2-4 times. Maybe you can perform the > benchmark for this solution too? Sorry for the late reply, I just have time to take this test now. But the patch no longer applies, I tried to apply on e42b3c3bd6(2021/1/26) but failed. Can you send a rebased version? Regards, Tang
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 1/11/21 11:16 PM, Tomas Vondra wrote: Hi Andrey, Unfortunately, this no longer applies :-( I tried to apply this on top of c532d15ddd (from 2020/12/30) but even that has non-trivial conflicts. Can you send a rebased version? regards Applied on 044aa9e70e. -- regards, Andrey Lepikhov Postgres Professional >From f8e0cd305c691108313c2365cc4576e4d5e0bd38 Mon Sep 17 00:00:00 2001 From: Andrey Lepikhov Date: Tue, 12 Jan 2021 08:54:45 +0500 Subject: [PATCH 2/2] Fast COPY FROM into the foreign or sharded table. This feature enables bulk COPY into foreign table in the case of multi inserts is possible and foreign table has non-zero number of columns. FDWAPI was extended by next routines: * BeginForeignCopy * EndForeignCopy * ExecForeignCopy BeginForeignCopy and EndForeignCopy initialize and free the CopyState of bulk COPY. The ExecForeignCopy routine send 'COPY ... FROM STDIN' command to the foreign server, in iterative manner send tuples by CopyTo() machinery, send EOF to this connection. Code that constructed list of columns for a given foreign relation in the deparseAnalyzeSql() routine is separated to the deparseRelColumnList(). It is reused in the deparseCopyFromSql(). Added TAP-tests on the specific corner cases of COPY FROM STDIN operation. By the analogy of CopyFrom() the CopyState structure was extended with data_dest_cb callback. It is used for send text representation of a tuple to a custom destination. The PgFdwModifyState structure is extended with the cstate field. It is needed for avoid repeated initialization of CopyState. ALso for this reason CopyTo() routine was split into the set of routines CopyToStart()/ CopyTo()/CopyToFinish(). Enum CopyInsertMethod removed. This logic implements by ri_usesMultiInsert field of the ResultRelInfo sructure. Discussion: https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru Authors: Andrey Lepikhov, Ashutosh Bapat, Amit Langote --- contrib/postgres_fdw/deparse.c| 60 ++-- .../postgres_fdw/expected/postgres_fdw.out| 46 ++- contrib/postgres_fdw/postgres_fdw.c | 130 ++ contrib/postgres_fdw/postgres_fdw.h | 1 + contrib/postgres_fdw/sql/postgres_fdw.sql | 45 ++ doc/src/sgml/fdwhandler.sgml | 73 ++ src/backend/commands/copy.c | 4 +- src/backend/commands/copyfrom.c | 126 ++--- src/backend/commands/copyto.c | 84 --- src/backend/executor/execMain.c | 8 +- src/backend/executor/execPartition.c | 27 +++- src/include/commands/copy.h | 8 +- src/include/foreign/fdwapi.h | 15 ++ 13 files changed, 533 insertions(+), 94 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 3cf7b4eb1e..b1ca479a65 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -184,6 +184,8 @@ static void appendAggOrderBy(List *orderList, List *targetList, static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); static Node *deparseSortGroupClause(Index ref, List *tlist, bool force_colno, deparse_expr_cxt *context); +static List *deparseRelColumnList(StringInfo buf, Relation rel, + bool enclose_in_parens); /* * Helper functions @@ -1763,6 +1765,20 @@ deparseInsertSql(StringInfo buf, RangeTblEntry *rte, withCheckOptionList, returningList, retrieved_attrs); } +/* + * Deparse COPY FROM into given buf. + * We need to use list of parameters at each query. + */ +void +deparseCopyFromSql(StringInfo buf, Relation rel) +{ + appendStringInfoString(buf, "COPY "); + deparseRelation(buf, rel); + (void) deparseRelColumnList(buf, rel, true); + + appendStringInfoString(buf, " FROM STDIN "); +} + /* * deparse remote UPDATE statement * @@ -2066,6 +2082,30 @@ deparseAnalyzeSizeSql(StringInfo buf, Relation rel) */ void deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) +{ + appendStringInfoString(buf, "SELECT "); + *retrieved_attrs = deparseRelColumnList(buf, rel, false); + + /* Don't generate bad syntax for zero-column relation. */ + if (list_length(*retrieved_attrs) == 0) + appendStringInfoString(buf, "NULL"); + + /* + * Construct FROM clause + */ + appendStringInfoString(buf, " FROM "); + deparseRelation(buf, rel); +} + +/* + * Construct the list of columns of given foreign relation in the order they + * appear in the tuple descriptor of the relation. Ignore any dropped columns. + * Use column names on the foreign server instead of local names. + * + * Optionally enclose the list in parantheses. + */ +static List * +deparseRelColumnList(StringInfo buf, Relation rel, bool enclose_in_parens) { Oid relid = RelationGetRelid(rel); TupleDesc tupdesc = RelationGetDescr(rel); @@ -2074,10 +2114,8 @@ deparseAnalyzeSql(StringInfo buf, Relation rel, List
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 1/11/21 4:59 PM, Tang, Haiying wrote: Hi Andrey, I had a general look at this extension feature, I think it's beneficial for some application scenarios of PostgreSQL. So I did 7 performance cases test on your patch(v13). The results are really good. As you can see below we can get 7-10 times improvement with this patch. PSA test_copy_from.sql shows my test cases detail(I didn't attach my data file since it's too big). Below are the test results: 'Test No' corresponds to the number(0 1...6) in attached test_copy_from.sql. %reg=(Patched-Unpatched)/Unpatched), Unit is millisecond. |Test No| Test Case |Patched(ms) | Unpatched(ms) |%reg | |---|-|-|---|---| |0 |COPY FROM insertion into the partitioned table(parition is foreign table)| 102483.223 | 1083300.907 | -91% | |1 |COPY FROM insertion into the partitioned table(parition is foreign partition)| 104779.893 | 1207320.287 | -91% | |2 |COPY FROM insertion into the foreign table(without partition) | 100268.730 | 1077309.158 | -91% | |3 |COPY FROM insertion into the partitioned table(part of foreign partitions) | 104110.620 | 1134781.855 | -91% | |4 |COPY FROM insertion into the partitioned table with constraint(part of foreign partition)| 136356.201 | 1238539.603 | -89% | |5 |COPY FROM insertion into the foreign table with constraint(without partition)| 136818.262 | 1189921.742 | -89% | |6 |\copy insertion into the partitioned table with constraint. | 140368.072 | 1242689.924 | -89% | If there is any question on my tests, please feel free to ask. Best Regard, Tang Thank you for this work. Sometimes before i suggested additional optimization [1] which can additionally speed up COPY by 2-4 times. Maybe you can perform the benchmark for this solution too? [1] https://www.postgresql.org/message-id/da7ed3f5-b596-2549-3710-4cc2a602ec17%40postgrespro.ru -- regards, Andrey Lepikhov Postgres Professional
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey, Unfortunately, this no longer applies :-( I tried to apply this on top of c532d15ddd (from 2020/12/30) but even that has non-trivial conflicts. Can you send a rebased version? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
RE: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey, I had a general look at this extension feature, I think it's beneficial for some application scenarios of PostgreSQL. So I did 7 performance cases test on your patch(v13). The results are really good. As you can see below we can get 7-10 times improvement with this patch. PSA test_copy_from.sql shows my test cases detail(I didn't attach my data file since it's too big). Below are the test results: 'Test No' corresponds to the number(0 1...6) in attached test_copy_from.sql. %reg=(Patched-Unpatched)/Unpatched), Unit is millisecond. |Test No| Test Case |Patched(ms) | Unpatched(ms) |%reg | |---|-|-|---|---| |0 |COPY FROM insertion into the partitioned table(parition is foreign table)| 102483.223 | 1083300.907 | -91% | |1 |COPY FROM insertion into the partitioned table(parition is foreign partition)| 104779.893 | 1207320.287 | -91% | |2 |COPY FROM insertion into the foreign table(without partition) | 100268.730 | 1077309.158 | -91% | |3 |COPY FROM insertion into the partitioned table(part of foreign partitions) | 104110.620 | 1134781.855 | -91% | |4 |COPY FROM insertion into the partitioned table with constraint(part of foreign partition)| 136356.201 | 1238539.603 | -89% | |5 |COPY FROM insertion into the foreign table with constraint(without partition)| 136818.262 | 1189921.742 | -89% | |6 |\copy insertion into the partitioned table with constraint. | 140368.072 | 1242689.924 | -89% | If there is any question on my tests, please feel free to ask. Best Regard, Tang test_copy_from.sql Description: test_copy_from.sql
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 29.12.2020 16:20, Hou, Zhijie wrote: see new version in attachment. I took a look into the patch, and have some comments. 1. + PG_FINALLY(); + { + copy_fmstate = NULL; /* Detect problems */ I don't quite understand this comment, does it means we want to detect something like Null reference ? 2. + PG_FINALLY(); + { ... + if (!OK) + PG_RE_THROW(); + } Is this PG_RE_THROW() necessary ? IMO, PG_FINALLY will reproduce the PG_RE_THROW action if we get to the code block due to an error being thrown. This is a debugging stage atavisms. fixed. 3. + ereport(ERROR, + (errmsg("unexpected extra results during COPY of table: %s", + PQerrorMessage(conn; I found some similar message like the following: pg_log_warning("unexpected extra results during COPY of table \"%s\"", tocEntryTag); How about using existing messages style ? This style is intended for use in frontend utilities, not for contrib extensions, i think. 4. I noticed some not standard code comment[1]. I think it's better to comment like: /* * line 1 * line 2 */ [1]--- + /* Finish COPY IN protocol. It is needed to do after successful copy or +* after an error. +*/ +/* + * + * postgresExecForeignCopy +/* + * + * postgresBeginForeignCopy Thanks, fixed. The patch in attachment rebased on 107a2d4204. -- regards, Andrey Lepikhov Postgres Professional From 1c5439d802b7654ee50dc4326b9bc24fc7f44677 Mon Sep 17 00:00:00 2001 From: Andrey Lepikhov Date: Mon, 14 Dec 2020 13:37:40 +0500 Subject: [PATCH 2/2] Fast COPY FROM into the foreign or sharded table. This feature enables bulk COPY into foreign table in the case of multi inserts is possible and foreign table has non-zero number of columns. FDWAPI was extended by next routines: * BeginForeignCopy * EndForeignCopy * ExecForeignCopy BeginForeignCopy and EndForeignCopy initialize and free the CopyState of bulk COPY. The ExecForeignCopy routine send 'COPY ... FROM STDIN' command to the foreign server, in iterative manner send tuples by CopyTo() machinery, send EOF to this connection. Code that constructed list of columns for a given foreign relation in the deparseAnalyzeSql() routine is separated to the deparseRelColumnList(). It is reused in the deparseCopyFromSql(). Added TAP-tests on the specific corner cases of COPY FROM STDIN operation. By the analogy of CopyFrom() the CopyState structure was extended with data_dest_cb callback. It is used for send text representation of a tuple to a custom destination. The PgFdwModifyState structure is extended with the cstate field. It is needed for avoid repeated initialization of CopyState. ALso for this reason CopyTo() routine was split into the set of routines CopyToStart()/ CopyTo()/CopyToFinish(). Enum CopyInsertMethod removed. This logic implements by ri_usesMultiInsert field of the ResultRelInfo sructure. Discussion: https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru Authors: Andrey Lepikhov, Ashutosh Bapat, Amit Langote --- contrib/postgres_fdw/deparse.c| 60 ++-- .../postgres_fdw/expected/postgres_fdw.out| 46 ++- contrib/postgres_fdw/postgres_fdw.c | 130 ++ contrib/postgres_fdw/postgres_fdw.h | 1 + contrib/postgres_fdw/sql/postgres_fdw.sql | 45 ++ doc/src/sgml/fdwhandler.sgml | 73 ++ src/backend/commands/copy.c | 4 +- src/backend/commands/copyfrom.c | 126 ++--- src/backend/commands/copyto.c | 84 --- src/backend/executor/execMain.c | 8 +- src/backend/executor/execPartition.c | 27 +++- src/include/commands/copy.h | 8 +- src/include/foreign/fdwapi.h | 15 ++ 13 files changed, 533 insertions(+), 94 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index ca2f9f3215..b2a71faabc 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -184,6 +184,8 @@ static void appendAggOrderBy(List *orderList, List *targetList, static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); static Node *deparseSortGroupClause(Index ref, List *tlist, bool force_colno, deparse_expr_cxt *context); +static List *deparseRelColumnList(StringInfo buf, Relation rel, + bool enclose_in_parens); /* * Helper functions @@ -1763,6 +1765,20 @@ deparseInsertSql(StringInfo buf, RangeTblEntry *rte,
RE: [POC] Fast COPY FROM command for the table with foreign partitions
Hi > see new version in attachment. I took a look into the patch, and have some comments. 1. + PG_FINALLY(); + { + copy_fmstate = NULL; /* Detect problems */ I don't quite understand this comment, does it means we want to detect something like Null reference ? 2. + PG_FINALLY(); + { ... + if (!OK) + PG_RE_THROW(); + } Is this PG_RE_THROW() necessary ? IMO, PG_FINALLY will reproduce the PG_RE_THROW action if we get to the code block due to an error being thrown. 3. + ereport(ERROR, + (errmsg("unexpected extra results during COPY of table: %s", + PQerrorMessage(conn; I found some similar message like the following: pg_log_warning("unexpected extra results during COPY of table \"%s\"", tocEntryTag); How about using existing messages style ? 4. I noticed some not standard code comment[1]. I think it's better to comment like: /* * line 1 * line 2 */ [1]--- + /* Finish COPY IN protocol. It is needed to do after successful copy or +* after an error. +*/ +/* + * + * postgresExecForeignCopy +/* + * + * postgresBeginForeignCopy --- Best regards, Houzj
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 12/22/20 12:04 PM, Tang, Haiying wrote: Hi Andrey, There is an error report in your patch as follows. Please take a check. https://travis-ci.org/github/postgresql-cfbot/postgresql/jobs/750682857#L1519 copyfrom.c:374:21: error: ‘save_cur_lineno’ is used uninitialized in this function [-Werror=uninitialized] Regards, Tang Thank you, see new version in attachment. -- regards, Andrey Lepikhov Postgres Professional >From e2bc0980f05061afe199de63b76b00020208510a Mon Sep 17 00:00:00 2001 From: Andrey Lepikhov Date: Mon, 14 Dec 2020 13:37:40 +0500 Subject: [PATCH 2/2] Fast COPY FROM into the foreign or sharded table. This feature enables bulk COPY into foreign table in the case of multi inserts is possible and foreign table has non-zero number of columns. FDWAPI was extended by next routines: * BeginForeignCopy * EndForeignCopy * ExecForeignCopy BeginForeignCopy and EndForeignCopy initialize and free the CopyState of bulk COPY. The ExecForeignCopy routine send 'COPY ... FROM STDIN' command to the foreign server, in iterative manner send tuples by CopyTo() machinery, send EOF to this connection. Code that constructed list of columns for a given foreign relation in the deparseAnalyzeSql() routine is separated to the deparseRelColumnList(). It is reused in the deparseCopyFromSql(). Added TAP-tests on the specific corner cases of COPY FROM STDIN operation. By the analogy of CopyFrom() the CopyState structure was extended with data_dest_cb callback. It is used for send text representation of a tuple to a custom destination. The PgFdwModifyState structure is extended with the cstate field. It is needed for avoid repeated initialization of CopyState. ALso for this reason CopyTo() routine was split into the set of routines CopyToStart()/ CopyTo()/CopyToFinish(). Enum CopyInsertMethod removed. This logic implements by ri_usesMultiInsert field of the ResultRelInfo sructure. Discussion: https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru Authors: Andrey Lepikhov, Ashutosh Bapat, Amit Langote --- contrib/postgres_fdw/deparse.c| 60 ++-- .../postgres_fdw/expected/postgres_fdw.out| 46 +- contrib/postgres_fdw/postgres_fdw.c | 137 ++ contrib/postgres_fdw/postgres_fdw.h | 1 + contrib/postgres_fdw/sql/postgres_fdw.sql | 45 ++ doc/src/sgml/fdwhandler.sgml | 73 ++ src/backend/commands/copy.c | 4 +- src/backend/commands/copyfrom.c | 126 +--- src/backend/commands/copyto.c | 84 --- src/backend/executor/execMain.c | 8 +- src/backend/executor/execPartition.c | 27 +++- src/include/commands/copy.h | 8 +- src/include/foreign/fdwapi.h | 15 ++ 13 files changed, 540 insertions(+), 94 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index ca2f9f3215..b2a71faabc 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -184,6 +184,8 @@ static void appendAggOrderBy(List *orderList, List *targetList, static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); static Node *deparseSortGroupClause(Index ref, List *tlist, bool force_colno, deparse_expr_cxt *context); +static List *deparseRelColumnList(StringInfo buf, Relation rel, + bool enclose_in_parens); /* * Helper functions @@ -1763,6 +1765,20 @@ deparseInsertSql(StringInfo buf, RangeTblEntry *rte, withCheckOptionList, returningList, retrieved_attrs); } +/* + * Deparse COPY FROM into given buf. + * We need to use list of parameters at each query. + */ +void +deparseCopyFromSql(StringInfo buf, Relation rel) +{ + appendStringInfoString(buf, "COPY "); + deparseRelation(buf, rel); + (void) deparseRelColumnList(buf, rel, true); + + appendStringInfoString(buf, " FROM STDIN "); +} + /* * deparse remote UPDATE statement * @@ -2066,6 +2082,30 @@ deparseAnalyzeSizeSql(StringInfo buf, Relation rel) */ void deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) +{ + appendStringInfoString(buf, "SELECT "); + *retrieved_attrs = deparseRelColumnList(buf, rel, false); + + /* Don't generate bad syntax for zero-column relation. */ + if (list_length(*retrieved_attrs) == 0) + appendStringInfoString(buf, "NULL"); + + /* + * Construct FROM clause + */ + appendStringInfoString(buf, " FROM "); + deparseRelation(buf, rel); +} + +/* + * Construct the list of columns of given foreign relation in the order they + * appear in the tuple descriptor of the relation. Ignore any dropped columns. + * Use column names on the foreign server instead of local names. + * + * Optionally enclose the list in parantheses. + */ +static List * +deparseRelColumnList(StringInfo buf, Relation rel, bool enclose_in_parens) { Oid relid = RelationGetRelid(rel); TupleDesc tupdesc =
RE: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey, There is an error report in your patch as follows. Please take a check. https://travis-ci.org/github/postgresql-cfbot/postgresql/jobs/750682857#L1519 >copyfrom.c:374:21: error: ‘save_cur_lineno’ is used uninitialized in this >function [-Werror=uninitialized] Regards, Tang
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 12/1/20 2:02 PM, Amit Langote wrote: On Tue, Dec 1, 2020 at 2:40 PM tsunakawa.ta...@fujitsu.com wrote: From: Amit Langote >> The code appears to require both BeginForeignCopy and EndForeignCopy, >> while the following documentation says they are optional. Which is >> correct? (I suppose the latter is correct just like other existing >> Begin/End functions are optional.) Fixed. > Anyway, one thing we could do is rename > ExecRelationAllowsMultiInsert() to ExecSetRelationUsesMultiInsert( Renamed. >> I agree with your idea of adding multi_insert argument to ExecFindPartition() to request a multi-insert-capable partition. At first, I thought ExecFindPartition() is used for all operations, insert/delete/update/select, so I found it odd to add multi_insert argument. But ExecFindPartion() is used only for insert, so multi_insert argument seems okay. > > Good. Andrey, any thoughts on this? I have no serious technical arguments against this, other than code readability and reduce of a routine parameters. Maybe we will be rethinking it later? The new version rebased on commit 525e60b742 is attached. -- regards, Andrey Lepikhov Postgres Professional >From 98a6f077cd3b694683ec0e4a3250c040cc33cb39 Mon Sep 17 00:00:00 2001 From: Andrey Lepikhov Date: Mon, 14 Dec 2020 11:29:03 +0500 Subject: [PATCH 1/2] Move multi-insert decision logic into executor When 0d5f05cde introduced support for using multi-insert mode when copying into partitioned tables, it introduced single variable of enum type CopyInsertMethod shared across all potential target relations (partitions) that, along with some target relation properties, dictated whether to engage multi-insert mode for a given target relation. Move that decision logic into InitResultRelInfo which now sets a new boolean field ri_usesMultiInsert of ResultRelInfo when a target relation is first initialized. That prevents repeated computation of the same information in some cases, especially for partitions, and the new arrangement results in slightly more readability. --- src/backend/commands/copyfrom.c | 142 ++- src/backend/executor/execMain.c | 52 + src/backend/executor/execPartition.c | 7 ++ src/include/commands/copyfrom_internal.h | 10 -- src/include/executor/execPartition.h | 2 + src/include/executor/executor.h | 2 + src/include/nodes/execnodes.h| 8 +- 7 files changed, 108 insertions(+), 115 deletions(-) diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c index 1b14e9a6eb..6d4f6cb80d 100644 --- a/src/backend/commands/copyfrom.c +++ b/src/backend/commands/copyfrom.c @@ -535,12 +535,10 @@ CopyFrom(CopyFromState cstate) CommandId mycid = GetCurrentCommandId(true); int ti_options = 0; /* start with default options for insert */ BulkInsertState bistate = NULL; - CopyInsertMethod insertMethod; CopyMultiInsertInfo multiInsertInfo = {0}; /* pacify compiler */ uint64 processed = 0; bool has_before_insert_row_trig; bool has_instead_insert_row_trig; - bool leafpart_use_multi_insert = false; Assert(cstate->rel); Assert(list_length(cstate->range_table) == 1); @@ -650,6 +648,30 @@ CopyFrom(CopyFromState cstate) resultRelInfo = target_resultRelInfo = makeNode(ResultRelInfo); ExecInitResultRelation(estate, resultRelInfo, 1); + Assert(target_resultRelInfo->ri_usesMultiInsert == false); + + /* + * It's generally more efficient to prepare a bunch of tuples for + * insertion, and insert them in bulk, for example, with one + * table_multi_insert() call than call table_tuple_insert() separately for + * every tuple. However, there are a number of reasons why we might not be + * able to do this. For example, if there any volatile expressions in the + * table's default values or in the statement's WHERE clause, which may + * query the table we are inserting into, buffering tuples might produce + * wrong results. Also, the relation we are trying to insert into itself + * may not be amenable to buffered inserts. + * + * Note: For partitions, this flag is set considering the target table's + * flag that is being set here and partition's own properties which are + * checked by calling ExecSetRelationUsesMultiInsert(). It does not matter + * whether partitions have any volatile default expressions as we use the + * defaults from the target of the COPY command. + */ + if (!cstate->volatile_defexprs && + !contain_volatile_functions(cstate->whereClause)) + target_resultRelInfo->ri_usesMultiInsert = + ExecSetRelationUsesMultiInsert(target_resultRelInfo, NULL); + /* Verify the named relation is a valid target for INSERT */ CheckValidResultRel(resultRelInfo, CMD_INSERT); @@ -665,6 +687,10 @@ CopyFrom(CopyFromState cstate) mtstate->operation = CMD_INSERT; mtstate->resultRelInfo = resultRelInfo; + /* + * Init copying process into foreign table. Initialization of copying into + * foreign
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Tue, Dec 1, 2020 at 2:40 PM tsunakawa.ta...@fujitsu.com wrote: > From: Amit Langote > > Andrey's original patch had the flag to, as I understand it, make the > > partitioning case work correctly. When inserting into a > > non-partitioned table, there's only one relation to care about. In > > that case, CopyFrom() can use either the new COPY interface or the > > INSERT interface for the entire operation when talking to a foreign > > target relation's FDW driver. With partitions, that has to be > > considered separately for each partition. What complicates the matter > > further is that while the original target relation (the root > > partitioned table in the partitioning case) is fully initialized in > > CopyFrom(), partitions are lazily initialized by ExecFindPartition(). > > Yeah, I felt it a bit confusing to see the calls to Begin/EndForeignInsert() > in both CopyFrom() and ExecInitRoutingInfo(). > > > Note that the initialization of a given target relation can also > > optionally involve calling the FDW to perform any pre-COPY > > initializations. So if a given partition is a foreign table, whether > > the copy operation was initialized using the COPY interface or the > > INSERT interface is determined away from CopyFrom(). Andrey created > > ri_usesMultiInsert to remember which was used so that CopyFrom() can > > use the correct interface during the subsequent interactions with the > > partition's driver. > > > > Now, it does not seem outright impossible to do this without the flag, > > but maybe Andrey thinks it is good for readability? If it is > > confusing from a modularity standpoint, maybe we should rethink that. > > That said, I still think that there should be a way for CopyFrom() to > > tell ExecFindPartition() which FDW interface to initialize a given > > foreign table partition's copy operation with -- COPY if the copy > > allows multi-insert, INSERT if not. Maybe the multi_insert parameter > > I mentioned earlier would serve that purpose. > > I agree with your idea of adding multi_insert argument to ExecFindPartition() > to request a multi-insert-capable partition. At first, I thought > ExecFindPartition() is used for all operations, insert/delete/update/select, > so I found it odd to add multi_insert argument. But ExecFindPartion() is > used only for insert, so multi_insert argument seems okay. Good. Andrey, any thoughts on this? -- Amit Langote EDB: http://www.enterprisedb.com
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Amit Langote > Andrey's original patch had the flag to, as I understand it, make the > partitioning case work correctly. When inserting into a > non-partitioned table, there's only one relation to care about. In > that case, CopyFrom() can use either the new COPY interface or the > INSERT interface for the entire operation when talking to a foreign > target relation's FDW driver. With partitions, that has to be > considered separately for each partition. What complicates the matter > further is that while the original target relation (the root > partitioned table in the partitioning case) is fully initialized in > CopyFrom(), partitions are lazily initialized by ExecFindPartition(). Yeah, I felt it a bit confusing to see the calls to Begin/EndForeignInsert() in both CopyFrom() and ExecInitRoutingInfo(). > Note that the initialization of a given target relation can also > optionally involve calling the FDW to perform any pre-COPY > initializations. So if a given partition is a foreign table, whether > the copy operation was initialized using the COPY interface or the > INSERT interface is determined away from CopyFrom(). Andrey created > ri_usesMultiInsert to remember which was used so that CopyFrom() can > use the correct interface during the subsequent interactions with the > partition's driver. > > Now, it does not seem outright impossible to do this without the flag, > but maybe Andrey thinks it is good for readability? If it is > confusing from a modularity standpoint, maybe we should rethink that. > That said, I still think that there should be a way for CopyFrom() to > tell ExecFindPartition() which FDW interface to initialize a given > foreign table partition's copy operation with -- COPY if the copy > allows multi-insert, INSERT if not. Maybe the multi_insert parameter > I mentioned earlier would serve that purpose. I agree with your idea of adding multi_insert argument to ExecFindPartition() to request a multi-insert-capable partition. At first, I thought ExecFindPartition() is used for all operations, insert/delete/update/select, so I found it odd to add multi_insert argument. But ExecFindPartion() is used only for insert, so multi_insert argument seems okay. Regards Takayuki Tsunakawa
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Thu, Nov 26, 2020 at 11:42 AM tsunakawa.ta...@fujitsu.com wrote: > From: Amit Langote > > Anyway, one thing we could do is rename > > ExecRelationAllowsMultiInsert() to ExecSetRelationUsesMultiInsert(), > > that is, to make it actually set ri_usesMultiInsert and have places > > like CopyFrom() call it if (and only if) its local logic allows > > multi-insert to be used. So, ri_usesMultiInsert starts out set to > > false and if a module wants to use multi-insert for a given target > > relation, it calls ExecSetRelationUsesMultiInsert() to turn the flag > > on. Also, given the confusion regarding how execPartition.c > > I think separating the setting and inspection of the property into different > functions will be good, at least. > > > manipulates the flag, maybe change ExecFindPartition() to accept a > > Boolean parameter multi_insert, which it will pass down to > > ExecInitPartitionInfo(), which in turn will call > > ExecSetRelationUsesMultiInsert() for a given partition. Of course, if > > the logic in ExecSetRelationUsesMultiInsert() determines that > > multi-insert can't be used, for the reasons listed in the function, > > then the caller will have to live with that decision. > > I can't say for sure, but it looks strange to me, because I can't find a good > description of multi_insert argument for ExecFindPartition(). If we add > multi_insert, I'm afraid we may want to add further arguments for other > properties in the future like "Hey, get me the partition that has triggers.", > "Next, pass me a partition that uses a foreign table.", etc. I think the > current ExecFindPartition() is good -- "Get me a partition that accepts this > row." > > I wonder if ri_usesMultiInsert is really necessary. Would it cut down enough > costs in the intended use case(s), say the heavyweight COPY FROM? Thinking on this more, I think I'm starting to agree with you on this. I skimmed the CopyFrom()'s main loop again today and indeed it doesn't seem that the cost of checking the individual conditions for whether or not to buffer the current tuple for the given target relation is all that big to save with ri_usesMultiInsert. So my argument that it is good for performance is perhaps not that strong. Andrey's original patch had the flag to, as I understand it, make the partitioning case work correctly. When inserting into a non-partitioned table, there's only one relation to care about. In that case, CopyFrom() can use either the new COPY interface or the INSERT interface for the entire operation when talking to a foreign target relation's FDW driver. With partitions, that has to be considered separately for each partition. What complicates the matter further is that while the original target relation (the root partitioned table in the partitioning case) is fully initialized in CopyFrom(), partitions are lazily initialized by ExecFindPartition(). Note that the initialization of a given target relation can also optionally involve calling the FDW to perform any pre-COPY initializations. So if a given partition is a foreign table, whether the copy operation was initialized using the COPY interface or the INSERT interface is determined away from CopyFrom(). Andrey created ri_usesMultiInsert to remember which was used so that CopyFrom() can use the correct interface during the subsequent interactions with the partition's driver. Now, it does not seem outright impossible to do this without the flag, but maybe Andrey thinks it is good for readability? If it is confusing from a modularity standpoint, maybe we should rethink that. That said, I still think that there should be a way for CopyFrom() to tell ExecFindPartition() which FDW interface to initialize a given foreign table partition's copy operation with -- COPY if the copy allows multi-insert, INSERT if not. Maybe the multi_insert parameter I mentioned earlier would serve that purpose. -- Amit Langote EDB: http://www.enterprisedb.com
RE: [POC] Fast COPY FROM command for the table with foreign partitions
From: Amit Langote > Second is whether the interface for setting ri_usesMultiInsert > encourages situations where different modules could possibly engage in > conflicting behaviors. I can't think of a real-life example of that > with the current implementation, but maybe the interface provided in > the patch makes it harder to ensure that that remains true in the > future. Tsunakawa-san, have you encountered an example of this, maybe > when trying to integrate this patch with some other? Thanks. No, I pointed out purely from the standpoint of program modularity (based on structured programming?) > Anyway, one thing we could do is rename > ExecRelationAllowsMultiInsert() to ExecSetRelationUsesMultiInsert(), > that is, to make it actually set ri_usesMultiInsert and have places > like CopyFrom() call it if (and only if) its local logic allows > multi-insert to be used. So, ri_usesMultiInsert starts out set to > false and if a module wants to use multi-insert for a given target > relation, it calls ExecSetRelationUsesMultiInsert() to turn the flag > on. Also, given the confusion regarding how execPartition.c I think separating the setting and inspection of the property into different functions will be good, at least. > manipulates the flag, maybe change ExecFindPartition() to accept a > Boolean parameter multi_insert, which it will pass down to > ExecInitPartitionInfo(), which in turn will call > ExecSetRelationUsesMultiInsert() for a given partition. Of course, if > the logic in ExecSetRelationUsesMultiInsert() determines that > multi-insert can't be used, for the reasons listed in the function, > then the caller will have to live with that decision. I can't say for sure, but it looks strange to me, because I can't find a good description of multi_insert argument for ExecFindPartition(). If we add multi_insert, I'm afraid we may want to add further arguments for other properties in the future like "Hey, get me the partition that has triggers.", "Next, pass me a partition that uses a foreign table.", etc. I think the current ExecFindPartition() is good -- "Get me a partition that accepts this row." I wonder if ri_usesMultiInsert is really necessary. Would it cut down enough costs in the intended use case(s), say the heavyweight COPY FROM? Regards Takayuki Tsunakawa
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi, On Tue, Oct 20, 2020 at 11:31 AM tsunakawa.ta...@fujitsu.com wrote: > > > (2) > > > + Assert(rri->ri_usesMultiInsert == false); > > > > > > As the above assertion represents, I'm afraid the semantics of > > ExecRelationAllowsMultiInsert() and ResultRelInfo->ri_usesMultiInsert are > > unclear. In CopyFrom(), ri_usesMultiInsert is set by also considering the > > COPY-specific conditions: > > > > > > + if (!cstate->volatile_defexprs && > > > + !contain_volatile_functions(cstate->whereClause) && > > > + ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL)) > > > + target_resultRelInfo->ri_usesMultiInsert = true; > > > > > > On the other hand, in below ExecInitPartitionInfo(), ri_usesMultiInsert > > > is set > > purely based on the relation's characteristics. > > > > > > + leaf_part_rri->ri_usesMultiInsert = > > > + ExecRelationAllowsMultiInsert(leaf_part_rri, > > rootResultRelInfo); > > > > > > In addition to these differences, I think it's a bit confusing that the > > > function > > itself doesn't record the check result in ri_usesMultiInsert. > > > > > > It's probably easy to understand to not add ri_usesMultiInsert, and the > > function just encapsulates the check logic based solely on the relation > > characteristics and returns the result. So, the argument is just one > > ResultRelInfo. The caller (e.g. COPY) combines the function result with > > other > > specific conditions. > > > I can't fully agreed with this suggestion. We do so because in the future > > anyone > > can call this code from another subsystem for another purposes. And we want > > all the relation-related restrictions contains in one routine. > > CopyState-related > > restrictions used in copy.c only and taken out of this function. > > I'm sorry if I'm misinterpreting you, but I think the following simply serves > its role sufficiently and cleanly without using ri_usesMultiInsert. > > bool > ExecRelationAllowsMultiInsert(RelationRelInfo *rri) > { > check if the relation allows multiinsert based on its characteristics; > return true or false; > } > > I'm concerned that if one subsystem sets ri_usesMultiInsert to true based on > its additional specific conditions, it might lead to another subsystem's > misjudgment. For example, when subsystem A and B want to do different things > respectively: > > [Subsystem A] > if (ExecRelationAllowsMultiInsert(rri) && {A's conditions}) > rri->ri_usesMultiInsert = true; > ... > if (rri->ri_usesMultiInsert) > do A's business; > > [Subsystem B] > if (rri->ri_usesMultiInsert) > do B's business; > > Here, what if subsystem A and B don't want each other's specific conditions > to hold true? That is, A wants to do A's business only if B's specific > conditions don't hold true. If A sets rri->ri_usesMultiInsert to true and > passes rri to B, then B wrongly does B's business despite that A's specific > conditions are true. > > (I think this is due to some form of violation of encapsulation.) Sorry about chiming in late, but I think Tsunakawa-san raises some valid concerns. First, IIUC, is whether we need the ri_usesMultiInsert flag at all. I think yes, because computing that information repeatedly for every row seems wasteful, especially for a bulk operation, and even more so if we're going to call a function when doing so. Second is whether the interface for setting ri_usesMultiInsert encourages situations where different modules could possibly engage in conflicting behaviors. I can't think of a real-life example of that with the current implementation, but maybe the interface provided in the patch makes it harder to ensure that that remains true in the future. Tsunakawa-san, have you encountered an example of this, maybe when trying to integrate this patch with some other? Anyway, one thing we could do is rename ExecRelationAllowsMultiInsert() to ExecSetRelationUsesMultiInsert(), that is, to make it actually set ri_usesMultiInsert and have places like CopyFrom() call it if (and only if) its local logic allows multi-insert to be used. So, ri_usesMultiInsert starts out set to false and if a module wants to use multi-insert for a given target relation, it calls ExecSetRelationUsesMultiInsert() to turn the flag on. Also, given the confusion regarding how execPartition.c manipulates the flag, maybe change ExecFindPartition() to accept a Boolean parameter multi_insert, which it will pass down to ExecInitPartitionInfo(), which in turn will call ExecSetRelationUsesMultiInsert() for a given partition. Of course, if the logic in ExecSetRelationUsesMultiInsert() determines that multi-insert can't be used, for the reasons listed in the function, then the caller will have to live with that decision. Any other ideas on how to make this work and look better? -- Amit Langote EDB: http://www.enterprisedb.com [1] https://www.postgresql.org/message-id/d3fbf3bc93b7bcd99ff7fa9ee41e0e20%40postgrespro.ru
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 11/24/20 9:27 AM, tsunakawa.ta...@fujitsu.com wrote: Andrey-san, Fujita-san, From: Etsuro Fujita On Mon, Nov 23, 2020 at 5:39 PM Andrey Lepikhov wrote: On 11/23/20 7:49 AM, tsunakawa.ta...@fujitsu.com wrote: Could I or my colleague continue this patch in a few days? It looks it's stalled over one month. I don't found any problems with this patch that needed to be corrected. It is wait for actions from committers side, i think. I'm planning to review this patch. I think it would be better for another pair of eyes to take a look at it, though. There are the following two issues left untouched. https://www.postgresql.org/message-id/TYAPR01MB2990DC396B338C98F27C8ED3FE1F0%40TYAPR01MB2990.jpnprd01.prod.outlook.com I disagree with your opinion about changing the interface of the ExecRelationAllowsMultiInsert routine. If you insist on the need for this change, we need another opinion. -- regards, Andrey Lepikhov Postgres Professional
RE: [POC] Fast COPY FROM command for the table with foreign partitions
Andrey-san, Fujita-san, From: Etsuro Fujita > On Mon, Nov 23, 2020 at 5:39 PM Andrey Lepikhov > wrote: > > On 11/23/20 7:49 AM, tsunakawa.ta...@fujitsu.com wrote: > > > Could I or my colleague continue this patch in a few days? It looks it's > stalled over one month. > > > > I don't found any problems with this patch that needed to be corrected. > > It is wait for actions from committers side, i think. > > I'm planning to review this patch. I think it would be better for > another pair of eyes to take a look at it, though. There are the following two issues left untouched. https://www.postgresql.org/message-id/TYAPR01MB2990DC396B338C98F27C8ED3FE1F0%40TYAPR01MB2990.jpnprd01.prod.outlook.com Regards Takayuki Tsunakawa
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Mon, Nov 23, 2020 at 5:39 PM Andrey Lepikhov wrote: > On 11/23/20 7:49 AM, tsunakawa.ta...@fujitsu.com wrote: > > Could I or my colleague continue this patch in a few days? It looks it's > > stalled over one month. > > I don't found any problems with this patch that needed to be corrected. > It is wait for actions from committers side, i think. I'm planning to review this patch. I think it would be better for another pair of eyes to take a look at it, though. Best regards, Etsuro Fujita
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 11/23/20 7:49 AM, tsunakawa.ta...@fujitsu.com wrote: Hi Andrey-san, From: Tomas Vondra I needed to look at this patch while working on something related, and I found it got broken by 6973533650c a couple days ago. So here's a fixed version, to keep cfbot happy. I haven't done any serious review yet. Could I or my colleague continue this patch in a few days? It looks it's stalled over one month. I don't found any problems with this patch that needed to be corrected. It is wait for actions from committers side, i think. -- regards, Andrey Lepikhov Postgres Professional
RE: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey-san, From: Tomas Vondra > I needed to look at this patch while working on something related, and I > found it > got broken by 6973533650c a couple days ago. So here's a fixed version, to > keep > cfbot happy. I haven't done any serious review yet. Could I or my colleague continue this patch in a few days? It looks it's stalled over one month. Regards Takayuki Tsunakawa
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi, I needed to look at this patch while working on something related, and I found it got broken by 6973533650c a couple days ago. So here's a fixed version, to keep cfbot happy. I haven't done any serious review yet. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company >From 75ad201a09238c8fb69a22a85b2cde72838c2c84 Mon Sep 17 00:00:00 2001 From: "Andrey V. Lepikhov" Date: Tue, 10 Nov 2020 18:54:05 +0100 Subject: [PATCH 1/2] Move multi-insert decision logic into executor When 0d5f05cde introduced support for using multi-insert mode when copying into partitioned tables, it introduced single variable of enum type CopyInsertMethod shared across all potential target relations (partitions) that, along with some target relation proprties, dictated whether to engage multi-insert mode for a given target relation. Move that decision logic into InitResultRelInfo which now sets a new boolean field ri_usesMultiInsert of ResultRelInfo when a target relation is first initialized. That prevents repeated computation of the same information in some cases, especially for partitions, and the new arrangement results in slightly more readability. --- src/backend/commands/copy.c | 152 +++ src/backend/executor/execMain.c | 52 + src/backend/executor/execPartition.c | 7 ++ src/include/executor/execPartition.h | 2 + src/include/executor/executor.h | 2 + src/include/nodes/execnodes.h| 9 +- 6 files changed, 109 insertions(+), 115 deletions(-) diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 115860a9d4..d882396d6f 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -85,16 +85,6 @@ typedef enum EolType EOL_CRNL } EolType; -/* - * Represents the heap insert method to be used during COPY FROM. - */ -typedef enum CopyInsertMethod -{ - CIM_SINGLE, /* use table_tuple_insert or fdw routine */ - CIM_MULTI, /* always use table_multi_insert */ - CIM_MULTI_CONDITIONAL /* use table_multi_insert only if valid */ -} CopyInsertMethod; - /* * This struct contains all the state variables used throughout a COPY * operation. For simplicity, we use the same struct for all variants of COPY, @@ -2717,12 +2707,10 @@ CopyFrom(CopyState cstate) CommandId mycid = GetCurrentCommandId(true); int ti_options = 0; /* start with default options for insert */ BulkInsertState bistate = NULL; - CopyInsertMethod insertMethod; CopyMultiInsertInfo multiInsertInfo = {0}; /* pacify compiler */ uint64 processed = 0; bool has_before_insert_row_trig; bool has_instead_insert_row_trig; - bool leafpart_use_multi_insert = false; Assert(cstate->rel); Assert(list_length(cstate->range_table) == 1); @@ -2832,6 +2820,30 @@ CopyFrom(CopyState cstate) resultRelInfo = target_resultRelInfo = makeNode(ResultRelInfo); ExecInitResultRelation(estate, resultRelInfo, 1); + Assert(target_resultRelInfo->ri_usesMultiInsert == false); + + /* + * It's generally more efficient to prepare a bunch of tuples for + * insertion, and insert them in bulk, for example, with one + * table_multi_insert() call than call table_tuple_insert() separately for + * every tuple. However, there are a number of reasons why we might not be + * able to do this. For example, if there any volatile expressions in the + * table's default values or in the statement's WHERE clause, which may + * query the table we are inserting into, buffering tuples might produce + * wrong results. Also, the relation we are trying to insert into itself + * may not be amenable to buffered inserts. + * + * Note: For partitions, this flag is set considering the target table's + * flag that is being set here and partition's own properties which are + * checked by calling ExecRelationAllowsMultiInsert(). It does not matter + * whether partitions have any volatile default expressions as we use the + * defaults from the target of the COPY command. + */ + if (!cstate->volatile_defexprs && + !contain_volatile_functions(cstate->whereClause)) + target_resultRelInfo->ri_usesMultiInsert = + ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL); + /* Verify the named relation is a valid target for INSERT */ CheckValidResultRel(resultRelInfo, CMD_INSERT); @@ -2847,6 +2859,10 @@ CopyFrom(CopyState cstate) mtstate->operation = CMD_INSERT; mtstate->resultRelInfo = resultRelInfo; + /* + * Init COPY into foreign table. Initialization of copying into foreign + * partitions will be done later. + */ if (resultRelInfo->ri_FdwRoutine != NULL && resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL) resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, @@ -2879,83 +2895,9 @@ CopyFrom(CopyState cstate) cstate->qualexpr = ExecInitQual(castNode(List, cstate->whereClause), >ps); - /* - * It's generally more efficient to prepare a bunch of tuples for - * insertion,
RE: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey-san, Thanks for the revision. The patch looks good except for the following two items. (18) + if (target_resultRelInfo->ri_FdwRoutine != NULL) + { + if (target_resultRelInfo->ri_usesMultiInsert) + { + Assert(target_resultRelInfo->ri_FdwRoutine->BeginForeignCopy != NULL); + target_resultRelInfo->ri_FdwRoutine->BeginForeignCopy(mtstate, + resultRelInfo); + } > > (14) > > @@ -1205,10 +1209,18 @@ ExecCleanupTupleRouting(ModifyTableState > *mtstate, > > ResultRelInfo *resultRelInfo = proute->partitions[i]; > > > > /* Allow any FDWs to shut down */ > > - if (resultRelInfo->ri_FdwRoutine != NULL && > > - resultRelInfo->ri_FdwRoutine->EndForeignInsert != > NULL) > > - > resultRelInfo->ri_FdwRoutine->EndForeignInsert(mtstate->ps.state, > > - > resultRelInfo); > > + if (resultRelInfo->ri_FdwRoutine != NULL) > > + { > > + if (resultRelInfo->ri_usesMultiInsert) > > + { > > + > Assert(resultRelInfo->ri_FdwRoutine->EndForeignCopy != NULL); > > + > resultRelInfo->ri_FdwRoutine->EndForeignCopy(mtstate->ps.state, > > + > resultRelInfo); > > + } > > + else if > (resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL) > > + > resultRelInfo->ri_FdwRoutine->EndForeignInsert(mtstate->ps.state, > > + > resultRelInfo); > > + } > > > > EndForeignCopy() is an optional function, isn't it? That is, it's called > > if it's > defined. > > > ri_usesMultiInsert must guarantee that we will use multi-insertions. And we > use only assertions to control this. The code appears to require both BeginForeignCopy and EndForeignCopy, while the following documentation says they are optional. Which is correct? (I suppose the latter is correct just like other existing Begin/End functions are optional.) + If the BeginForeignCopy pointer is set to + NULL, no action is taken for the initialization. + If the EndForeignCopy pointer is set to + NULL, no action is taken for the termination. > > (2) > > + Assert(rri->ri_usesMultiInsert == false); > > > > As the above assertion represents, I'm afraid the semantics of > ExecRelationAllowsMultiInsert() and ResultRelInfo->ri_usesMultiInsert are > unclear. In CopyFrom(), ri_usesMultiInsert is set by also considering the > COPY-specific conditions: > > > > + if (!cstate->volatile_defexprs && > > + !contain_volatile_functions(cstate->whereClause) && > > + ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL)) > > + target_resultRelInfo->ri_usesMultiInsert = true; > > > > On the other hand, in below ExecInitPartitionInfo(), ri_usesMultiInsert is > > set > purely based on the relation's characteristics. > > > > + leaf_part_rri->ri_usesMultiInsert = > > + ExecRelationAllowsMultiInsert(leaf_part_rri, > rootResultRelInfo); > > > > In addition to these differences, I think it's a bit confusing that the > > function > itself doesn't record the check result in ri_usesMultiInsert. > > > > It's probably easy to understand to not add ri_usesMultiInsert, and the > function just encapsulates the check logic based solely on the relation > characteristics and returns the result. So, the argument is just one > ResultRelInfo. The caller (e.g. COPY) combines the function result with other > specific conditions. > I can't fully agreed with this suggestion. We do so because in the future > anyone > can call this code from another subsystem for another purposes. And we want > all the relation-related restrictions contains in one routine. > CopyState-related > restrictions used in copy.c only and taken out of this function. I'm sorry if I'm misinterpreting you, but I think the following simply serves its role sufficiently and cleanly without using ri_usesMultiInsert. bool ExecRelationAllowsMultiInsert(RelationRelInfo *rri) { check if the relation allows multiinsert based on its characteristics; return true or false; } I'm concerned that if one subsystem sets ri_usesMultiInsert to true based on its additional specific conditions, it might lead to another subsystem's misjudgment. For example, when subsystem A and B want to do different things respectively: [Subsystem A] if (ExecRelationAllowsMultiInsert(rri) && {A's conditions}) rri->ri_usesMultiInsert = true; ... if (rri->ri_usesMultiInsert) do A's business; [Subsystem B] if (rri->ri_usesMultiInsert) do B's business; Here, what if subsystem A and B don't want each other's specific conditions to
Re: [POC] Fast COPY FROM command for the table with foreign partitions
19.10.2020 09:12, tsunakawa.ta...@fujitsu.com пишет: Hello Andrey-san, Thank you for challenging an interesting feature. Below are my review comments. (1) - /* for use by copy.c when performing multi-inserts */ + /* +* The following fields are currently only relevant to copy.c. +* +* True if okay to use multi-insert on this relation +*/ + bool ri_usesMultiInsert; + + /* Buffer allocated to this relation when using multi-insert mode */ struct CopyMultiInsertBuffer *ri_CopyMultiInsertBuffer; } ResultRelInfo; It's better to place the new bool member next to an existing bool member, so that the structure doesn't get larger. Here the variable position chosen in accordance with the logical meaning. I don't see large problem with size of this structure. (2) + Assert(rri->ri_usesMultiInsert == false); As the above assertion represents, I'm afraid the semantics of ExecRelationAllowsMultiInsert() and ResultRelInfo->ri_usesMultiInsert are unclear. In CopyFrom(), ri_usesMultiInsert is set by also considering the COPY-specific conditions: + if (!cstate->volatile_defexprs && + !contain_volatile_functions(cstate->whereClause) && + ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL)) + target_resultRelInfo->ri_usesMultiInsert = true; On the other hand, in below ExecInitPartitionInfo(), ri_usesMultiInsert is set purely based on the relation's characteristics. + leaf_part_rri->ri_usesMultiInsert = + ExecRelationAllowsMultiInsert(leaf_part_rri, rootResultRelInfo); In addition to these differences, I think it's a bit confusing that the function itself doesn't record the check result in ri_usesMultiInsert. It's probably easy to understand to not add ri_usesMultiInsert, and the function just encapsulates the check logic based solely on the relation characteristics and returns the result. So, the argument is just one ResultRelInfo. The caller (e.g. COPY) combines the function result with other specific conditions. I can't fully agreed with this suggestion. We do so because in the future anyone can call this code from another subsystem for another purposes. And we want all the relation-related restrictions contains in one routine. CopyState-related restrictions used in copy.c only and taken out of this function. (3) +typedef void (*BeginForeignCopy_function) (ModifyTableState *mtstate, + ResultRelInfo *rinfo); + +typedef void (*EndForeignCopy_function) (EState *estate, + ResultRelInfo *rinfo); + +typedef void (*ExecForeignCopy_function) (ResultRelInfo *rinfo, + TupleTableSlot **slots, + int nslots); To align with other function groups, it's better to place the functions in order of Begin, Exec, and End. Ok, thanks. (4) + /* COPY a bulk of tuples into a foreign relation */ + BeginForeignCopy_function BeginForeignCopy; + EndForeignCopy_function EndForeignCopy; + ExecForeignCopy_function ExecForeignCopy; To align with the other functions' comment, the comment should be: /* Support functions for COPY */ Agreed (5) + +TupleTableSlot * +ExecForeignCopy(ResultRelInfo *rinfo, + TupleTableSlot **slots, + int nslots); + + + Copy a bulk of tuples into the foreign table. + estate is global execution state for the query. The return type is void. Agreed (6) + nslots cis a number of tuples in the slots cis -> is Ok (7) + + If the ExecForeignCopy pointer is set to + NULL, attempts to insert into the foreign table will fail + with an error message. + "attempts to insert into" should be "attempts to run COPY on", because it's used for COPY. Furthermore, if ExecForeignCopy is NULL, COPY should use ExecForeignInsert() instead, right? Otherwise, existing FDWs would become unable to be used for COPY. Thanks (8) + boolpipe = (filename == NULL) && (data_dest_cb == NULL); The above pipe in BeginCopyTo() is changed to not match pipe in DoCopyTo(), which only refers to filename. Should pipe in DoCopyTo() also be changed? If no, the use of the same variable name for different conditions is confusing. Ok (9) -* partitions will be done later. +- * partitions will be done later. This is an unintended addition of '-'? Ok (10) - if (resultRelInfo->ri_FdwRoutine != NULL && - resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL) - resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, -
RE: [POC] Fast COPY FROM command for the table with foreign partitions
Hello Andrey-san, Thank you for challenging an interesting feature. Below are my review comments. (1) - /* for use by copy.c when performing multi-inserts */ + /* +* The following fields are currently only relevant to copy.c. +* +* True if okay to use multi-insert on this relation +*/ + bool ri_usesMultiInsert; + + /* Buffer allocated to this relation when using multi-insert mode */ struct CopyMultiInsertBuffer *ri_CopyMultiInsertBuffer; } ResultRelInfo; It's better to place the new bool member next to an existing bool member, so that the structure doesn't get larger. (2) + Assert(rri->ri_usesMultiInsert == false); As the above assertion represents, I'm afraid the semantics of ExecRelationAllowsMultiInsert() and ResultRelInfo->ri_usesMultiInsert are unclear. In CopyFrom(), ri_usesMultiInsert is set by also considering the COPY-specific conditions: + if (!cstate->volatile_defexprs && + !contain_volatile_functions(cstate->whereClause) && + ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL)) + target_resultRelInfo->ri_usesMultiInsert = true; On the other hand, in below ExecInitPartitionInfo(), ri_usesMultiInsert is set purely based on the relation's characteristics. + leaf_part_rri->ri_usesMultiInsert = + ExecRelationAllowsMultiInsert(leaf_part_rri, rootResultRelInfo); In addition to these differences, I think it's a bit confusing that the function itself doesn't record the check result in ri_usesMultiInsert. It's probably easy to understand to not add ri_usesMultiInsert, and the function just encapsulates the check logic based solely on the relation characteristics and returns the result. So, the argument is just one ResultRelInfo. The caller (e.g. COPY) combines the function result with other specific conditions. (3) +typedef void (*BeginForeignCopy_function) (ModifyTableState *mtstate, + ResultRelInfo *rinfo); + +typedef void (*EndForeignCopy_function) (EState *estate, + ResultRelInfo *rinfo); + +typedef void (*ExecForeignCopy_function) (ResultRelInfo *rinfo, + TupleTableSlot **slots, + int nslots); To align with other function groups, it's better to place the functions in order of Begin, Exec, and End. (4) + /* COPY a bulk of tuples into a foreign relation */ + BeginForeignCopy_function BeginForeignCopy; + EndForeignCopy_function EndForeignCopy; + ExecForeignCopy_function ExecForeignCopy; To align with the other functions' comment, the comment should be: /* Support functions for COPY */ (5) + +TupleTableSlot * +ExecForeignCopy(ResultRelInfo *rinfo, + TupleTableSlot **slots, + int nslots); + + + Copy a bulk of tuples into the foreign table. + estate is global execution state for the query. The return type is void. (6) + nslots cis a number of tuples in the slots cis -> is (7) + + If the ExecForeignCopy pointer is set to + NULL, attempts to insert into the foreign table will fail + with an error message. + "attempts to insert into" should be "attempts to run COPY on", because it's used for COPY. Furthermore, if ExecForeignCopy is NULL, COPY should use ExecForeignInsert() instead, right? Otherwise, existing FDWs would become unable to be used for COPY. (8) + boolpipe = (filename == NULL) && (data_dest_cb == NULL); The above pipe in BeginCopyTo() is changed to not match pipe in DoCopyTo(), which only refers to filename. Should pipe in DoCopyTo() also be changed? If no, the use of the same variable name for different conditions is confusing. (9) -* partitions will be done later. +- * partitions will be done later. This is an unintended addition of '-'? (10) - if (resultRelInfo->ri_FdwRoutine != NULL && - resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL) - resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, - resultRelInfo); + if (target_resultRelInfo->ri_FdwRoutine != NULL) + { + if (target_resultRelInfo->ri_usesMultiInsert) + target_resultRelInfo->ri_FdwRoutine->BeginForeignCopy(mtstate, + resultRelInfo); + else if
Re: [POC] Fast COPY FROM command for the table with foreign partitions
This patch currently looks very ready for use. And I'm taking a close look at the error reporting. Here we have difference in behavior of local and foreign table: regression test in postgres_fdw.sql: copy rem2 from stdin; -1 xyzzy \. reports error (1): = ERROR: new row for relation "loc2" violates check constraint... DETAIL: Failing row contains (-1, xyzzy). CONTEXT: COPY loc2, line 1: "-1 xyzzy" remote SQL command: COPY public.loc2(f1, f2) FROM STDIN COPY rem2, line 2 But local COPY into loc2 reports another error (2): === copy loc2 from stdin; ERROR: new row for relation "loc2" violates check constraint... DETAIL: Failing row contains (-1, xyzzy). CONTEXT: COPY loc2, line 1: "-1 xyzzy" Report (2) is shorter and more specific. Report (1) contains meaningless information. Maybe we need to improve error report? For example like this: ERROR: Failed COPY into foreign table "rem2": new row for relation "loc2" violates check constraint... DETAIL: Failing row contains (-1, xyzzy). remote SQL command: COPY public.loc2(f1, f2) FROM STDIN COPY rem2, line 1 The problem here is that we run into an error after the COPY FROM command completes. And we need to translate lineno from foreign server to lineno of overall COPY command. -- regards, Andrey Lepikhov Postgres Professional
Re: [POC] Fast COPY FROM command for the table with foreign partitions
16.09.2020 12:10, Amit Langote пишет: On Thu, Sep 10, 2020 at 6:57 PM Andrey V. Lepikhov wrote: On 9/9/20 5:51 PM, Amit Langote wrote: Ok. I rewrited the patch 0001 with the Alexey suggestion. Thank you. Some mostly cosmetic suggestions on that: +bool +checkMultiInsertMode(const ResultRelInfo *rri, const ResultRelInfo *parent) I think we should put this definition in executor.c and export in executor.h, not execPartition.c/h. Also, better to match the naming style of surrounding executor routines, say, ExecRelationAllowsMultiInsert? I'm not sure if we need the 'parent' parameter but as it's pretty specific to partition's case, maybe partition_root is a better name. Agreed + if (!checkMultiInsertMode(target_resultRelInfo, NULL)) + { + /* +* Do nothing. Can't allow multi-insert mode if previous conditions +* checking disallow this. +*/ + } Personally, I find this notation with empty blocks a bit strange. Maybe it's easier to read this instead: if (!cstate->volatile_defexprs && !contain_volatile_functions(cstate->whereClause) && ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL)) target_resultRelInfo->ri_usesMultiInsert = true; Agreed Also, I don't really understand why we need list_length(cstate->attnumlist) > 0 to use multi-insert on foreign tables but apparently we do. The next patch should add that condition here along with a brief note on that in the comment. This is a feature of the COPY command. It can't be used without any column in braces. However, foreign tables without columns can exist. You can see this problem if you apply the 0002 patch on top of your delta patch. Ashutosh in [1] noticed this problem and anchored it with regression test. I included this expression (with comments) into the 0002 patch. - if (resultRelInfo->ri_FdwRoutine != NULL && - resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL) - resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, -resultRelInfo); + /* +* Init COPY into foreign table. Initialization of copying into foreign +* partitions will be done later. +*/ + if (target_resultRelInfo->ri_FdwRoutine != NULL && + target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL) + target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, + resultRelInfo); @@ -3349,11 +3302,10 @@ CopyFrom(CopyState cstate) if (target_resultRelInfo->ri_FdwRoutine != NULL && target_resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL) target_resultRelInfo->ri_FdwRoutine->EndForeignInsert(estate, - target_resultRelInfo); + target_resultRelInfo); These two hunks seem unnecessary, which I think I introduced into this patch when breaking it out of the main one. Please check the attached delta patch which contains the above changes. I applied your delta patch to the 0001 patch and fix the 0002 patch in accordance with these changes. Patches 0003 and 0004 are experimental and i will not support them before discussing on applicability. [1] https://www.postgresql.org/message-id/CAExHW5uAtyAVL-iuu1Hsd0fycqS5UHoHCLfauYDLQwRucwC9Og%40mail.gmail.com -- regards, Andrey Lepikhov Postgres Professional >From 05e0e9cf9de7a3893dd1692d8ea0131cc1332433 Mon Sep 17 00:00:00 2001 From: "Andrey V. Lepikhov" Date: Sun, 20 Sep 2020 11:44:30 +0300 Subject: [PATCH 2/2] Fast COPY FROM into the foreign or sharded table. This feature enables bulk COPY into foreign table in the case of multi inserts is possible and foreign table has non-zero number of columns. FDWAPI was extended by next routines: * BeginForeignCopy * EndForeignCopy * ExecForeignCopy BeginForeignCopy and EndForeignCopy initialize and free the CopyState of bulk COPY. The ExecForeignCopy routine send 'COPY ... FROM STDIN' command to the foreign server, in iterative manner send tuples by CopyTo() machinery, send EOF to this connection. Code that constructed list of columns for a given foreign relation in the deparseAnalyzeSql() routine is separated to the deparseRelColumnList(). It is reused in the deparseCopyFromSql(). Added TAP-tests on the specific corner cases of COPY FROM STDIN operation. By the analogy of CopyFrom() the CopyState structure was extended with data_dest_cb callback. It is used for send text representation of a tuple to a custom destination. The PgFdwModifyState structure is extended with the cstate field. It is needed for avoid repeated initialization of CopyState. ALso for this reason CopyTo() routine was split into the set of routines CopyToStart()/ CopyTo()/CopyToFinish(). Enum CopyInsertMethod removed. This logic implements by ri_usesMultiInsert field of the ResultRelInfo sructure. Discussion:
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Thu, Sep 10, 2020 at 6:57 PM Andrey V. Lepikhov wrote: > On 9/9/20 5:51 PM, Amit Langote wrote: > > On Wed, Sep 9, 2020 at 6:42 PM Alexey Kondratov > > wrote: > >> And InitResultRelInfo() may set ri_usesMultiInsert to false by default, > >> since it's used only by COPY now. Then you won't need this in several > >> places: > >> > >> + resultRelInfo->ri_usesMultiInsert = false; > >> > >> While the logic of turning multi-insert on with all the validations > >> required could be factored out of InitResultRelInfo() to a separate > >> routine. > > > > Interesting idea. Maybe better to have a separate routine like Alexey says. > Ok. I rewrited the patch 0001 with the Alexey suggestion. Thank you. Some mostly cosmetic suggestions on that: +bool +checkMultiInsertMode(const ResultRelInfo *rri, const ResultRelInfo *parent) I think we should put this definition in executor.c and export in executor.h, not execPartition.c/h. Also, better to match the naming style of surrounding executor routines, say, ExecRelationAllowsMultiInsert? I'm not sure if we need the 'parent' parameter but as it's pretty specific to partition's case, maybe partition_root is a better name. + if (!checkMultiInsertMode(target_resultRelInfo, NULL)) + { + /* +* Do nothing. Can't allow multi-insert mode if previous conditions +* checking disallow this. +*/ + } Personally, I find this notation with empty blocks a bit strange. Maybe it's easier to read this instead: if (!cstate->volatile_defexprs && !contain_volatile_functions(cstate->whereClause) && ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL)) target_resultRelInfo->ri_usesMultiInsert = true; Also, I don't really understand why we need list_length(cstate->attnumlist) > 0 to use multi-insert on foreign tables but apparently we do. The next patch should add that condition here along with a brief note on that in the comment. - if (resultRelInfo->ri_FdwRoutine != NULL && - resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL) - resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, -resultRelInfo); + /* +* Init COPY into foreign table. Initialization of copying into foreign +* partitions will be done later. +*/ + if (target_resultRelInfo->ri_FdwRoutine != NULL && + target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL) + target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, + resultRelInfo); @@ -3349,11 +3302,10 @@ CopyFrom(CopyState cstate) if (target_resultRelInfo->ri_FdwRoutine != NULL && target_resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL) target_resultRelInfo->ri_FdwRoutine->EndForeignInsert(estate, - target_resultRelInfo); + target_resultRelInfo); These two hunks seem unnecessary, which I think I introduced into this patch when breaking it out of the main one. Please check the attached delta patch which contains the above changes. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com 0001-delta.patch Description: Binary data
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 9/9/20 5:51 PM, Amit Langote wrote: On Wed, Sep 9, 2020 at 6:42 PM Alexey Kondratov wrote: On 2020-09-09 11:45, Andrey V. Lepikhov wrote: This does not seem very convenient and will lead to errors in the future. So, I agree with Amit. And InitResultRelInfo() may set ri_usesMultiInsert to false by default, since it's used only by COPY now. Then you won't need this in several places: + resultRelInfo->ri_usesMultiInsert = false; While the logic of turning multi-insert on with all the validations required could be factored out of InitResultRelInfo() to a separate routine. Interesting idea. Maybe better to have a separate routine like Alexey says. Ok. I rewrited the patch 0001 with the Alexey suggestion. Patch 0002... required minor changes (new version see in attachment). Also I added some optimization (see 0003 and 0004 patches). Here we execute 'COPY .. FROM STDIN' at foreign server only once, in the BeginForeignCopy routine. It is a proof-of-concept patches. Also I see that error messages processing needs to be rewritten. Unlike the INSERT operation applied to each row, here we find out copy errors only after sending the END of copy. Currently implementations 0002 and 0004 provide uninformative error messages for some cases. -- regards, Andrey Lepikhov Postgres Professional >From 2053ac530db87ae4617aa953142c447e0b27e3a2 Mon Sep 17 00:00:00 2001 From: amitlan Date: Mon, 24 Aug 2020 15:08:37 +0900 Subject: [PATCH 1/4] Move multi-insert decision logic into executor When 0d5f05cde introduced support for using multi-insert mode when copying into partitioned tables, it introduced single variable of enum type CopyInsertMethod shared across all potential target relations (partitions) that, along with some target relation proprties, dictated whether to engage multi-insert mode for a given target relation. Move that decision logic into InitResultRelInfo which now sets a new boolean field ri_usesMultiInsert of ResultRelInfo when a target relation is first initialized. That prevents repeated computation of the same information in some cases, especially for partitions, and the new arrangement results in slightly more readability. --- src/backend/commands/copy.c | 190 ++- src/backend/executor/execMain.c | 3 + src/backend/executor/execPartition.c | 47 +++ src/include/executor/execPartition.h | 2 + src/include/nodes/execnodes.h| 9 +- 5 files changed, 131 insertions(+), 120 deletions(-) diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index db7d24a511..2119db4213 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -85,16 +85,6 @@ typedef enum EolType EOL_CRNL } EolType; -/* - * Represents the heap insert method to be used during COPY FROM. - */ -typedef enum CopyInsertMethod -{ - CIM_SINGLE, /* use table_tuple_insert or fdw routine */ - CIM_MULTI, /* always use table_multi_insert */ - CIM_MULTI_CONDITIONAL /* use table_multi_insert only if valid */ -} CopyInsertMethod; - /* * This struct contains all the state variables used throughout a COPY * operation. For simplicity, we use the same struct for all variants of COPY, @@ -2715,12 +2705,10 @@ CopyFrom(CopyState cstate) CommandId mycid = GetCurrentCommandId(true); int ti_options = 0; /* start with default options for insert */ BulkInsertState bistate = NULL; - CopyInsertMethod insertMethod; CopyMultiInsertInfo multiInsertInfo = {0}; /* pacify compiler */ uint64 processed = 0; bool has_before_insert_row_trig; bool has_instead_insert_row_trig; - bool leafpart_use_multi_insert = false; Assert(cstate->rel); @@ -2833,6 +2821,58 @@ CopyFrom(CopyState cstate) 0); target_resultRelInfo = resultRelInfo; + Assert(target_resultRelInfo->ri_usesMultiInsert == false); + + /* + * It's generally more efficient to prepare a bunch of tuples for + * insertion, and insert them in bulk, for example, with one + * table_multi_insert() call than call table_tuple_insert() separately + * for every tuple. However, there are a number of reasons why we might + * not be able to do this. We check some conditions below while some + * other target relation properties are checked in InitResultRelInfo(). + * Partition initialization will use result of this check implicitly as + * the ri_usesMultiInsert value of the parent relation. + */ + if (!checkMultiInsertMode(target_resultRelInfo, NULL)) + { + /* + * Do nothing. Can't allow multi-insert mode if previous conditions + * checking disallow this. + */ + } + else if (cstate->volatile_defexprs || list_length(cstate->attnumlist) == 0) + { + /* + * Can't support bufferization of copy into foreign tables without any + * defined columns or if there are any volatile default expressions in the + * table. Similarly to the trigger case above, such expressions may query + * the table we're inserting into. + * + * Note: It does not matter
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Wed, Sep 9, 2020 at 6:42 PM Alexey Kondratov wrote: > On 2020-09-09 11:45, Andrey V. Lepikhov wrote: > > This does not seem very convenient and will lead to errors in the > > future. So, I agree with Amit. > > And InitResultRelInfo() may set ri_usesMultiInsert to false by default, > since it's used only by COPY now. Then you won't need this in several > places: > > + resultRelInfo->ri_usesMultiInsert = false; > > While the logic of turning multi-insert on with all the validations > required could be factored out of InitResultRelInfo() to a separate > routine. Interesting idea. Maybe better to have a separate routine like Alexey says. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 2020-09-09 11:45, Andrey V. Lepikhov wrote: On 9/8/20 8:34 PM, Alexey Kondratov wrote: On 2020-09-08 17:00, Amit Langote wrote: wrote: On 2020-09-08 10:34, Amit Langote wrote: Another ambiguous part of the refactoring was in changing InitResultRelInfo() arguments: @@ -1278,6 +1280,7 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo, Relation resultRelationDesc, Index resultRelationIndex, Relation partition_root, + bool use_multi_insert, int instrument_options) Why do we need to pass this use_multi_insert flag here? Would it be better to set resultRelInfo->ri_usesMultiInsert in the InitResultRelInfo() unconditionally like it is done for ri_usesFdwDirectModify? And after that it will be up to the caller whether to use multi-insert or not based on their own circumstances. Otherwise now we have a flag to indicate that we want to check for another flag, while this check doesn't look costly. Hmm, I think having two flags seems confusing and bug prone, especially if you consider partitions. For example, if a partition's ri_usesMultiInsert is true, but CopyFrom()'s local flag is false, then execPartition.c: ExecInitPartitionInfo() would wrongly perform BeginForeignCopy() based on only ri_usesMultiInsert, because it wouldn't know CopyFrom()'s local flag. Am I missing something? No, you're right. If someone want to share a state and use ResultRelInfo (RRI) for that purpose, then it's fine, but CopyFrom() may simply override RRI->ri_usesMultiInsert if needed and pass this RRI further. This is how it's done for RRI->ri_usesFdwDirectModify. InitResultRelInfo() initializes it to false and then ExecInitModifyTable() changes the flag if needed. Probably this is just a matter of personal choice, but for me the current implementation with additional argument in InitResultRelInfo() doesn't look completely right. Maybe because a caller now should pass an additional argument (as false) even if it doesn't care about ri_usesMultiInsert at all. It also adds additional complexity and feels like abstractions leaking. I didn't feel what the problem was and prepared a patch version according to Alexey's suggestion (see Alternate.patch). Yes, that's very close to what I've meant. + leaf_part_rri->ri_usesMultiInsert = (leaf_part_rri->ri_usesMultiInsert && + rootResultRelInfo->ri_usesMultiInsert) ? true : false; This could be just: + leaf_part_rri->ri_usesMultiInsert = (leaf_part_rri->ri_usesMultiInsert && + rootResultRelInfo->ri_usesMultiInsert); This does not seem very convenient and will lead to errors in the future. So, I agree with Amit. And InitResultRelInfo() may set ri_usesMultiInsert to false by default, since it's used only by COPY now. Then you won't need this in several places: + resultRelInfo->ri_usesMultiInsert = false; While the logic of turning multi-insert on with all the validations required could be factored out of InitResultRelInfo() to a separate routine. Anyway, I don't insist at all and think it's fine to stick to the original v7's logic. Regards -- Alexey Kondratov Postgres Professional https://www.postgrespro.com Russian Postgres Company
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Version 8 split into two patches (in accordance with Amit suggestion). Also I eliminate naming inconsistency (thanks to Alexey). Based on master, f481d28232. -- regards, Andrey Lepikhov Postgres Professional >From 21b11f4ec0bec71bc7226014ef15c58dee9002da Mon Sep 17 00:00:00 2001 From: amitlan Date: Mon, 24 Aug 2020 15:08:37 +0900 Subject: [PATCH 1/2] Move multi-insert decision logic into executor When 0d5f05cde introduced support for using multi-insert mode when copying into partitioned tables, it introduced single variable of enum type CopyInsertMethod shared across all potential target relations (partitions) that, along with some target relation proprties, dictated whether to engage multi-insert mode for a given target relation. Move that decision logic into InitResultRelInfo which now sets a new boolean field ri_usesMultiInsert of ResultRelInfo when a target relation is first initialized. That prevents repeated computation of the same information in some cases, especially for partitions, and the new arrangement results in slightly more readability. --- src/backend/commands/copy.c | 186 --- src/backend/commands/tablecmds.c | 1 + src/backend/executor/execMain.c | 49 ++ src/backend/executor/execPartition.c | 3 +- src/backend/replication/logical/worker.c | 2 +- src/include/executor/executor.h | 1 + src/include/nodes/execnodes.h| 9 +- 7 files changed, 129 insertions(+), 122 deletions(-) diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index db7d24a511..4e63926cb7 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -85,16 +85,6 @@ typedef enum EolType EOL_CRNL } EolType; -/* - * Represents the heap insert method to be used during COPY FROM. - */ -typedef enum CopyInsertMethod -{ - CIM_SINGLE, /* use table_tuple_insert or fdw routine */ - CIM_MULTI, /* always use table_multi_insert */ - CIM_MULTI_CONDITIONAL /* use table_multi_insert only if valid */ -} CopyInsertMethod; - /* * This struct contains all the state variables used throughout a COPY * operation. For simplicity, we use the same struct for all variants of COPY, @@ -2715,12 +2705,11 @@ CopyFrom(CopyState cstate) CommandId mycid = GetCurrentCommandId(true); int ti_options = 0; /* start with default options for insert */ BulkInsertState bistate = NULL; - CopyInsertMethod insertMethod; + bool use_multi_insert; CopyMultiInsertInfo multiInsertInfo = {0}; /* pacify compiler */ uint64 processed = 0; bool has_before_insert_row_trig; bool has_instead_insert_row_trig; - bool leafpart_use_multi_insert = false; Assert(cstate->rel); @@ -2820,6 +2809,52 @@ CopyFrom(CopyState cstate) ti_options |= TABLE_INSERT_FROZEN; } + /* + * It's generally more efficient to prepare a bunch of tuples for + * insertion, and insert them in bulk, for example, with one + * table_multi_insert() call than call table_tuple_insert() separately + * for every tuple. However, there are a number of reasons why we might + * not be able to do this. We check some conditions below while some + * other target relation properties are left for InitResultRelInfo() to + * check, because they must also be checked for partitions which are + * initialized later. + */ + if (cstate->volatile_defexprs || list_length(cstate->attnumlist) == 0) + { + /* + * Can't support bufferization of copy into foreign tables without any + * defined columns or if there are any volatile default expressions in the + * table. Similarly to the trigger case above, such expressions may query + * the table we're inserting into. + * + * Note: It does not matter if any partitions have any volatile + * default expressions as we use the defaults from the target of the + * COPY command. + */ + use_multi_insert = false; + } + else if (contain_volatile_functions(cstate->whereClause)) + { + /* + * Can't support multi-inserts if there are any volatile function + * expressions in WHERE clause. Similarly to the trigger case above, + * such expressions may query the table we're inserting into. + */ + use_multi_insert = false; + } + else + { + /* + * Looks okay to try multi-insert, but that may change once we + * check few more properties in InitResultRelInfo(). + * + * For partitioned tables, whether or not to use multi-insert depends + * on the individual parition's properties which are also checked in + * InitResultRelInfo(). + */ + use_multi_insert = true; + } + /* * We need a ResultRelInfo so we can use the regular executor's * index-entry-making machinery. (There used to be a huge amount of code @@ -2830,6 +2865,7 @@ CopyFrom(CopyState cstate) cstate->rel, 1, /* must match rel's position in range_table */ NULL, + use_multi_insert, 0); target_resultRelInfo = resultRelInfo; @@ -2854,10 +2890,14 @@ CopyFrom(CopyState cstate)
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 9/8/20 8:34 PM, Alexey Kondratov wrote: On 2020-09-08 17:00, Amit Langote wrote: wrote: On 2020-09-08 10:34, Amit Langote wrote: Another ambiguous part of the refactoring was in changing InitResultRelInfo() arguments: @@ -1278,6 +1280,7 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo, Relation resultRelationDesc, Index resultRelationIndex, Relation partition_root, + bool use_multi_insert, int instrument_options) Why do we need to pass this use_multi_insert flag here? Would it be better to set resultRelInfo->ri_usesMultiInsert in the InitResultRelInfo() unconditionally like it is done for ri_usesFdwDirectModify? And after that it will be up to the caller whether to use multi-insert or not based on their own circumstances. Otherwise now we have a flag to indicate that we want to check for another flag, while this check doesn't look costly. Hmm, I think having two flags seems confusing and bug prone, especially if you consider partitions. For example, if a partition's ri_usesMultiInsert is true, but CopyFrom()'s local flag is false, then execPartition.c: ExecInitPartitionInfo() would wrongly perform BeginForeignCopy() based on only ri_usesMultiInsert, because it wouldn't know CopyFrom()'s local flag. Am I missing something? No, you're right. If someone want to share a state and use ResultRelInfo (RRI) for that purpose, then it's fine, but CopyFrom() may simply override RRI->ri_usesMultiInsert if needed and pass this RRI further. This is how it's done for RRI->ri_usesFdwDirectModify. InitResultRelInfo() initializes it to false and then ExecInitModifyTable() changes the flag if needed. Probably this is just a matter of personal choice, but for me the current implementation with additional argument in InitResultRelInfo() doesn't look completely right. Maybe because a caller now should pass an additional argument (as false) even if it doesn't care about ri_usesMultiInsert at all. It also adds additional complexity and feels like abstractions leaking. I didn't feel what the problem was and prepared a patch version according to Alexey's suggestion (see Alternate.patch). This does not seem very convenient and will lead to errors in the future. So, I agree with Amit. -- regards, Andrey Lepikhov Postgres Professional >From 73705843d300ad1016384e6cb8893c80246372a6 Mon Sep 17 00:00:00 2001 From: amitlan Date: Mon, 24 Aug 2020 15:08:37 +0900 Subject: [PATCH 1/2] Move multi-insert decision logic into executor When 0d5f05cde introduced support for using multi-insert mode when copying into partitioned tables, it introduced single variable of enum type CopyInsertMethod shared across all potential target relations (partitions) that, along with some target relation proprties, dictated whether to engage multi-insert mode for a given target relation. Move that decision logic into InitResultRelInfo which now sets a new boolean field ri_usesMultiInsert of ResultRelInfo when a target relation is first initialized. That prevents repeated computation of the same information in some cases, especially for partitions, and the new arrangement results in slightly more readability. --- src/backend/commands/copy.c | 189 +-- src/backend/commands/tablecmds.c | 1 + src/backend/executor/execMain.c | 40 + src/backend/executor/execPartition.c | 7 + src/backend/replication/logical/worker.c | 1 + src/include/nodes/execnodes.h| 9 +- 6 files changed, 127 insertions(+), 120 deletions(-) diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index db7d24a511..94f6e71a94 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -85,16 +85,6 @@ typedef enum EolType EOL_CRNL } EolType; -/* - * Represents the heap insert method to be used during COPY FROM. - */ -typedef enum CopyInsertMethod -{ - CIM_SINGLE, /* use table_tuple_insert or fdw routine */ - CIM_MULTI, /* always use table_multi_insert */ - CIM_MULTI_CONDITIONAL /* use table_multi_insert only if valid */ -} CopyInsertMethod; - /* * This struct contains all the state variables used throughout a COPY * operation. For simplicity, we use the same struct for all variants of COPY, @@ -2715,12 +2705,10 @@ CopyFrom(CopyState cstate) CommandId mycid = GetCurrentCommandId(true); int ti_options = 0; /* start with default options for insert */ BulkInsertState bistate = NULL; - CopyInsertMethod insertMethod; CopyMultiInsertInfo multiInsertInfo = {0}; /* pacify compiler */ uint64 processed = 0; bool has_before_insert_row_trig; bool has_instead_insert_row_trig; - bool leafpart_use_multi_insert = false; Assert(cstate->rel); @@ -2833,6 +2821,57 @@ CopyFrom(CopyState cstate) 0); target_resultRelInfo = resultRelInfo; + /* + * It's
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 2020-09-08 17:00, Amit Langote wrote: Hi Alexey, On Tue, Sep 8, 2020 at 6:29 PM Alexey Kondratov wrote: On 2020-09-08 10:34, Amit Langote wrote: > Any thoughts on the taking out the refactoring changes out of the main > patch as I suggested? > +1 for splitting the patch. It was rather difficult for me to distinguish changes required by COPY via postgres_fdw from this refactoring. Another ambiguous part of the refactoring was in changing InitResultRelInfo() arguments: @@ -1278,6 +1280,7 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo, Relation resultRelationDesc, Index resultRelationIndex, Relation partition_root, + bool use_multi_insert, int instrument_options) Why do we need to pass this use_multi_insert flag here? Would it be better to set resultRelInfo->ri_usesMultiInsert in the InitResultRelInfo() unconditionally like it is done for ri_usesFdwDirectModify? And after that it will be up to the caller whether to use multi-insert or not based on their own circumstances. Otherwise now we have a flag to indicate that we want to check for another flag, while this check doesn't look costly. Hmm, I think having two flags seems confusing and bug prone, especially if you consider partitions. For example, if a partition's ri_usesMultiInsert is true, but CopyFrom()'s local flag is false, then execPartition.c: ExecInitPartitionInfo() would wrongly perform BeginForeignCopy() based on only ri_usesMultiInsert, because it wouldn't know CopyFrom()'s local flag. Am I missing something? No, you're right. If someone want to share a state and use ResultRelInfo (RRI) for that purpose, then it's fine, but CopyFrom() may simply override RRI->ri_usesMultiInsert if needed and pass this RRI further. This is how it's done for RRI->ri_usesFdwDirectModify. InitResultRelInfo() initializes it to false and then ExecInitModifyTable() changes the flag if needed. Probably this is just a matter of personal choice, but for me the current implementation with additional argument in InitResultRelInfo() doesn't look completely right. Maybe because a caller now should pass an additional argument (as false) even if it doesn't care about ri_usesMultiInsert at all. It also adds additional complexity and feels like abstractions leaking. Regards -- Alexey Kondratov Postgres Professional https://www.postgrespro.com Russian Postgres Company
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Alexey, On Tue, Sep 8, 2020 at 6:29 PM Alexey Kondratov wrote: > On 2020-09-08 10:34, Amit Langote wrote: > > Any thoughts on the taking out the refactoring changes out of the main > > patch as I suggested? > > > > +1 for splitting the patch. It was rather difficult for me to > distinguish changes required by COPY via postgres_fdw from this > refactoring. > > Another ambiguous part of the refactoring was in changing > InitResultRelInfo() arguments: > > @@ -1278,6 +1280,7 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo, > Relation resultRelationDesc, > Index resultRelationIndex, > Relation partition_root, > + bool use_multi_insert, > int instrument_options) > > Why do we need to pass this use_multi_insert flag here? Would it be > better to set resultRelInfo->ri_usesMultiInsert in the > InitResultRelInfo() unconditionally like it is done for > ri_usesFdwDirectModify? And after that it will be up to the caller > whether to use multi-insert or not based on their own circumstances. > Otherwise now we have a flag to indicate that we want to check for > another flag, while this check doesn't look costly. Hmm, I think having two flags seems confusing and bug prone, especially if you consider partitions. For example, if a partition's ri_usesMultiInsert is true, but CopyFrom()'s local flag is false, then execPartition.c: ExecInitPartitionInfo() would wrongly perform BeginForeignCopy() based on only ri_usesMultiInsert, because it wouldn't know CopyFrom()'s local flag. Am I missing something? -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 9/8/20 12:34 PM, Amit Langote wrote: Hi Andrey, On Mon, Sep 7, 2020 at 7:31 PM Andrey V. Lepikhov wrote: On 9/7/20 12:26 PM, Michael Paquier wrote: While on it, the CF bot is telling that the documentation of the patch fails to compile. This needs to be fixed. -- Michael v.7 (in attachment) fixes this problem. I also accepted Amit's suggestion to rename all fdwapi routines such as ForeignCopyIn to *ForeignCopy. Any thoughts on the taking out the refactoring changes out of the main patch as I suggested? Sorry I thought you asked to ignore your previous letter. I'll look into this patch set shortly. -- regards, Andrey Lepikhov Postgres Professional
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi, I've started doing a review of v7 yesterday. On 2020-09-08 10:34, Amit Langote wrote: On Mon, Sep 7, 2020 at 7:31 PM Andrey V. Lepikhov wrote: > v.7 (in attachment) fixes this problem. I also accepted Amit's suggestion to rename all fdwapi routines such as ForeignCopyIn to *ForeignCopy. It seems that naming is quite inconsistent now: + /* COPY a bulk of tuples into a foreign relation */ + BeginForeignCopyIn_function BeginForeignCopy; + EndForeignCopyIn_function EndForeignCopy; + ExecForeignCopyIn_function ExecForeignCopy; You get rid of this 'In' in the function names, but the types are still with it: +typedef void (*BeginForeignCopyIn_function) (ModifyTableState *mtstate, + ResultRelInfo *rinfo); + +typedef void (*EndForeignCopyIn_function) (EState *estate, + ResultRelInfo *rinfo); + +typedef void (*ExecForeignCopyIn_function) (ResultRelInfo *rinfo, + TupleTableSlot **slots, + int nslots); Also docs refer to old function names: +void +BeginForeignCopyIn(ModifyTableState *mtstate, + ResultRelInfo *rinfo); I think that it'd be better to choose either of these two naming schemes and use it everywhere for consistency. Any thoughts on the taking out the refactoring changes out of the main patch as I suggested? +1 for splitting the patch. It was rather difficult for me to distinguish changes required by COPY via postgres_fdw from this refactoring. Another ambiguous part of the refactoring was in changing InitResultRelInfo() arguments: @@ -1278,6 +1280,7 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo, Relation resultRelationDesc, Index resultRelationIndex, Relation partition_root, + bool use_multi_insert, int instrument_options) Why do we need to pass this use_multi_insert flag here? Would it be better to set resultRelInfo->ri_usesMultiInsert in the InitResultRelInfo() unconditionally like it is done for ri_usesFdwDirectModify? And after that it will be up to the caller whether to use multi-insert or not based on their own circumstances. Otherwise now we have a flag to indicate that we want to check for another flag, while this check doesn't look costly. Regards -- Alexey Kondratov Postgres Professional https://www.postgrespro.com Russian Postgres Company
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey, On Mon, Sep 7, 2020 at 7:31 PM Andrey V. Lepikhov wrote: > On 9/7/20 12:26 PM, Michael Paquier wrote: > > While on it, the CF bot is telling that the documentation of the patch > > fails to compile. This needs to be fixed. > > -- > > Michael > > > v.7 (in attachment) fixes this problem. > I also accepted Amit's suggestion to rename all fdwapi routines such as > ForeignCopyIn to *ForeignCopy. Any thoughts on the taking out the refactoring changes out of the main patch as I suggested? -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 9/7/20 12:26 PM, Michael Paquier wrote: On Mon, Aug 24, 2020 at 06:19:28PM +0900, Amit Langote wrote: On Mon, Aug 24, 2020 at 4:18 PM Amit Langote wrote: I would Oops, thought I'd continue writing, but hit send before actually doing that. Please ignore. I have some comments on v6, which I will share later this week. While on it, the CF bot is telling that the documentation of the patch fails to compile. This needs to be fixed. -- Michael v.7 (in attachment) fixes this problem. I also accepted Amit's suggestion to rename all fdwapi routines such as ForeignCopyIn to *ForeignCopy. -- regards, Andrey Lepikhov Postgres Professional >From db4ba1bac6a8d642dffd1b907dcc1dd082203fab Mon Sep 17 00:00:00 2001 From: Andrey Lepikhov Date: Thu, 9 Jul 2020 11:16:56 +0500 Subject: [PATCH] Fast COPY FROM into the foreign or sharded table. This feature enables bulk COPY into foreign table in the case of multi inserts is possible and foreign table has non-zero number of columns. FDWAPI was extended by next routines: * BeginForeignCopy * EndForeignCopy * ExecForeignCopy BeginForeignCopy and EndForeignCopy initialize and free the CopyState of bulk COPY. The ExecForeignCopy routine send 'COPY ... FROM STDIN' command to the foreign server, in iterative manner send tuples by CopyTo() machinery, send EOF to this connection. Code that constructed list of columns for a given foreign relation in the deparseAnalyzeSql() routine is separated to the deparseRelColumnList(). It is reused in the deparseCopyFromSql(). Added TAP-tests on the specific corner cases of COPY FROM STDIN operation. By the analogy of CopyFrom() the CopyState structure was extended with data_dest_cb callback. It is used for send text representation of a tuple to a custom destination. The PgFdwModifyState structure is extended with the cstate field. It is needed for avoid repeated initialization of CopyState. ALso for this reason CopyTo() routine was split into the set of routines CopyToStart()/ CopyTo()/CopyToFinish(). Discussion: https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru Authors: Andrey Lepikhov, Ashutosh Bapat, Amit Langote --- contrib/postgres_fdw/deparse.c| 60 ++- .../postgres_fdw/expected/postgres_fdw.out| 46 +- contrib/postgres_fdw/postgres_fdw.c | 143 +++ contrib/postgres_fdw/postgres_fdw.h | 1 + contrib/postgres_fdw/sql/postgres_fdw.sql | 45 ++ doc/src/sgml/fdwhandler.sgml | 75 src/backend/commands/copy.c | 398 +- src/backend/commands/tablecmds.c | 1 + src/backend/executor/execMain.c | 53 +++ src/backend/executor/execPartition.c | 28 +- src/backend/replication/logical/worker.c | 2 +- src/include/commands/copy.h | 11 + src/include/executor/executor.h | 1 + src/include/foreign/fdwapi.h | 15 + src/include/nodes/execnodes.h | 9 +- 15 files changed, 670 insertions(+), 218 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index ad37a74221..a37981ff66 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -184,6 +184,8 @@ static void appendAggOrderBy(List *orderList, List *targetList, static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); static Node *deparseSortGroupClause(Index ref, List *tlist, bool force_colno, deparse_expr_cxt *context); +static List *deparseRelColumnList(StringInfo buf, Relation rel, + bool enclose_in_parens); /* * Helper functions @@ -1758,6 +1760,20 @@ deparseInsertSql(StringInfo buf, RangeTblEntry *rte, withCheckOptionList, returningList, retrieved_attrs); } +/* + * Deparse COPY FROM into given buf. + * We need to use list of parameters at each query. + */ +void +deparseCopyFromSql(StringInfo buf, Relation rel) +{ + appendStringInfoString(buf, "COPY "); + deparseRelation(buf, rel); + (void) deparseRelColumnList(buf, rel, true); + + appendStringInfoString(buf, " FROM STDIN "); +} + /* * deparse remote UPDATE statement * @@ -2061,6 +2077,30 @@ deparseAnalyzeSizeSql(StringInfo buf, Relation rel) */ void deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) +{ + appendStringInfoString(buf, "SELECT "); + *retrieved_attrs = deparseRelColumnList(buf, rel, false); + + /* Don't generate bad syntax for zero-column relation. */ + if (list_length(*retrieved_attrs) == 0) + appendStringInfoString(buf, "NULL"); + + /* + * Construct FROM clause + */ + appendStringInfoString(buf, " FROM "); + deparseRelation(buf, rel); +} + +/* + * Construct the list of columns of given foreign relation in the order they + * appear in the tuple descriptor of the relation. Ignore any dropped columns. + * Use column names on the foreign server instead of local names. + * + * Optionally enclose the list
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Mon, Aug 24, 2020 at 06:19:28PM +0900, Amit Langote wrote: > On Mon, Aug 24, 2020 at 4:18 PM Amit Langote wrote: > > I would > > Oops, thought I'd continue writing, but hit send before actually doing > that. Please ignore. > > I have some comments on v6, which I will share later this week. While on it, the CF bot is telling that the documentation of the patch fails to compile. This needs to be fixed. -- Michael signature.asc Description: PGP signature
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Mon, Aug 24, 2020 at 4:18 PM Amit Langote wrote: > I would Oops, thought I'd continue writing, but hit send before actually doing that. Please ignore. I have some comments on v6, which I will share later this week. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey, On Fri, Aug 21, 2020 at 9:19 PM Andrey Lepikhov wrote: > On 8/7/20 2:14 PM, Amit Langote wrote: > > I was playing around with v5 and I noticed an assertion failure which > > I concluded is due to improper setting of ri_usesBulkModify. You can > > reproduce it with these steps. > > > > create extension postgres_fdw; > > create server lb foreign data wrapper postgres_fdw ; > > create user mapping for current_user server lb; > > create table foo (a int, b int) partition by list (a); > > create table foo1 (like foo); > > create foreign table ffoo1 partition of foo for values in (1) server > > lb options (table_name 'foo1'); > > create table foo2 (like foo); > > create foreign table ffoo2 partition of foo for values in (2) server > > lb options (table_name 'foo2'); > > create function print_new_row() returns trigger language plpgsql as $$ > > begin raise notice '%', new; return new; end; $$; > > create trigger ffoo1_br_trig before insert on ffoo1 for each row > > execute function print_new_row(); > > copy foo from stdin csv; > > Enter data to be copied followed by a newline. > > End with a backslash and a period on a line by itself, or an EOF signal. > >>> 1,2 > >>> 2,3 > >>> \. > > NOTICE: (1,2) > > server closed the connection unexpectedly > > This probably means the server terminated abnormally > > before or while processing the request. > > Thnx, I added TAP-test on this problem> However instead of duplicating > the same logic to do so in two places Good call. > > (CopyFrom and ExecInitPartitionInfo), I think it might be a good idea > > to refactor the code to decide if multi-insert mode can be used for a > > given relation by checking its properties and put it in some place > > that both the main target relation and partitions need to invoke. > > InitResultRelInfo() seems to be one such place. > +1 > > > > Also, it might be a good idea to use ri_usesBulkModify more generally > > than only for foreign relations as the patch currently does, because I > > can see that it can replace the variable insertMethod in CopyFrom(). > > Having both insertMethod and ri_usesBulkModify in each ResultRelInfo > > seems confusing and bug-prone. > > > > Finally, I suggest renaming ri_usesBulkModify to ri_usesMultiInsert to > > reflect its scope. > > > > Please check the attached delta patch that applies on top of v5 to see > > what that would look like. > > I merged your delta patch (see v6 in attachment) to the main patch. > Currently it seems more commitable than before. Thanks for accepting the changes. Actually, I was thinking maybe making the patch to replace CopyInsertMethod enum by ri_usesMultiInsert separate from the rest might be better as I can see it as independent refactoring. Attached is how the division would look like. I would -- Amit Langote EnterpriseDB: http://www.enterprisedb.com v6-0001-Move-multi-insert-decision-logic-into-executor.patch Description: Binary data v6-0002-Fast-COPY-FROM-into-the-foreign-or-sharded-table.patch Description: Binary data
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 8/7/20 2:14 PM, Amit Langote wrote: I was playing around with v5 and I noticed an assertion failure which I concluded is due to improper setting of ri_usesBulkModify. You can reproduce it with these steps. create extension postgres_fdw; create server lb foreign data wrapper postgres_fdw ; create user mapping for current_user server lb; create table foo (a int, b int) partition by list (a); create table foo1 (like foo); create foreign table ffoo1 partition of foo for values in (1) server lb options (table_name 'foo1'); create table foo2 (like foo); create foreign table ffoo2 partition of foo for values in (2) server lb options (table_name 'foo2'); create function print_new_row() returns trigger language plpgsql as $$ begin raise notice '%', new; return new; end; $$; create trigger ffoo1_br_trig before insert on ffoo1 for each row execute function print_new_row(); copy foo from stdin csv; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself, or an EOF signal. 1,2 2,3 \. NOTICE: (1,2) server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. #0 0x7f2d5e266337 in raise () from /lib64/libc.so.6 #1 0x7f2d5e267a28 in abort () from /lib64/libc.so.6 #2 0x00aafd5d in ExceptionalCondition (conditionName=0x7f2d37b468d0 "!resultRelInfo->ri_usesBulkModify || resultRelInfo->ri_FdwRoutine->BeginForeignCopyIn == NULL", errorType=0x7f2d37b46680 "FailedAssertion", fileName=0x7f2d37b4677f "postgres_fdw.c", lineNumber=1863) at assert.c:67 #3 0x7f2d37b3b0fe in postgresExecForeignInsert (estate=0x2456320, resultRelInfo=0x23a8f58, slot=0x23a9480, planSlot=0x0) at postgres_fdw.c:1862 #4 0x0066362a in CopyFrom (cstate=0x23a8d40) at copy.c:3331 The problem is that partition ffoo1's BR trigger prevents it from using multi-insert, but its ResultRelInfo.ri_usesBulkModify is true, which is copied from its parent. We should really check the same things for a partition that CopyFrom() checks for the main target relation (root parent) when deciding whether to use multi-insert. Thnx, I added TAP-test on this problem> However instead of duplicating the same logic to do so in two places (CopyFrom and ExecInitPartitionInfo), I think it might be a good idea to refactor the code to decide if multi-insert mode can be used for a given relation by checking its properties and put it in some place that both the main target relation and partitions need to invoke. InitResultRelInfo() seems to be one such place. +1 Also, it might be a good idea to use ri_usesBulkModify more generally than only for foreign relations as the patch currently does, because I can see that it can replace the variable insertMethod in CopyFrom(). Having both insertMethod and ri_usesBulkModify in each ResultRelInfo seems confusing and bug-prone. Finally, I suggest renaming ri_usesBulkModify to ri_usesMultiInsert to reflect its scope. Please check the attached delta patch that applies on top of v5 to see what that would look like. I merged your delta patch (see v6 in attachment) to the main patch. Currently it seems more commitable than before. -- regards, Andrey Lepikhov Postgres Professional >From da6c4fe8262df58164e2c4ab80e085a019c9c6c1 Mon Sep 17 00:00:00 2001 From: Andrey Lepikhov Date: Thu, 9 Jul 2020 11:16:56 +0500 Subject: [PATCH] Fast COPY FROM into the foreign or sharded table. This feature enables bulk COPY into foreign table in the case of multi inserts is possible and foreign table has non-zero number of columns. FDWAPI was extended by next routines: * BeginForeignCopyIn * EndForeignCopyIn * ExecForeignCopyIn BeginForeignCopyIn and EndForeignCopyIn initialize and free the CopyState of bulk COPY. The ExecForeignCopyIn routine send 'COPY ... FROM STDIN' command to the foreign server, in iterative manner send tuples by CopyTo() machinery, send EOF to this connection. Code that constructed list of columns for a given foreign relation in the deparseAnalyzeSql() routine is separated to the deparseRelColumnList(). It is reused in the deparseCopyFromSql(). Added TAP-tests on the specific corner cases of COPY FROM STDIN operation. By the analogy of CopyFrom() the CopyState structure was extended with data_dest_cb callback. It is used for send text representation of a tuple to a custom destination. The PgFdwModifyState structure is extended with the cstate field. It is needed for avoid repeated initialization of CopyState. ALso for this reason CopyTo() routine was split into the set of routines CopyToStart()/ CopyTo()/CopyToFinish(). Enum CopyInsertMethod removed. This logic implements by ri_usesMultiInsert field of the ResultRelInfo sructure. Discussion: https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru Authors: Andrey Lepikhov, Ashutosh Bapat, Amit Langote --- contrib/postgres_fdw/deparse.c| 60 ++-
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Mon, Aug 3, 2020 at 8:38 PM Amit Langote wrote: > On Wed, Jul 29, 2020 at 5:36 PM Andrey V. Lepikhov > wrote: > > > Will send more comments after reading the v5 patch. > > > > > Ok. I'll be waiting for the end of your review. > > Sorry about the late reply. > > If you'd like to send a new version for other reviewers, please feel > free. I haven't managed to take more than a brief look at the v5 > patch, but will try to look at it (or maybe the new version if you > post) more closely this week. I was playing around with v5 and I noticed an assertion failure which I concluded is due to improper setting of ri_usesBulkModify. You can reproduce it with these steps. create extension postgres_fdw; create server lb foreign data wrapper postgres_fdw ; create user mapping for current_user server lb; create table foo (a int, b int) partition by list (a); create table foo1 (like foo); create foreign table ffoo1 partition of foo for values in (1) server lb options (table_name 'foo1'); create table foo2 (like foo); create foreign table ffoo2 partition of foo for values in (2) server lb options (table_name 'foo2'); create function print_new_row() returns trigger language plpgsql as $$ begin raise notice '%', new; return new; end; $$; create trigger ffoo1_br_trig before insert on ffoo1 for each row execute function print_new_row(); copy foo from stdin csv; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself, or an EOF signal. >> 1,2 >> 2,3 >> \. NOTICE: (1,2) server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. #0 0x7f2d5e266337 in raise () from /lib64/libc.so.6 #1 0x7f2d5e267a28 in abort () from /lib64/libc.so.6 #2 0x00aafd5d in ExceptionalCondition (conditionName=0x7f2d37b468d0 "!resultRelInfo->ri_usesBulkModify || resultRelInfo->ri_FdwRoutine->BeginForeignCopyIn == NULL", errorType=0x7f2d37b46680 "FailedAssertion", fileName=0x7f2d37b4677f "postgres_fdw.c", lineNumber=1863) at assert.c:67 #3 0x7f2d37b3b0fe in postgresExecForeignInsert (estate=0x2456320, resultRelInfo=0x23a8f58, slot=0x23a9480, planSlot=0x0) at postgres_fdw.c:1862 #4 0x0066362a in CopyFrom (cstate=0x23a8d40) at copy.c:3331 The problem is that partition ffoo1's BR trigger prevents it from using multi-insert, but its ResultRelInfo.ri_usesBulkModify is true, which is copied from its parent. We should really check the same things for a partition that CopyFrom() checks for the main target relation (root parent) when deciding whether to use multi-insert. However instead of duplicating the same logic to do so in two places (CopyFrom and ExecInitPartitionInfo), I think it might be a good idea to refactor the code to decide if multi-insert mode can be used for a given relation by checking its properties and put it in some place that both the main target relation and partitions need to invoke. InitResultRelInfo() seems to be one such place. Also, it might be a good idea to use ri_usesBulkModify more generally than only for foreign relations as the patch currently does, because I can see that it can replace the variable insertMethod in CopyFrom(). Having both insertMethod and ri_usesBulkModify in each ResultRelInfo seems confusing and bug-prone. Finally, I suggest renaming ri_usesBulkModify to ri_usesMultiInsert to reflect its scope. Please check the attached delta patch that applies on top of v5 to see what that would look like. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com ri_usesMultiInsert.patch Description: Binary data
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey, On Wed, Jul 29, 2020 at 5:36 PM Andrey V. Lepikhov wrote: > > Will send more comments after reading the v5 patch. > > > Ok. I'll be waiting for the end of your review. Sorry about the late reply. If you'd like to send a new version for other reviewers, please feel free. I haven't managed to take more than a brief look at the v5 patch, but will try to look at it (or maybe the new version if you post) more closely this week. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 7/29/20 1:03 PM, Amit Langote wrote: Hi Andrey, Thanks for updating the patch. I will try to take a look later. On Wed, Jul 22, 2020 at 6:09 PM Andrey V. Lepikhov wrote: On 7/16/20 2:14 PM, Amit Langote wrote: * Why the "In" in these API names? + /* COPY a bulk of tuples into a foreign relation */ + BeginForeignCopyIn_function BeginForeignCopyIn; + EndForeignCopyIn_function EndForeignCopyIn; + ExecForeignCopyIn_function ExecForeignCopyIn; I used an analogy from copy.c. Hmm, if we were going to also need *ForeignCopyOut APIs, maybe it makes sense to have "In" here, but maybe we don't, so how about leaving out the "In" for clarity? Ok, sounds good. * I see that the remote copy is performed from scratch on every call of postgresExecForeignCopyIn(), but wouldn't it be more efficient to send the `COPY remote_table FROM STDIN` in postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn() when there are no errors during the copy? It is not possible. FDW share one connection between all foreign relations from a server. If two or more partitions will be placed at one foreign server you will have problems with concurrent COPY command. Ah, you're right. I didn't consider multiple foreign partitions pointing to the same server. Indeed, we would need separate connections to a given server to COPY to multiple remote relations on that server in parallel. May be we can create new connection for each partition? Yeah, perhaps, although it sounds like something that might be more generally useful and so we should work on that separately if at all. I will try to prepare a separate patch. I tried implementing these two changes -- pgfdw_copy_data_dest_cb() and sending `COPY remote_table FROM STDIN` only once instead of on every flush -- and I see significant speedup. Please check the attached patch that applies on top of yours. I integrated first change and rejected the second by the reason as above. Thanks. Will send more comments after reading the v5 patch. Ok. I'll be waiting for the end of your review. -- regards, Andrey Lepikhov Postgres Professional
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey, Thanks for updating the patch. I will try to take a look later. On Wed, Jul 22, 2020 at 6:09 PM Andrey V. Lepikhov wrote: > On 7/16/20 2:14 PM, Amit Langote wrote: > > * Why the "In" in these API names? > > > > + /* COPY a bulk of tuples into a foreign relation */ > > + BeginForeignCopyIn_function BeginForeignCopyIn; > > + EndForeignCopyIn_function EndForeignCopyIn; > > + ExecForeignCopyIn_function ExecForeignCopyIn; > > I used an analogy from copy.c. Hmm, if we were going to also need *ForeignCopyOut APIs, maybe it makes sense to have "In" here, but maybe we don't, so how about leaving out the "In" for clarity? > > * I see that the remote copy is performed from scratch on every call > > of postgresExecForeignCopyIn(), but wouldn't it be more efficient to > > send the `COPY remote_table FROM STDIN` in > > postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn() > > when there are no errors during the copy? > > It is not possible. FDW share one connection between all foreign > relations from a server. If two or more partitions will be placed at one > foreign server you will have problems with concurrent COPY command. Ah, you're right. I didn't consider multiple foreign partitions pointing to the same server. Indeed, we would need separate connections to a given server to COPY to multiple remote relations on that server in parallel. > May be we can create new connection for each partition? Yeah, perhaps, although it sounds like something that might be more generally useful and so we should work on that separately if at all. > > I tried implementing these two changes -- pgfdw_copy_data_dest_cb() > > and sending `COPY remote_table FROM STDIN` only once instead of on > > every flush -- and I see significant speedup. Please check the > > attached patch that applies on top of yours. > > I integrated first change and rejected the second by the reason as above. Thanks. Will send more comments after reading the v5 patch. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 2020-07-28 03:33, Andrey Lepikhov wrote: 27.07.2020 21:34, Alexey Kondratov пишет: Hi Andrey, On 2020-07-23 09:23, Andrey V. Lepikhov wrote: On 7/16/20 2:14 PM, Amit Langote wrote: Amit Langote EnterpriseDB: http://www.enterprisedb.com Version 5 of the patch. With changes caused by Amit's comments. Just got a segfault with your v5 patch by deleting from a foreign table. It seems that ShutdownForeignScan inside node->fdwroutine doesn't have a correct pointer to the required function. I haven't had a chance to look closer on the code, but you can easily reproduce this error with the attached script (patched Postgres binaries should be available in the PATH). It works well with master and fails with your patch applied. I used master a3ab7a707d and v5 version of the patch with your script. No errors found. Can you check your test case? Yes, my bad. I forgot to re-install postgres_fdw extension, only did it for postgres core, sorry for disturb. Regards -- Alexey Kondratov Postgres Professional https://www.postgrespro.com Russian Postgres Company
Re: [POC] Fast COPY FROM command for the table with foreign partitions
27.07.2020 21:34, Alexey Kondratov пишет: Hi Andrey, On 2020-07-23 09:23, Andrey V. Lepikhov wrote: On 7/16/20 2:14 PM, Amit Langote wrote: Amit Langote EnterpriseDB: http://www.enterprisedb.com Version 5 of the patch. With changes caused by Amit's comments. Just got a segfault with your v5 patch by deleting from a foreign table. It seems that ShutdownForeignScan inside node->fdwroutine doesn't have a correct pointer to the required function. I haven't had a chance to look closer on the code, but you can easily reproduce this error with the attached script (patched Postgres binaries should be available in the PATH). It works well with master and fails with your patch applied. I used master a3ab7a707d and v5 version of the patch with your script. No errors found. Can you check your test case? -- regards, Andrey Lepikhov Postgres Professional
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey, On 2020-07-23 09:23, Andrey V. Lepikhov wrote: On 7/16/20 2:14 PM, Amit Langote wrote: Amit Langote EnterpriseDB: http://www.enterprisedb.com Version 5 of the patch. With changes caused by Amit's comments. Just got a segfault with your v5 patch by deleting from a foreign table. Here is a part of backtrace: * frame #0: 0x0001029069ec postgres`ExecShutdownForeignScan(node=0x7ff28c8909b0) at nodeForeignscan.c:385:3 frame #1: 0x0001028e7b06 postgres`ExecShutdownNode(node=0x7ff28c8909b0) at execProcnode.c:779:4 frame #2: 0x00010299b3fa postgres`planstate_walk_members(planstates=0x7ff28c8906d8, nplans=1, walker=(postgres`ExecShutdownNode at execProcnode.c:752), context=0x) at nodeFuncs.c:3998:7 frame #3: 0x00010299b010 postgres`planstate_tree_walker(planstate=0x7ff28c8904c0, walker=(postgres`ExecShutdownNode at execProcnode.c:752), context=0x) at nodeFuncs.c:3914:8 frame #4: 0x0001028e7ab7 postgres`ExecShutdownNode(node=0x7ff28c8904c0) at execProcnode.c:771:2 (lldb) f 0 frame #0: 0x0001029069ec postgres`ExecShutdownForeignScan(node=0x7ff28c8909b0) at nodeForeignscan.c:385:3 382 FdwRoutine *fdwroutine = node->fdwroutine; 383 384 if (fdwroutine->ShutdownForeignScan) -> 385 fdwroutine->ShutdownForeignScan(node); 386 } (lldb) p node->fdwroutine->ShutdownForeignScan (ShutdownForeignScan_function) $1 = 0x7f7f7f7f7f7f7f7f It seems that ShutdownForeignScan inside node->fdwroutine doesn't have a correct pointer to the required function. I haven't had a chance to look closer on the code, but you can easily reproduce this error with the attached script (patched Postgres binaries should be available in the PATH). It works well with master and fails with your patch applied. Regards -- Alexey Kondratov Postgres Professional https://www.postgrespro.com Russian Postgres Company#!/usr/bin/env sh pg_ctl -D node1 stop > /dev/null pg_ctl -D node2 stop > /dev/null rm -rf node1 node2 rm node1.log node2.log initdb -D node1 initdb -D node2 echo "port = 5433" >> node2/postgresql.conf pg_ctl -D node1 -l node1.log start pg_ctl -D node2 -l node2.log start createdb createdb -p5433 psql -p5433 -c "CREATE TABLE test (id INT) PARTITION BY HASH (id)" psql -c "CREATE EXTENSION IF NOT EXISTS postgres_fdw" psql -c "CREATE SERVER node2 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (port '5433'); CREATE USER MAPPING FOR current_user SERVER node2" psql -c "CREATE FOREIGN TABLE test(id INT) SERVER node2" psql -c "DELETE FROM test"
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 7/16/20 2:14 PM, Amit Langote wrote: Amit Langote EnterpriseDB: http://www.enterprisedb.com Version 5 of the patch. With changes caused by Amit's comments. -- regards, Andrey Lepikhov Postgres Professional >From 24465d61d6f0ec6a45578d252bda1690ac045543 Mon Sep 17 00:00:00 2001 From: Andrey Lepikhov Date: Thu, 9 Jul 2020 11:16:56 +0500 Subject: [PATCH] Fast COPY FROM into the foreign or sharded table. This feature enables bulk COPY into foreign table in the case of multi inserts is possible and foreign table has non-zero number of columns. FDWAPI was extended by next routines: * BeginForeignCopyIn * EndForeignCopyIn * ExecForeignCopyIn BeginForeignCopyIn and EndForeignCopyIn initialize and free the CopyState of bulk COPY. The ExecForeignCopyIn routine send 'COPY ... FROM STDIN' command to the foreign server, in iterative manner send tuples by CopyTo() machinery, send EOF to this connection. Code that constructed list of columns for a given foreign relation in the deparseAnalyzeSql() routine is separated to the deparseRelColumnList(). It is reused in the deparseCopyFromSql(). Added TAP-tests on the specific corner cases of COPY FROM STDIN operation. By the analogy of CopyFrom() the CopyState structure was extended with data_dest_cb callback. It is used for send text representation of a tuple to a custom destination. The PgFdwModifyState structure is extended with the cstate field. It is needed for avoid repeated initialization of CopyState. ALso for this reason CopyTo() routine was split into the set of routines CopyToStart()/ CopyTo()/CopyToFinish(). Discussion: https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru Authors: Andrey Lepikhov, Ashutosh Bapat, Amit Langote --- contrib/postgres_fdw/deparse.c| 60 - .../postgres_fdw/expected/postgres_fdw.out| 33 ++- contrib/postgres_fdw/postgres_fdw.c | 146 +++ contrib/postgres_fdw/postgres_fdw.h | 1 + contrib/postgres_fdw/sql/postgres_fdw.sql | 28 ++ doc/src/sgml/fdwhandler.sgml | 74 ++ src/backend/commands/copy.c | 247 +++--- src/backend/executor/execMain.c | 1 + src/backend/executor/execPartition.c | 34 ++- src/include/commands/copy.h | 11 + src/include/foreign/fdwapi.h | 15 ++ src/include/nodes/execnodes.h | 8 + 12 files changed, 547 insertions(+), 111 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index ad37a74221..a37981ff66 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -184,6 +184,8 @@ static void appendAggOrderBy(List *orderList, List *targetList, static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); static Node *deparseSortGroupClause(Index ref, List *tlist, bool force_colno, deparse_expr_cxt *context); +static List *deparseRelColumnList(StringInfo buf, Relation rel, + bool enclose_in_parens); /* * Helper functions @@ -1758,6 +1760,20 @@ deparseInsertSql(StringInfo buf, RangeTblEntry *rte, withCheckOptionList, returningList, retrieved_attrs); } +/* + * Deparse COPY FROM into given buf. + * We need to use list of parameters at each query. + */ +void +deparseCopyFromSql(StringInfo buf, Relation rel) +{ + appendStringInfoString(buf, "COPY "); + deparseRelation(buf, rel); + (void) deparseRelColumnList(buf, rel, true); + + appendStringInfoString(buf, " FROM STDIN "); +} + /* * deparse remote UPDATE statement * @@ -2061,6 +2077,30 @@ deparseAnalyzeSizeSql(StringInfo buf, Relation rel) */ void deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) +{ + appendStringInfoString(buf, "SELECT "); + *retrieved_attrs = deparseRelColumnList(buf, rel, false); + + /* Don't generate bad syntax for zero-column relation. */ + if (list_length(*retrieved_attrs) == 0) + appendStringInfoString(buf, "NULL"); + + /* + * Construct FROM clause + */ + appendStringInfoString(buf, " FROM "); + deparseRelation(buf, rel); +} + +/* + * Construct the list of columns of given foreign relation in the order they + * appear in the tuple descriptor of the relation. Ignore any dropped columns. + * Use column names on the foreign server instead of local names. + * + * Optionally enclose the list in parantheses. + */ +static List * +deparseRelColumnList(StringInfo buf, Relation rel, bool enclose_in_parens) { Oid relid = RelationGetRelid(rel); TupleDesc tupdesc = RelationGetDescr(rel); @@ -2069,10 +2109,8 @@ deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) List *options; ListCell *lc; bool first = true; + List *retrieved_attrs = NIL; - *retrieved_attrs = NIL; - - appendStringInfoString(buf, "SELECT "); for (i = 0; i < tupdesc->natts; i++) { /* Ignore dropped columns. */ @@ -2081,6 +2119,9 @@ deparseAnalyzeSql(StringInfo
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 7/16/20 2:14 PM, Amit Langote wrote: Hi Andrey, Thanks for this work. I have been reading through your patch and here's a what I understand it does and how: The patch aims to fix the restriction that COPYing into a foreign table can't use multi-insert buffer mechanism effectively. That's because copy.c currently uses the ExecForeignInsert() FDW API which can be passed only 1 row at a time. postgres_fdw's implementation issues an `INSERT INTO remote_table VALUES (...)` statement to the remote side for each row, which is pretty inefficient for bulk loads. The patch introduces a new FDW API ExecForeignCopyIn() that can receive multiple rows and copy.c now calls it every time it flushes the multi-insert buffer so that all the flushed rows can be sent to the remote side in one go. postgres_fdw's now issues a `COPY remote_table FROM STDIN` to the remote server and postgresExecForeignCopyIn() funnels the tuples flushed by the local copy to the server side waiting for tuples on the COPY protocol. Fine Here are some comments on the patch. * Why the "In" in these API names? + /* COPY a bulk of tuples into a foreign relation */ + BeginForeignCopyIn_function BeginForeignCopyIn; + EndForeignCopyIn_function EndForeignCopyIn; + ExecForeignCopyIn_function ExecForeignCopyIn; I used an analogy from copy.c. * fdwhandler.sgml should be updated with the description of these new APIs. * As far as I can tell, the following copy.h additions are for an FDW to use copy.c to obtain an external representation (char string) to send to the remote side of the individual rows that are passed to ExecForeignCopyIn(): +typedef void (*copy_data_dest_cb) (void *outbuf, int len); +extern CopyState BeginForeignCopyTo(Relation rel); +extern char *NextForeignCopyRow(CopyState cstate, TupleTableSlot *slot); +extern void EndForeignCopyTo(CopyState cstate); So, an FDW's ExecForeignCopyIn() calls copy.c: NextForeignCopyRow() which in turn calls copy.c: CopyOneRowTo() which fills CopyState.fe_msgbuf. The data_dest_cb() callback that runs after fe_msgbuf contains the full row simply copies it into a palloc'd char buffer whose pointer is returned back to ExecForeignCopyIn(). I wonder why not let FDWs implement the callback and pass it to copy.c through BeginForeignCopyTo()? For example, you could implement a pgfdw_copy_data_dest_cb() in postgres_fdw.c which gets a direct pointer of fe_msgbuf to send it to the remote server. It is good point! Thank you. Do you think all FDWs would want to use copy,c like above? If not, maybe the above APIs are really postgres_fdw-specific? Anyway, adding comments above the definitions of these functions would be helpful. Agreed * I see that the remote copy is performed from scratch on every call of postgresExecForeignCopyIn(), but wouldn't it be more efficient to send the `COPY remote_table FROM STDIN` in postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn() when there are no errors during the copy? It is not possible. FDW share one connection between all foreign relations from a server. If two or more partitions will be placed at one foreign server you will have problems with concurrent COPY command. May be we can create new connection for each partition? I tried implementing these two changes -- pgfdw_copy_data_dest_cb() and sending `COPY remote_table FROM STDIN` only once instead of on every flush -- and I see significant speedup. Please check the attached patch that applies on top of yours. I integrated first change and rejected the second by the reason as above. One problem I spotted when trying my patch but didn't spend much time debugging is that local COPY cannot be interrupted by Ctrl+C anymore, but that should be fixable by adjusting PG_TRY() blocks. Thanks * ResultRelInfo.UseBulkModifying should be ri_usesBulkModify for consistency. +1 I will post a new version of the patch a little bit later. -- regards, Andrey Lepikhov Postgres Professional
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey, Thanks for this work. I have been reading through your patch and here's a what I understand it does and how: The patch aims to fix the restriction that COPYing into a foreign table can't use multi-insert buffer mechanism effectively. That's because copy.c currently uses the ExecForeignInsert() FDW API which can be passed only 1 row at a time. postgres_fdw's implementation issues an `INSERT INTO remote_table VALUES (...)` statement to the remote side for each row, which is pretty inefficient for bulk loads. The patch introduces a new FDW API ExecForeignCopyIn() that can receive multiple rows and copy.c now calls it every time it flushes the multi-insert buffer so that all the flushed rows can be sent to the remote side in one go. postgres_fdw's now issues a `COPY remote_table FROM STDIN` to the remote server and postgresExecForeignCopyIn() funnels the tuples flushed by the local copy to the server side waiting for tuples on the COPY protocol. Here are some comments on the patch. * Why the "In" in these API names? + /* COPY a bulk of tuples into a foreign relation */ + BeginForeignCopyIn_function BeginForeignCopyIn; + EndForeignCopyIn_function EndForeignCopyIn; + ExecForeignCopyIn_function ExecForeignCopyIn; * fdwhandler.sgml should be updated with the description of these new APIs. * As far as I can tell, the following copy.h additions are for an FDW to use copy.c to obtain an external representation (char string) to send to the remote side of the individual rows that are passed to ExecForeignCopyIn(): +typedef void (*copy_data_dest_cb) (void *outbuf, int len); +extern CopyState BeginForeignCopyTo(Relation rel); +extern char *NextForeignCopyRow(CopyState cstate, TupleTableSlot *slot); +extern void EndForeignCopyTo(CopyState cstate); So, an FDW's ExecForeignCopyIn() calls copy.c: NextForeignCopyRow() which in turn calls copy.c: CopyOneRowTo() which fills CopyState.fe_msgbuf. The data_dest_cb() callback that runs after fe_msgbuf contains the full row simply copies it into a palloc'd char buffer whose pointer is returned back to ExecForeignCopyIn(). I wonder why not let FDWs implement the callback and pass it to copy.c through BeginForeignCopyTo()? For example, you could implement a pgfdw_copy_data_dest_cb() in postgres_fdw.c which gets a direct pointer of fe_msgbuf to send it to the remote server. Do you think all FDWs would want to use copy,c like above? If not, maybe the above APIs are really postgres_fdw-specific? Anyway, adding comments above the definitions of these functions would be helpful. * I see that the remote copy is performed from scratch on every call of postgresExecForeignCopyIn(), but wouldn't it be more efficient to send the `COPY remote_table FROM STDIN` in postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn() when there are no errors during the copy? I tried implementing these two changes -- pgfdw_copy_data_dest_cb() and sending `COPY remote_table FROM STDIN` only once instead of on every flush -- and I see significant speedup. Please check the attached patch that applies on top of yours. One problem I spotted when trying my patch but didn't spend much time debugging is that local COPY cannot be interrupted by Ctrl+C anymore, but that should be fixable by adjusting PG_TRY() blocks. * ResultRelInfo.UseBulkModifying should be ri_usesBulkModify for consistency. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 0db8d74..5668977 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -2063,6 +2063,21 @@ postgresEndForeignInsert(EState *estate, finish_foreign_modify(fmstate); } +static PgFdwModifyState *copy_fmstate = NULL; + +static void +pgfdw_copy_dest_cb(void *buf, int len) +{ + PGconn *conn = copy_fmstate->conn; + + if (PQputCopyData(conn, (char *) buf, len) <= 0) + { + PGresult *res = PQgetResult(conn); + + pgfdw_report_error(ERROR, res, conn, true, copy_fmstate->query); + } +} + /* * * postgresBeginForeignCopyIn @@ -2076,6 +2091,8 @@ postgresBeginForeignCopyIn(ModifyTableState *mtstate, Relation rel = resultRelInfo->ri_RelationDesc; StringInfoData sql; RangeTblEntry *rte; + PGconn *conn; + PGresult *res; rte = exec_rt_fetch(resultRelInfo->ri_RangeTableIndex, mtstate->ps.state); initStringInfo(); @@ -2090,8 +2107,16 @@ postgresBeginForeignCopyIn(ModifyTableState *mtstate, NIL, false, NIL); - fmstate->cstate = BeginForeignCopyTo(resultRelInfo->ri_RelationDesc); + fmstate->cstate = BeginForeignCopyTo(rel, pgfdw_copy_dest_cb); resultRelInfo->ri_FdwState = fmstate; + + conn = fmstate->conn; + res = PQexec(conn, fmstate->query); + if (PQresultStatus(res) != PGRES_COPY_IN) + pgfdw_report_error(ERROR, res, conn, true, fmstate->query); + PQclear(res); + + copy_fmstate = fmstate; } /* @@ -2102,14 +2127,40 @@ static void
Re: [POC] Fast COPY FROM command for the table with foreign partitions
22.06.2020 17:11, Ashutosh Bapat пишет: On Wed, 17 Jun 2020 at 11:54, Andrey V. Lepikhov mailto:a.lepik...@postgrespro.ru>> wrote: On 6/15/20 10:26 AM, Ashutosh Bapat wrote: > Thanks Andrey for the patch. I am glad that the patch has taken care > of some corner cases already but there exist still more. > > COPY command constructed doesn't take care of dropped columns. There > is code in deparseAnalyzeSql which constructs list of columns for a > given foreign relation. 0002 patch attached here, moves that code to a > separate function and reuses it for COPY. If you find that code change > useful please include it in the main patch. Thanks, i included it. > 2. In the same case, if the foreign table declared locally didn't have > any non-dropped columns but the relation that it referred to on the > foreign server had some non-dropped columns, COPY command fails. I > added a test case for this in 0002 but haven't fixed it. I fixed it. This is very special corner case. The problem was that COPY FROM does not support semantics like the "INSERT INTO .. DEFAULT VALUES". To simplify the solution, i switched off bulk copying for this case. > I think this work is useful. Please add it to the next commitfest so > that it's tracked. Ok. It looks like we call BeginForeignInsert and EndForeignInsert even though actual copy is performed using BeginForeignCopy, ExecForeignCopy and EndForeignCopy. BeginForeignInsert constructs the INSERT query which looks unnecessary. Also some of the other PgFdwModifyState members are initialized unnecessarily. It also gives an impression that we are using INSERT underneath the copy. Instead a better way would be to call BeginForeignCopy instead of BeginForeignInsert and EndForeignCopy instead of EndForeignInsert, if we are going to use COPY protocol to copy data to the foreign server. Corresponding postgres_fdw implementations need to change in order to do that. Fixed. I replaced names of CopyIn FDW API. Also the partition routing initializer calls BeginForeignInsert or BeginForeignCopyIn routines in accordance with value of ResultRelInfo::UseBulkModifying. I introduced this parameter because foreign partitions can be placed at foreign servers with different types of foreign wrapper. Not all wrappers can support CopyIn API. Also I ran the Tomas Vondra benchmark. At my laptop we have results: * regular: 5000 ms. * Tomas buffering patch: 11000 ms. * This CopyIn patch: 8000 ms. -- regards, Andrey Lepikhov Postgres Professional >From ac43384af911acd0a07b3fae0ab25a9131a4504c Mon Sep 17 00:00:00 2001 From: Andrey Lepikhov Date: Thu, 9 Jul 2020 11:16:56 +0500 Subject: [PATCH] Fast COPY FROM into the foreign or sharded table. This feature enables bulk COPY into foreign table in the case of multi inserts is possible and foreign table has non-zero number of columns. --- contrib/postgres_fdw/deparse.c| 60 - .../postgres_fdw/expected/postgres_fdw.out| 33 ++- contrib/postgres_fdw/postgres_fdw.c | 130 ++ contrib/postgres_fdw/postgres_fdw.h | 1 + contrib/postgres_fdw/sql/postgres_fdw.sql | 28 ++ src/backend/commands/copy.c | 239 -- src/backend/executor/execMain.c | 1 + src/backend/executor/execPartition.c | 34 ++- src/include/commands/copy.h | 5 + src/include/foreign/fdwapi.h | 15 ++ src/include/nodes/execnodes.h | 8 + 11 files changed, 456 insertions(+), 98 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index ad37a74221..a37981ff66 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -184,6 +184,8 @@ static void appendAggOrderBy(List *orderList, List *targetList, static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); static Node *deparseSortGroupClause(Index ref, List *tlist, bool force_colno, deparse_expr_cxt *context); +static List *deparseRelColumnList(StringInfo buf, Relation rel, + bool enclose_in_parens); /* * Helper functions @@ -1758,6 +1760,20 @@ deparseInsertSql(StringInfo buf, RangeTblEntry *rte, withCheckOptionList, returningList, retrieved_attrs); } +/* + * Deparse COPY FROM into given buf. + * We need to use list of parameters at each query. + */ +void +deparseCopyFromSql(StringInfo buf, Relation rel) +{ + appendStringInfoString(buf, "COPY "); + deparseRelation(buf, rel); + (void) deparseRelColumnList(buf, rel, true); + + appendStringInfoString(buf, " FROM STDIN "); +} + /* * deparse remote UPDATE statement * @@ -2061,6 +2077,30 @@ deparseAnalyzeSizeSql(StringInfo buf, Relation rel) */ void deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) +{ + appendStringInfoString(buf, "SELECT "); + *retrieved_attrs =
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 6/22/20 5:11 PM, Ashutosh Bapat wrote: mailto:a.lepik...@postgrespro.ru>> wrote: It looks like we call BeginForeignInsert and EndForeignInsert even though actual copy is performed using BeginForeignCopy, ExecForeignCopy and EndForeignCopy. BeginForeignInsert constructs the INSERT query which looks unnecessary. Also some of the other PgFdwModifyState members are initialized unnecessarily. It also gives an impression that we are using INSERT underneath the copy. Instead a better way would be to call BeginForeignCopy instead of BeginForeignInsert and EndForeignCopy instead of EndForeignInsert, if we are going to use COPY protocol to copy data to the foreign server. Corresponding postgres_fdw implementations need to change in order to do that. I did not answer for a long time, because of waiting for the results of the discussion on Tomas approach to bulk INSERT/UPDATE/DELETE. It seems more general. I can move the query construction into the first execution of INSERT or COPY operation. But another changes seems more invasive because BeginForeignInsert/EndForeignInsert are used in the execPartition.c module. We will need to pass copy/insert state of operation into ExecFindPartition() and ExecCleanupTupleRouting(). -- regards, Andrey Lepikhov Postgres Professional
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Wed, 17 Jun 2020 at 11:54, Andrey V. Lepikhov wrote: > On 6/15/20 10:26 AM, Ashutosh Bapat wrote: > > Thanks Andrey for the patch. I am glad that the patch has taken care > > of some corner cases already but there exist still more. > > > > COPY command constructed doesn't take care of dropped columns. There > > is code in deparseAnalyzeSql which constructs list of columns for a > > given foreign relation. 0002 patch attached here, moves that code to a > > separate function and reuses it for COPY. If you find that code change > > useful please include it in the main patch. > > Thanks, i included it. > > > 2. In the same case, if the foreign table declared locally didn't have > > any non-dropped columns but the relation that it referred to on the > > foreign server had some non-dropped columns, COPY command fails. I > > added a test case for this in 0002 but haven't fixed it. > > I fixed it. > This is very special corner case. The problem was that COPY FROM does > not support semantics like the "INSERT INTO .. DEFAULT VALUES". To > simplify the solution, i switched off bulk copying for this case. > > > I think this work is useful. Please add it to the next commitfest so > > that it's tracked. > Ok. > It looks like we call BeginForeignInsert and EndForeignInsert even though actual copy is performed using BeginForeignCopy, ExecForeignCopy and EndForeignCopy. BeginForeignInsert constructs the INSERT query which looks unnecessary. Also some of the other PgFdwModifyState members are initialized unnecessarily. It also gives an impression that we are using INSERT underneath the copy. Instead a better way would be to call BeginForeignCopy instead of BeginForeignInsert and EndForeignCopy instead of EndForeignInsert, if we are going to use COPY protocol to copy data to the foreign server. Corresponding postgres_fdw implementations need to change in order to do that. This isn't a full review. I will continue reviewing this patch further. -- Best Wishes, Ashutosh
Re: [POC] Fast COPY FROM command for the table with foreign partitions
19.06.2020 19:58, Etsuro Fujita пишет: On Tue, Jun 2, 2020 at 2:51 PM Andrey Lepikhov wrote: Hiding the COPY code under the buffers management machinery allows us to generalize buffers machinery, execute one COPY operation on each buffer and simplify error handling. I'm not sure that it's really a good idea that the bulk-insert API is designed the way it's tightly coupled with the bulk-insert machinery in the core, because 1) some FDWs might want to send tuples provided by the core to the remote, one by one, without storing them in a buffer, or 2) some other FDWs might want to store the tuples in the buffer and send them in a lump as postgres_fdw in the proposed patch but might want to do so independently of MAX_BUFFERED_TUPLES and/or MAX_BUFFERED_BYTES defined in the bulk-insert machinery. I agree that we would need special handling for cases you mentioned above if we design this API based on something like the idea I proposed in that thread. Agreed As i understand, main idea of the thread, mentioned by you, is to add "COPY FROM" support without changes in FDW API. I don't think so; I think we should introduce new API for this feature to keep the ExecForeignInsert() API simple. Ok All that I can offer in this place now is to introduce one new ExecForeignBulkInsert(buf) routine that will execute single "COPY FROM STDIN" operation, send tuples and close the operation. We can use the ExecForeignInsert() routine for each buffer tuple if ExecForeignBulkInsert() is not supported. Agreed. In the next version (see attachment) of the patch i removed Begin/End fdwapi routines. Now we have only the ExecForeignBulkInsert() routine. -- Andrey Lepikhov Postgres Professional >From 108dc421cec88ab5afd092f40da3fa31b8fcfbc5 Mon Sep 17 00:00:00 2001 From: "Andrey V. Lepikhov" Date: Mon, 22 Jun 2020 10:28:42 +0500 Subject: [PATCH] Fast COPY FROM into the foreign or sharded table. This feature enables bulk COPY into foreign table in the case of multi inserts is possible and foreign table has non-zero number of columns. --- contrib/postgres_fdw/deparse.c| 60 - .../postgres_fdw/expected/postgres_fdw.out| 33 ++- contrib/postgres_fdw/postgres_fdw.c | 87 contrib/postgres_fdw/postgres_fdw.h | 1 + contrib/postgres_fdw/sql/postgres_fdw.sql | 28 +++ src/backend/commands/copy.c | 206 -- src/include/commands/copy.h | 5 + src/include/foreign/fdwapi.h | 8 + 8 files changed, 344 insertions(+), 84 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index ad37a74221..a37981ff66 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -184,6 +184,8 @@ static void appendAggOrderBy(List *orderList, List *targetList, static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); static Node *deparseSortGroupClause(Index ref, List *tlist, bool force_colno, deparse_expr_cxt *context); +static List *deparseRelColumnList(StringInfo buf, Relation rel, + bool enclose_in_parens); /* * Helper functions @@ -1758,6 +1760,20 @@ deparseInsertSql(StringInfo buf, RangeTblEntry *rte, withCheckOptionList, returningList, retrieved_attrs); } +/* + * Deparse COPY FROM into given buf. + * We need to use list of parameters at each query. + */ +void +deparseCopyFromSql(StringInfo buf, Relation rel) +{ + appendStringInfoString(buf, "COPY "); + deparseRelation(buf, rel); + (void) deparseRelColumnList(buf, rel, true); + + appendStringInfoString(buf, " FROM STDIN "); +} + /* * deparse remote UPDATE statement * @@ -2061,6 +2077,30 @@ deparseAnalyzeSizeSql(StringInfo buf, Relation rel) */ void deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) +{ + appendStringInfoString(buf, "SELECT "); + *retrieved_attrs = deparseRelColumnList(buf, rel, false); + + /* Don't generate bad syntax for zero-column relation. */ + if (list_length(*retrieved_attrs) == 0) + appendStringInfoString(buf, "NULL"); + + /* + * Construct FROM clause + */ + appendStringInfoString(buf, " FROM "); + deparseRelation(buf, rel); +} + +/* + * Construct the list of columns of given foreign relation in the order they + * appear in the tuple descriptor of the relation. Ignore any dropped columns. + * Use column names on the foreign server instead of local names. + * + * Optionally enclose the list in parantheses. + */ +static List * +deparseRelColumnList(StringInfo buf, Relation rel, bool enclose_in_parens) { Oid relid = RelationGetRelid(rel); TupleDesc tupdesc = RelationGetDescr(rel); @@ -2069,10 +2109,8 @@ deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) List *options; ListCell *lc; bool first = true; + List *retrieved_attrs = NIL; - *retrieved_attrs = NIL; - - appendStringInfoString(buf, "SELECT "); for (i = 0; i < tupdesc->natts; i++) { /* Ignore
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On Tue, Jun 2, 2020 at 2:51 PM Andrey Lepikhov wrote: > 02.06.2020 05:02, Etsuro Fujita пишет: > > I think I also thought something similar to this before [1]. Will take a > > look. I'm still reviewing the patch, but let me comment on it. > > [1] > > https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp > I have looked into the thread. > My first version of the patch was like your idea. But when developing > the “COPY FROM” code, the following features were discovered: > 1. Two or more partitions can be placed at the same node. We need to > finish COPY into one partition before start COPY into another partition > at the same node. > 2. On any error we need to send EOF to all started "COPY .. FROM STDIN" > operations. Otherwise FDW can't cancel operation. > > Hiding the COPY code under the buffers management machinery allows us to > generalize buffers machinery, execute one COPY operation on each buffer > and simplify error handling. I'm not sure that it's really a good idea that the bulk-insert API is designed the way it's tightly coupled with the bulk-insert machinery in the core, because 1) some FDWs might want to send tuples provided by the core to the remote, one by one, without storing them in a buffer, or 2) some other FDWs might want to store the tuples in the buffer and send them in a lump as postgres_fdw in the proposed patch but might want to do so independently of MAX_BUFFERED_TUPLES and/or MAX_BUFFERED_BYTES defined in the bulk-insert machinery. I agree that we would need special handling for cases you mentioned above if we design this API based on something like the idea I proposed in that thread. > As i understand, main idea of the thread, mentioned by you, is to add > "COPY FROM" support without changes in FDW API. I don't think so; I think we should introduce new API for this feature to keep the ExecForeignInsert() API simple. > All that I can offer in this place now is to introduce one new > ExecForeignBulkInsert(buf) routine that will execute single "COPY FROM > STDIN" operation, send tuples and close the operation. We can use the > ExecForeignInsert() routine for each buffer tuple if > ExecForeignBulkInsert() is not supported. Agreed. > One of main questions here is to use COPY TO machinery for serializing a > tuple. It is needed (if you will take a look into the patch) to > transform the CopyTo() routine to an iterative representation: > start/next/finish. May it be acceptable? +1 for the general idea. > In the attachment there is a patch with the correction of a stupid error. Thanks for the patch! Sorry for the delay. Best regards, Etsuro Fujita
Re: [POC] Fast COPY FROM command for the table with foreign partitions
On 6/15/20 10:26 AM, Ashutosh Bapat wrote: Thanks Andrey for the patch. I am glad that the patch has taken care of some corner cases already but there exist still more. COPY command constructed doesn't take care of dropped columns. There is code in deparseAnalyzeSql which constructs list of columns for a given foreign relation. 0002 patch attached here, moves that code to a separate function and reuses it for COPY. If you find that code change useful please include it in the main patch. Thanks, i included it. 2. In the same case, if the foreign table declared locally didn't have any non-dropped columns but the relation that it referred to on the foreign server had some non-dropped columns, COPY command fails. I added a test case for this in 0002 but haven't fixed it. I fixed it. This is very special corner case. The problem was that COPY FROM does not support semantics like the "INSERT INTO .. DEFAULT VALUES". To simplify the solution, i switched off bulk copying for this case. > I think this work is useful. Please add it to the next commitfest so > that it's tracked. Ok. -- Andrey Lepikhov Postgres Professional https://postgrespro.com >From abe4db0a5391735f7663daac81df579644a70fc3 Mon Sep 17 00:00:00 2001 From: Andrey Lepikhov Date: Wed, 17 Jun 2020 11:07:54 +0500 Subject: [PATCH] Fast COPY FROM into the foreign or sharded table. This feature enables bulk COPY into foreign table in the case of multi inserts is possible and foreign table has non-zero number of columns. --- contrib/postgres_fdw/deparse.c| 60 - .../postgres_fdw/expected/postgres_fdw.out| 33 ++- contrib/postgres_fdw/postgres_fdw.c | 98 contrib/postgres_fdw/postgres_fdw.h | 1 + contrib/postgres_fdw/sql/postgres_fdw.sql | 28 +++ src/backend/commands/copy.c | 223 -- src/include/commands/copy.h | 5 + src/include/foreign/fdwapi.h | 9 + 8 files changed, 374 insertions(+), 83 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index ad37a74221..a37981ff66 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -184,6 +184,8 @@ static void appendAggOrderBy(List *orderList, List *targetList, static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); static Node *deparseSortGroupClause(Index ref, List *tlist, bool force_colno, deparse_expr_cxt *context); +static List *deparseRelColumnList(StringInfo buf, Relation rel, + bool enclose_in_parens); /* * Helper functions @@ -1758,6 +1760,20 @@ deparseInsertSql(StringInfo buf, RangeTblEntry *rte, withCheckOptionList, returningList, retrieved_attrs); } +/* + * Deparse COPY FROM into given buf. + * We need to use list of parameters at each query. + */ +void +deparseCopyFromSql(StringInfo buf, Relation rel) +{ + appendStringInfoString(buf, "COPY "); + deparseRelation(buf, rel); + (void) deparseRelColumnList(buf, rel, true); + + appendStringInfoString(buf, " FROM STDIN "); +} + /* * deparse remote UPDATE statement * @@ -2061,6 +2077,30 @@ deparseAnalyzeSizeSql(StringInfo buf, Relation rel) */ void deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) +{ + appendStringInfoString(buf, "SELECT "); + *retrieved_attrs = deparseRelColumnList(buf, rel, false); + + /* Don't generate bad syntax for zero-column relation. */ + if (list_length(*retrieved_attrs) == 0) + appendStringInfoString(buf, "NULL"); + + /* + * Construct FROM clause + */ + appendStringInfoString(buf, " FROM "); + deparseRelation(buf, rel); +} + +/* + * Construct the list of columns of given foreign relation in the order they + * appear in the tuple descriptor of the relation. Ignore any dropped columns. + * Use column names on the foreign server instead of local names. + * + * Optionally enclose the list in parantheses. + */ +static List * +deparseRelColumnList(StringInfo buf, Relation rel, bool enclose_in_parens) { Oid relid = RelationGetRelid(rel); TupleDesc tupdesc = RelationGetDescr(rel); @@ -2069,10 +2109,8 @@ deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) List *options; ListCell *lc; bool first = true; + List *retrieved_attrs = NIL; - *retrieved_attrs = NIL; - - appendStringInfoString(buf, "SELECT "); for (i = 0; i < tupdesc->natts; i++) { /* Ignore dropped columns. */ @@ -2081,6 +2119,9 @@ deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) if (!first) appendStringInfoString(buf, ", "); + else if (enclose_in_parens) + appendStringInfoChar(buf, '('); + first = false; /* Use attribute name or column_name option. */ @@ -2100,18 +2141,13 @@ deparseAnalyzeSql(StringInfo buf, Relation rel, List **retrieved_attrs) appendStringInfoString(buf, quote_identifier(colname)); - *retrieved_attrs = lappend_int(*retrieved_attrs, i + 1); + retrieved_attrs =
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Thanks Andrey for the patch. I am glad that the patch has taken care of some corner cases already but there exist still more. COPY command constructed doesn't take care of dropped columns. There is code in deparseAnalyzeSql which constructs list of columns for a given foreign relation. 0002 patch attached here, moves that code to a separate function and reuses it for COPY. If you find that code change useful please include it in the main patch. While working on that, I found two issues 1. The COPY command constructed an empty columns list when there were no non-dropped columns in the relation. This caused a syntax error. Fixed that in 0002. 2. In the same case, if the foreign table declared locally didn't have any non-dropped columns but the relation that it referred to on the foreign server had some non-dropped columns, COPY command fails. I added a test case for this in 0002 but haven't fixed it. I think this work is useful. Please add it to the next commitfest so that it's tracked. On Tue, Jun 2, 2020 at 11:21 AM Andrey Lepikhov wrote: > > Thank you for the answer, > > 02.06.2020 05:02, Etsuro Fujita пишет: > > I think I also thought something similar to this before [1]. Will take a > > look. > > > [1] > > https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp > > > I have looked into the thread. > My first version of the patch was like your idea. But when developing > the “COPY FROM” code, the following features were discovered: > 1. Two or more partitions can be placed at the same node. We need to > finish COPY into one partition before start COPY into another partition > at the same node. > 2. On any error we need to send EOF to all started "COPY .. FROM STDIN" > operations. Otherwise FDW can't cancel operation. > > Hiding the COPY code under the buffers management machinery allows us to > generalize buffers machinery, execute one COPY operation on each buffer > and simplify error handling. > > As i understand, main idea of the thread, mentioned by you, is to add > "COPY FROM" support without changes in FDW API. > It is possible to remove BeginForeignCopy() and EndForeignCopy() from > the patch. But it is not trivial to change ExecForeignInsert() for the > COPY purposes. > All that I can offer in this place now is to introduce one new > ExecForeignBulkInsert(buf) routine that will execute single "COPY FROM > STDIN" operation, send tuples and close the operation. We can use the > ExecForeignInsert() routine for each buffer tuple if > ExecForeignBulkInsert() is not supported. > > One of main questions here is to use COPY TO machinery for serializing a > tuple. It is needed (if you will take a look into the patch) to > transform the CopyTo() routine to an iterative representation: > start/next/finish. May it be acceptable? > > In the attachment there is a patch with the correction of a stupid error. > > -- > Andrey Lepikhov > Postgres Professional > https://postgrespro.com > The Russian Postgres Company -- Best Wishes, Ashutosh Bapat From 9c4e09bd03cb98b1f84c42c34ce7b76e0a87011c Mon Sep 17 00:00:00 2001 From: "Andrey V. Lepikhov" Date: Fri, 29 May 2020 10:39:57 +0500 Subject: [PATCH 1/2] Fast COPY FROM into the foreign (or sharded) table. --- contrib/postgres_fdw/deparse.c| 25 ++ .../postgres_fdw/expected/postgres_fdw.out| 5 +- contrib/postgres_fdw/postgres_fdw.c | 95 contrib/postgres_fdw/postgres_fdw.h | 1 + src/backend/commands/copy.c | 213 -- src/include/commands/copy.h | 5 + src/include/foreign/fdwapi.h | 9 + 7 files changed, 286 insertions(+), 67 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index ad37a74221..427402c8eb 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -1758,6 +1758,31 @@ deparseInsertSql(StringInfo buf, RangeTblEntry *rte, withCheckOptionList, returningList, retrieved_attrs); } +/* + * Deparse COPY FROM into given buf. + * We need to use list of parameters at each query. + */ +void +deparseCopyFromSql(StringInfo buf, Relation rel) +{ + int attnum; + + appendStringInfoString(buf, "COPY "); + deparseRelation(buf, rel); + appendStringInfoString(buf, " ( "); + + for(attnum = 0; attnum < rel->rd_att->natts; attnum++) + { + appendStringInfoString(buf, NameStr(rel->rd_att->attrs[attnum].attname)); + + if (attnum != rel->rd_att->natts-1) + appendStringInfoString(buf, ", "); + } + + appendStringInfoString(buf, " ) "); + appendStringInfoString(buf, " FROM STDIN "); +} + /* * deparse remote UPDATE statement * diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 82fc1290ef..922c08d2dc 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -8063,8 +8063,9 @@ copy rem2 from stdin; copy rem2
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Thank you for the answer, 02.06.2020 05:02, Etsuro Fujita пишет: I think I also thought something similar to this before [1]. Will take a look. [1] https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp I have looked into the thread. My first version of the patch was like your idea. But when developing the “COPY FROM” code, the following features were discovered: 1. Two or more partitions can be placed at the same node. We need to finish COPY into one partition before start COPY into another partition at the same node. 2. On any error we need to send EOF to all started "COPY .. FROM STDIN" operations. Otherwise FDW can't cancel operation. Hiding the COPY code under the buffers management machinery allows us to generalize buffers machinery, execute one COPY operation on each buffer and simplify error handling. As i understand, main idea of the thread, mentioned by you, is to add "COPY FROM" support without changes in FDW API. It is possible to remove BeginForeignCopy() and EndForeignCopy() from the patch. But it is not trivial to change ExecForeignInsert() for the COPY purposes. All that I can offer in this place now is to introduce one new ExecForeignBulkInsert(buf) routine that will execute single "COPY FROM STDIN" operation, send tuples and close the operation. We can use the ExecForeignInsert() routine for each buffer tuple if ExecForeignBulkInsert() is not supported. One of main questions here is to use COPY TO machinery for serializing a tuple. It is needed (if you will take a look into the patch) to transform the CopyTo() routine to an iterative representation: start/next/finish. May it be acceptable? In the attachment there is a patch with the correction of a stupid error. -- Andrey Lepikhov Postgres Professional https://postgrespro.com The Russian Postgres Company >From 92a86ca52e33444ef2f559a1da9c8632398892a4 Mon Sep 17 00:00:00 2001 From: "Andrey V. Lepikhov" Date: Fri, 29 May 2020 10:39:57 +0500 Subject: [PATCH] Fast COPY FROM into the foreign (or sharded) table. --- contrib/postgres_fdw/deparse.c| 25 ++ .../postgres_fdw/expected/postgres_fdw.out| 5 +- contrib/postgres_fdw/postgres_fdw.c | 95 contrib/postgres_fdw/postgres_fdw.h | 1 + src/backend/commands/copy.c | 213 -- src/include/commands/copy.h | 5 + src/include/foreign/fdwapi.h | 9 + 7 files changed, 286 insertions(+), 67 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index ad37a74221..427402c8eb 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -1758,6 +1758,31 @@ deparseInsertSql(StringInfo buf, RangeTblEntry *rte, withCheckOptionList, returningList, retrieved_attrs); } +/* + * Deparse COPY FROM into given buf. + * We need to use list of parameters at each query. + */ +void +deparseCopyFromSql(StringInfo buf, Relation rel) +{ + int attnum; + + appendStringInfoString(buf, "COPY "); + deparseRelation(buf, rel); + appendStringInfoString(buf, " ( "); + + for(attnum = 0; attnum < rel->rd_att->natts; attnum++) + { + appendStringInfoString(buf, NameStr(rel->rd_att->attrs[attnum].attname)); + + if (attnum != rel->rd_att->natts-1) + appendStringInfoString(buf, ", "); + } + + appendStringInfoString(buf, " ) "); + appendStringInfoString(buf, " FROM STDIN "); +} + /* * deparse remote UPDATE statement * diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 90db550b92..5ae24fef7c 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -8063,8 +8063,9 @@ copy rem2 from stdin; copy rem2 from stdin; -- ERROR ERROR: new row for relation "loc2" violates check constraint "loc2_f1positive" DETAIL: Failing row contains (-1, xyzzy). -CONTEXT: remote SQL command: INSERT INTO public.loc2(f1, f2) VALUES ($1, $2) -COPY rem2, line 1: "-1 xyzzy" +CONTEXT: COPY loc2, line 1: "-1 xyzzy" +remote SQL command: COPY public.loc2 ( f1, f2 ) FROM STDIN +COPY rem2, line 2 select * from rem2; f1 | f2 +- diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 9fc53cad68..bd2a8f596f 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -18,6 +18,7 @@ #include "access/sysattr.h" #include "access/table.h" #include "catalog/pg_class.h" +#include "commands/copy.h" #include "commands/defrem.h" #include "commands/explain.h" #include "commands/vacuum.h" @@ -190,6 +191,7 @@ typedef struct PgFdwModifyState /* for update row movement if subplan result rel */ struct PgFdwModifyState *aux_fmstate; /* foreign-insert state, if * created */ + CopyState fdwcstate; } PgFdwModifyState; /* @@ -350,12 +352,16 @@ static TupleTableSlot
Re: [POC] Fast COPY FROM command for the table with foreign partitions
Hi Andrey, On Mon, Jun 1, 2020 at 6:29 PM Andrey Lepikhov wrote: > Currently i see, COPY FROM insertion into the partitioned table with > foreign partitions is not optimal: even if table constraints allows can > do multi insert copy, we will flush the buffers and prepare new INSERT > query for each tuple, routed into the foreign partition. > To solve this problem i tried to use the multi insert buffers for > foreign tuples too. Flushing of these buffers performs by the analogy > with 'COPY .. FROM STDIN' machinery as it is done by the psql '\copy' > command. > The patch in attachment was prepared from the private scratch developed > by Arseny Sher a couple of years ago. > Benchmarks shows that it speeds up COPY FROM operation: > Command "COPY pgbench_accounts FROM ..." (test file contains 1e7 tuples, > copy to three partitions) executes on my laptop in 14 minutes without > the patch and in 1.5 minutes with the patch. Theoretical minimum here > (with infinite buffer size) is 40 seconds. Great! > A couple of questions: > 1. Can this feature be interesting for the PostgreSQL core or not? Yeah, I think this is especially useful for sharding. > 2. If this is a useful feature, is the correct way chosen? I think I also thought something similar to this before [1]. Will take a look. Thanks! Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/23990375-45a6-5823-b0aa-a6a7a6a957f0%40lab.ntt.co.jp