Re: should INSERT SELECT use a BulkInsertState?

2021-11-04 Thread Daniel Gustafsson
> On 27 Sep 2021, at 02:08, Justin Pryzby  wrote:

> I split off the simple part again.  If there's no interest in the 0001 patch
> alone, then I guess it should be closed in the CF.

Since the thread has stalled, maybe that's the best course of action here.  Any
objections from anyone on the thread?

--
Daniel Gustafsson   https://vmware.com/





Re: should INSERT SELECT use a BulkInsertState?

2021-09-26 Thread Justin Pryzby
On Mon, May 11, 2020 at 03:19:34PM +0900, Michael Paquier wrote:
> On Fri, May 08, 2020 at 02:25:45AM -0500, Justin Pryzby wrote:
> > Seems to me it should, at least conditionally.  At least if there's a 
> > function
> > scan or a relation or ..
> >
> > I mentioned a bit about our use-case here:
> > https://www.postgresql.org/message-id/20200219173742.GA30939%40telsasoft.com
> > => I'd prefer our loaders to write their own data rather than dirtying large
> > fractions of buffer cache and leaving it around for other backends to clean 
> > up.
>
> Does it matter in terms of performance and for which cases does it
> actually matter?

Every 15min we're inserting 10s of thousands of rows, which dirties a large
number of buffers:

postgres=# CREATE EXTENSION pg_buffercache; DROP TABLE tt; CREATE TABLE tt(i 
int); INSERT INTO tt SELECT generate_series(1,99); SELECT 
usagecount,COUNT(1) FROM pg_buffercache WHERE isdirty GROUP BY 1 ORDER BY 1;
 usagecount | count
+---
  1 | 1
  2 | 1
  3 | 2
  4 | 2
  5 |  4436

With this patch it dirties fewer pages and with lower usage count:

  1 |  2052
  2 | 1
  3 | 3
  4 | 2
  5 |10

The goal is to avoid cache churn by using a small ring buffer.
Note that indexes on the target table will themselves use up buffers, and
BulkInsert won't help so much.

On Thu, Mar 18, 2021 at 08:29:50AM -0700, Zhihong Yu wrote:
> +   mtstate->ntuples > bulk_insert_ntuples &&
> +   bulk_insert_ntuples >= 0)
> 
> I wonder why bulk_insert_ntuples == 0 is included in the above. It
> seems bulk_insert_ntuples having value of 0 should mean not enabling bulk
> insertions.

I think it ought to be possible to enable bulk insertions immediately, which is 

  
what 0 does.  -1 is the value defined to mean "do not use bulk insert". 

  
I realize there's no documentation yet. 

  

This patch started out with the goal of using a BulkInsertState for INSERT,
same as for COPY.  We use INSERT ON CONFLICT VALUES(),()..., and it'd be nice
if our data loaders could avoid leaving behind dirty buffers.

Simon suggested to also use MultInsertInfo.  However that patch is complicated:
it cannot work with volatile default expressions, and probably many other
things that could go in SELECT but not supported by miistate.  That may be
better handled by another patch (or in any case by someone else) like this one:
| New Table Access Methods for Multi and Single Inserts
https://commitfest.postgresql.org/31/2871/

I split off the simple part again.  If there's no interest in the 0001 patch
alone, then I guess it should be closed in the CF.

However, the "new table AM" patch doesn't address our case, since neither
VALUES nor INSERT SELECT are considered a bulk insert..

-- 
Justin
>From 19eea95d1eca89a90f3f1b2992e248e3e3459caa Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 27 Apr 2021 09:09:13 -0500
Subject: [PATCH v11] ExecInsert to use BulkInsertState

---
 src/backend/executor/nodeModifyTable.c | 34 --
 src/backend/utils/misc/guc.c   |  9 +++
 src/include/executor/nodeModifyTable.h |  2 ++
 src/include/nodes/execnodes.h  |  5 
 src/test/regress/expected/insert.out   | 15 
 src/test/regress/sql/insert.sql| 10 
 6 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d328856ae5..9ce81dc922 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -81,6 +81,8 @@ static TupleTableSlot *ExecPrepareTupleRouting(ModifyTableState *mtstate,
 			   ResultRelInfo *targetRelInfo,
 			   TupleTableSlot *slot,
 			   ResultRelInfo **partRelInfo);
+/* guc */
+int bulk_insert_ntuples = 1000;
 
 /*
  * Verify that the tuples to be produced by INSERT match the
@@ -625,6 +627,19 @@ ExecInsert(ModifyTableState *mtstate,
 
 	resultRelationDesc = resultRelInfo->ri_RelationDesc;
 
+	/* Use bulk insert after a threshold number of tuples */
+	// XXX: maybe this should only be done if it's not a partitioned table or
+	// if the partitions don't support miinfo, which uses its own bistates
+	mtstate->ntuples++;
+	if (mtstate->bistate == NULL &&
+			mtstate->ntuples > bulk_insert_ntuples &&
+			bulk_insert_ntuples >= 0)
+	{
+		elog(DEBUG1, "enabling bulk insert");
+		mtstate->bistate = 

RE: should INSERT SELECT use a BulkInsertState?

2021-03-22 Thread houzj.f...@fujitsu.com
Hi,

About the 0002-patch [Check for volatile defaults].

I wonder if we can check the volatile default value by traversing 
"query->targetList" in planner.

IMO, the column default expression was written into the targetList, and the 
current parallel-safety check
travere the "query->targetList" to determine whether it contains unsafe column 
default expression.
Like: standard_planner-> query_tree_walker
if (walker((Node *) query->targetList, context))
return true;
May be we can do the similar thing to check the volatile defaults, if so, we do 
not need to add a field to TargetEntry.

Best regards,
houzj





Re: should INSERT SELECT use a BulkInsertState?

2021-03-18 Thread Zhihong Yu
Hi,
+   mtstate->ntuples > bulk_insert_ntuples &&
+   bulk_insert_ntuples >= 0)

I wonder why bulk_insert_ntuples == 0 is included in the above. It
seems bulk_insert_ntuples having value of 0 should mean not enabling bulk
insertions.

+   }
+   else
+   {

nit: the else should be on the same line as the right brace.

Cheers

On Thu, Mar 18, 2021 at 6:38 AM Ibrar Ahmed  wrote:

>
>
> On Mon, Mar 8, 2021 at 2:18 PM houzj.f...@fujitsu.com <
> houzj.f...@fujitsu.com> wrote:
>
>> > > > I am very interested in this patch, and I plan to do some
>> > > > experiments with the
>> > > patch.
>> > > > Can you please rebase the patch because it seems can not applied to
>> > > > the
>> > > master now.
>> > >
>> > > Thanks for your interest.
>> > >
>> > > I was sitting on a rebased version since the bulk FDW patch will cause
>> > > conflicts, and since this should maybe be built on top of the
>> table-am patch
>> > (2871).
>> > > Have fun :)
>> > Hi,
>> >
>> > When I testing with the patch, I found I can not use "\d tablename".
>> > It reports the following error, it this related to the patch?
>>
>> Sorry, solved by re-initdb.
>>
>> Best regards,
>> houzj
>>
>>
>> One of the patch
> (v10-0001-INSERT-SELECT-to-use-BulkInsertState-and-multi_i.patch) from the
> patchset does not apply.
>
> http://cfbot.cputube.org/patch_32_2553.log
>
> 1 out of 13 hunks FAILED -- saving rejects to file
> src/backend/commands/copyfrom.c.rej
>
> It is a minor change, therefore I fixed that to make cfbot happy. Please
> take a look if that works for you.
>
>
>
>
> --
> Ibrar Ahmed
>


Re: should INSERT SELECT use a BulkInsertState?

2021-03-18 Thread Ibrar Ahmed
On Mon, Mar 8, 2021 at 2:18 PM houzj.f...@fujitsu.com <
houzj.f...@fujitsu.com> wrote:

> > > > I am very interested in this patch, and I plan to do some
> > > > experiments with the
> > > patch.
> > > > Can you please rebase the patch because it seems can not applied to
> > > > the
> > > master now.
> > >
> > > Thanks for your interest.
> > >
> > > I was sitting on a rebased version since the bulk FDW patch will cause
> > > conflicts, and since this should maybe be built on top of the table-am
> patch
> > (2871).
> > > Have fun :)
> > Hi,
> >
> > When I testing with the patch, I found I can not use "\d tablename".
> > It reports the following error, it this related to the patch?
>
> Sorry, solved by re-initdb.
>
> Best regards,
> houzj
>
>
> One of the patch
(v10-0001-INSERT-SELECT-to-use-BulkInsertState-and-multi_i.patch) from the
patchset does not apply.

http://cfbot.cputube.org/patch_32_2553.log

1 out of 13 hunks FAILED -- saving rejects to file
src/backend/commands/copyfrom.c.rej

It is a minor change, therefore I fixed that to make cfbot happy. Please
take a look if that works for you.




-- 
Ibrar Ahmed


v11-0001-INSERT-SELECT-to-use-BulkInsertState-and-multi_i.patch
Description: Binary data


RE: should INSERT SELECT use a BulkInsertState?

2021-03-08 Thread houzj.f...@fujitsu.com
> > > I am very interested in this patch, and I plan to do some
> > > experiments with the
> > patch.
> > > Can you please rebase the patch because it seems can not applied to
> > > the
> > master now.
> >
> > Thanks for your interest.
> >
> > I was sitting on a rebased version since the bulk FDW patch will cause
> > conflicts, and since this should maybe be built on top of the table-am patch
> (2871).
> > Have fun :)
> Hi,
> 
> When I testing with the patch, I found I can not use "\d tablename".
> It reports the following error, it this related to the patch?

Sorry, solved by re-initdb.

Best regards,
houzj




RE: should INSERT SELECT use a BulkInsertState?

2021-03-08 Thread houzj.f...@fujitsu.com
> > I am very interested in this patch, and I plan to do some experiments with 
> > the
> patch.
> > Can you please rebase the patch because it seems can not applied to the
> master now.
> 
> Thanks for your interest.
> 
> I was sitting on a rebased version since the bulk FDW patch will cause 
> conflicts,
> and since this should maybe be built on top of the table-am patch (2871).
> Have fun :)
Hi,

When I testing with the patch, I found I can not use "\d tablename".
It reports the following error, it this related to the patch?

--
ERROR:  did not find '}' at end of input node at character 141
STATEMENT:  SELECT pol.polname, pol.polpermissive,
  CASE WHEN pol.polroles = '{0}' THEN NULL ELSE 
pg_catalog.array_to_string(array(select rolname from pg_catalog.pg_roles where 
oid = any (pol.polroles) order by 1),',') END,
  pg_catalog.pg_get_expr(pol.polqual, pol.polrelid),
  pg_catalog.pg_get_expr(pol.polwithcheck, pol.polrelid),
  CASE pol.polcmd
WHEN 'r' THEN 'SELECT'
WHEN 'a' THEN 'INSERT'
WHEN 'w' THEN 'UPDATE'
WHEN 'd' THEN 'DELETE'
END AS cmd
FROM pg_catalog.pg_policy pol
WHERE pol.polrelid = '58112' ORDER BY 1;
ERROR:  did not find '}' at end of input node
LINE 2: ...catalog.array_to_string(array(select rolname from pg_catalog...
--

Best regards,
houzj




Re: should INSERT SELECT use a BulkInsertState?

2021-02-21 Thread Justin Pryzby
On Mon, Feb 22, 2021 at 02:25:22AM +, houzj.f...@fujitsu.com wrote:
> > > Yes, you can see that I've copied the checks from copy.
> > > Like copy, some checks are done once, in ExecInitModifyTable, outside
> > > of the ExecModifyTable "loop".
> > >
> > > This squishes some commits together.
> > > And uses bistate for ON CONFLICT.
> > > And attempts to use memory context for tuple size.
> > 
> > Rebased on 9dc718bdf2b1a574481a45624d42b674332e2903
> > 
> > I guess my patch should/may be subsumed by this other one - I'm fine with
> > that.
> > https://commitfest.postgresql.org/31/2871/
> > 
> > Note that my interest here is just in bistate, to avoid leaving behind many 
> > dirty
> > buffers, not improved performance of COPY.
> 
> I am very interested in this patch, and I plan to do some experiments with 
> the patch.
> Can you please rebase the patch because it seems can not applied to the 
> master now.

Thanks for your interest.

I was sitting on a rebased version since the bulk FDW patch will cause
conflicts, and since this should maybe be built on top of the table-am patch
(2871).  Have fun :)

-- 
Justin
>From e2b93b3b3aaa32f680193b42d91a80bab40768a4 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 8 May 2020 02:17:32 -0500
Subject: [PATCH v10 1/4] INSERT SELECT to use BulkInsertState and multi_insert

Renames structures;
Move MultipleInsert functions from copyfrom.c to (tentatively) nodeModifyTable.h;
Move into MultiInsertInfo: transition_capture and cur_lineno (via cstate->miinfo);

Dynamically switch to multi-insert mode based on the number of insertions.
This is intended to accomodate 1) the original use case of INSERT using a small
ring buffer to avoid leaving behind dirty buffers; and, 2) Automatically using
multi-inserts for batch operations; 3) allow the old behavior of leaving behind
dirty buffers, which might allow INSERT to run more quickly, at the cost of
leaving behind many dirty buffers which other backends may have to write out.

XXX: for (1), the bulk-insert state is used even if not multi-insert, including
for a VALUES.

TODO: use cstate->miinfo.cur_lineno++ instead of mtstate->miinfo->ntuples
---
 src/backend/commands/copyfrom.c  | 394 +--
 src/backend/commands/copyfromparse.c |  10 +-
 src/backend/executor/execMain.c  |   2 +-
 src/backend/executor/execPartition.c |   2 +-
 src/backend/executor/nodeModifyTable.c   | 196 +--
 src/backend/utils/misc/guc.c |  10 +
 src/include/commands/copyfrom_internal.h |   5 +-
 src/include/executor/nodeModifyTable.h   | 371 +
 src/include/nodes/execnodes.h|  16 +-
 src/test/regress/expected/insert.out |  43 +++
 src/test/regress/sql/insert.sql  |  20 ++
 src/tools/pgindent/typedefs.list |   4 +-
 12 files changed, 658 insertions(+), 415 deletions(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 796ca7b3f7..5b8a1e4b61 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -46,54 +46,6 @@
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
 
-/*
- * No more than this many tuples per CopyMultiInsertBuffer
- *
- * Caution: Don't make this too big, as we could end up with this many
- * CopyMultiInsertBuffer items stored in CopyMultiInsertInfo's
- * multiInsertBuffers list.  Increasing this can cause quadratic growth in
- * memory requirements during copies into partitioned tables with a large
- * number of partitions.
- */
-#define MAX_BUFFERED_TUPLES		1000
-
-/*
- * Flush buffers if there are >= this many bytes, as counted by the input
- * size, of tuples stored.
- */
-#define MAX_BUFFERED_BYTES		65535
-
-/* Trim the list of buffers back down to this number after flushing */
-#define MAX_PARTITION_BUFFERS	32
-
-/* Stores multi-insert data related to a single relation in CopyFrom. */
-typedef struct CopyMultiInsertBuffer
-{
-	TupleTableSlot *slots[MAX_BUFFERED_TUPLES]; /* Array to store tuples */
-	ResultRelInfo *resultRelInfo;	/* ResultRelInfo for 'relid' */
-	BulkInsertState bistate;	/* BulkInsertState for this rel */
-	int			nused;			/* number of 'slots' containing tuples */
-	uint64		linenos[MAX_BUFFERED_TUPLES];	/* Line # of tuple in copy
- * stream */
-} CopyMultiInsertBuffer;
-
-/*
- * Stores one or many CopyMultiInsertBuffers and details about the size and
- * number of tuples which are stored in them.  This allows multiple buffers to
- * exist at once when COPYing into a partitioned table.
- */
-typedef struct CopyMultiInsertInfo
-{
-	List	   *multiInsertBuffers; /* List of tracked CopyMultiInsertBuffers */
-	int			bufferedTuples; /* number of tuples buffered over all buffers */
-	int			bufferedBytes;	/* number of bytes from all buffered tuples */
-	CopyFromState	cstate;			/* Copy state for this CopyMultiInsertInfo */
-	EState	   *estate;			/* Executor state used for COPY */
-	CommandId	mycid;			/* Command Id used for COPY */
-	int			

Re: should INSERT SELECT use a BulkInsertState?

2021-01-20 Thread Justin Pryzby
On Sat, Dec 05, 2020 at 01:59:41PM -0600, Justin Pryzby wrote:
> On Thu, Dec 03, 2020 at 10:59:34AM +0530, Bharath Rupireddy wrote:
> > On Wed, Dec 2, 2020 at 10:24 PM Justin Pryzby  wrote:
> > >
> > > One loose end in this patch is how to check for volatile default 
> > > expressions.
> > 
> > I think we should be doing all the necessary checks in the planner and
> > have a flag in the planned stmt to indicate whether to go with multi
> > insert or not. For the required checks, we can have a look at how the
> > existing COPY decides to go with either CIM_MULTI or CIM_SINGLE.
> 
> Yes, you can see that I've copied the checks from copy.
> Like copy, some checks are done once, in ExecInitModifyTable, outside of the
> ExecModifyTable "loop".
> 
> This squishes some commits together.
> And uses bistate for ON CONFLICT.
> And attempts to use memory context for tuple size.

Rebased on 9dc718bdf2b1a574481a45624d42b674332e2903

I guess my patch should/may be subsumed by this other one - I'm fine with that.
https://commitfest.postgresql.org/31/2871/

Note that my interest here is just in bistate, to avoid leaving behind many
dirty buffers, not improved performance of COPY.

-- 
Justin
>From 65fd9d9352634e4d0ad49685a988c5e4e157b9d8 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 8 May 2020 02:17:32 -0500
Subject: [PATCH v9 1/4] INSERT SELECT to use BulkInsertState and multi_insert

Renames structures;
Move MultipleInsert functions from copyfrom.c to (tentatively) nodeModifyTable.h;
Move into MultiInsertInfo: transition_capture and cur_lineno (via cstate->miinfo);

Dynamically switch to multi-insert mode based on the number of insertions.
This is intended to accomodate 1) the original use case of INSERT using a small
ring buffer to avoid leaving behind dirty buffers; and, 2) Automatically using
multi-inserts for batch operations; 3) allow the old behavior of leaving behind
dirty buffers, which might allow INSERT to run more quickly, at the cost of
leaving behind many dirty buffers which other backends may have to write out.

XXX: for (1), the bulk-insert state is used even if not multi-insert, including
for a VALUES.

TODO: use cstate->miinfo.cur_lineno++ instead of mtstate->miinfo->ntuples
---
 src/backend/commands/copyfrom.c  | 394 +--
 src/backend/commands/copyfromparse.c |  10 +-
 src/backend/executor/execMain.c  |   2 +-
 src/backend/executor/execPartition.c |   2 +-
 src/backend/executor/nodeModifyTable.c   | 196 +--
 src/backend/utils/misc/guc.c |  10 +
 src/include/commands/copyfrom_internal.h |   5 +-
 src/include/executor/nodeModifyTable.h   | 371 +
 src/include/nodes/execnodes.h|  16 +-
 src/test/regress/expected/insert.out |  43 +++
 src/test/regress/sql/insert.sql  |  20 ++
 src/tools/pgindent/typedefs.list |   4 +-
 12 files changed, 658 insertions(+), 415 deletions(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index c39cc736ed..8221a2c5d3 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -46,54 +46,6 @@
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
 
-/*
- * No more than this many tuples per CopyMultiInsertBuffer
- *
- * Caution: Don't make this too big, as we could end up with this many
- * CopyMultiInsertBuffer items stored in CopyMultiInsertInfo's
- * multiInsertBuffers list.  Increasing this can cause quadratic growth in
- * memory requirements during copies into partitioned tables with a large
- * number of partitions.
- */
-#define MAX_BUFFERED_TUPLES		1000
-
-/*
- * Flush buffers if there are >= this many bytes, as counted by the input
- * size, of tuples stored.
- */
-#define MAX_BUFFERED_BYTES		65535
-
-/* Trim the list of buffers back down to this number after flushing */
-#define MAX_PARTITION_BUFFERS	32
-
-/* Stores multi-insert data related to a single relation in CopyFrom. */
-typedef struct CopyMultiInsertBuffer
-{
-	TupleTableSlot *slots[MAX_BUFFERED_TUPLES]; /* Array to store tuples */
-	ResultRelInfo *resultRelInfo;	/* ResultRelInfo for 'relid' */
-	BulkInsertState bistate;	/* BulkInsertState for this rel */
-	int			nused;			/* number of 'slots' containing tuples */
-	uint64		linenos[MAX_BUFFERED_TUPLES];	/* Line # of tuple in copy
- * stream */
-} CopyMultiInsertBuffer;
-
-/*
- * Stores one or many CopyMultiInsertBuffers and details about the size and
- * number of tuples which are stored in them.  This allows multiple buffers to
- * exist at once when COPYing into a partitioned table.
- */
-typedef struct CopyMultiInsertInfo
-{
-	List	   *multiInsertBuffers; /* List of tracked CopyMultiInsertBuffers */
-	int			bufferedTuples; /* number of tuples buffered over all buffers */
-	int			bufferedBytes;	/* number of bytes from all buffered tuples */
-	CopyFromState	cstate;			/* Copy state for this CopyMultiInsertInfo */
-	EState	   *estate;			/* Executor state used for COPY */

Re: should INSERT SELECT use a BulkInsertState?

2020-12-05 Thread Justin Pryzby
On Thu, Dec 03, 2020 at 10:59:34AM +0530, Bharath Rupireddy wrote:
> On Wed, Dec 2, 2020 at 10:24 PM Justin Pryzby  wrote:
> >
> > One loose end in this patch is how to check for volatile default 
> > expressions.
> 
> I think we should be doing all the necessary checks in the planner and
> have a flag in the planned stmt to indicate whether to go with multi
> insert or not. For the required checks, we can have a look at how the
> existing COPY decides to go with either CIM_MULTI or CIM_SINGLE.

Yes, you can see that I've copied the checks from copy.
Like copy, some checks are done once, in ExecInitModifyTable, outside of the
ExecModifyTable "loop".

This squishes some commits together.
And uses bistate for ON CONFLICT.
And attempts to use memory context for tuple size.

For the bufferedBytes check, I'm not sure what's best.  Copy flushes buffers
after 65k of input line length, but that's totally different from tuple slot
memory context size, which is what I used for insert.  Maybe COPY should also
use slot size?  Or maybe the threshold to flush needs to be set in miinfo,
rather than a #define, and differ between COPY and INSERT.

-- 
Justin
>From f83313efc8612a5e94f1f13a87d80fb0c393c7b0 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 8 May 2020 02:17:32 -0500
Subject: [PATCH v8 1/4] INSERT SELECT to use BulkInsertState and multi_insert

Renames structures;
Move MultipleInsert functions from copyfrom.c to (tentatively) nodeModifyTable.h;
Move into MultiInsertInfo: transition_capture and cur_lineno (via cstate->miinfo);

Dynamically switch to multi-insert mode based on the number of insertions.
This is intended to accomodate 1) the original use case of INSERT using a small
ring buffer to avoid leaving behind dirty buffers; and, 2) Automatically using
multi-inserts for batch operations; 3) allow the old behavior of leaving behind
dirty buffers, which might allow INSERT to run more quickly, at the cost of
leaving behind many dirty buffers which other backends may have to write out.

XXX: for (1), the bulk-insert state is used even if not multi-insert, including
for a VALUES.

TODO: use cstate->miinfo.cur_lineno++ instead of mtstate->miinfo->ntuples
---
 src/backend/commands/copyfrom.c  | 394 +--
 src/backend/commands/copyfromparse.c |  10 +-
 src/backend/executor/execMain.c  |   2 +-
 src/backend/executor/execPartition.c |   2 +-
 src/backend/executor/nodeModifyTable.c   | 196 +--
 src/backend/utils/misc/guc.c |  10 +
 src/include/commands/copyfrom_internal.h |   5 +-
 src/include/executor/nodeModifyTable.h   | 370 +
 src/include/nodes/execnodes.h|  16 +-
 src/test/regress/expected/insert.out |  43 +++
 src/test/regress/sql/insert.sql  |  20 ++
 src/tools/pgindent/typedefs.list |   4 +-
 12 files changed, 657 insertions(+), 415 deletions(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 1b14e9a6eb..c4fe75df8e 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -44,54 +44,6 @@
 #include "utils/rel.h"
 #include "utils/snapmgr.h"
 
-/*
- * No more than this many tuples per CopyMultiInsertBuffer
- *
- * Caution: Don't make this too big, as we could end up with this many
- * CopyMultiInsertBuffer items stored in CopyMultiInsertInfo's
- * multiInsertBuffers list.  Increasing this can cause quadratic growth in
- * memory requirements during copies into partitioned tables with a large
- * number of partitions.
- */
-#define MAX_BUFFERED_TUPLES		1000
-
-/*
- * Flush buffers if there are >= this many bytes, as counted by the input
- * size, of tuples stored.
- */
-#define MAX_BUFFERED_BYTES		65535
-
-/* Trim the list of buffers back down to this number after flushing */
-#define MAX_PARTITION_BUFFERS	32
-
-/* Stores multi-insert data related to a single relation in CopyFrom. */
-typedef struct CopyMultiInsertBuffer
-{
-	TupleTableSlot *slots[MAX_BUFFERED_TUPLES]; /* Array to store tuples */
-	ResultRelInfo *resultRelInfo;	/* ResultRelInfo for 'relid' */
-	BulkInsertState bistate;	/* BulkInsertState for this rel */
-	int			nused;			/* number of 'slots' containing tuples */
-	uint64		linenos[MAX_BUFFERED_TUPLES];	/* Line # of tuple in copy
- * stream */
-} CopyMultiInsertBuffer;
-
-/*
- * Stores one or many CopyMultiInsertBuffers and details about the size and
- * number of tuples which are stored in them.  This allows multiple buffers to
- * exist at once when COPYing into a partitioned table.
- */
-typedef struct CopyMultiInsertInfo
-{
-	List	   *multiInsertBuffers; /* List of tracked CopyMultiInsertBuffers */
-	int			bufferedTuples; /* number of tuples buffered over all buffers */
-	int			bufferedBytes;	/* number of bytes from all buffered tuples */
-	CopyFromState	cstate;			/* Copy state for this CopyMultiInsertInfo */
-	EState	   *estate;			/* Executor state used for COPY */
-	CommandId	mycid;			/* Command Id 

Re: should INSERT SELECT use a BulkInsertState?

2020-12-02 Thread Bharath Rupireddy
On Wed, Dec 2, 2020 at 10:24 PM Justin Pryzby  wrote:
>
> One loose end in this patch is how to check for volatile default expressions.
>
> copyfrom.c is a utility statement, so it can look at the parser's column list:
> COPY table(c1,c2)...
>
> However, for INSERT, in nodeModifyTable.c, it looks like parsing, rewriting,
> and planning are done, at which point I don't know if there's a good way to
> find that.  The default expressions will have been rewritten into the planned
> statement.
>
> We need the list of columns whose default is volatile, excluding columns for
> which a non-default value is specified.
>
> INSERT INTO table (c1,c2) VALUES (1,default);
>
> We'd want the list of any column in the table with a volatile default,
> excluding columns c1, but not-excluding explicit default columns c2 or any
> implicit default columns (c3, etc).
>
> Any idea ?
>

I think we should be doing all the necessary checks in the planner and
have a flag in the planned stmt to indicate whether to go with multi
insert or not. For the required checks, we can have a look at how the
existing COPY decides to go with either CIM_MULTI or CIM_SINGLE.

Now, the question of how we can get to know whether a given relation
has default expressions or volatile expressions, it is worth to look
at build_column_default() and contain_volatile_functions().

I prefer to have the multi insert deciding code in COPY and INSERT
SELECT, in a single common function which can be reused. Though COPY
has somethings like default expressions and others ready unlike INSERT
SELECT, we can try to keep them under a common function and say for
COPY we can skip some code and for INSERT SELECT we can do extra work
to find default expressions.

Although unrelated, for parallel inserts in INSERT SELECT[1], in the
planner there are some checks to see if the parallelism is safe or
not. Check max_parallel_hazard_for_modify() in
v8-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT.patch from
[1]. On the similar lines, we can also have multi insert deciding
code.

[1] 
https://www.postgresql.org/message-id/CAJcOf-fy3P%2BkDArvmbEtdQTxFMf7Rn2%3DV-sqCnMmKO3QKBsgPA%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: should INSERT SELECT use a BulkInsertState?

2020-12-02 Thread Justin Pryzby
One loose end in this patch is how to check for volatile default expressions.

copyfrom.c is a utility statement, so it can look at the parser's column list:
COPY table(c1,c2)...

However, for INSERT, in nodeModifyTable.c, it looks like parsing, rewriting,
and planning are done, at which point I don't know if there's a good way to
find that.  The default expressions will have been rewritten into the planned
statement.

We need the list of columns whose default is volatile, excluding columns for
which a non-default value is specified.

INSERT INTO table (c1,c2) VALUES (1,default);

We'd want the list of any column in the table with a volatile default,
excluding columns c1, but not-excluding explicit default columns c2 or any
implicit default columns (c3, etc).

Any idea ?

-- 
Justin




Re: should INSERT SELECT use a BulkInsertState?

2020-11-29 Thread Justin Pryzby
On Mon, Nov 23, 2020 at 08:00:20PM -0600, Justin Pryzby wrote:
> On Mon, Nov 02, 2020 at 12:45:51PM -0600, Justin Pryzby wrote:
> > On Mon, Nov 02, 2020 at 07:53:45AM +0100, Luc Vlaming wrote:
> > > On 30.10.20 05:51, Justin Pryzby wrote:
> > > > On Thu, Oct 22, 2020 at 01:29:53PM +0100, Simon Riggs wrote:
> > > > > On Fri, 16 Oct 2020 at 22:05, Justin Pryzby  
> > > > > wrote:
> > > > > 
> > > > > > > > I made this conditional on BEGIN BULK/SET bulk, so I'll solicit 
> > > > > > > > comments on that.
> > > > > 
> > > > > I think it would be better if this was self-tuning. So that we don't
> > > > > allocate a bulkinsert state until we've done say 100 (?) rows
> > > > > inserted.
> > > > 
> > > > I made it an optional, non-default behavior in response to the 
> > > > legitimate
> > > > concern for performance regression for the cases where a loader needs 
> > > > to be as
> > > > fast as possible - as compared with our case, where we want instead to 
> > > > optimize
> > > > for our reports by making the loaders responsible for their own writes, 
> > > > rather
> > > > than leaving behind many dirty pages, and clobbering the cache, too.
> > > > 
> > > > Also, INSERT SELECT doesn't immediately help us (telsasoft), since we 
> > > > use
> > > > INSERT .. VALUES () .. ON CONFLICT.  This would handle that case, which 
> > > > is
> > > > great, even though that wasn't a design goal.  It could also be an 
> > > > integer GUC
> > > > to allow configuring the size of the ring buffer.
> > > > 
> > > > > You should also use table_multi_insert() since that will give further
> > > > > performance gains by reducing block access overheads. Switching from
> > > > > single row to multi-row should also only happen once we've loaded a
> > > > > few rows, so we don't introduce overahads for smaller SQL statements.
> > > > 
> > > > Good idea...multi_insert (which reduces the overhead of individual 
> > > > inserts) is
> > > > mostly independent from BulkInsert state (which uses a ring-buffer to 
> > > > avoid
> > > > dirtying the cache).  I made this 0002.
> > > > 
> > > > This makes INSERT SELECT several times faster, and not clobber the 
> > > > cache too.
> 
>  - Rebased on Heikki's copy.c split;
>  - Rename structures without "Copy" prefix;
>  - Move MultiInsert* from copyfrom.c to (tentatively) nodeModifyTable.h;
>  - Move cur_lineno and transition_capture into MultiInsertInfo;
> 
> This switches to multi insert after a configurable number of tuples.
> If set to -1, that provides the historic behavior that bulk inserts
> can leave behind many dirty buffers.  Perhaps that should be the default.
> 
> I guess this shouldn't be in copy.h or in commands/* at all.
> It'll be included by both: commands/copyfrom_internal.h and
> executor/nodeModifyTable.h.  Maybe it should go in util or lib...
> I don't know how to do it without including executor.h, which seems
> to be undesirable.

Attached resolves issue with FDW contrib by including the MultiInsertInfo
structure rather than a pointer and makes the logic more closely match
copyfrom.c related to partition/triggers.

I had made this a conditional based on the concern that bulk insert state would
cause regression.  But then it occurred to me that COPY uses a bulk insert
unconditionally.  Should COPY be conditional, too ?  Or maybe that's ok, since
COPY is assumed to be a bulk operation.

-- 
Justin
>From 886709926523f480255b4897d5bb08984be26a29 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 8 May 2020 02:17:32 -0500
Subject: [PATCH v7 1/3] Allow INSERT SELECT to use a BulkInsertState

---
 src/backend/executor/nodeModifyTable.c | 22 --
 src/backend/parser/gram.y  |  7 ++-
 src/backend/tcop/utility.c |  4 
 src/backend/utils/misc/guc.c   | 11 +++
 src/include/executor/nodeModifyTable.h |  2 ++
 src/include/nodes/execnodes.h  |  3 +++
 src/include/parser/kwlist.h|  1 +
 src/test/regress/expected/insert.out   | 23 +++
 src/test/regress/sql/insert.sql| 13 +
 9 files changed, 83 insertions(+), 3 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index e0f24283b8..f65ae2c0d6 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -72,6 +72,8 @@ static TupleTableSlot *ExecPrepareTupleRouting(ModifyTableState *mtstate,
 			   ResultRelInfo *targetRelInfo,
 			   TupleTableSlot *slot,
 			   ResultRelInfo **partRelInfo);
+/* guc */
+bool insert_in_bulk = false;
 
 /*
  * Verify that the tuples to be produced by INSERT or UPDATE match the
@@ -594,7 +596,7 @@ ExecInsert(ModifyTableState *mtstate,
 			table_tuple_insert_speculative(resultRelationDesc, slot,
 		   estate->es_output_cid,
 		   0,
-		   NULL,
+		   NULL, /* Bulk insert not supported */
 		   specToken);
 
 			/* insert index 

Re: should INSERT SELECT use a BulkInsertState?

2020-11-23 Thread Justin Pryzby
On Mon, Nov 02, 2020 at 12:45:51PM -0600, Justin Pryzby wrote:
> On Mon, Nov 02, 2020 at 07:53:45AM +0100, Luc Vlaming wrote:
> > On 30.10.20 05:51, Justin Pryzby wrote:
> > > On Thu, Oct 22, 2020 at 01:29:53PM +0100, Simon Riggs wrote:
> > > > On Fri, 16 Oct 2020 at 22:05, Justin Pryzby  
> > > > wrote:
> > > > 
> > > > > > > I made this conditional on BEGIN BULK/SET bulk, so I'll solicit 
> > > > > > > comments on that.
> > > > 
> > > > I think it would be better if this was self-tuning. So that we don't
> > > > allocate a bulkinsert state until we've done say 100 (?) rows
> > > > inserted.
> > > 
> > > I made it an optional, non-default behavior in response to the legitimate
> > > concern for performance regression for the cases where a loader needs to 
> > > be as
> > > fast as possible - as compared with our case, where we want instead to 
> > > optimize
> > > for our reports by making the loaders responsible for their own writes, 
> > > rather
> > > than leaving behind many dirty pages, and clobbering the cache, too.
> > > 
> > > Also, INSERT SELECT doesn't immediately help us (telsasoft), since we use
> > > INSERT .. VALUES () .. ON CONFLICT.  This would handle that case, which is
> > > great, even though that wasn't a design goal.  It could also be an 
> > > integer GUC
> > > to allow configuring the size of the ring buffer.
> > > 
> > > > You should also use table_multi_insert() since that will give further
> > > > performance gains by reducing block access overheads. Switching from
> > > > single row to multi-row should also only happen once we've loaded a
> > > > few rows, so we don't introduce overahads for smaller SQL statements.
> > > 
> > > Good idea...multi_insert (which reduces the overhead of individual 
> > > inserts) is
> > > mostly independent from BulkInsert state (which uses a ring-buffer to 
> > > avoid
> > > dirtying the cache).  I made this 0002.
> > > 
> > > This makes INSERT SELECT several times faster, and not clobber the cache 
> > > too.

 - Rebased on Heikki's copy.c split;
 - Rename structures without "Copy" prefix;
 - Move MultiInsert* from copyfrom.c to (tentatively) nodeModifyTable.h;
 - Move cur_lineno and transition_capture into MultiInsertInfo;

This switches to multi insert after a configurable number of tuples.
If set to -1, that provides the historic behavior that bulk inserts
can leave behind many dirty buffers.  Perhaps that should be the default.

I guess this shouldn't be in copy.h or in commands/* at all.
It'll be included by both: commands/copyfrom_internal.h and
executor/nodeModifyTable.h.  Maybe it should go in util or lib...
I don't know how to do it without including executor.h, which seems
to be undesirable.

-- 
Justin
>From d67c82e870c640e6f4ba25b3da5acf54df7165d2 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 8 May 2020 02:17:32 -0500
Subject: [PATCH v6 1/3] Allow INSERT SELECT to use a BulkInsertState

---
 src/backend/executor/nodeModifyTable.c | 22 --
 src/backend/parser/gram.y  |  7 ++-
 src/backend/tcop/utility.c |  4 
 src/backend/utils/misc/guc.c   | 11 +++
 src/include/executor/nodeModifyTable.h |  2 ++
 src/include/nodes/execnodes.h  |  3 +++
 src/include/parser/kwlist.h|  1 +
 src/test/regress/expected/insert.out   | 23 +++
 src/test/regress/sql/insert.sql| 13 +
 9 files changed, 83 insertions(+), 3 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 29e07b7228..26ff964105 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -72,6 +72,8 @@ static TupleTableSlot *ExecPrepareTupleRouting(ModifyTableState *mtstate,
 			   ResultRelInfo *targetRelInfo,
 			   TupleTableSlot *slot,
 			   ResultRelInfo **partRelInfo);
+/* guc */
+bool insert_in_bulk = false;
 
 /*
  * Verify that the tuples to be produced by INSERT or UPDATE match the
@@ -594,7 +596,7 @@ ExecInsert(ModifyTableState *mtstate,
 			table_tuple_insert_speculative(resultRelationDesc, slot,
 		   estate->es_output_cid,
 		   0,
-		   NULL,
+		   NULL, /* Bulk insert not supported */
 		   specToken);
 
 			/* insert index entries for tuple */
@@ -631,10 +633,17 @@ ExecInsert(ModifyTableState *mtstate,
 		}
 		else
 		{
+			if (proute && mtstate->prevResultRelInfo != resultRelInfo)
+			{
+if (mtstate->bistate)
+	ReleaseBulkInsertStatePin(mtstate->bistate);
+mtstate->prevResultRelInfo = resultRelInfo;
+			}
+
 			/* insert the tuple normally */
 			table_tuple_insert(resultRelationDesc, slot,
 			   estate->es_output_cid,
-			   0, NULL);
+			   0, mtstate->bistate);
 
 			/* insert index entries for tuple */
 			if (resultRelInfo->ri_NumIndices > 0)
@@ -2232,6 +2241,9 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	

Re: should INSERT SELECT use a BulkInsertState?

2020-11-02 Thread Justin Pryzby
On Mon, Nov 02, 2020 at 07:53:45AM +0100, Luc Vlaming wrote:
> On 30.10.20 05:51, Justin Pryzby wrote:
> > On Thu, Oct 22, 2020 at 01:29:53PM +0100, Simon Riggs wrote:
> > > On Fri, 16 Oct 2020 at 22:05, Justin Pryzby  wrote:
> > > 
> > > > > > I made this conditional on BEGIN BULK/SET bulk, so I'll solicit 
> > > > > > comments on that.
> > > 
> > > I think it would be better if this was self-tuning. So that we don't
> > > allocate a bulkinsert state until we've done say 100 (?) rows
> > > inserted.
> > 
> > I made it an optional, non-default behavior in response to the legitimate
> > concern for performance regression for the cases where a loader needs to be 
> > as
> > fast as possible - as compared with our case, where we want instead to 
> > optimize
> > for our reports by making the loaders responsible for their own writes, 
> > rather
> > than leaving behind many dirty pages, and clobbering the cache, too.
> > 
> > Also, INSERT SELECT doesn't immediately help us (telsasoft), since we use
> > INSERT .. VALUES () .. ON CONFLICT.  This would handle that case, which is
> > great, even though that wasn't a design goal.  It could also be an integer 
> > GUC
> > to allow configuring the size of the ring buffer.
> > 
> > > You should also use table_multi_insert() since that will give further
> > > performance gains by reducing block access overheads. Switching from
> > > single row to multi-row should also only happen once we've loaded a
> > > few rows, so we don't introduce overahads for smaller SQL statements.
> > 
> > Good idea...multi_insert (which reduces the overhead of individual inserts) 
> > is
> > mostly independent from BulkInsert state (which uses a ring-buffer to avoid
> > dirtying the cache).  I made this 0002.
> > 
> > This makes INSERT SELECT several times faster, and not clobber the cache 
> > too.
> > 
> > Time: 4700.606 ms (00:04.701)
> > 123 |  1
> >  37 |  2
> >  20 |  3
> >  11 |  4
> >4537 |  5
> >   11656 |
> > 
> > Time: 1125.302 ms (00:01.125)
> >2171 |  1
> >  37 |  2
> >  20 |  3
> >  11 |  4
> > 111 |  5
> >   14034 |
> > 
> > When enabled, this passes nearly all regression tests, and all but 2 of the
> > changes are easily understood.  The 2nd patch still needs work.
> > 
> 
> Hi,
> 
> Came across this thread because I'm working on an improvement for the
> relation extension to improve the speed of the bulkinsert itself in (highly)
> parallel cases and would like to make sure that our approaches work nicely

Thanks for looking.

Since this is a GUC, I thought it would accomodate users optimizing for either
inserts vs selects, as well as users who don't want to change their application
(they can "ALTER SYSTEM SET bulk_insert=on").  I'm not thrilled about making a
new guc, but that seems to be required for "begin bulk", which was the obvious
way to make it an 'opt-in' feature.

I guess it'd be easy to add a counter to ModifyTableState, although it makes
the code a bit less clean and conceivably performs "discontinuously" - inserts
100rows/sec for the first 999 rows and then 200rows/sec afterwards.

If you "mix" small inserts and big inserts, it would be a bad strategy to
optimize for the small ones.  Anyway, in a quick test, small inserts were not
slower.
https://www.postgresql.org/message-id/20200713015700.GA23581%40telsasoft.com

Do you have an example that regresses with bulk insert ?

The two patches are separate, and it's possible they should be enabled
differently or independently.

-- 
Justin

> Given what I've seen and tried so far with various benchmarks I would also
> really like to see a different approach here. The "BEGIN BULK" can be
> problematic for example if you mix small amounts of inserts and big amounts
> in the same transaction, or if your application possibly does a bulk insert
> but otherwise mostly OLTP transactions.

> To me the idea from Simon sounds good to only use a bulk insert state after
> inserting e.g. a 1000 rows, and this also seems more applicable to most
> applications compared to requiring a change to any application that wishes
> to have faster ingest.
> 
> Another approach could be to combine this, for example, with a few extra
> requirements to limit the amount of regressions and first learn more how
> this behaves in the field.
> We could, for example, only (just throwing out some ideas), require that:
> - the relation has a certain size
> - a BufferStrategy a maximum certain size is used
> - there is a certain amount of lock waiters on relation extension. (like we
> do with bulk extend)
> - we have extended the relation for at least e.g. 4 MB and not used the FSM
> anymore thereby proving that we are doing bulk operations instead of random
> small extensions everywhere into the relation that use the FSM.
> 
> Another thing is that we first try to improve the bulk operation facilities
> in general and then have another 

Re: should INSERT SELECT use a BulkInsertState?

2020-11-01 Thread Luc Vlaming

On 30.10.20 05:51, Justin Pryzby wrote:

On Thu, Oct 22, 2020 at 01:29:53PM +0100, Simon Riggs wrote:

On Fri, 16 Oct 2020 at 22:05, Justin Pryzby  wrote:


I made this conditional on BEGIN BULK/SET bulk, so I'll solicit comments on 
that.


I think it would be better if this was self-tuning. So that we don't
allocate a bulkinsert state until we've done say 100 (?) rows
inserted.


I made it an optional, non-default behavior in response to the legitimate
concern for performance regression for the cases where a loader needs to be as
fast as possible - as compared with our case, where we want instead to optimize
for our reports by making the loaders responsible for their own writes, rather
than leaving behind many dirty pages, and clobbering the cache, too.

Also, INSERT SELECT doesn't immediately help us (telsasoft), since we use
INSERT .. VALUES () .. ON CONFLICT.  This would handle that case, which is
great, even though that wasn't a design goal.  It could also be an integer GUC
to allow configuring the size of the ring buffer.


You should also use table_multi_insert() since that will give further
performance gains by reducing block access overheads. Switching from
single row to multi-row should also only happen once we've loaded a
few rows, so we don't introduce overahads for smaller SQL statements.


Good idea...multi_insert (which reduces the overhead of individual inserts) is
mostly independent from BulkInsert state (which uses a ring-buffer to avoid
dirtying the cache).  I made this 0002.

This makes INSERT SELECT several times faster, and not clobber the cache too.

Time: 4700.606 ms (00:04.701)
123 |  1
 37 |  2
 20 |  3
 11 |  4
   4537 |  5
  11656 |

Time: 1125.302 ms (00:01.125)
   2171 |  1
 37 |  2
 20 |  3
 11 |  4
111 |  5
  14034 |

When enabled, this passes nearly all regression tests, and all but 2 of the
changes are easily understood.  The 2nd patch still needs work.



Hi,

Came across this thread because I'm working on an improvement for the 
relation extension to improve the speed of the bulkinsert itself in 
(highly) parallel cases and would like to make sure that our approaches 
work nicely together.


Given what I've seen and tried so far with various benchmarks I would 
also really like to see a different approach here. The "BEGIN BULK" can 
be problematic for example if you mix small amounts of inserts and big 
amounts in the same transaction, or if your application possibly does a 
bulk insert but otherwise mostly OLTP transactions.


To me the idea from Simon sounds good to only use a bulk insert state 
after inserting e.g. a 1000 rows, and this also seems more applicable to 
most applications compared to requiring a change to any application that 
wishes to have faster ingest.


Another approach could be to combine this, for example, with a few extra 
requirements to limit the amount of regressions and first learn more how 
this behaves in the field.

We could, for example, only (just throwing out some ideas), require that:
- the relation has a certain size
- a BufferStrategy a maximum certain size is used
- there is a certain amount of lock waiters on relation extension. (like 
we do with bulk extend)
- we have extended the relation for at least e.g. 4 MB and not used the 
FSM anymore thereby proving that we are doing bulk operations instead of 
random small extensions everywhere into the relation that use the FSM.


Another thing is that we first try to improve the bulk operation 
facilities in general and then have another shot at this? Not sure if 
there is some benchmark / query that shows where such a 10x slowdown 
would appear but maybe that would be worth a look as well possibly.


Regards,
Luc




Re: should INSERT SELECT use a BulkInsertState?

2020-10-29 Thread Justin Pryzby
On Thu, Oct 22, 2020 at 01:29:53PM +0100, Simon Riggs wrote:
> On Fri, 16 Oct 2020 at 22:05, Justin Pryzby  wrote:
> 
> > > > I made this conditional on BEGIN BULK/SET bulk, so I'll solicit 
> > > > comments on that.
> 
> I think it would be better if this was self-tuning. So that we don't
> allocate a bulkinsert state until we've done say 100 (?) rows
> inserted.

I made it an optional, non-default behavior in response to the legitimate
concern for performance regression for the cases where a loader needs to be as
fast as possible - as compared with our case, where we want instead to optimize
for our reports by making the loaders responsible for their own writes, rather
than leaving behind many dirty pages, and clobbering the cache, too.

Also, INSERT SELECT doesn't immediately help us (telsasoft), since we use
INSERT .. VALUES () .. ON CONFLICT.  This would handle that case, which is
great, even though that wasn't a design goal.  It could also be an integer GUC
to allow configuring the size of the ring buffer.

> You should also use table_multi_insert() since that will give further
> performance gains by reducing block access overheads. Switching from
> single row to multi-row should also only happen once we've loaded a
> few rows, so we don't introduce overahads for smaller SQL statements.

Good idea...multi_insert (which reduces the overhead of individual inserts) is
mostly independent from BulkInsert state (which uses a ring-buffer to avoid
dirtying the cache).  I made this 0002.

This makes INSERT SELECT several times faster, and not clobber the cache too.

Time: 4700.606 ms (00:04.701)
   123 |  1
37 |  2
20 |  3
11 |  4
  4537 |  5
 11656 |   

Time: 1125.302 ms (00:01.125)
  2171 |  1
37 |  2
20 |  3
11 |  4
   111 |  5
 14034 |   

When enabled, this passes nearly all regression tests, and all but 2 of the
changes are easily understood.  The 2nd patch still needs work.

-- 
Justin
>From 16057608bd58f54a5e365433ded18757aca8ec48 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 8 May 2020 02:17:32 -0500
Subject: [PATCH v5 1/2] Allow INSERT SELECT to use a BulkInsertState

---
 src/backend/executor/nodeModifyTable.c | 22 --
 src/backend/parser/gram.y  |  7 ++-
 src/backend/tcop/utility.c |  4 
 src/backend/utils/misc/guc.c   | 11 +++
 src/include/executor/nodeModifyTable.h |  2 ++
 src/include/nodes/execnodes.h  |  3 +++
 src/include/parser/kwlist.h|  1 +
 src/test/regress/expected/insert.out   | 23 +++
 src/test/regress/sql/insert.sql| 13 +
 9 files changed, 83 insertions(+), 3 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 29e07b7228..26ff964105 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -72,6 +72,8 @@ static TupleTableSlot *ExecPrepareTupleRouting(ModifyTableState *mtstate,
 			   ResultRelInfo *targetRelInfo,
 			   TupleTableSlot *slot,
 			   ResultRelInfo **partRelInfo);
+/* guc */
+bool insert_in_bulk = false;
 
 /*
  * Verify that the tuples to be produced by INSERT or UPDATE match the
@@ -594,7 +596,7 @@ ExecInsert(ModifyTableState *mtstate,
 			table_tuple_insert_speculative(resultRelationDesc, slot,
 		   estate->es_output_cid,
 		   0,
-		   NULL,
+		   NULL, /* Bulk insert not supported */
 		   specToken);
 
 			/* insert index entries for tuple */
@@ -631,10 +633,17 @@ ExecInsert(ModifyTableState *mtstate,
 		}
 		else
 		{
+			if (proute && mtstate->prevResultRelInfo != resultRelInfo)
+			{
+if (mtstate->bistate)
+	ReleaseBulkInsertStatePin(mtstate->bistate);
+mtstate->prevResultRelInfo = resultRelInfo;
+			}
+
 			/* insert the tuple normally */
 			table_tuple_insert(resultRelationDesc, slot,
 			   estate->es_output_cid,
-			   0, NULL);
+			   0, mtstate->bistate);
 
 			/* insert index entries for tuple */
 			if (resultRelInfo->ri_NumIndices > 0)
@@ -2232,6 +2241,9 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
+	mtstate->bistate = NULL;
+	if (operation == CMD_INSERT && insert_in_bulk)
+		mtstate->bistate = GetBulkInsertState();
 
 	/* set up epqstate with dummy subplan data for the moment */
 	EvalPlanQualInit(>mt_epqstate, estate, NULL, NIL, node->epqParam);
@@ -2698,6 +2710,12 @@ ExecEndModifyTable(ModifyTableState *node)
 		   resultRelInfo);
 	}
 
+	if (node->bistate)
+	{
+		FreeBulkInsertState(node->bistate);
+		table_finish_bulk_insert(node->rootResultRelInfo->ri_RelationDesc, 0);
+	}
+
 	/*
 	 * Close all the partitioned tables, leaf partitions, and their indices
 	 * and 

Re: should INSERT SELECT use a BulkInsertState?

2020-10-22 Thread Simon Riggs
On Fri, 16 Oct 2020 at 22:05, Justin Pryzby  wrote:

> > > I made this conditional on BEGIN BULK/SET bulk, so I'll solicit comments 
> > > on that.

I think it would be better if this was self-tuning. So that we don't
allocate a bulkinsert state until we've done say 100 (?) rows
inserted.

If there are other conditions under which this is non-optimal
(Andres?), we can also autodetect that and avoid them.

You should also use table_multi_insert() since that will give further
performance gains by reducing block access overheads. Switching from
single row to multi-row should also only happen once we've loaded a
few rows, so we don't introduce overahads for smaller SQL statements.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: should INSERT SELECT use a BulkInsertState?

2020-10-22 Thread Simon Riggs
On Thu, 4 Jun 2020 at 18:31, Andres Freund  wrote:

> On 2020-05-08 02:25:45 -0500, Justin Pryzby wrote:
> > Seems to me it should, at least conditionally.  At least if there's a 
> > function
> > scan or a relation or ..
>
> Well, the problem is that this can cause very very significant
> regressions. As in 10x slower or more. The ringbuffer can cause constant
> XLogFlush() calls (due to the lsn interlock), and the eviction from
> shared_buffers (regardless of actual available) will mean future vacuums
> etc will be much slower.  I think this is likely to cause pretty
> widespread regressions on upgrades.
>
> Now, it sucks that we have this problem in the general facility that's
> supposed to be used for this kind of bulk operation. But I don't really
> see it realistic as expanding use of bulk insert strategies unless we
> have some more fundamental fixes.

Are you saying that *anything* that uses the BulkInsertState is
generally broken? We use it for VACUUM and COPY writes, so you are
saying they are broken??

When we put that in, the use of the ringbuffer for writes required a
much larger number of blocks to smooth out the extra XLogFlush()
calls, but overall it was a clear win in those earlier tests. Perhaps
the ring buffer needs to be increased, or made configurable. The
eviction behavior was/is deliberate, to avoid large data loads
spoiling cache - perhaps that could also be configurable for the case
where data fits in shared buffers.

Anyway, if we can discuss what you see as broken, we can fix that and
then extend the usage to other cases, such as INSERT SELECT.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: should INSERT SELECT use a BulkInsertState?

2020-10-16 Thread Justin Pryzby
On Sat, Sep 19, 2020 at 08:32:15AM -0500, Justin Pryzby wrote:
> On Sun, Jul 12, 2020 at 08:57:00PM -0500, Justin Pryzby wrote:
> > On Thu, Jun 04, 2020 at 10:30:47AM -0700, Andres Freund wrote:
> > > On 2020-05-08 02:25:45 -0500, Justin Pryzby wrote:
> > > > Seems to me it should, at least conditionally.  At least if there's a 
> > > > function
> > > > scan or a relation or ..
> > > 
> > > Well, the problem is that this can cause very very significant
> > > regressions. As in 10x slower or more. The ringbuffer can cause constant
> > > XLogFlush() calls (due to the lsn interlock), and the eviction from
> > > shared_buffers (regardless of actual available) will mean future vacuums
> > > etc will be much slower.  I think this is likely to cause pretty
> > > widespread regressions on upgrades.
> > > 
> > > Now, it sucks that we have this problem in the general facility that's
> > > supposed to be used for this kind of bulk operation. But I don't really
> > > see it realistic as expanding use of bulk insert strategies unless we
> > > have some more fundamental fixes.
> > 
> > I made this conditional on BEGIN BULK/SET bulk, so I'll solicit comments on 
> > that.
> 
> @cfbot: rebased

again

-- 
Justin
>From 7f856d0597a34a98be848c337612f1671497f52f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 8 May 2020 02:17:32 -0500
Subject: [PATCH v4] Allow INSERT SELECT to use a BulkInsertState

---
 src/backend/executor/nodeModifyTable.c | 22 --
 src/backend/parser/gram.y  |  7 ++-
 src/backend/tcop/utility.c |  4 
 src/backend/utils/misc/guc.c   | 11 +++
 src/include/executor/nodeModifyTable.h |  2 ++
 src/include/nodes/execnodes.h  |  2 ++
 src/include/parser/kwlist.h|  1 +
 7 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 0c055ed408..d19d6d1a85 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -76,6 +76,8 @@ static ResultRelInfo *getTargetResultRelInfo(ModifyTableState *node);
 static void ExecSetupChildParentMapForSubplan(ModifyTableState *mtstate);
 static TupleConversionMap *tupconv_map_for_subplan(ModifyTableState *node,
    int whichplan);
+/* guc */
+bool insert_in_bulk = false;
 
 /*
  * Verify that the tuples to be produced by INSERT or UPDATE match the
@@ -598,7 +600,7 @@ ExecInsert(ModifyTableState *mtstate,
 			table_tuple_insert_speculative(resultRelationDesc, slot,
 		   estate->es_output_cid,
 		   0,
-		   NULL,
+		   NULL, /* Bulk insert not supported */
 		   specToken);
 
 			/* insert index entries for tuple */
@@ -638,7 +640,7 @@ ExecInsert(ModifyTableState *mtstate,
 			/* insert the tuple normally */
 			table_tuple_insert(resultRelationDesc, slot,
 			   estate->es_output_cid,
-			   0, NULL);
+			   0, mtstate->bistate);
 
 			/* insert index entries for tuple */
 			if (resultRelInfo->ri_NumIndices > 0)
@@ -2365,6 +2367,16 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
+	mtstate->bistate = NULL;
+	if (operation == CMD_INSERT &&
+			node->onConflictAction != ONCONFLICT_UPDATE &&
+			node->rootRelation == 0)
+	{
+		Assert(nplans == 1);
+
+		if (insert_in_bulk)
+			mtstate->bistate = GetBulkInsertState();
+	}
 
 	/* set up epqstate with dummy subplan data for the moment */
 	EvalPlanQualInit(>mt_epqstate, estate, NULL, NIL, node->epqParam);
@@ -2805,6 +2817,12 @@ ExecEndModifyTable(ModifyTableState *node)
 		   resultRelInfo);
 	}
 
+	if (node->bistate)
+	{
+		FreeBulkInsertState(node->bistate);
+		table_finish_bulk_insert((getTargetResultRelInfo(node))->ri_RelationDesc, 0);
+	}
+
 	/*
 	 * Close all the partitioned tables, leaf partitions, and their indices
 	 * and release the slot used for tuple routing, if set.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 480d168346..b87e31e36a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -631,7 +631,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	ASSERTION ASSIGNMENT ASYMMETRIC AT ATTACH ATTRIBUTE AUTHORIZATION
 
 	BACKWARD BEFORE BEGIN_P BETWEEN BIGINT BINARY BIT
-	BOOLEAN_P BOTH BY
+	BOOLEAN_P BOTH BY BULK
 
 	CACHE CALL CALLED CASCADE CASCADED CASE CAST CATALOG_P CHAIN CHAR_P
 	CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
@@ -9873,6 +9873,9 @@ transaction_mode_item:
 			| NOT DEFERRABLE
 	{ $$ = makeDefElem("transaction_deferrable",
 	   makeIntConst(false, @1), @1); }
+			| BULK
+	{ $$ = makeDefElem("bulk",
+	   makeIntConst(true, @1), @1); }
 		;
 
 /* Syntax with commas is SQL-spec, without commas is Postgres historical */
@@ -15079,6 +15082,7 @@ unreserved_keyword:
 			| BACKWARD
 

Re: should INSERT SELECT use a BulkInsertState?

2020-09-19 Thread Justin Pryzby
On Sun, Jul 12, 2020 at 08:57:00PM -0500, Justin Pryzby wrote:
> On Thu, Jun 04, 2020 at 10:30:47AM -0700, Andres Freund wrote:
> > On 2020-05-08 02:25:45 -0500, Justin Pryzby wrote:
> > > Seems to me it should, at least conditionally.  At least if there's a 
> > > function
> > > scan or a relation or ..
> > 
> > Well, the problem is that this can cause very very significant
> > regressions. As in 10x slower or more. The ringbuffer can cause constant
> > XLogFlush() calls (due to the lsn interlock), and the eviction from
> > shared_buffers (regardless of actual available) will mean future vacuums
> > etc will be much slower.  I think this is likely to cause pretty
> > widespread regressions on upgrades.
> > 
> > Now, it sucks that we have this problem in the general facility that's
> > supposed to be used for this kind of bulk operation. But I don't really
> > see it realistic as expanding use of bulk insert strategies unless we
> > have some more fundamental fixes.
> 
> I made this conditional on BEGIN BULK/SET bulk, so I'll solicit comments on 
> that.

@cfbot: rebased
>From acfc6ef7b84a6753a49b7f4c9d5b77a0abbfd37c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 8 May 2020 02:17:32 -0500
Subject: [PATCH v3] Allow INSERT SELECT to use a BulkInsertState

---
 src/backend/executor/nodeModifyTable.c | 23 +--
 src/backend/parser/gram.y  |  7 ++-
 src/backend/tcop/utility.c |  4 
 src/backend/utils/misc/guc.c   | 12 +++-
 src/include/executor/nodeModifyTable.h |  2 ++
 src/include/nodes/execnodes.h  |  2 ++
 src/include/parser/kwlist.h|  1 +
 7 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 9812089161..464ad5e346 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -75,6 +75,8 @@ static ResultRelInfo *getTargetResultRelInfo(ModifyTableState *node);
 static void ExecSetupChildParentMapForSubplan(ModifyTableState *mtstate);
 static TupleConversionMap *tupconv_map_for_subplan(ModifyTableState *node,
    int whichplan);
+/* guc */
+bool insert_in_bulk = false;
 
 /*
  * Verify that the tuples to be produced by INSERT or UPDATE match the
@@ -578,7 +580,7 @@ ExecInsert(ModifyTableState *mtstate,
 			table_tuple_insert_speculative(resultRelationDesc, slot,
 		   estate->es_output_cid,
 		   0,
-		   NULL,
+		   NULL, /* Bulk insert not supported */
 		   specToken);
 
 			/* insert index entries for tuple */
@@ -617,7 +619,7 @@ ExecInsert(ModifyTableState *mtstate,
 			/* insert the tuple normally */
 			table_tuple_insert(resultRelationDesc, slot,
 			   estate->es_output_cid,
-			   0, NULL);
+			   0, mtstate->bistate);
 
 			/* insert index entries for tuple */
 			if (resultRelInfo->ri_NumIndices > 0)
@@ -2332,6 +2334,17 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
+	mtstate->bistate = NULL;
+	if (operation == CMD_INSERT &&
+			node->onConflictAction != ONCONFLICT_UPDATE &&
+			node->rootResultRelIndex < 0)
+	{
+		// Plan *p = linitial(node->plans);
+		Assert(nplans == 1);
+
+		if (insert_in_bulk)
+			mtstate->bistate = GetBulkInsertState();
+	}
 
 	/* set up epqstate with dummy subplan data for the moment */
 	EvalPlanQualInit(>mt_epqstate, estate, NULL, NIL, node->epqParam);
@@ -2776,6 +2789,12 @@ ExecEndModifyTable(ModifyTableState *node)
 		   resultRelInfo);
 	}
 
+	if (node->bistate)
+	{
+		FreeBulkInsertState(node->bistate);
+		table_finish_bulk_insert((getTargetResultRelInfo(node))->ri_RelationDesc, 0);
+	}
+
 	/*
 	 * Close all the partitioned tables, leaf partitions, and their indices
 	 * and release the slot used for tuple routing, if set.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 017940bdcd..0bc2108a2b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -631,7 +631,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	ASSERTION ASSIGNMENT ASYMMETRIC AT ATTACH ATTRIBUTE AUTHORIZATION
 
 	BACKWARD BEFORE BEGIN_P BETWEEN BIGINT BINARY BIT
-	BOOLEAN_P BOTH BY
+	BOOLEAN_P BOTH BY BULK
 
 	CACHE CALL CALLED CASCADE CASCADED CASE CAST CATALOG_P CHAIN CHAR_P
 	CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
@@ -9846,6 +9846,9 @@ transaction_mode_item:
 			| NOT DEFERRABLE
 	{ $$ = makeDefElem("transaction_deferrable",
 	   makeIntConst(false, @1), @1); }
+			| BULK
+	{ $$ = makeDefElem("bulk",
+	   makeIntConst(true, @1), @1); }
 		;
 
 /* Syntax with commas is SQL-spec, without commas is Postgres historical */
@@ -15052,6 +15055,7 @@ unreserved_keyword:
 			| BACKWARD
 			| BEFORE
 			| BEGIN_P
+			| BULK
 			| BY
 			| CACHE
 			| CALL
@@ -15563,6 

Re: should INSERT SELECT use a BulkInsertState?

2020-07-12 Thread Justin Pryzby
On Thu, Jun 04, 2020 at 10:30:47AM -0700, Andres Freund wrote:
> On 2020-05-08 02:25:45 -0500, Justin Pryzby wrote:
> > Seems to me it should, at least conditionally.  At least if there's a 
> > function
> > scan or a relation or ..
> 
> Well, the problem is that this can cause very very significant
> regressions. As in 10x slower or more. The ringbuffer can cause constant
> XLogFlush() calls (due to the lsn interlock), and the eviction from
> shared_buffers (regardless of actual available) will mean future vacuums
> etc will be much slower.  I think this is likely to cause pretty
> widespread regressions on upgrades.
> 
> Now, it sucks that we have this problem in the general facility that's
> supposed to be used for this kind of bulk operation. But I don't really
> see it realistic as expanding use of bulk insert strategies unless we
> have some more fundamental fixes.

I made this conditional on BEGIN BULK/SET bulk, so I'll solicit comments on 
that.

postgres=# \t on \\ \set QUIET \\ VACUUM FULL t; \dt+ t \\ begin ; \timing on 
\\ INSERT INTO t SELECT * FROM t; rollback; SELECT COUNT(1), usagecount FROM 
pg_buffercache GROUP BY 2 ORDER BY 2; 
| public | t| table | pryzbyj | 35 MB | 
|Time: 9497.318 ms (00:09.497)
|33 |  1
| 3 |  2
|18 |  3
| 5 |  4
|  4655 |  5
| 11670 |   

vs

postgres=# \t on \\ \set QUIET \\ VACUUM FULL t; \dt+ t \\ begin BULK ; \timing 
on \\ INSERT INTO t SELECT * FROM t; rollback; SELECT COUNT(1), usagecount FROM 
pg_buffercache GROUP BY 2 ORDER BY 2; 
| public | t| table | pryzbyj | 35 MB | 
|Time: 8268.780 ms (00:08.269)
|  2080 |  1
| 3 |  2
|19 |  4
|   234 |  5
| 14048 |   

And:

postgres=# begin ; \x \\ \t \\ SELECT statement_timestamp(); \o /dev/null \\ 
SELECT 'INSERT INTO t VALUES(0)' FROM generate_series(1,99); \set ECHO 
errors \\ \set QUIET on \\ \o \\ \gexec \\ SELECT statement_timestamp(); abort; 
\x \\ SELECT COUNT(1), usagecount FROM pg_buffercache GROUP BY 2 ORDER BY 2; a
|statement_timestamp | 2020-07-12 20:31:43.717328-05
|statement_timestamp | 2020-07-12 20:36:16.692469-05
|
|52 |  1
|24 |  2
|17 |  3
| 6 |  4
|  4531 |  5
| 11754 |   

vs

postgres=# begin BULK ; \x \\ \t \\ SELECT statement_timestamp(); \o /dev/null 
\\ SELECT 'INSERT INTO t VALUES(0)' FROM generate_series(1,99); \set ECHO 
errors \\ \set QUIET on \\ \o \\ \gexec \\ SELECT statement_timestamp(); abort; 
\x \\ SELECT COUNT(1), usagecount FROM pg_buffercache GROUP BY 2 ORDER BY 2; a
|statement_timestamp | 2020-07-12 20:43:47.089538-05
|statement_timestamp | 2020-07-12 20:48:04.798138-05
|
|  4456 |  1
|22 |  2
| 1 |  3
| 7 |  4
|79 |  5
| 11819 |

-- 
Justin
>From 0362537e0f3a8496ac760574931db66c59f7c1ba Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 8 May 2020 02:17:32 -0500
Subject: [PATCH v2] Allow INSERT SELECT to use a BulkInsertState

---
 src/backend/executor/nodeModifyTable.c | 23 +--
 src/backend/parser/gram.y  |  6 +-
 src/backend/tcop/utility.c |  4 
 src/backend/utils/misc/guc.c   | 12 +++-
 src/include/executor/nodeModifyTable.h |  2 ++
 src/include/nodes/execnodes.h  |  2 ++
 src/include/parser/kwlist.h|  1 +
 7 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 20a4c474cc..5ff4a2e901 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -75,6 +75,8 @@ static ResultRelInfo *getTargetResultRelInfo(ModifyTableState *node);
 static void ExecSetupChildParentMapForSubplan(ModifyTableState *mtstate);
 static TupleConversionMap *tupconv_map_for_subplan(ModifyTableState *node,
    int whichplan);
+/* guc */
+bool insert_in_bulk = false;
 
 /*
  * Verify that the tuples to be produced by INSERT or UPDATE match the
@@ -578,7 +580,7 @@ ExecInsert(ModifyTableState *mtstate,
 			table_tuple_insert_speculative(resultRelationDesc, slot,
 		   estate->es_output_cid,
 		   0,
-		   NULL,
+		   NULL, /* Bulk insert not supported */
 		   specToken);
 
 			/* insert index entries for tuple */
@@ -617,7 +619,7 @@ ExecInsert(ModifyTableState *mtstate,
 			/* insert the tuple normally */
 			table_tuple_insert(resultRelationDesc, slot,
 			   estate->es_output_cid,
-			   0, NULL);
+			   0, mtstate->bistate);
 
 			/* insert index entries for tuple */
 			if (resultRelInfo->ri_NumIndices > 0)
@@ -2332,6 +2334,17 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
+	mtstate->bistate = NULL;
+	if (operation == CMD_INSERT &&
+			

Re: should INSERT SELECT use a BulkInsertState?

2020-07-12 Thread Daniel Gustafsson
> On 4 Jun 2020, at 19:30, Andres Freund  wrote:
> On 2020-05-08 02:25:45 -0500, Justin Pryzby wrote:

>> Seems to me it should, at least conditionally.  At least if there's a 
>> function
>> scan or a relation or ..
> 
> Well, the problem is that this can cause very very significant
> regressions. As in 10x slower or more. The ringbuffer can cause constant
> XLogFlush() calls (due to the lsn interlock), and the eviction from
> shared_buffers (regardless of actual available) will mean future vacuums
> etc will be much slower.  I think this is likely to cause pretty
> widespread regressions on upgrades.
> 
> Now, it sucks that we have this problem in the general facility that's
> supposed to be used for this kind of bulk operation. But I don't really
> see it realistic as expanding use of bulk insert strategies unless we
> have some more fundamental fixes.

Based on the above, and the lack of activity in the thread, it sounds like this
patch should be marked Returned with Feedback; but Justin: you set it to
Waiting on Author at the start of the commitfest, are you working on a new
version?

cheers ./daniel



Re: should INSERT SELECT use a BulkInsertState?

2020-06-04 Thread Andres Freund
Hi,

On 2020-05-08 02:25:45 -0500, Justin Pryzby wrote:
> Seems to me it should, at least conditionally.  At least if there's a function
> scan or a relation or ..

Well, the problem is that this can cause very very significant
regressions. As in 10x slower or more. The ringbuffer can cause constant
XLogFlush() calls (due to the lsn interlock), and the eviction from
shared_buffers (regardless of actual available) will mean future vacuums
etc will be much slower.  I think this is likely to cause pretty
widespread regressions on upgrades.

Now, it sucks that we have this problem in the general facility that's
supposed to be used for this kind of bulk operation. But I don't really
see it realistic as expanding use of bulk insert strategies unless we
have some more fundamental fixes.

Regards,

Andres




Re: should INSERT SELECT use a BulkInsertState?

2020-05-11 Thread Michael Paquier
On Fri, May 08, 2020 at 02:25:45AM -0500, Justin Pryzby wrote:
> Seems to me it should, at least conditionally.  At least if there's a function
> scan or a relation or ..
> 
> I mentioned a bit about our use-case here:
> https://www.postgresql.org/message-id/20200219173742.GA30939%40telsasoft.com
> => I'd prefer our loaders to write their own data rather than dirtying large
> fractions of buffer cache and leaving it around for other backends to clean 
> up.

Does it matter in terms of performance and for which cases does it
actually matter?

> diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
> index 4fee043bb2..daf365f181 100644
> --- a/src/include/nodes/execnodes.h
> +++ b/src/include/nodes/execnodes.h
> @@ -14,6 +14,7 @@
>  #ifndef EXECNODES_H
>  #define EXECNODES_H
>  
> +#include "access/heapam.h"
>  #include "access/tupconvert.h"
>  #include "executor/instrument.h"
>  #include "fmgr.h"
> @@ -1177,6 +1178,7 @@ typedef struct ModifyTableState
>   List  **mt_arowmarks;   /* per-subplan ExecAuxRowMark lists */
>   EPQStatemt_epqstate;/* for evaluating EvalPlanQual rechecks 
> */
>   boolfireBSTriggers; /* do we need to fire stmt triggers? */
> + BulkInsertState bistate;/* State for bulk insert like INSERT 
> SELECT */

I think that this needs more thoughts.  You are introducing a
dependency between some generic execution-related nodes and heap, a
table AM.
--
Michael


signature.asc
Description: PGP signature


Re: should INSERT SELECT use a BulkInsertState?

2020-05-10 Thread Justin Pryzby
On Fri, May 08, 2020 at 02:25:45AM -0500, Justin Pryzby wrote:
> Seems to me it should, at least conditionally.  At least if there's a function
> scan or a relation or ..
> 
> I mentioned a bit about our use-case here:
> https://www.postgresql.org/message-id/20200219173742.GA30939%40telsasoft.com
> => I'd prefer our loaders to write their own data rather than dirtying large
> fractions of buffer cache and leaving it around for other backends to clean 
> up.

Nobody suggested otherwise so I added here and cleaned up to pass tests.
https://commitfest.postgresql.org/28/2553/

-- 
Justin
>From ba5cf05960a097cf82c10a29af81f4f66a9274a6 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 8 May 2020 02:17:32 -0500
Subject: [PATCH v1] Make INSERT SELECT use a BulkInsertState

---
 src/backend/executor/nodeModifyTable.c | 21 +++--
 src/include/nodes/execnodes.h  |  2 ++
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 20a4c474cc..aa85245f39 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -578,7 +578,7 @@ ExecInsert(ModifyTableState *mtstate,
 			table_tuple_insert_speculative(resultRelationDesc, slot,
 		   estate->es_output_cid,
 		   0,
-		   NULL,
+		   NULL, /* Bulk insert not supported */
 		   specToken);
 
 			/* insert index entries for tuple */
@@ -617,7 +617,7 @@ ExecInsert(ModifyTableState *mtstate,
 			/* insert the tuple normally */
 			table_tuple_insert(resultRelationDesc, slot,
 			   estate->es_output_cid,
-			   0, NULL);
+			   0, mtstate->bistate);
 
 			/* insert index entries for tuple */
 			if (resultRelInfo->ri_NumIndices > 0)
@@ -2332,6 +2332,17 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
+	mtstate->bistate = NULL;
+	if (operation == CMD_INSERT &&
+			node->onConflictAction != ONCONFLICT_UPDATE &&
+			node->rootResultRelIndex < 0)
+	{
+		Plan *p = linitial(node->plans);
+		Assert(nplans == 1);
+
+		if (!IsA(p, Result) && !IsA(p, ProjectSet) && !IsA(p, ValuesScan))
+			mtstate->bistate = GetBulkInsertState();
+	}
 
 	/* set up epqstate with dummy subplan data for the moment */
 	EvalPlanQualInit(>mt_epqstate, estate, NULL, NIL, node->epqParam);
@@ -2776,6 +2787,12 @@ ExecEndModifyTable(ModifyTableState *node)
 		   resultRelInfo);
 	}
 
+	if (node->bistate)
+	{
+		FreeBulkInsertState(node->bistate);
+		table_finish_bulk_insert((getTargetResultRelInfo(node))->ri_RelationDesc, 0);
+	}
+
 	/*
 	 * Close all the partitioned tables, leaf partitions, and their indices
 	 * and release the slot used for tuple routing, if set.
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 4fee043bb2..daf365f181 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -14,6 +14,7 @@
 #ifndef EXECNODES_H
 #define EXECNODES_H
 
+#include "access/heapam.h"
 #include "access/tupconvert.h"
 #include "executor/instrument.h"
 #include "fmgr.h"
@@ -1177,6 +1178,7 @@ typedef struct ModifyTableState
 	List	  **mt_arowmarks;	/* per-subplan ExecAuxRowMark lists */
 	EPQState	mt_epqstate;	/* for evaluating EvalPlanQual rechecks */
 	bool		fireBSTriggers; /* do we need to fire stmt triggers? */
+	BulkInsertState	bistate;	/* State for bulk insert like INSERT SELECT */
 
 	/*
 	 * Slot for storing tuples in the root partitioned table's rowtype during
-- 
2.17.0



should INSERT SELECT use a BulkInsertState?

2020-05-08 Thread Justin Pryzby
Seems to me it should, at least conditionally.  At least if there's a function
scan or a relation or ..

I mentioned a bit about our use-case here:
https://www.postgresql.org/message-id/20200219173742.GA30939%40telsasoft.com
=> I'd prefer our loaders to write their own data rather than dirtying large
fractions of buffer cache and leaving it around for other backends to clean up.

commit 7f9e061363e58f30eee08a0e46f637bf137b
Author: Justin Pryzby 
Date:   Fri May 8 02:17:32 2020 -0500

Make INSERT SELECT use a BulkInsertState

diff --git a/src/backend/executor/nodeModifyTable.c 
b/src/backend/executor/nodeModifyTable.c
index 20a4c474cc..6da4325225 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -578,7 +578,7 @@ ExecInsert(ModifyTableState *mtstate,
table_tuple_insert_speculative(resultRelationDesc, slot,

   estate->es_output_cid,

   0,
-   
   NULL,
+   
   mtstate->bistate,

   specToken);
 
/* insert index entries for tuple */
@@ -617,7 +617,7 @@ ExecInsert(ModifyTableState *mtstate,
/* insert the tuple normally */
table_tuple_insert(resultRelationDesc, slot,
   
estate->es_output_cid,
-  0, NULL);
+  0, mtstate->bistate);
 
/* insert index entries for tuple */
if (resultRelInfo->ri_NumIndices > 0)
@@ -2332,6 +2332,14 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, 
int eflags)
 
mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
mtstate->mt_nplans = nplans;
+   mtstate->bistate = NULL;
+   if (operation == CMD_INSERT)
+   {
+   Plan *p = linitial(node->plans);
+   Assert(nplans == 1);
+   if (!IsA(p, Result) && !IsA(p, ValuesScan))
+   mtstate->bistate = GetBulkInsertState();
+   }
 
/* set up epqstate with dummy subplan data for the moment */
EvalPlanQualInit(>mt_epqstate, estate, NULL, NIL, 
node->epqParam);
@@ -2809,6 +2817,9 @@ ExecEndModifyTable(ModifyTableState *node)
 */
for (i = 0; i < node->mt_nplans; i++)
ExecEndNode(node->mt_plans[i]);
+
+   if (node->bistate)
+   FreeBulkInsertState(node->bistate);
 }
 
 void
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 4fee043bb2..daf365f181 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -14,6 +14,7 @@
 #ifndef EXECNODES_H
 #define EXECNODES_H
 
+#include "access/heapam.h"
 #include "access/tupconvert.h"
 #include "executor/instrument.h"
 #include "fmgr.h"
@@ -1177,6 +1178,7 @@ typedef struct ModifyTableState
List  **mt_arowmarks;   /* per-subplan ExecAuxRowMark lists */
EPQStatemt_epqstate;/* for evaluating EvalPlanQual rechecks 
*/
boolfireBSTriggers; /* do we need to fire stmt triggers? */
+   BulkInsertState bistate;/* State for bulk insert like INSERT 
SELECT */
 
/*
 * Slot for storing tuples in the root partitioned table's rowtype 
during