Re: should INSERT SELECT use a BulkInsertState?

Justin Pryzby Sun, 26 Sep 2021 17:09:09 -0700

On Mon, May 11, 2020 at 03:19:34PM +0900, Michael Paquier wrote:
> On Fri, May 08, 2020 at 02:25:45AM -0500, Justin Pryzby wrote:
> > Seems to me it should, at least conditionally.  At least if there's a 
> > function
> > scan or a relation or ..
> >
> > I mentioned a bit about our use-case here:
> > https://www.postgresql.org/message-id/20200219173742.GA30939%40telsasoft.com
> > => I'd prefer our loaders to write their own data rather than dirtying large
> > fractions of buffer cache and leaving it around for other backends to clean 
> > up.
>
> Does it matter in terms of performance and for which cases does it
> actually matter?

Every 15min we're inserting 10s of thousands of rows, which dirties a large
number of buffers:

postgres=# CREATE EXTENSION pg_buffercache; DROP TABLE tt; CREATE TABLE tt(i 
int); INSERT INTO tt SELECT generate_series(1,999999); SELECT 
usagecount,COUNT(1) FROM pg_buffercache WHERE isdirty GROUP BY 1 ORDER BY 1;
 usagecount | count
------------+-------
          1 |     1
          2 |     1
          3 |     2
          4 |     2
          5 |  4436

With this patch it dirties fewer pages and with lower usage count:

          1 |  2052
          2 |     1
          3 |     3
          4 |     2
          5 |    10

The goal is to avoid cache churn by using a small ring buffer.
Note that indexes on the target table will themselves use up buffers, and
BulkInsert won't help so much.

On Thu, Mar 18, 2021 at 08:29:50AM -0700, Zhihong Yu wrote:
> +           mtstate->ntuples > bulk_insert_ntuples &&
> +           bulk_insert_ntuples >= 0)
> 
> I wonder why bulk_insert_ntuples == 0 is included in the above. It
> seems bulk_insert_ntuples having value of 0 should mean not enabling bulk
> insertions.

I think it ought to be possible to enable bulk insertions immediately, which is 

what 0 does.  -1 is the value defined to mean "do not use bulk insert".         

I realize there's no documentation yet.                                         

This patch started out with the goal of using a BulkInsertState for INSERT,
same as for COPY.  We use INSERT ON CONFLICT VALUES(),()..., and it'd be nice
if our data loaders could avoid leaving behind dirty buffers.

Simon suggested to also use MultInsertInfo.  However that patch is complicated:
it cannot work with volatile default expressions, and probably many other
things that could go in SELECT but not supported by miistate.  That may be
better handled by another patch (or in any case by someone else) like this one:
| New Table Access Methods for Multi and Single Inserts
https://commitfest.postgresql.org/31/2871/

I split off the simple part again.  If there's no interest in the 0001 patch
alone, then I guess it should be closed in the CF.

However, the "new table AM" patch doesn't address our case, since neither
VALUES nor INSERT SELECT are considered a bulk insert..

-- 
Justin

>From 19eea95d1eca89a90f3f1b2992e248e3e3459caa Mon Sep 17 00:00:00 2001
From: Justin Pryzby <[email protected]>
Date: Tue, 27 Apr 2021 09:09:13 -0500
Subject: [PATCH v11] ExecInsert to use BulkInsertState

---
 src/backend/executor/nodeModifyTable.c | 34 ++++++++++++++++++++++++--
 src/backend/utils/misc/guc.c           |  9 +++++++
 src/include/executor/nodeModifyTable.h |  2 ++
 src/include/nodes/execnodes.h          |  5 ++++
 src/test/regress/expected/insert.out   | 15 ++++++++++++
 src/test/regress/sql/insert.sql        | 10 ++++++++
 6 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d328856ae5..9ce81dc922 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -81,6 +81,8 @@ static TupleTableSlot *ExecPrepareTupleRouting(ModifyTableState *mtstate,
 											   ResultRelInfo *targetRelInfo,
 											   TupleTableSlot *slot,
 											   ResultRelInfo **partRelInfo);
+/* guc */
+int bulk_insert_ntuples = 1000;
 
 /*
  * Verify that the tuples to be produced by INSERT match the
@@ -625,6 +627,19 @@ ExecInsert(ModifyTableState *mtstate,
 
 	resultRelationDesc = resultRelInfo->ri_RelationDesc;
 
+	/* Use bulk insert after a threshold number of tuples */
+	// XXX: maybe this should only be done if it's not a partitioned table or
+	// if the partitions don't support miinfo, which uses its own bistates
+	mtstate->ntuples++;
+	if (mtstate->bistate == NULL &&
+			mtstate->ntuples > bulk_insert_ntuples &&
+			bulk_insert_ntuples >= 0)
+	{
+		elog(DEBUG1, "enabling bulk insert");
+		mtstate->bistate = GetBulkInsertState();
+	}
+
+
 	/*
 	 * Open the table's indexes, if we have not done so already, so that we
 	 * can add new index entries for the inserted tuple.
@@ -893,7 +908,7 @@ ExecInsert(ModifyTableState *mtstate,
 			table_tuple_insert_speculative(resultRelationDesc, slot,
 										   estate->es_output_cid,
 										   0,
-										   NULL,
+										   mtstate->bistate,
 										   specToken);
 
 			/* insert index entries for tuple */
@@ -930,10 +945,17 @@ ExecInsert(ModifyTableState *mtstate,
 		}
 		else
 		{
+			if (proute && mtstate->prevResultRelInfo != resultRelInfo)
+			{
+				if (mtstate->bistate)
+					ReleaseBulkInsertStatePin(mtstate->bistate);
+				mtstate->prevResultRelInfo = resultRelInfo;
+			}
+
 			/* insert the tuple normally */
 			table_tuple_insert(resultRelationDesc, slot,
 							   estate->es_output_cid,
-							   0, NULL);
+							   0, mtstate->bistate);
 
 			/* insert index entries for tuple */
 			if (resultRelInfo->ri_NumIndices > 0)
@@ -2736,6 +2758,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 	mtstate->resultRelInfo = (ResultRelInfo *)
 		palloc(nrels * sizeof(ResultRelInfo));
 
+	mtstate->bistate = NULL;
+
 	/*----------
 	 * Resolve the target relation. This is the same as:
 	 *
@@ -3196,6 +3220,12 @@ ExecEndModifyTable(ModifyTableState *node)
 		}
 	}
 
+	if (node->bistate)
+	{
+		FreeBulkInsertState(node->bistate);
+		table_finish_bulk_insert(node->rootResultRelInfo->ri_RelationDesc, 0);
+	}
+
 	/*
 	 * Close all the partitioned tables, leaf partitions, and their indices
 	 * and release the slot used for tuple routing, if set.
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index d2ce4a8450..984e636c5d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -51,6 +51,7 @@
 #include "commands/vacuum.h"
 #include "commands/variable.h"
 #include "common/string.h"
+#include "executor/nodeModifyTable.h"
 #include "funcapi.h"
 #include "jit/jit.h"
 #include "libpq/auth.h"
@@ -3569,6 +3570,14 @@ static struct config_int ConfigureNamesInt[] =
 		0, 0, INT_MAX,
 		check_client_connection_check_interval, NULL, NULL
 	},
+	{
+		{"bulk_insert_ntuples", PGC_USERSET, CLIENT_CONN_STATEMENT,
+			gettext_noop("Enable bulk insertions after this number of tuples."),
+			gettext_noop("A ring buffer of limited size will be used and updates done in batch"),
+		},
+		&bulk_insert_ntuples,
+		1000, -1, INT_MAX,
+	},
 
 	/* End-of-list marker */
 	{
diff --git a/src/include/executor/nodeModifyTable.h b/src/include/executor/nodeModifyTable.h
index 83e2965531..479124899a 100644
--- a/src/include/executor/nodeModifyTable.h
+++ b/src/include/executor/nodeModifyTable.h
@@ -15,6 +15,8 @@
 
 #include "nodes/execnodes.h"
 
+extern PGDLLIMPORT int bulk_insert_ntuples;
+
 extern void ExecComputeStoredGenerated(ResultRelInfo *resultRelInfo,
 									   EState *estate, TupleTableSlot *slot,
 									   CmdType cmdtype);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 37cb4f3d59..f2bd880d28 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -14,6 +14,7 @@
 #ifndef EXECNODES_H
 #define EXECNODES_H
 
+#include "access/heapam.h"
 #include "access/tupconvert.h"
 #include "executor/instrument.h"
 #include "fmgr.h"
@@ -1202,6 +1203,10 @@ typedef struct ModifyTableState
 	EPQState	mt_epqstate;	/* for evaluating EvalPlanQual rechecks */
 	bool		fireBSTriggers; /* do we need to fire stmt triggers? */
 
+	BulkInsertState	bistate;	/* state for bulk insert like INSERT SELECT, when miinfo cannot be used */
+	ResultRelInfo	*prevResultRelInfo; /* last child inserted with bistate */
+	size_t			ntuples;	/* Number of tuples inserted */
+
 	/*
 	 * These fields are used for inherited UPDATE and DELETE, to track which
 	 * target relation a given tuple is from.  If there are a lot of target
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index 5063a3dc22..ea1b15ff83 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -483,6 +483,21 @@ Partitions: part_aa_bb FOR VALUES IN ('aa', 'bb'),
             part_xx_yy FOR VALUES IN ('xx', 'yy'), PARTITIONED,
             part_default DEFAULT, PARTITIONED
 
+-- bulk inserts
+truncate hash_parted;
+begin;
+-- exercise bulk insert to partitions
+SET client_min_messages=debug1;
+insert into hash_parted select generate_series(1,9999);
+DEBUG:  enabling bulk insert
+RESET client_min_messages;
+select count(1) from hash_parted;
+ count 
+-------
+  9999
+(1 row)
+
+commit;
 -- cleanup
 drop table range_parted, list_parted;
 drop table hash_parted;
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index bfaa8a3b27..1a792084c1 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -302,6 +302,16 @@ from hash_parted order by part;
 -- partitions
 \d+ list_parted
 
+-- bulk inserts
+truncate hash_parted;
+begin;
+-- exercise bulk insert to partitions
+SET client_min_messages=debug1;
+insert into hash_parted select generate_series(1,9999);
+RESET client_min_messages;
+select count(1) from hash_parted;
+commit;
+
 -- cleanup
 drop table range_parted, list_parted;
 drop table hash_parted;
-- 
2.17.0

Re: should INSERT SELECT use a BulkInsertState?

Reply via email to