On Mon, Feb 05, 2024 at 06:05:15PM +0900, Sutou Kouhei wrote:
> In <zcckwaefrloqp...@paquier.xyz>
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Mon, 5 Feb 2024 16:14:08 +0900,
>   Michael Paquier <mich...@paquier.xyz> wrote:
>> 2) I have backpedaled on the postpare callback, which did not bring
>> much in clarity IMO while being a CSV-only callback.  Note that we
>> have in copyfromparse.c more paths that are only for CSV but the past
>> versions of the patch never cared about that.  This makes the text and
>> CSV implementations much closer to each other, as a result.
> 
> Ah, sorry. I forgot to eliminate cstate->opts.csv_mode in
> CopyReadLineText(). The postpare callback is for
> optimization. If it doesn't improve performance, we don't
> need to introduce it.

No worries.

> We may want to try eliminating cstate->opts.csv_mode in
> CopyReadLineText() for performance. But we don't need to
> do this in introducing CopyFromRoutine. We can defer it.
> 
> So I don't object removing the postpare callback.

Rather related, but there has been a comment from Andres about this
kind of splits a few hours ago, so perhaps this is for the best:
https://www.postgresql.org/message-id/20240205182118.h5rkbnjgujwzuxip%40awork3.anarazel.de

I'll reply to this one in a bit.

>>                                              Let me know if you have
>> comments about all that.
> 
> Here are some comments for the patch:

Thanks.  My head was spinning after reading the diffs more than 20
times :)

> fmgr_info ->
> finfo
> optinally ->
> optionally
> CopyFromRoutine->OneRow ->
> CopyFromRoutine->CopyFromOneRow
> CopyFromTextStart ->
> CopyFromBinaryStart
> CopyFromTextEnd ->
> CopyFromBinaryEnd

Fixed all these.

> How about passing CopyFromState cstate too like other
> callbacks for consistency?

Yes, I was wondering a bit if this can be useful for the custom
formats.

> +     /*
> +      * Copy one row to a set of `values` and `nulls` of size tupDesc->natts.
> +      *
> +      * 'econtext' is used to evaluate default expression for each column 
> that
> +      * is either not read from the file or is using the DEFAULT option of 
> COPY
> 
> or is ->
> or

"or is" is correct here IMO.

> Wow! I didn't know that we need to update typedefs.list when
> I add a "typedef struct".

That's for the automated indentation.  This is a habit I have when it
comes to work on shaping up patches to avoid weird diffs with pgindent
and new structure names.  It's OK to forget about it :)

Attaching a v13 for now.
--
Michael
From 50ebc2d6b1b5f7bcb2248dd3fa4cef3f14f7adaa Mon Sep 17 00:00:00 2001
From: Michael Paquier <mich...@paquier.xyz>
Date: Tue, 6 Feb 2024 08:40:17 +0900
Subject: [PATCH v13] Extract COPY FROM/TO format implementations

This doesn't change the current behavior.  This just introduces a set of
copy routines called CopyFromRoutine and CopyToRoutine, which just has
function pointers of format implementation like TupleTableSlotOps, and
use it for existing "text", "csv" and "binary" format implementations.
There are plans to extend that more in the future with custom formats.

This improves performance when a relation has many attributes as this
eliminates format-based if branches.  The more the attributes, the
better the performance gain.  Blah add some numbers.
---
 src/include/commands/copyapi.h           | 100 ++++++
 src/include/commands/copyfrom_internal.h |  10 +
 src/backend/commands/copyfrom.c          | 209 ++++++++---
 src/backend/commands/copyfromparse.c     | 362 ++++++++++---------
 src/backend/commands/copyto.c            | 438 +++++++++++++++--------
 src/tools/pgindent/typedefs.list         |   2 +
 6 files changed, 770 insertions(+), 351 deletions(-)
 create mode 100644 src/include/commands/copyapi.h

diff --git a/src/include/commands/copyapi.h b/src/include/commands/copyapi.h
new file mode 100644
index 0000000000..87e0629c2f
--- /dev/null
+++ b/src/include/commands/copyapi.h
@@ -0,0 +1,100 @@
+/*-------------------------------------------------------------------------
+ *
+ * copyapi.h
+ *	  API for COPY TO/FROM handlers
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/commands/copyapi.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef COPYAPI_H
+#define COPYAPI_H
+
+#include "executor/tuptable.h"
+#include "nodes/execnodes.h"
+
+/* These are private in commands/copy[from|to].c */
+typedef struct CopyFromStateData *CopyFromState;
+typedef struct CopyToStateData *CopyToState;
+
+/*
+ * API structure for a COPY FROM format implementation.  Note this must be
+ * allocated in a server-lifetime manner, typically as a static const struct.
+ */
+typedef struct CopyFromRoutine
+{
+	/*
+	 * Called when COPY FROM is started to set up the input functions
+	 * associated to the relation's attributes writing to.  `finfo` can be
+	 * optionally filled to provide the catalog information of the input
+	 * function.  `typioparam` can be optionally filled to define the OID of
+	 * the type to pass to the input function.  `atttypid` is the OID of data
+	 * type used by the relation's attribute.
+	 */
+	void		(*CopyFromInFunc) (CopyFromState cstate, Oid atttypid,
+								   FmgrInfo *finfo, Oid *typioparam);
+
+	/*
+	 * Called when COPY FROM is started.
+	 *
+	 * `tupDesc` is the tuple descriptor of the relation where the data needs
+	 * to be copied.  This can be used for any initialization steps required
+	 * by a format.
+	 */
+	void		(*CopyFromStart) (CopyFromState cstate, TupleDesc tupDesc);
+
+	/*
+	 * Copy one row to a set of `values` and `nulls` of size tupDesc->natts.
+	 *
+	 * 'econtext' is used to evaluate default expression for each column that
+	 * is either not read from the file or is using the DEFAULT option of COPY
+	 * FROM.  It is NULL if no default values are used.
+	 *
+	 * Returns false if there are no more tuples to copy.
+	 */
+	bool		(*CopyFromOneRow) (CopyFromState cstate, ExprContext *econtext,
+								   Datum *values, bool *nulls);
+
+	/* Called when COPY FROM has ended. */
+	void		(*CopyFromEnd) (CopyFromState cstate);
+} CopyFromRoutine;
+
+/*
+ * API structure for a COPY TO format implementation.   Note this must be
+ * allocated in a server-lifetime manner, typically as a static const struct.
+ */
+typedef struct CopyToRoutine
+{
+	/*
+	 * Called when COPY TO is started to set up the output functions
+	 * associated to the relation's attributes reading from.  `finfo` can be
+	 * optionally filled. `atttypid` is the OID of data type used by the
+	 * relation's attribute.
+	 */
+	void		(*CopyToOutFunc) (CopyToState cstate, Oid atttypid,
+								  FmgrInfo *finfo);
+
+	/*
+	 * Called when COPY TO is started.
+	 *
+	 * `tupDesc` is the tuple descriptor of the relation from where the data
+	 * is read.
+	 */
+	void		(*CopyToStart) (CopyToState cstate, TupleDesc tupDesc);
+
+	/*
+	 * Copy one row for COPY TO.
+	 *
+	 * `slot` is the tuple slot where the data is emitted.
+	 */
+	void		(*CopyToOneRow) (CopyToState cstate, TupleTableSlot *slot);
+
+	/* Called when COPY TO has ended */
+	void		(*CopyToEnd) (CopyToState cstate);
+} CopyToRoutine;
+
+#endif							/* COPYAPI_H */
diff --git a/src/include/commands/copyfrom_internal.h b/src/include/commands/copyfrom_internal.h
index 759f8e3d09..5fb52dc629 100644
--- a/src/include/commands/copyfrom_internal.h
+++ b/src/include/commands/copyfrom_internal.h
@@ -15,6 +15,7 @@
 #define COPYFROM_INTERNAL_H
 
 #include "commands/copy.h"
+#include "commands/copyapi.h"
 #include "commands/trigger.h"
 #include "nodes/miscnodes.h"
 
@@ -65,6 +66,9 @@ typedef int (*CopyReadAttributes) (CopyFromState cstate);
  */
 typedef struct CopyFromStateData
 {
+	/* format routine */
+	const CopyFromRoutine *routine;
+
 	/* low-level state data */
 	CopySource	copy_src;		/* type of copy source */
 	FILE	   *copy_file;		/* used if copy_src == COPY_FILE */
@@ -200,4 +204,10 @@ extern void ReceiveCopyBinaryHeader(CopyFromState cstate);
 extern int	CopyReadAttributesCSV(CopyFromState cstate);
 extern int	CopyReadAttributesText(CopyFromState cstate);
 
+/* Callbacks for CopyFromRoutine->CopyFromOneRow */
+extern bool CopyFromTextOneRow(CopyFromState cstate, ExprContext *econtext,
+							   Datum *values, bool *nulls);
+extern bool CopyFromBinaryOneRow(CopyFromState cstate, ExprContext *econtext,
+								 Datum *values, bool *nulls);
+
 #endif							/* COPYFROM_INTERNAL_H */
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 41f6bc43e4..a90b7189b5 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -108,6 +108,160 @@ static char *limit_printout_length(const char *str);
 
 static void ClosePipeFromProgram(CopyFromState cstate);
 
+
+/*
+ * CopyFromRoutine implementations for text and CSV.
+ */
+
+/*
+ * CopyFromTextInFunc
+ *
+ * Assign input function data for a relation's attribute in text/CSV format.
+ */
+static void
+CopyFromTextInFunc(CopyFromState cstate, Oid atttypid,
+				   FmgrInfo *finfo, Oid *typioparam)
+{
+	Oid			func_oid;
+
+	getTypeInputInfo(atttypid, &func_oid, typioparam);
+	fmgr_info(func_oid, finfo);
+}
+
+/*
+ * CopyFromTextStart
+ *
+ * Start of COPY FROM for text/CSV format.
+ */
+static void
+CopyFromTextStart(CopyFromState cstate, TupleDesc tupDesc)
+{
+	AttrNumber	attr_count;
+
+	/*
+	 * If encoding conversion is needed, we need another buffer to hold the
+	 * converted input data.  Otherwise, we can just point input_buf to the
+	 * same buffer as raw_buf.
+	 */
+	if (cstate->need_transcoding)
+	{
+		cstate->input_buf = (char *) palloc(INPUT_BUF_SIZE + 1);
+		cstate->input_buf_index = cstate->input_buf_len = 0;
+	}
+	else
+		cstate->input_buf = cstate->raw_buf;
+	cstate->input_reached_eof = false;
+
+	initStringInfo(&cstate->line_buf);
+
+	/* create workspace for CopyReadAttributes results */
+	attr_count = list_length(cstate->attnumlist);
+	cstate->max_fields = attr_count;
+	cstate->raw_fields = (char **) palloc(attr_count * sizeof(char *));
+
+	/* Set read attribute callback */
+	if (cstate->opts.csv_mode)
+		cstate->copy_read_attributes = CopyReadAttributesCSV;
+	else
+		cstate->copy_read_attributes = CopyReadAttributesText;
+}
+
+/*
+ * CopyFromTextEnd
+ *
+ * End of COPY FROM for text/CSV format.
+ */
+static void
+CopyFromTextEnd(CopyFromState cstate)
+{
+	/* nothing to do */
+}
+
+/*
+ * CopyFromRoutine implementation for "binary".
+ */
+
+/*
+ * CopyFromBinaryInFunc
+ *
+ * Assign input function data for a relation's attribute in binary format.
+ */
+static void
+CopyFromBinaryInFunc(CopyFromState cstate, Oid atttypid,
+					 FmgrInfo *finfo, Oid *typioparam)
+{
+	Oid			func_oid;
+
+	getTypeBinaryInputInfo(atttypid, &func_oid, typioparam);
+	fmgr_info(func_oid, finfo);
+}
+
+/*
+ * CopyFromBinaryStart
+ *
+ * Start of COPY FROM for binary format.
+ */
+static void
+CopyFromBinaryStart(CopyFromState cstate, TupleDesc tupDesc)
+{
+	/* Read and verify binary header */
+	ReceiveCopyBinaryHeader(cstate);
+}
+
+/*
+ * CopyFromBinaryEnd
+ *
+ * End of COPY FROM for binary format.
+ */
+static void
+CopyFromBinaryEnd(CopyFromState cstate)
+{
+	/* nothing to do */
+}
+
+/*
+ * Routines assigned to each format.
++
+ * CSV and text share the same implementation, at the exception of the
+ * copy_read_attributes callback.
+ */
+static const CopyFromRoutine CopyFromRoutineText = {
+	.CopyFromInFunc = CopyFromTextInFunc,
+	.CopyFromStart = CopyFromTextStart,
+	.CopyFromOneRow = CopyFromTextOneRow,
+	.CopyFromEnd = CopyFromTextEnd,
+};
+
+static const CopyFromRoutine CopyFromRoutineCSV = {
+	.CopyFromInFunc = CopyFromTextInFunc,
+	.CopyFromStart = CopyFromTextStart,
+	.CopyFromOneRow = CopyFromTextOneRow,
+	.CopyFromEnd = CopyFromTextEnd,
+};
+
+static const CopyFromRoutine CopyFromRoutineBinary = {
+	.CopyFromInFunc = CopyFromBinaryInFunc,
+	.CopyFromStart = CopyFromBinaryStart,
+	.CopyFromOneRow = CopyFromBinaryOneRow,
+	.CopyFromEnd = CopyFromBinaryEnd,
+};
+
+/*
+ * Define the COPY FROM routines to use for a format.
+ */
+static const CopyFromRoutine *
+CopyFromGetRoutine(CopyFormatOptions opts)
+{
+	if (opts.csv_mode)
+		return &CopyFromRoutineCSV;
+	else if (opts.binary)
+		return &CopyFromRoutineBinary;
+
+	/* default is text */
+	return &CopyFromRoutineText;
+}
+
+
 /*
  * error context callback for COPY FROM
  *
@@ -1386,7 +1540,6 @@ BeginCopyFrom(ParseState *pstate,
 				num_defaults;
 	FmgrInfo   *in_functions;
 	Oid		   *typioparams;
-	Oid			in_func_oid;
 	int		   *defmap;
 	ExprState **defexprs;
 	MemoryContext oldcontext;
@@ -1418,6 +1571,9 @@ BeginCopyFrom(ParseState *pstate,
 	/* Extract options from the statement node tree */
 	ProcessCopyOptions(pstate, &cstate->opts, true /* is_from */ , options);
 
+	/* Set format routine */
+	cstate->routine = CopyFromGetRoutine(cstate->opts);
+
 	/* Process the target relation */
 	cstate->rel = rel;
 
@@ -1571,25 +1727,6 @@ BeginCopyFrom(ParseState *pstate,
 	cstate->raw_buf_index = cstate->raw_buf_len = 0;
 	cstate->raw_reached_eof = false;
 
-	if (!cstate->opts.binary)
-	{
-		/*
-		 * If encoding conversion is needed, we need another buffer to hold
-		 * the converted input data.  Otherwise, we can just point input_buf
-		 * to the same buffer as raw_buf.
-		 */
-		if (cstate->need_transcoding)
-		{
-			cstate->input_buf = (char *) palloc(INPUT_BUF_SIZE + 1);
-			cstate->input_buf_index = cstate->input_buf_len = 0;
-		}
-		else
-			cstate->input_buf = cstate->raw_buf;
-		cstate->input_reached_eof = false;
-
-		initStringInfo(&cstate->line_buf);
-	}
-
 	initStringInfo(&cstate->attribute_buf);
 
 	/* Assign range table and rteperminfos, we'll need them in CopyFrom. */
@@ -1622,13 +1759,9 @@ BeginCopyFrom(ParseState *pstate,
 			continue;
 
 		/* Fetch the input function and typioparam info */
-		if (cstate->opts.binary)
-			getTypeBinaryInputInfo(att->atttypid,
-								   &in_func_oid, &typioparams[attnum - 1]);
-		else
-			getTypeInputInfo(att->atttypid,
-							 &in_func_oid, &typioparams[attnum - 1]);
-		fmgr_info(in_func_oid, &in_functions[attnum - 1]);
+		cstate->routine->CopyFromInFunc(cstate, att->atttypid,
+										&in_functions[attnum - 1],
+										&typioparams[attnum - 1]);
 
 		/* Get default info if available */
 		defexprs[attnum - 1] = NULL;
@@ -1763,25 +1896,7 @@ BeginCopyFrom(ParseState *pstate,
 
 	pgstat_progress_update_multi_param(3, progress_cols, progress_vals);
 
-	if (cstate->opts.binary)
-	{
-		/* Read and verify binary header */
-		ReceiveCopyBinaryHeader(cstate);
-	}
-
-	/* create workspace for CopyReadAttributes results */
-	if (!cstate->opts.binary)
-	{
-		AttrNumber	attr_count = list_length(cstate->attnumlist);
-
-		cstate->max_fields = attr_count;
-		cstate->raw_fields = (char **) palloc(attr_count * sizeof(char *));
-
-		if (cstate->opts.csv_mode)
-			cstate->copy_read_attributes = CopyReadAttributesCSV;
-		else
-			cstate->copy_read_attributes = CopyReadAttributesText;
-	}
+	cstate->routine->CopyFromStart(cstate, tupDesc);
 
 	MemoryContextSwitchTo(oldcontext);
 
@@ -1794,6 +1909,8 @@ BeginCopyFrom(ParseState *pstate,
 void
 EndCopyFrom(CopyFromState cstate)
 {
+	cstate->routine->CopyFromEnd(cstate);
+
 	/* No COPY FROM related resources except memory. */
 	if (cstate->is_program)
 	{
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 906756362e..c45f9ae134 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -832,6 +832,200 @@ NextCopyFromRawFields(CopyFromState cstate, char ***fields, int *nfields)
 	return true;
 }
 
+/*
+ * CopyFromTextOneRow
+ *
+ * Copy one row to a set of `values` and `nulls` for the text and CSV
+ * formats.
+ */
+bool
+CopyFromTextOneRow(CopyFromState cstate,
+				   ExprContext *econtext,
+				   Datum *values,
+				   bool *nulls)
+{
+	TupleDesc	tupDesc;
+	AttrNumber	attr_count;
+	FmgrInfo   *in_functions = cstate->in_functions;
+	Oid		   *typioparams = cstate->typioparams;
+	ExprState **defexprs = cstate->defexprs;
+	char	  **field_strings;
+	ListCell   *cur;
+	int			fldct;
+	int			fieldno;
+	char	   *string;
+
+	tupDesc = RelationGetDescr(cstate->rel);
+	attr_count = list_length(cstate->attnumlist);
+
+	/* read raw fields in the next line */
+	if (!NextCopyFromRawFields(cstate, &field_strings, &fldct))
+		return false;
+
+	/* check for overflowing fields */
+	if (attr_count > 0 && fldct > attr_count)
+		ereport(ERROR,
+				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
+				 errmsg("extra data after last expected column")));
+
+	fieldno = 0;
+
+	/* Loop to read the user attributes on the line. */
+	foreach(cur, cstate->attnumlist)
+	{
+		int			attnum = lfirst_int(cur);
+		int			m = attnum - 1;
+		Form_pg_attribute att = TupleDescAttr(tupDesc, m);
+
+		if (fieldno >= fldct)
+			ereport(ERROR,
+					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
+					 errmsg("missing data for column \"%s\"",
+							NameStr(att->attname))));
+		string = field_strings[fieldno++];
+
+		if (cstate->convert_select_flags &&
+			!cstate->convert_select_flags[m])
+		{
+			/* ignore input field, leaving column as NULL */
+			continue;
+		}
+
+		cstate->cur_attname = NameStr(att->attname);
+		cstate->cur_attval = string;
+
+		if (cstate->opts.csv_mode)
+		{
+			if (string == NULL &&
+				cstate->opts.force_notnull_flags[m])
+			{
+				/*
+				 * FORCE_NOT_NULL option is set and column is NULL - convert
+				 * it to the NULL string.
+				 */
+				string = cstate->opts.null_print;
+			}
+			else if (string != NULL && cstate->opts.force_null_flags[m]
+					 && strcmp(string, cstate->opts.null_print) == 0)
+			{
+				/*
+				 * FORCE_NULL option is set and column matches the NULL
+				 * string. It must have been quoted, or otherwise the string
+				 * would already have been set to NULL. Convert it to NULL as
+				 * specified.
+				 */
+				string = NULL;
+			}
+		}
+
+		if (string != NULL)
+			nulls[m] = false;
+
+		if (cstate->defaults[m])
+		{
+			/*
+			 * The caller must supply econtext and have switched into the
+			 * per-tuple memory context in it.
+			 */
+			Assert(econtext != NULL);
+			Assert(CurrentMemoryContext == econtext->ecxt_per_tuple_memory);
+
+			values[m] = ExecEvalExpr(defexprs[m], econtext, &nulls[m]);
+		}
+
+		/*
+		 * If ON_ERROR is specified with IGNORE, skip rows with soft errors
+		 */
+		else if (!InputFunctionCallSafe(&in_functions[m],
+										string,
+										typioparams[m],
+										att->atttypmod,
+										(Node *) cstate->escontext,
+										&values[m]))
+		{
+			cstate->num_errors++;
+			return true;
+		}
+
+		cstate->cur_attname = NULL;
+		cstate->cur_attval = NULL;
+	}
+
+	Assert(fieldno == attr_count);
+
+	return true;
+}
+
+/*
+ * CopyFromBinaryOneRow
+ *
+ * Copy one row to a set of `values` and `nulls` for the binary format.
+ */
+bool
+CopyFromBinaryOneRow(CopyFromState cstate, ExprContext *econtext,
+					 Datum *values, bool *nulls)
+{
+	TupleDesc	tupDesc;
+	AttrNumber	attr_count;
+	FmgrInfo   *in_functions = cstate->in_functions;
+	Oid		   *typioparams = cstate->typioparams;
+	int16		fld_count;
+	ListCell   *cur;
+
+	tupDesc = RelationGetDescr(cstate->rel);
+	attr_count = list_length(cstate->attnumlist);
+
+	cstate->cur_lineno++;
+
+	if (!CopyGetInt16(cstate, &fld_count))
+	{
+		/* EOF detected (end of file, or protocol-level EOF) */
+		return false;
+	}
+
+	if (fld_count == -1)
+	{
+		/*
+		 * Received EOF marker.  Wait for the protocol-level EOF, and complain
+		 * if it doesn't come immediately.  In COPY FROM STDIN, this ensures
+		 * that we correctly handle CopyFail, if client chooses to send that
+		 * now.  When copying from file, we could ignore the rest of the file
+		 * like in text mode, but we choose to be consistent with the COPY
+		 * FROM STDIN case.
+		 */
+		char		dummy;
+
+		if (CopyReadBinaryData(cstate, &dummy, 1) > 0)
+			ereport(ERROR,
+					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
+					 errmsg("received copy data after EOF marker")));
+		return false;
+	}
+
+	if (fld_count != attr_count)
+		ereport(ERROR,
+				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
+				 errmsg("row field count is %d, expected %d",
+						(int) fld_count, attr_count)));
+
+	foreach(cur, cstate->attnumlist)
+	{
+		int			attnum = lfirst_int(cur);
+		int			m = attnum - 1;
+		Form_pg_attribute att = TupleDescAttr(tupDesc, m);
+
+		cstate->cur_attname = NameStr(att->attname);
+		values[m] = CopyReadBinaryAttribute(cstate,
+											&in_functions[m],
+											typioparams[m],
+											att->atttypmod,
+											&nulls[m]);
+		cstate->cur_attname = NULL;
+	}
+
+	return true;
+}
+
 /*
  * Read next tuple from file for COPY FROM. Return false if no more tuples.
  *
@@ -841,7 +1035,8 @@ NextCopyFromRawFields(CopyFromState cstate, char ***fields, int *nfields)
  * read from the file, and DEFAULT option is unset.
  *
  * 'values' and 'nulls' arrays must be the same length as columns of the
- * relation passed to BeginCopyFrom. This function fills the arrays.
+ * relation passed to BeginCopyFrom. This function calls the format routine
+ * that should fill the arrays.
  */
 bool
 NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
@@ -849,181 +1044,22 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
 {
 	TupleDesc	tupDesc;
 	AttrNumber	num_phys_attrs,
-				attr_count,
 				num_defaults = cstate->num_defaults;
-	FmgrInfo   *in_functions = cstate->in_functions;
-	Oid		   *typioparams = cstate->typioparams;
 	int			i;
 	int		   *defmap = cstate->defmap;
 	ExprState **defexprs = cstate->defexprs;
 
 	tupDesc = RelationGetDescr(cstate->rel);
 	num_phys_attrs = tupDesc->natts;
-	attr_count = list_length(cstate->attnumlist);
 
 	/* Initialize all values for row to NULL */
 	MemSet(values, 0, num_phys_attrs * sizeof(Datum));
 	MemSet(nulls, true, num_phys_attrs * sizeof(bool));
 	MemSet(cstate->defaults, false, num_phys_attrs * sizeof(bool));
 
-	if (!cstate->opts.binary)
-	{
-		char	  **field_strings;
-		ListCell   *cur;
-		int			fldct;
-		int			fieldno;
-		char	   *string;
-
-		/* read raw fields in the next line */
-		if (!NextCopyFromRawFields(cstate, &field_strings, &fldct))
-			return false;
-
-		/* check for overflowing fields */
-		if (attr_count > 0 && fldct > attr_count)
-			ereport(ERROR,
-					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
-					 errmsg("extra data after last expected column")));
-
-		fieldno = 0;
-
-		/* Loop to read the user attributes on the line. */
-		foreach(cur, cstate->attnumlist)
-		{
-			int			attnum = lfirst_int(cur);
-			int			m = attnum - 1;
-			Form_pg_attribute att = TupleDescAttr(tupDesc, m);
-
-			if (fieldno >= fldct)
-				ereport(ERROR,
-						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
-						 errmsg("missing data for column \"%s\"",
-								NameStr(att->attname))));
-			string = field_strings[fieldno++];
-
-			if (cstate->convert_select_flags &&
-				!cstate->convert_select_flags[m])
-			{
-				/* ignore input field, leaving column as NULL */
-				continue;
-			}
-
-			if (cstate->opts.csv_mode)
-			{
-				if (string == NULL &&
-					cstate->opts.force_notnull_flags[m])
-				{
-					/*
-					 * FORCE_NOT_NULL option is set and column is NULL -
-					 * convert it to the NULL string.
-					 */
-					string = cstate->opts.null_print;
-				}
-				else if (string != NULL && cstate->opts.force_null_flags[m]
-						 && strcmp(string, cstate->opts.null_print) == 0)
-				{
-					/*
-					 * FORCE_NULL option is set and column matches the NULL
-					 * string. It must have been quoted, or otherwise the
-					 * string would already have been set to NULL. Convert it
-					 * to NULL as specified.
-					 */
-					string = NULL;
-				}
-			}
-
-			cstate->cur_attname = NameStr(att->attname);
-			cstate->cur_attval = string;
-
-			if (string != NULL)
-				nulls[m] = false;
-
-			if (cstate->defaults[m])
-			{
-				/*
-				 * The caller must supply econtext and have switched into the
-				 * per-tuple memory context in it.
-				 */
-				Assert(econtext != NULL);
-				Assert(CurrentMemoryContext == econtext->ecxt_per_tuple_memory);
-
-				values[m] = ExecEvalExpr(defexprs[m], econtext, &nulls[m]);
-			}
-
-			/*
-			 * If ON_ERROR is specified with IGNORE, skip rows with soft
-			 * errors
-			 */
-			else if (!InputFunctionCallSafe(&in_functions[m],
-											string,
-											typioparams[m],
-											att->atttypmod,
-											(Node *) cstate->escontext,
-											&values[m]))
-			{
-				cstate->num_errors++;
-				return true;
-			}
-
-			cstate->cur_attname = NULL;
-			cstate->cur_attval = NULL;
-		}
-
-		Assert(fieldno == attr_count);
-	}
-	else
-	{
-		/* binary */
-		int16		fld_count;
-		ListCell   *cur;
-
-		cstate->cur_lineno++;
-
-		if (!CopyGetInt16(cstate, &fld_count))
-		{
-			/* EOF detected (end of file, or protocol-level EOF) */
-			return false;
-		}
-
-		if (fld_count == -1)
-		{
-			/*
-			 * Received EOF marker.  Wait for the protocol-level EOF, and
-			 * complain if it doesn't come immediately.  In COPY FROM STDIN,
-			 * this ensures that we correctly handle CopyFail, if client
-			 * chooses to send that now.  When copying from file, we could
-			 * ignore the rest of the file like in text mode, but we choose to
-			 * be consistent with the COPY FROM STDIN case.
-			 */
-			char		dummy;
-
-			if (CopyReadBinaryData(cstate, &dummy, 1) > 0)
-				ereport(ERROR,
-						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
-						 errmsg("received copy data after EOF marker")));
-			return false;
-		}
-
-		if (fld_count != attr_count)
-			ereport(ERROR,
-					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
-					 errmsg("row field count is %d, expected %d",
-							(int) fld_count, attr_count)));
-
-		foreach(cur, cstate->attnumlist)
-		{
-			int			attnum = lfirst_int(cur);
-			int			m = attnum - 1;
-			Form_pg_attribute att = TupleDescAttr(tupDesc, m);
-
-			cstate->cur_attname = NameStr(att->attname);
-			values[m] = CopyReadBinaryAttribute(cstate,
-												&in_functions[m],
-												typioparams[m],
-												att->atttypmod,
-												&nulls[m]);
-			cstate->cur_attname = NULL;
-		}
-	}
+	if (!cstate->routine->CopyFromOneRow(cstate, econtext, values,
+										 nulls))
+		return false;
 
 	/*
 	 * Now compute and insert any defaults available for the columns not
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 52b42f8a52..70cacac6b2 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -24,6 +24,7 @@
 #include "access/xact.h"
 #include "access/xlog.h"
 #include "commands/copy.h"
+#include "commands/copyapi.h"
 #include "commands/progress.h"
 #include "executor/execdesc.h"
 #include "executor/executor.h"
@@ -79,6 +80,9 @@ typedef void (*CopyAttributeOut) (CopyToState cstate, const char *string,
  */
 typedef struct CopyToStateData
 {
+	/* format routine */
+	const CopyToRoutine *routine;
+
 	/* low-level state data */
 	CopyDest	copy_dest;		/* type of copy source/destination */
 	FILE	   *copy_file;		/* used if copy_dest == COPY_FILE */
@@ -143,6 +147,291 @@ static void CopySendEndOfRow(CopyToState cstate);
 static void CopySendInt32(CopyToState cstate, int32 val);
 static void CopySendInt16(CopyToState cstate, int16 val);
 
+/*
+ * CopyToRoutine implementations.
+ */
+
+/*
+ * CopyToTextSendEndOfRow
+ *
+ * Apply line terminations for a line sent in text or CSV format depending
+ * on the destination, then send the end of a row.
+ */
+static inline void
+CopyToTextSendEndOfRow(CopyToState cstate)
+{
+	switch (cstate->copy_dest)
+	{
+		case COPY_FILE:
+			/* Default line termination depends on platform */
+#ifndef WIN32
+			CopySendChar(cstate, '\n');
+#else
+			CopySendString(cstate, "\r\n");
+#endif
+			break;
+		case COPY_FRONTEND:
+			/* The FE/BE protocol uses \n as newline for all platforms */
+			CopySendChar(cstate, '\n');
+			break;
+		default:
+			break;
+	}
+
+	/* Now take the actions related to the end of a row */
+	CopySendEndOfRow(cstate);
+}
+
+/*
+ * CopyToTextStart
+ *
+ * Start of COPY TO for text and CSV format.
+ */
+static void
+CopyToTextStart(CopyToState cstate, TupleDesc tupDesc)
+{
+	ListCell   *cur;
+
+	/* Set output representation callback */
+	if (cstate->opts.csv_mode)
+		cstate->copy_attribute_out = CopyAttributeOutCSV;
+	else
+		cstate->copy_attribute_out = CopyAttributeOutText;
+
+	/*
+	 * For non-binary copy, we need to convert null_print to file encoding,
+	 * because it will be sent directly with CopySendString.
+	 */
+	if (cstate->need_transcoding)
+		cstate->opts.null_print_client = pg_server_to_any(cstate->opts.null_print,
+														  cstate->opts.null_print_len,
+														  cstate->file_encoding);
+
+	/* if a header has been requested send the line */
+	if (cstate->opts.header_line)
+	{
+		bool		hdr_delim = false;
+
+		foreach(cur, cstate->attnumlist)
+		{
+			int			attnum = lfirst_int(cur);
+			char	   *colname;
+
+			if (hdr_delim)
+				CopySendChar(cstate, cstate->opts.delim[0]);
+			hdr_delim = true;
+
+			colname = NameStr(TupleDescAttr(tupDesc, attnum - 1)->attname);
+
+			/* Ignore quotes */
+			cstate->copy_attribute_out(cstate, colname, false);
+		}
+
+		CopyToTextSendEndOfRow(cstate);
+	}
+}
+
+/*
+ * CopyToTextOutFunc
+ *
+ * Assign output function data for a relation's attribute in text/CSV format.
+ */
+static void
+CopyToTextOutFunc(CopyToState cstate, Oid atttypid, FmgrInfo *finfo)
+{
+	Oid			func_oid;
+	bool		is_varlena;
+
+	/* Set output function for an attribute */
+	getTypeOutputInfo(atttypid, &func_oid, &is_varlena);
+	fmgr_info(func_oid, finfo);
+}
+
+/*
+ * CopyToTextOneRow
+ *
+ * Process one row for text/CSV format.
+ */
+static void
+CopyToTextOneRow(CopyToState cstate,
+				 TupleTableSlot *slot)
+{
+	bool		need_delim = false;
+	FmgrInfo   *out_functions = cstate->out_functions;
+	ListCell   *cur;
+
+	foreach(cur, cstate->attnumlist)
+	{
+		int			attnum = lfirst_int(cur);
+		Datum		value = slot->tts_values[attnum - 1];
+		bool		isnull = slot->tts_isnull[attnum - 1];
+
+		if (need_delim)
+			CopySendChar(cstate, cstate->opts.delim[0]);
+		need_delim = true;
+
+		if (isnull)
+		{
+			CopySendString(cstate, cstate->opts.null_print_client);
+		}
+		else
+		{
+			char	   *string;
+
+			string = OutputFunctionCall(&out_functions[attnum - 1], value);
+
+			cstate->copy_attribute_out(cstate, string,
+									   cstate->opts.force_quote_flags[attnum - 1]);
+		}
+	}
+
+	CopyToTextSendEndOfRow(cstate);
+}
+
+/*
+ * CopyToTextEnd
+ *
+ * End of COPY TO for text/CSV format.
+ */
+static void
+CopyToTextEnd(CopyToState cstate)
+{
+	/* Nothing to do here */
+}
+
+/*
+ * CopyToRoutine implementation for "binary".
+ */
+
+/*
+ * CopyToBinaryStart
+ *
+ * Start of COPY TO for binary format.
+ */
+static void
+CopyToBinaryStart(CopyToState cstate, TupleDesc tupDesc)
+{
+	/* Generate header for a binary copy */
+	int32		tmp;
+
+	/* Signature */
+	CopySendData(cstate, BinarySignature, 11);
+	/* Flags field */
+	tmp = 0;
+	CopySendInt32(cstate, tmp);
+	/* No header extension */
+	tmp = 0;
+	CopySendInt32(cstate, tmp);
+}
+
+/*
+ * CopyToBinaryOutFunc
+ *
+ * Assign output function data for a relation's attribute in binary format.
+ */
+static void
+CopyToBinaryOutFunc(CopyToState cstate, Oid atttypid, FmgrInfo *finfo)
+{
+	Oid			func_oid;
+	bool		is_varlena;
+
+	/* Set output function for an attribute */
+	getTypeBinaryOutputInfo(atttypid, &func_oid, &is_varlena);
+	fmgr_info(func_oid, finfo);
+}
+
+/*
+ * CopyToBinaryOneRow
+ *
+ * Process one row for binary format.
+ */
+static void
+CopyToBinaryOneRow(CopyToState cstate, TupleTableSlot *slot)
+{
+	FmgrInfo   *out_functions = cstate->out_functions;
+	ListCell   *cur;
+
+	/* Binary per-tuple header */
+	CopySendInt16(cstate, list_length(cstate->attnumlist));
+
+	foreach(cur, cstate->attnumlist)
+	{
+		int			attnum = lfirst_int(cur);
+		Datum		value = slot->tts_values[attnum - 1];
+		bool		isnull = slot->tts_isnull[attnum - 1];
+
+		if (isnull)
+		{
+			CopySendInt32(cstate, -1);
+		}
+		else
+		{
+			bytea	   *outputbytes;
+
+			outputbytes = SendFunctionCall(&out_functions[attnum - 1], value);
+			CopySendInt32(cstate, VARSIZE(outputbytes) - VARHDRSZ);
+			CopySendData(cstate, VARDATA(outputbytes),
+						 VARSIZE(outputbytes) - VARHDRSZ);
+		}
+	}
+
+	CopySendEndOfRow(cstate);
+}
+
+/*
+ * CopyToBinaryEnd
+ *
+ * End of COPY TO for binary format.
+ */
+static void
+CopyToBinaryEnd(CopyToState cstate)
+{
+	/* Generate trailer for a binary copy */
+	CopySendInt16(cstate, -1);
+	/* Need to flush out the trailer */
+	CopySendEndOfRow(cstate);
+}
+
+/*
+ * CSV and text share the same implementation, at the exception of the
+ * output representation function.
+ */
+static const CopyToRoutine CopyToRoutineText = {
+	.CopyToStart = CopyToTextStart,
+	.CopyToOutFunc = CopyToTextOutFunc,
+	.CopyToOneRow = CopyToTextOneRow,
+	.CopyToEnd = CopyToTextEnd,
+};
+
+static const CopyToRoutine CopyToRoutineCSV = {
+	.CopyToStart = CopyToTextStart,
+	.CopyToOutFunc = CopyToTextOutFunc,
+	.CopyToOneRow = CopyToTextOneRow,
+	.CopyToEnd = CopyToTextEnd,
+};
+
+static const CopyToRoutine CopyToRoutineBinary = {
+	.CopyToStart = CopyToBinaryStart,
+	.CopyToOutFunc = CopyToBinaryOutFunc,
+	.CopyToOneRow = CopyToBinaryOneRow,
+	.CopyToEnd = CopyToBinaryEnd,
+};
+
+/*
+ * Define the COPY TO routines to use for a format.  This should be called
+ * after options are parsed.
+ */
+static const CopyToRoutine *
+CopyToGetRoutine(CopyFormatOptions opts)
+{
+	if (opts.csv_mode)
+		return &CopyToRoutineCSV;
+	else if (opts.binary)
+		return &CopyToRoutineBinary;
+
+	/* default is text */
+	return &CopyToRoutineText;
+}
 
 /*
  * Send copy start/stop messages for frontend copies.  These have changed
@@ -210,16 +499,6 @@ CopySendEndOfRow(CopyToState cstate)
 	switch (cstate->copy_dest)
 	{
 		case COPY_FILE:
-			if (!cstate->opts.binary)
-			{
-				/* Default line termination depends on platform */
-#ifndef WIN32
-				CopySendChar(cstate, '\n');
-#else
-				CopySendString(cstate, "\r\n");
-#endif
-			}
-
 			if (fwrite(fe_msgbuf->data, fe_msgbuf->len, 1,
 					   cstate->copy_file) != 1 ||
 				ferror(cstate->copy_file))
@@ -254,10 +533,6 @@ CopySendEndOfRow(CopyToState cstate)
 			}
 			break;
 		case COPY_FRONTEND:
-			/* The FE/BE protocol uses \n as newline for all platforms */
-			if (!cstate->opts.binary)
-				CopySendChar(cstate, '\n');
-
 			/* Dump the accumulated row as one CopyData message */
 			(void) pq_putmessage(PqMsg_CopyData, fe_msgbuf->data, fe_msgbuf->len);
 			break;
@@ -445,14 +720,8 @@ BeginCopyTo(ParseState *pstate,
 	/* Extract options from the statement node tree */
 	ProcessCopyOptions(pstate, &cstate->opts, false /* is_from */ , options);
 
-	/* Set output representation callback */
-	if (!cstate->opts.binary)
-	{
-		if (cstate->opts.csv_mode)
-			cstate->copy_attribute_out = CopyAttributeOutCSV;
-		else
-			cstate->copy_attribute_out = CopyAttributeOutText;
-	}
+	/* Set format routine */
+	cstate->routine = CopyToGetRoutine(cstate->opts);
 
 	/* Process the source/target relation or query */
 	if (rel)
@@ -791,19 +1060,10 @@ DoCopyTo(CopyToState cstate)
 	foreach(cur, cstate->attnumlist)
 	{
 		int			attnum = lfirst_int(cur);
-		Oid			out_func_oid;
-		bool		isvarlena;
 		Form_pg_attribute attr = TupleDescAttr(tupDesc, attnum - 1);
 
-		if (cstate->opts.binary)
-			getTypeBinaryOutputInfo(attr->atttypid,
-									&out_func_oid,
-									&isvarlena);
-		else
-			getTypeOutputInfo(attr->atttypid,
-							  &out_func_oid,
-							  &isvarlena);
-		fmgr_info(out_func_oid, &cstate->out_functions[attnum - 1]);
+		cstate->routine->CopyToOutFunc(cstate, attr->atttypid,
+									   &cstate->out_functions[attnum - 1]);
 	}
 
 	/*
@@ -816,54 +1076,7 @@ DoCopyTo(CopyToState cstate)
 											   "COPY TO",
 											   ALLOCSET_DEFAULT_SIZES);
 
-	if (cstate->opts.binary)
-	{
-		/* Generate header for a binary copy */
-		int32		tmp;
-
-		/* Signature */
-		CopySendData(cstate, BinarySignature, 11);
-		/* Flags field */
-		tmp = 0;
-		CopySendInt32(cstate, tmp);
-		/* No header extension */
-		tmp = 0;
-		CopySendInt32(cstate, tmp);
-	}
-	else
-	{
-		/*
-		 * For non-binary copy, we need to convert null_print to file
-		 * encoding, because it will be sent directly with CopySendString.
-		 */
-		if (cstate->need_transcoding)
-			cstate->opts.null_print_client = pg_server_to_any(cstate->opts.null_print,
-															  cstate->opts.null_print_len,
-															  cstate->file_encoding);
-
-		/* if a header has been requested send the line */
-		if (cstate->opts.header_line)
-		{
-			bool		hdr_delim = false;
-
-			foreach(cur, cstate->attnumlist)
-			{
-				int			attnum = lfirst_int(cur);
-				char	   *colname;
-
-				if (hdr_delim)
-					CopySendChar(cstate, cstate->opts.delim[0]);
-				hdr_delim = true;
-
-				colname = NameStr(TupleDescAttr(tupDesc, attnum - 1)->attname);
-
-				/* Ignore quotes */
-				cstate->copy_attribute_out(cstate, colname, false);
-			}
-
-			CopySendEndOfRow(cstate);
-		}
-	}
+	cstate->routine->CopyToStart(cstate, tupDesc);
 
 	if (cstate->rel)
 	{
@@ -902,13 +1115,7 @@ DoCopyTo(CopyToState cstate)
 		processed = ((DR_copy *) cstate->queryDesc->dest)->processed;
 	}
 
-	if (cstate->opts.binary)
-	{
-		/* Generate trailer for a binary copy */
-		CopySendInt16(cstate, -1);
-		/* Need to flush out the trailer */
-		CopySendEndOfRow(cstate);
-	}
+	cstate->routine->CopyToEnd(cstate);
 
 	MemoryContextDelete(cstate->rowcontext);
 
@@ -924,68 +1131,15 @@ DoCopyTo(CopyToState cstate)
 static void
 CopyOneRowTo(CopyToState cstate, TupleTableSlot *slot)
 {
-	bool		need_delim = false;
-	FmgrInfo   *out_functions = cstate->out_functions;
 	MemoryContext oldcontext;
-	ListCell   *cur;
-	char	   *string;
 
 	MemoryContextReset(cstate->rowcontext);
 	oldcontext = MemoryContextSwitchTo(cstate->rowcontext);
 
-	if (cstate->opts.binary)
-	{
-		/* Binary per-tuple header */
-		CopySendInt16(cstate, list_length(cstate->attnumlist));
-	}
-
 	/* Make sure the tuple is fully deconstructed */
 	slot_getallattrs(slot);
 
-	foreach(cur, cstate->attnumlist)
-	{
-		int			attnum = lfirst_int(cur);
-		Datum		value = slot->tts_values[attnum - 1];
-		bool		isnull = slot->tts_isnull[attnum - 1];
-
-		if (!cstate->opts.binary)
-		{
-			if (need_delim)
-				CopySendChar(cstate, cstate->opts.delim[0]);
-			need_delim = true;
-		}
-
-		if (isnull)
-		{
-			if (!cstate->opts.binary)
-				CopySendString(cstate, cstate->opts.null_print_client);
-			else
-				CopySendInt32(cstate, -1);
-		}
-		else
-		{
-			if (!cstate->opts.binary)
-			{
-				string = OutputFunctionCall(&out_functions[attnum - 1],
-											value);
-
-				cstate->copy_attribute_out(cstate, string,
-										   cstate->opts.force_quote_flags[attnum - 1]);
-			}
-			else
-			{
-				bytea	   *outputbytes;
-
-				outputbytes = SendFunctionCall(&out_functions[attnum - 1],
-											   value);
-				CopySendInt32(cstate, VARSIZE(outputbytes) - VARHDRSZ);
-				CopySendData(cstate, VARDATA(outputbytes),
-							 VARSIZE(outputbytes) - VARHDRSZ);
-			}
-		}
-	}
-
-	CopySendEndOfRow(cstate);
+	cstate->routine->CopyToOneRow(cstate, slot);
 
 	MemoryContextSwitchTo(oldcontext);
 }
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 91433d439b..d02a7773e3 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -473,6 +473,7 @@ ConvertRowtypeExpr
 CookedConstraint
 CopyDest
 CopyFormatOptions
+CopyFromRoutine
 CopyFromState
 CopyFromStateData
 CopyHeaderChoice
@@ -482,6 +483,7 @@ CopyMultiInsertInfo
 CopyOnErrorChoice
 CopySource
 CopyStmt
+CopyToRoutine
 CopyToState
 CopyToStateData
 Cost
-- 
2.43.0

Attachment: signature.asc
Description: PGP signature

Reply via email to