On 2024-09-25 00:46, Fujii Masao wrote:
Thanks for the comments!

On 2024/09/24 20:08, torikoshia wrote:
Thanks for the explanation and suggestion.
Since there is almost the same code in copyfrom.c, attached 0003 patch for refactoring both.

Thanks for updating the patches!

Regarding 0002.patch, I think it’s better to include the refactored code
from the start rather than adding redundant code intentionally.
How about leaving just the refactor in copyfrom.c to 0003.patch?
If that works, as a refactoring, you could also replace "skipped" with
"cstate->num_errors" in that patch, as you suggested earlier.

Agreed.

While reviewing again, I noticed that running ANALYZE on a file_fdw
foreign table also calls NextCopyFrom(), but it doesn’t seem to
skip erroneous rows when on_error is set to "ignore." This could lead
to inaccurate statistics. Shouldn’t ANALYZE on file_fdw foreign tables
with on_error=ignore also skip erroneous rows?

Thanks, it seems the right thing to do.

The tab-completion needs to be updated to support the "silent" option?

Yes, updated 0002 patch.

Thanks! Also, this should be part of 0001.patch since "silent" is
introduced there, right?

Agreed.

Updated the patches.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation
From 494447fcf99abab141c27e35171dc6d63f64c994 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikos...@oss.nttdata.com>
Date: Wed, 25 Sep 2024 21:28:15 +0900
Subject: [PATCH v5 1/3] Add log_verbosity to 'silent'

Currently when on_error is set to ignore, COPY logs a NOTICE message
containing the ignored row count.
When using on_error option to file_fdw, it'd be better to omit this
message in some cases, such as frequent access to a malformed file
via the same foreign table.
As a preliminary step to add on_error option to file_fdw, this patch
adds new value 'silent' to log_verbosity and enables to silence
the message.
---
 doc/src/sgml/ref/copy.sgml          | 10 +++++++---
 src/backend/commands/copy.c         |  4 +++-
 src/backend/commands/copyfrom.c     |  3 ++-
 src/bin/psql/tab-complete.c         |  2 +-
 src/include/commands/copy.h         |  6 ++++--
 src/test/regress/expected/copy2.out |  4 +++-
 src/test/regress/sql/copy2.sql      |  4 ++++
 7 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 1518af8a04..d87684a5be 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -407,6 +407,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
       <literal>verbose</literal>, a <literal>NOTICE</literal> message
       containing the line of the input file and the column name whose input
       conversion has failed is emitted for each discarded row.
+      When it is set to <literal>silent</literal>, no message is emitted
+      regarding ignored rows.
      </para>
     </listitem>
    </varlistentry>
@@ -428,9 +430,11 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     <listitem>
      <para>
       Specify the amount of messages emitted by a <command>COPY</command>
-      command: <literal>default</literal> or <literal>verbose</literal>. If
-      <literal>verbose</literal> is specified, additional messages are emitted
-      during processing.
+      command: <literal>default</literal>, <literal>verbose</literal>, or
+      <literal>silent</literal>.
+      If <literal>verbose</literal> is specified, additional messages are
+      emitted during processing.
+      <literal>silent</literal> suppresses both verbose and default messages.
      </para>
      <para>
       This is currently used in <command>COPY FROM</command> command when
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 3bb579a3a4..03eb7a4eba 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -427,9 +427,11 @@ defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
 	char	   *sval;
 
 	/*
-	 * Allow "default", or "verbose" values.
+	 * Allow "silent", "default", or "verbose" values.
 	 */
 	sval = defGetString(def);
+	if (pg_strcasecmp(sval, "silent") == 0)
+		return COPY_LOG_VERBOSITY_SILENT;
 	if (pg_strcasecmp(sval, "default") == 0)
 		return COPY_LOG_VERBOSITY_DEFAULT;
 	if (pg_strcasecmp(sval, "verbose") == 0)
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 2d3462913e..47879994f7 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1320,7 +1320,8 @@ CopyFrom(CopyFromState cstate)
 	error_context_stack = errcallback.previous;
 
 	if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
-		cstate->num_errors > 0)
+		cstate->num_errors > 0 &&
+		cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
 		ereport(NOTICE,
 				errmsg_plural("%llu row was skipped due to data type incompatibility",
 							  "%llu rows were skipped due to data type incompatibility",
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index a7ccde6d7d..6530b0f1ce 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2916,7 +2916,7 @@ psql_completion(const char *text, int start, int end)
 
 	/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
-		COMPLETE_WITH("default", "verbose");
+		COMPLETE_WITH("silent", "default", "verbose");
 
 	/* Complete COPY <sth> FROM <sth> WITH (<options>) */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 141fd48dc1..9fb13e952d 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -45,8 +45,10 @@ typedef enum CopyOnErrorChoice
  */
 typedef enum CopyLogVerbosityChoice
 {
-	COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
-	COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+	COPY_LOG_VERBOSITY_SILENT = -1,	/* logs none */
+	COPY_LOG_VERBOSITY_DEFAULT = 0,	/* logs no additional messages. As this is
+									   the default, assign 0 */
+	COPY_LOG_VERBOSITY_VERBOSE,	/* logs additional messages */
 } CopyLogVerbosityChoice;
 
 /*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 61a19cdc4c..4e752977b5 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -760,6 +760,7 @@ COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
 NOTICE:  skipping row due to data type incompatibility at line 2 for column "l": null input
 CONTEXT:  COPY check_ign_err2
 NOTICE:  1 row was skipped due to data type incompatibility
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity silent);
 -- reset context choice
 \set SHOW_CONTEXT errors
 SELECT * FROM check_ign_err;
@@ -774,7 +775,8 @@ SELECT * FROM check_ign_err2;
  n |  m  | k |   l   
 ---+-----+---+-------
  1 | {1} | 1 | 'foo'
-(1 row)
+ 3 | {3} | 3 | 'bar'
+(2 rows)
 
 -- test datatype error that can't be handled as soft: should fail
 CREATE TABLE hard_err(foo widget);
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 8b14962194..fa6aa17344 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -533,6 +533,10 @@ COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
 1	{1}	1	'foo'
 2	{2}	2	\N
 \.
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity silent);
+3	{3}	3	'bar'
+4	{4}	4	\N
+\.
 
 -- reset context choice
 \set SHOW_CONTEXT errors

base-commit: 1ab67c9dfaadda86059f405e5746efb6ddb9fe21
-- 
2.39.2

From dd8f20716524982e3687ea3e6a44a11e5f0b886c Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikos...@oss.nttdata.com>
Date: Wed, 25 Sep 2024 21:29:48 +0900
Subject: [PATCH v5 2/3] Add on_error and log_verbosity options to file_fdw

9e2d870, b725b7e and f5a2278 introduced on_error and log_verbosity to COPY.
Since these options seem useful to file_fdw when accessing dirty data,
this patch adds them to file_fdw.
---
 contrib/file_fdw/expected/file_fdw.out | 18 +++++++
 contrib/file_fdw/file_fdw.c            | 74 +++++++++++++++++++++++---
 contrib/file_fdw/sql/file_fdw.sql      |  6 +++
 doc/src/sgml/file-fdw.sgml             | 23 ++++++++
 4 files changed, 115 insertions(+), 6 deletions(-)

diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index 86c148a86b..2ea0deeb10 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -206,6 +206,24 @@ SELECT * FROM agg_csv c JOIN agg_text t ON (t.a = c.a) ORDER BY c.a;
 SELECT * FROM agg_bad;               -- ERROR
 ERROR:  invalid input syntax for type real: "aaa"
 CONTEXT:  COPY agg_bad, line 3, column b: "aaa"
+-- on_error and log_verbosity tests
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'ignore');
+SELECT * FROM agg_bad;
+NOTICE:  1 row was skipped due to data type incompatibility
+  a  |   b    
+-----+--------
+ 100 | 99.097
+  42 | 324.78
+(2 rows)
+
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD log_verbosity 'silent');
+SELECT * FROM agg_bad;
+  a  |   b    
+-----+--------
+ 100 | 99.097
+  42 | 324.78
+(2 rows)
+
 -- misc query tests
 \t on
 SELECT explain_filter('EXPLAIN (VERBOSE, COSTS FALSE) SELECT * FROM agg_csv');
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index d16821f8e1..543b1ca4f9 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -22,8 +22,10 @@
 #include "catalog/pg_authid.h"
 #include "catalog/pg_foreign_table.h"
 #include "commands/copy.h"
+#include "commands/copyfrom_internal.h"
 #include "commands/defrem.h"
 #include "commands/explain.h"
+#include "commands/progress.h"
 #include "commands/vacuum.h"
 #include "foreign/fdwapi.h"
 #include "foreign/foreign.h"
@@ -34,6 +36,7 @@
 #include "optimizer/planmain.h"
 #include "optimizer/restrictinfo.h"
 #include "utils/acl.h"
+#include "utils/backend_progress.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
 #include "utils/sampling.h"
@@ -74,6 +77,8 @@ static const struct FileFdwOption valid_options[] = {
 	{"null", ForeignTableRelationId},
 	{"default", ForeignTableRelationId},
 	{"encoding", ForeignTableRelationId},
+	{"on_error", ForeignTableRelationId},
+	{"log_verbosity", ForeignTableRelationId},
 	{"force_not_null", AttributeRelationId},
 	{"force_null", AttributeRelationId},
 
@@ -726,11 +731,12 @@ fileIterateForeignScan(ForeignScanState *node)
 	MemoryContext oldcontext;
 	TupleTableSlot *slot = node->ss.ss_ScanTupleSlot;
 	bool		found;
+	CopyFromState cstate = festate->cstate;
 	ErrorContextCallback errcallback;
 
 	/* Set up callback to identify error line number. */
 	errcallback.callback = CopyFromErrorCallback;
-	errcallback.arg = (void *) festate->cstate;
+	errcallback.arg = (void *) cstate;
 	errcallback.previous = error_context_stack;
 	error_context_stack = &errcallback;
 
@@ -751,10 +757,36 @@ fileIterateForeignScan(ForeignScanState *node)
 	 * switch in case we are doing that.
 	 */
 	oldcontext = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
-	found = NextCopyFrom(festate->cstate, econtext,
-						 slot->tts_values, slot->tts_isnull);
-	if (found)
+
+	for(;;)
+	{
+		found = NextCopyFrom(cstate, econtext,
+					 slot->tts_values, slot->tts_isnull);
+		if (!found)
+			break;
+
+		if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE &&
+			cstate->escontext->error_occurred)
+		{
+			/*
+			 * Soft error occurred, skip this tuple and just make
+			 * ErrorSaveContext ready for the next NextCopyFrom. Since we don't
+			 * set details_wanted and error_data is not to be filled, just
+			 * resetting error_occurred is enough.
+			 */
+			cstate->escontext->error_occurred = false;
+
+			/* Report that this tuple was skipped by the ON_ERROR clause */
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
+										cstate->num_errors);
+
+			/* Repeat NextCopyFrom() until no soft error occurs */
+			continue;
+		}
+
 		ExecStoreVirtualTuple(slot);
+		break;
+	}
 
 	/* Switch back to original memory context */
 	MemoryContextSwitchTo(oldcontext);
@@ -796,8 +828,19 @@ fileEndForeignScan(ForeignScanState *node)
 	FileFdwExecutionState *festate = (FileFdwExecutionState *) node->fdw_state;
 
 	/* if festate is NULL, we are in EXPLAIN; nothing to do */
-	if (festate)
-		EndCopyFrom(festate->cstate);
+	if (!festate)
+		return;
+
+	if (festate->cstate->opts.on_error != COPY_ON_ERROR_STOP &&
+		festate->cstate->num_errors > 0 &&
+		festate->cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
+		ereport(NOTICE,
+				errmsg_plural("%llu row was skipped due to data type incompatibility",
+							  "%llu rows were skipped due to data type incompatibility",
+							  (unsigned long long) festate->cstate->num_errors,
+							  (unsigned long long) festate->cstate->num_errors));
+
+	EndCopyFrom(festate->cstate);
 }
 
 /*
@@ -1191,6 +1234,25 @@ file_acquire_sample_rows(Relation onerel, int elevel,
 		if (!found)
 			break;
 
+		if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE &&
+			cstate->escontext->error_occurred)
+		{
+			/*
+			 * Soft error occurred, skip this tuple and just make
+			 * ErrorSaveContext ready for the next NextCopyFrom. Since we don't
+			 * set details_wanted and error_data is not to be filled, just
+			 * resetting error_occurred is enough.
+			 */
+			cstate->escontext->error_occurred = false;
+
+			/* Report that this tuple was skipped by the ON_ERROR clause */
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
+										cstate->num_errors);
+
+			/* Repeat NextCopyFrom() until no soft error occurs */
+			continue;
+		}
+
 		/*
 		 * The first targrows sample rows are simply copied into the
 		 * reservoir.  Then we start replacing tuples in the sample until we
diff --git a/contrib/file_fdw/sql/file_fdw.sql b/contrib/file_fdw/sql/file_fdw.sql
index f0548e14e1..a417067e54 100644
--- a/contrib/file_fdw/sql/file_fdw.sql
+++ b/contrib/file_fdw/sql/file_fdw.sql
@@ -150,6 +150,12 @@ SELECT * FROM agg_csv c JOIN agg_text t ON (t.a = c.a) ORDER BY c.a;
 -- error context report tests
 SELECT * FROM agg_bad;               -- ERROR
 
+-- on_error and log_verbosity tests
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'ignore');
+SELECT * FROM agg_bad;
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD log_verbosity 'silent');
+SELECT * FROM agg_bad;
+
 -- misc query tests
 \t on
 SELECT explain_filter('EXPLAIN (VERBOSE, COSTS FALSE) SELECT * FROM agg_csv');
diff --git a/doc/src/sgml/file-fdw.sgml b/doc/src/sgml/file-fdw.sgml
index f2f2af9a59..bb3579b077 100644
--- a/doc/src/sgml/file-fdw.sgml
+++ b/doc/src/sgml/file-fdw.sgml
@@ -126,6 +126,29 @@
    </listitem>
   </varlistentry>
 
+  <varlistentry>
+   <term><literal>on_error</literal></term>
+
+   <listitem>
+    <para>
+     Specifies how to behave when encountering an error converting a column's
+     input value into its data type,
+     the same as <command>COPY</command>'s <literal>ON_ERROR</literal> option.
+    </para>
+   </listitem>
+  </varlistentry>
+
+  <varlistentry>
+   <term><literal>log_verbosity</literal></term>
+
+   <listitem>
+    <para>
+     Specifies the amount of messages emitted by <literal>file_fdw</literal>,
+     the same as <command>COPY</command>'s <literal>LOG_VERBOSITY</literal> option.
+    </para>
+   </listitem>
+  </varlistentry>
+
  </variablelist>
 
  <para>
-- 
2.39.2

From 4cc913a0965ddc07413b256ab979c50cbbfa8b1a Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikos...@oss.nttdata.com>
Date: Wed, 25 Sep 2024 21:30:26 +0900
Subject: [PATCH v5 3/3] Refactor copyfrom.c regarding ON_ERROR

Refactor copyfrom.c For the consistency with 0001 and 0002 patch:
- currently local variable 'skipped' is defined, but remove it because 
  we've already counted it as cstate->num_errors
- simplify the logic using the condtion that CopyOnErrorChoice only accepts
  two values
- add the same comment

---
 src/backend/commands/copyfrom.c | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 47879994f7..1cd004ffbc 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -657,7 +657,6 @@ CopyFrom(CopyFromState cstate)
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
 	int64		processed = 0;
 	int64		excluded = 0;
-	int64		skipped = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -1004,26 +1003,22 @@ CopyFrom(CopyFromState cstate)
 		if (!NextCopyFrom(cstate, econtext, myslot->tts_values, myslot->tts_isnull))
 			break;
 
-		if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
+		if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE &&
 			cstate->escontext->error_occurred)
 		{
 			/*
-			 * Soft error occurred, skip this tuple and deal with error
-			 * information according to ON_ERROR.
+			 * Soft error occurred, skip this tuple and just make
+			 * ErrorSaveContext ready for the next NextCopyFrom. Since we don't
+			 * set details_wanted and error_data is not to be filled, just
+			 * resetting error_occurred is enough.
 			 */
-			if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
-
-				/*
-				 * Just make ErrorSaveContext ready for the next NextCopyFrom.
-				 * Since we don't set details_wanted and error_data is not to
-				 * be filled, just resetting error_occurred is enough.
-				 */
-				cstate->escontext->error_occurred = false;
+			cstate->escontext->error_occurred = false;
 
 			/* Report that this tuple was skipped by the ON_ERROR clause */
 			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
-										 ++skipped);
+										 cstate->num_errors);
 
+			/* Repeat NextCopyFrom() until no soft error occurs */
 			continue;
 		}
 
-- 
2.39.2

Reply via email to