On 2024-09-20 11:27, Fujii Masao wrote:
Thanks for your comments!
On 2024/09/19 23:16, torikoshia wrote:
- COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional
messages, default */
- COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+ COPY_LOG_VERBOSITY_SILENT = -1, /* logs none */
+ COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional
messages, default */
+ COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages
*/
Why do we need to assign specific numbers like -1 or 0 in this enum
definition?
CopyFormatOptions is initialized by palloc0() at the beginning of
ProcessCopyOptions().
The reason to assign specific numbers here is to assign
COPY_LOG_VERBOSITY_DEFAULT to 0 as default value and sort elements of
enum according to the amount of logging.
Understood.
BTW CopyFrom() also uses local variable skipped. It isn't reset like
file_fdw, but using local variable seems not necessary since we can
use cstate->num_errors here as well.
Sounds reasonable to me.
+ if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
+ cstate->escontext->error_occurred)
+ {
+ /*
+ * Soft error occurred, skip this tuple and
deal with error
+ * information according to ON_ERROR.
+ */
+ if (cstate->opts.on_error ==
COPY_ON_ERROR_IGNORE)
If COPY_ON_ERROR_IGNORE indicates tuple skipping, shouldn’t we not
only reset
error_occurred but also call "pgstat_progress_update_param" and
continue within
this block?
I may misunderstand your comment, but I thought it to behave as you
expect in the below codes:
The "on_error == COPY_ON_ERROR_IGNORE" condition isn't needed since
"on_error != COPY_ON_ERROR_STOP" is already checked, and on_error
accepts
only two values "ignore" and "stop." I assume you added it with
a future option in mind, like "set_to_null" (as discussed in another
thread).
However, I’m not sure how much this helps such future changes.
So, how about simplifying the code by replacing "on_error !=
COPY_ON_ERROR_STOP"
with "on_error == COPY_ON_ERROR_IGNORE" at the top and removing
the "on_error == COPY_ON_ERROR_IGNORE" check in the middle?
It could improve readability.
Thanks for the explanation and suggestion.
Since there is almost the same code in copyfrom.c, attached 0003 patch
for refactoring both.
+ for(;;)
+ {
Using "goto" here might improve readability instead of using a "for"
loop.
Hmm, AFAIU we need to return a slot here before the end of file is
reached.
```
--src/backend/executor/execMain.c [ExecutePlan()]
/*
* if the tuple is null, then we assume there is nothing
more to
* process so we just end the loop...
*/
if (TupIsNull(slot))
break;
```
When ignoring errors, we have to keep calling NextCopyFrom() until we
find a non error tuple or EOF and to do so calling NextCopyFrom() in
for loop seems straight forward.
Replacing the "for" loop using "goto" as follows is possible, but
seems not so readable because of the upward "goto":
Understood.
Attached v4 patches reflected these comments.
Thanks for updating the patches!
The tab-completion needs to be updated to support the "silent" option?
Yes, updated 0002 patch.
--
Regards,
--
Atsushi Torikoshi
NTT DATA Group Corporation
From 89a97461eab2e14ab84778d93f0b4e91999606df Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikos...@oss.nttdata.com>
Date: Tue, 24 Sep 2024 14:32:54 +0900
Subject: [PATCH v4 1/3] Add log_verbosity to 'silent'
Currently when on_error is set to ignore, COPY logs a NOTICE message
containing the ignored row count.
When using on_error option to file_fdw, it'd be better to omit this
message in some cases, such as frequent access to a malformed file
via the same foreign table.
As a preliminary step to add on_error option to file_fdw, this patch
adds new value 'silent' to log_verbosity and enables to silence
the message.
---
doc/src/sgml/ref/copy.sgml | 10 +++++++---
src/backend/commands/copy.c | 4 +++-
src/backend/commands/copyfrom.c | 3 ++-
src/include/commands/copy.h | 6 ++++--
src/test/regress/expected/copy2.out | 4 +++-
src/test/regress/sql/copy2.sql | 4 ++++
6 files changed, 23 insertions(+), 8 deletions(-)
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 1518af8a04..d87684a5be 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -407,6 +407,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
<literal>verbose</literal>, a <literal>NOTICE</literal> message
containing the line of the input file and the column name whose input
conversion has failed is emitted for each discarded row.
+ When it is set to <literal>silent</literal>, no message is emitted
+ regarding ignored rows.
</para>
</listitem>
</varlistentry>
@@ -428,9 +430,11 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
<listitem>
<para>
Specify the amount of messages emitted by a <command>COPY</command>
- command: <literal>default</literal> or <literal>verbose</literal>. If
- <literal>verbose</literal> is specified, additional messages are emitted
- during processing.
+ command: <literal>default</literal>, <literal>verbose</literal>, or
+ <literal>silent</literal>.
+ If <literal>verbose</literal> is specified, additional messages are
+ emitted during processing.
+ <literal>silent</literal> suppresses both verbose and default messages.
</para>
<para>
This is currently used in <command>COPY FROM</command> command when
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 3bb579a3a4..03eb7a4eba 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -427,9 +427,11 @@ defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
char *sval;
/*
- * Allow "default", or "verbose" values.
+ * Allow "silent", "default", or "verbose" values.
*/
sval = defGetString(def);
+ if (pg_strcasecmp(sval, "silent") == 0)
+ return COPY_LOG_VERBOSITY_SILENT;
if (pg_strcasecmp(sval, "default") == 0)
return COPY_LOG_VERBOSITY_DEFAULT;
if (pg_strcasecmp(sval, "verbose") == 0)
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 2d3462913e..47879994f7 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1320,7 +1320,8 @@ CopyFrom(CopyFromState cstate)
error_context_stack = errcallback.previous;
if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
- cstate->num_errors > 0)
+ cstate->num_errors > 0 &&
+ cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
ereport(NOTICE,
errmsg_plural("%llu row was skipped due to data type incompatibility",
"%llu rows were skipped due to data type incompatibility",
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 141fd48dc1..9fb13e952d 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -45,8 +45,10 @@ typedef enum CopyOnErrorChoice
*/
typedef enum CopyLogVerbosityChoice
{
- COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
- COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+ COPY_LOG_VERBOSITY_SILENT = -1, /* logs none */
+ COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages. As this is
+ the default, assign 0 */
+ COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
} CopyLogVerbosityChoice;
/*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 61a19cdc4c..4e752977b5 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -760,6 +760,7 @@ COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
NOTICE: skipping row due to data type incompatibility at line 2 for column "l": null input
CONTEXT: COPY check_ign_err2
NOTICE: 1 row was skipped due to data type incompatibility
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity silent);
-- reset context choice
\set SHOW_CONTEXT errors
SELECT * FROM check_ign_err;
@@ -774,7 +775,8 @@ SELECT * FROM check_ign_err2;
n | m | k | l
---+-----+---+-------
1 | {1} | 1 | 'foo'
-(1 row)
+ 3 | {3} | 3 | 'bar'
+(2 rows)
-- test datatype error that can't be handled as soft: should fail
CREATE TABLE hard_err(foo widget);
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 8b14962194..fa6aa17344 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -533,6 +533,10 @@ COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
1 {1} 1 'foo'
2 {2} 2 \N
\.
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity silent);
+3 {3} 3 'bar'
+4 {4} 4 \N
+\.
-- reset context choice
\set SHOW_CONTEXT errors
base-commit: bbba59e69a56e1622e270f5e47b402c3a904cefc
--
2.39.2
From c3ad3211c6660f74abda30b0607da6cf9f7fb2b0 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikos...@oss.nttdata.com>
Date: Tue, 24 Sep 2024 14:36:17 +0900
Subject: [PATCH v4 2/3] Add on_error and log_verbosity options to file_fdw
9e2d870, b725b7e and f5a2278 introduced on_error and log_verbosity to COPY.
Since these options seem useful to file_fdw when accessing dirty data,
this patch adds them to file_fdw.
---
contrib/file_fdw/expected/file_fdw.out | 18 ++++++++
contrib/file_fdw/file_fdw.c | 60 +++++++++++++++++++++++---
contrib/file_fdw/sql/file_fdw.sql | 6 +++
doc/src/sgml/file-fdw.sgml | 23 ++++++++++
src/bin/psql/tab-complete.c | 2 +-
5 files changed, 101 insertions(+), 8 deletions(-)
diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index 86c148a86b..2ea0deeb10 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -206,6 +206,24 @@ SELECT * FROM agg_csv c JOIN agg_text t ON (t.a = c.a) ORDER BY c.a;
SELECT * FROM agg_bad; -- ERROR
ERROR: invalid input syntax for type real: "aaa"
CONTEXT: COPY agg_bad, line 3, column b: "aaa"
+-- on_error and log_verbosity tests
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'ignore');
+SELECT * FROM agg_bad;
+NOTICE: 1 row was skipped due to data type incompatibility
+ a | b
+-----+--------
+ 100 | 99.097
+ 42 | 324.78
+(2 rows)
+
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD log_verbosity 'silent');
+SELECT * FROM agg_bad;
+ a | b
+-----+--------
+ 100 | 99.097
+ 42 | 324.78
+(2 rows)
+
-- misc query tests
\t on
SELECT explain_filter('EXPLAIN (VERBOSE, COSTS FALSE) SELECT * FROM agg_csv');
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index d16821f8e1..82fd9cbe55 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -22,8 +22,10 @@
#include "catalog/pg_authid.h"
#include "catalog/pg_foreign_table.h"
#include "commands/copy.h"
+#include "commands/copyfrom_internal.h"
#include "commands/defrem.h"
#include "commands/explain.h"
+#include "commands/progress.h"
#include "commands/vacuum.h"
#include "foreign/fdwapi.h"
#include "foreign/foreign.h"
@@ -34,6 +36,7 @@
#include "optimizer/planmain.h"
#include "optimizer/restrictinfo.h"
#include "utils/acl.h"
+#include "utils/backend_progress.h"
#include "utils/memutils.h"
#include "utils/rel.h"
#include "utils/sampling.h"
@@ -74,6 +77,8 @@ static const struct FileFdwOption valid_options[] = {
{"null", ForeignTableRelationId},
{"default", ForeignTableRelationId},
{"encoding", ForeignTableRelationId},
+ {"on_error", ForeignTableRelationId},
+ {"log_verbosity", ForeignTableRelationId},
{"force_not_null", AttributeRelationId},
{"force_null", AttributeRelationId},
@@ -725,12 +730,12 @@ fileIterateForeignScan(ForeignScanState *node)
ExprContext *econtext;
MemoryContext oldcontext;
TupleTableSlot *slot = node->ss.ss_ScanTupleSlot;
- bool found;
+ CopyFromState cstate = festate->cstate;
ErrorContextCallback errcallback;
/* Set up callback to identify error line number. */
errcallback.callback = CopyFromErrorCallback;
- errcallback.arg = (void *) festate->cstate;
+ errcallback.arg = (void *) cstate;
errcallback.previous = error_context_stack;
error_context_stack = &errcallback;
@@ -751,10 +756,40 @@ fileIterateForeignScan(ForeignScanState *node)
* switch in case we are doing that.
*/
oldcontext = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
- found = NextCopyFrom(festate->cstate, econtext,
- slot->tts_values, slot->tts_isnull);
- if (found)
+
+ for(;;)
+ {
+ if (!NextCopyFrom(cstate, econtext,
+ slot->tts_values, slot->tts_isnull))
+ break;
+
+ if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
+ cstate->escontext->error_occurred)
+ {
+ /*
+ * Soft error occurred, skip this tuple and deal with error
+ * information according to ON_ERROR.
+ */
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+
+ /*
+ * Just make ErrorSaveContext ready for the next NextCopyFrom.
+ * Since we don't set details_wanted and error_data is not to
+ * be filled, just resetting error_occurred is enough.
+ */
+ cstate->escontext->error_occurred = false;
+
+ /* Report that this tuple was skipped by the ON_ERROR clause */
+ pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
+ cstate->num_errors);
+
+ /* Repeat NextCopyFrom() until no soft error occurs */
+ continue;
+ }
+
ExecStoreVirtualTuple(slot);
+ break;
+ }
/* Switch back to original memory context */
MemoryContextSwitchTo(oldcontext);
@@ -796,8 +831,19 @@ fileEndForeignScan(ForeignScanState *node)
FileFdwExecutionState *festate = (FileFdwExecutionState *) node->fdw_state;
/* if festate is NULL, we are in EXPLAIN; nothing to do */
- if (festate)
- EndCopyFrom(festate->cstate);
+ if (!festate)
+ return;
+
+ if (festate->cstate->opts.on_error != COPY_ON_ERROR_STOP &&
+ festate->cstate->num_errors > 0 &&
+ festate->cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
+ ereport(NOTICE,
+ errmsg_plural("%llu row was skipped due to data type incompatibility",
+ "%llu rows were skipped due to data type incompatibility",
+ (unsigned long long) festate->cstate->num_errors,
+ (unsigned long long) festate->cstate->num_errors));
+
+ EndCopyFrom(festate->cstate);
}
/*
diff --git a/contrib/file_fdw/sql/file_fdw.sql b/contrib/file_fdw/sql/file_fdw.sql
index f0548e14e1..a417067e54 100644
--- a/contrib/file_fdw/sql/file_fdw.sql
+++ b/contrib/file_fdw/sql/file_fdw.sql
@@ -150,6 +150,12 @@ SELECT * FROM agg_csv c JOIN agg_text t ON (t.a = c.a) ORDER BY c.a;
-- error context report tests
SELECT * FROM agg_bad; -- ERROR
+-- on_error and log_verbosity tests
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'ignore');
+SELECT * FROM agg_bad;
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD log_verbosity 'silent');
+SELECT * FROM agg_bad;
+
-- misc query tests
\t on
SELECT explain_filter('EXPLAIN (VERBOSE, COSTS FALSE) SELECT * FROM agg_csv');
diff --git a/doc/src/sgml/file-fdw.sgml b/doc/src/sgml/file-fdw.sgml
index f2f2af9a59..bb3579b077 100644
--- a/doc/src/sgml/file-fdw.sgml
+++ b/doc/src/sgml/file-fdw.sgml
@@ -126,6 +126,29 @@
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>on_error</literal></term>
+
+ <listitem>
+ <para>
+ Specifies how to behave when encountering an error converting a column's
+ input value into its data type,
+ the same as <command>COPY</command>'s <literal>ON_ERROR</literal> option.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>log_verbosity</literal></term>
+
+ <listitem>
+ <para>
+ Specifies the amount of messages emitted by <literal>file_fdw</literal>,
+ the same as <command>COPY</command>'s <literal>LOG_VERBOSITY</literal> option.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
<para>
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index a7ccde6d7d..6530b0f1ce 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2916,7 +2916,7 @@ psql_completion(const char *text, int start, int end)
/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
- COMPLETE_WITH("default", "verbose");
+ COMPLETE_WITH("silent", "default", "verbose");
/* Complete COPY <sth> FROM <sth> WITH (<options>) */
else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
--
2.39.2
From 096e22c705b165cc1c8f6850fee1e6460a2b3a9d Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikos...@oss.nttdata.com>
Date: Tue, 24 Sep 2024 14:40:56 +0900
Subject: [PATCH v4 3/3] Refactor CopyOnErrorChoice option handling
Currently we assume that the possible values of CopyOnErrorChoice will
increase in the futue, but it decreases the readability.
This patch simplifies the logic using the condition that CopyOnErrorChoice
has only 2 options now.
---
contrib/file_fdw/file_fdw.c | 18 ++++++------------
src/backend/commands/copyfrom.c | 17 ++++++-----------
2 files changed, 12 insertions(+), 23 deletions(-)
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 82fd9cbe55..2cf1f2cc85 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -763,27 +763,21 @@ fileIterateForeignScan(ForeignScanState *node)
slot->tts_values, slot->tts_isnull))
break;
- if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE &&
cstate->escontext->error_occurred)
{
/*
- * Soft error occurred, skip this tuple and deal with error
- * information according to ON_ERROR.
+ * Soft error occurred, skip this tuple and just make
+ * ErrorSaveContext ready for the next NextCopyFrom. Since we don't
+ * set details_wanted and error_data is not to be filled, just
+ * resetting error_occurred is enough.
*/
- if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
-
- /*
- * Just make ErrorSaveContext ready for the next NextCopyFrom.
- * Since we don't set details_wanted and error_data is not to
- * be filled, just resetting error_occurred is enough.
- */
- cstate->escontext->error_occurred = false;
+ cstate->escontext->error_occurred = false;
/* Report that this tuple was skipped by the ON_ERROR clause */
pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
cstate->num_errors);
- /* Repeat NextCopyFrom() until no soft error occurs */
continue;
}
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 47879994f7..6b5192cbed 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1004,21 +1004,16 @@ CopyFrom(CopyFromState cstate)
if (!NextCopyFrom(cstate, econtext, myslot->tts_values, myslot->tts_isnull))
break;
- if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE &&
cstate->escontext->error_occurred)
{
/*
- * Soft error occurred, skip this tuple and deal with error
- * information according to ON_ERROR.
+ * Soft error occurred, skip this tuple and just make
+ * ErrorSaveContext ready for the next NextCopyFrom. Since we don't
+ * set details_wanted and error_data is not to be filled, just
+ * resetting error_occurred is enough.
*/
- if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
-
- /*
- * Just make ErrorSaveContext ready for the next NextCopyFrom.
- * Since we don't set details_wanted and error_data is not to
- * be filled, just resetting error_occurred is enough.
- */
- cstate->escontext->error_occurred = false;
+ cstate->escontext->error_occurred = false;
/* Report that this tuple was skipped by the ON_ERROR clause */
pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
--
2.39.2