On 2024-09-11 23:50, Fujii Masao wrote:
Thanks for your comments!
Here are the review comments for 0001 patch.
+ <literal>silent</literal> excludes verbose messages.
This should clarify that in silent mode, not only verbose messages but
also
default ones are suppressed?
Agreed.
+ cstate->opts.log_verbosity !=
COPY_LOG_VERBOSITY_SILENT)
I think using "cstate->opts.log_verbosity >=
COPY_LOG_VERBOSITY_DEFAULT" instead
might improve readability.
Agreed.
I've also modified a similar code in fileEndForeignScan() in 0002 patch.
- COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages,
default */
- COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+ COPY_LOG_VERBOSITY_SILENT = -1, /* logs none */
+ COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages,
default */
+ COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
Why do we need to assign specific numbers like -1 or 0 in this enum
definition?
CopyFormatOptions is initialized by palloc0() at the beginning of
ProcessCopyOptions().
The reason to assign specific numbers here is to assign
COPY_LOG_VERBOSITY_DEFAULT to 0 as default value and sort elements of
enum according to the amount of logging.
Should we assign 0 to COPY_LOG_VERBOSITY_SILENT and change
opts_out->log_verbosity after the initialization?
Here are the review comments for 0002 patch.
+
pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
+
++skipped);
The skipped tuple count isn't accurate because fileIterateForeignScan()
resets
"skipped" to 0.
Ugh. Fixed to use cstate->num_errors.
BTW CopyFrom() also uses local variable skipped. It isn't reset like
file_fdw, but using local variable seems not necessary since we can use
cstate->num_errors here as well.
I'm going to discuss it in another thread.
+ if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
+ cstate->escontext->error_occurred)
+ {
+ /*
+ * Soft error occurred, skip this tuple and
deal with error
+ * information according to ON_ERROR.
+ */
+ if (cstate->opts.on_error ==
COPY_ON_ERROR_IGNORE)
If COPY_ON_ERROR_IGNORE indicates tuple skipping, shouldn’t we not only
reset
error_occurred but also call "pgstat_progress_update_param" and
continue within
this block?
I may misunderstand your comment, but I thought it to behave as you
expect in the below codes:
```
+ /* Report that this tuple was skipped by the ON_ERROR clause
*/
+ pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
+ ++skipped);
+
+ /* Repeat NextCopyFrom() until no soft error occurs */
+ continue;
```
+ for(;;)
+ {
Using "goto" here might improve readability instead of using a "for"
loop.
Hmm, AFAIU we need to return a slot here before the end of file is
reached.
```
--src/backend/executor/execMain.c [ExecutePlan()]
/*
* if the tuple is null, then we assume there is nothing more
to
* process so we just end the loop...
*/
if (TupIsNull(slot))
break;
```
When ignoring errors, we have to keep calling NextCopyFrom() until we
find a non error tuple or EOF and to do so calling NextCopyFrom() in for
loop seems straight forward.
Replacing the "for" loop using "goto" as follows is possible, but seems
not so readable because of the upward "goto":
```
start_of_getting_tuple:
if (!NextCopyFrom(cstate, econtext,
slot->tts_values, slot->tts_isnull))
goto end_of_fileread;
if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
cstate->escontext->error_occurred)
{
/*
* Soft error occurred, skip this tuple and deal with error
* information according to ON_ERROR.
*/
if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
/*
* Just make ErrorSaveContext ready for the next
NextCopyFrom.
* Since we don't set details_wanted and error_data is not
to
* be filled, just resetting error_occurred is enough.
*/
cstate->escontext->error_occurred = false;
/* Report that this tuple was skipped by the ON_ERROR clause */
pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
cstate->num_errors);
goto start_of_fileread;
}
ExecStoreVirtualTuple(slot);
end_of_getting_tuple:
```
Attached v4 patches reflected these comments.
--
Regards,
--
Atsushi Torikoshi
NTT DATA Group Corporation
From 8643f8e97305bf2a979fe0b5460917c454d63911 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikos...@oss.nttdata.com>
Date: Thu, 19 Sep 2024 00:16:37 +0900
Subject: [PATCH v4 1/2] Add log_verbosity to 'silent'
When on_error is set to ignore, COPY currently logs a NOTICE message
containing the ignored row count.
When using on_error option to file_fdw, it'd be better to omit this
message in some cases, such as frequent access to a malformed file
via the same foreign table.
As a preliminary step to add on_error option to file_fdw, this patch
adds new value 'silent' to log_verbosity and enables to silence
the message.
---
doc/src/sgml/ref/copy.sgml | 10 +++++++---
src/backend/commands/copy.c | 4 +++-
src/backend/commands/copyfrom.c | 3 ++-
src/include/commands/copy.h | 6 ++++--
src/test/regress/expected/copy2.out | 4 +++-
src/test/regress/sql/copy2.sql | 4 ++++
6 files changed, 23 insertions(+), 8 deletions(-)
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 1518af8a04..d87684a5be 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -407,6 +407,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
<literal>verbose</literal>, a <literal>NOTICE</literal> message
containing the line of the input file and the column name whose input
conversion has failed is emitted for each discarded row.
+ When it is set to <literal>silent</literal>, no message is emitted
+ regarding ignored rows.
</para>
</listitem>
</varlistentry>
@@ -428,9 +430,11 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
<listitem>
<para>
Specify the amount of messages emitted by a <command>COPY</command>
- command: <literal>default</literal> or <literal>verbose</literal>. If
- <literal>verbose</literal> is specified, additional messages are emitted
- during processing.
+ command: <literal>default</literal>, <literal>verbose</literal>, or
+ <literal>silent</literal>.
+ If <literal>verbose</literal> is specified, additional messages are
+ emitted during processing.
+ <literal>silent</literal> suppresses both verbose and default messages.
</para>
<para>
This is currently used in <command>COPY FROM</command> command when
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 3bb579a3a4..03eb7a4eba 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -427,9 +427,11 @@ defGetCopyLogVerbosityChoice(DefElem *def, ParseState *pstate)
char *sval;
/*
- * Allow "default", or "verbose" values.
+ * Allow "silent", "default", or "verbose" values.
*/
sval = defGetString(def);
+ if (pg_strcasecmp(sval, "silent") == 0)
+ return COPY_LOG_VERBOSITY_SILENT;
if (pg_strcasecmp(sval, "default") == 0)
return COPY_LOG_VERBOSITY_DEFAULT;
if (pg_strcasecmp(sval, "verbose") == 0)
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 2d3462913e..47879994f7 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1320,7 +1320,8 @@ CopyFrom(CopyFromState cstate)
error_context_stack = errcallback.previous;
if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
- cstate->num_errors > 0)
+ cstate->num_errors > 0 &&
+ cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
ereport(NOTICE,
errmsg_plural("%llu row was skipped due to data type incompatibility",
"%llu rows were skipped due to data type incompatibility",
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 141fd48dc1..9fb13e952d 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -45,8 +45,10 @@ typedef enum CopyOnErrorChoice
*/
typedef enum CopyLogVerbosityChoice
{
- COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages, default */
- COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
+ COPY_LOG_VERBOSITY_SILENT = -1, /* logs none */
+ COPY_LOG_VERBOSITY_DEFAULT = 0, /* logs no additional messages. As this is
+ the default, assign 0 */
+ COPY_LOG_VERBOSITY_VERBOSE, /* logs additional messages */
} CopyLogVerbosityChoice;
/*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 61a19cdc4c..4e752977b5 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -760,6 +760,7 @@ COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
NOTICE: skipping row due to data type incompatibility at line 2 for column "l": null input
CONTEXT: COPY check_ign_err2
NOTICE: 1 row was skipped due to data type incompatibility
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity silent);
-- reset context choice
\set SHOW_CONTEXT errors
SELECT * FROM check_ign_err;
@@ -774,7 +775,8 @@ SELECT * FROM check_ign_err2;
n | m | k | l
---+-----+---+-------
1 | {1} | 1 | 'foo'
-(1 row)
+ 3 | {3} | 3 | 'bar'
+(2 rows)
-- test datatype error that can't be handled as soft: should fail
CREATE TABLE hard_err(foo widget);
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 8b14962194..fa6aa17344 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -533,6 +533,10 @@ COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity verbose);
1 {1} 1 'foo'
2 {2} 2 \N
\.
+COPY check_ign_err2 FROM STDIN WITH (on_error ignore, log_verbosity silent);
+3 {3} 3 'bar'
+4 {4} 4 \N
+\.
-- reset context choice
\set SHOW_CONTEXT errors
base-commit: 95d6e9af07d2e5af2fdd272e72b5b552bad3ea0a
--
2.39.2
From 64bc3c53605d2e71dab816aad31097644b63e6d6 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikos...@oss.nttdata.com>
Date: Thu, 19 Sep 2024 00:18:23 +0900
Subject: [PATCH v4 2/2] Add on_error and log_verbosity options to file_fdw
9e2d870, b725b7e and f5a2278 introduced on_error and log_verbosity to COPY.
Since these options seems useful to file_fdw when accessing dirty data,
this patch adds them to file_fdw.
---
contrib/file_fdw/expected/file_fdw.out | 18 ++++++++
contrib/file_fdw/file_fdw.c | 60 +++++++++++++++++++++++---
contrib/file_fdw/sql/file_fdw.sql | 6 +++
doc/src/sgml/file-fdw.sgml | 23 ++++++++++
4 files changed, 100 insertions(+), 7 deletions(-)
diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index 86c148a86b..2ea0deeb10 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -206,6 +206,24 @@ SELECT * FROM agg_csv c JOIN agg_text t ON (t.a = c.a) ORDER BY c.a;
SELECT * FROM agg_bad; -- ERROR
ERROR: invalid input syntax for type real: "aaa"
CONTEXT: COPY agg_bad, line 3, column b: "aaa"
+-- on_error and log_verbosity tests
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'ignore');
+SELECT * FROM agg_bad;
+NOTICE: 1 row was skipped due to data type incompatibility
+ a | b
+-----+--------
+ 100 | 99.097
+ 42 | 324.78
+(2 rows)
+
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD log_verbosity 'silent');
+SELECT * FROM agg_bad;
+ a | b
+-----+--------
+ 100 | 99.097
+ 42 | 324.78
+(2 rows)
+
-- misc query tests
\t on
SELECT explain_filter('EXPLAIN (VERBOSE, COSTS FALSE) SELECT * FROM agg_csv');
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index d16821f8e1..82fd9cbe55 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -22,8 +22,10 @@
#include "catalog/pg_authid.h"
#include "catalog/pg_foreign_table.h"
#include "commands/copy.h"
+#include "commands/copyfrom_internal.h"
#include "commands/defrem.h"
#include "commands/explain.h"
+#include "commands/progress.h"
#include "commands/vacuum.h"
#include "foreign/fdwapi.h"
#include "foreign/foreign.h"
@@ -34,6 +36,7 @@
#include "optimizer/planmain.h"
#include "optimizer/restrictinfo.h"
#include "utils/acl.h"
+#include "utils/backend_progress.h"
#include "utils/memutils.h"
#include "utils/rel.h"
#include "utils/sampling.h"
@@ -74,6 +77,8 @@ static const struct FileFdwOption valid_options[] = {
{"null", ForeignTableRelationId},
{"default", ForeignTableRelationId},
{"encoding", ForeignTableRelationId},
+ {"on_error", ForeignTableRelationId},
+ {"log_verbosity", ForeignTableRelationId},
{"force_not_null", AttributeRelationId},
{"force_null", AttributeRelationId},
@@ -725,12 +730,12 @@ fileIterateForeignScan(ForeignScanState *node)
ExprContext *econtext;
MemoryContext oldcontext;
TupleTableSlot *slot = node->ss.ss_ScanTupleSlot;
- bool found;
+ CopyFromState cstate = festate->cstate;
ErrorContextCallback errcallback;
/* Set up callback to identify error line number. */
errcallback.callback = CopyFromErrorCallback;
- errcallback.arg = (void *) festate->cstate;
+ errcallback.arg = (void *) cstate;
errcallback.previous = error_context_stack;
error_context_stack = &errcallback;
@@ -751,10 +756,40 @@ fileIterateForeignScan(ForeignScanState *node)
* switch in case we are doing that.
*/
oldcontext = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
- found = NextCopyFrom(festate->cstate, econtext,
- slot->tts_values, slot->tts_isnull);
- if (found)
+
+ for(;;)
+ {
+ if (!NextCopyFrom(cstate, econtext,
+ slot->tts_values, slot->tts_isnull))
+ break;
+
+ if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
+ cstate->escontext->error_occurred)
+ {
+ /*
+ * Soft error occurred, skip this tuple and deal with error
+ * information according to ON_ERROR.
+ */
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+
+ /*
+ * Just make ErrorSaveContext ready for the next NextCopyFrom.
+ * Since we don't set details_wanted and error_data is not to
+ * be filled, just resetting error_occurred is enough.
+ */
+ cstate->escontext->error_occurred = false;
+
+ /* Report that this tuple was skipped by the ON_ERROR clause */
+ pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
+ cstate->num_errors);
+
+ /* Repeat NextCopyFrom() until no soft error occurs */
+ continue;
+ }
+
ExecStoreVirtualTuple(slot);
+ break;
+ }
/* Switch back to original memory context */
MemoryContextSwitchTo(oldcontext);
@@ -796,8 +831,19 @@ fileEndForeignScan(ForeignScanState *node)
FileFdwExecutionState *festate = (FileFdwExecutionState *) node->fdw_state;
/* if festate is NULL, we are in EXPLAIN; nothing to do */
- if (festate)
- EndCopyFrom(festate->cstate);
+ if (!festate)
+ return;
+
+ if (festate->cstate->opts.on_error != COPY_ON_ERROR_STOP &&
+ festate->cstate->num_errors > 0 &&
+ festate->cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
+ ereport(NOTICE,
+ errmsg_plural("%llu row was skipped due to data type incompatibility",
+ "%llu rows were skipped due to data type incompatibility",
+ (unsigned long long) festate->cstate->num_errors,
+ (unsigned long long) festate->cstate->num_errors));
+
+ EndCopyFrom(festate->cstate);
}
/*
diff --git a/contrib/file_fdw/sql/file_fdw.sql b/contrib/file_fdw/sql/file_fdw.sql
index f0548e14e1..a417067e54 100644
--- a/contrib/file_fdw/sql/file_fdw.sql
+++ b/contrib/file_fdw/sql/file_fdw.sql
@@ -150,6 +150,12 @@ SELECT * FROM agg_csv c JOIN agg_text t ON (t.a = c.a) ORDER BY c.a;
-- error context report tests
SELECT * FROM agg_bad; -- ERROR
+-- on_error and log_verbosity tests
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'ignore');
+SELECT * FROM agg_bad;
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD log_verbosity 'silent');
+SELECT * FROM agg_bad;
+
-- misc query tests
\t on
SELECT explain_filter('EXPLAIN (VERBOSE, COSTS FALSE) SELECT * FROM agg_csv');
diff --git a/doc/src/sgml/file-fdw.sgml b/doc/src/sgml/file-fdw.sgml
index f2f2af9a59..bb3579b077 100644
--- a/doc/src/sgml/file-fdw.sgml
+++ b/doc/src/sgml/file-fdw.sgml
@@ -126,6 +126,29 @@
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>on_error</literal></term>
+
+ <listitem>
+ <para>
+ Specifies how to behave when encountering an error converting a column's
+ input value into its data type,
+ the same as <command>COPY</command>'s <literal>ON_ERROR</literal> option.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>log_verbosity</literal></term>
+
+ <listitem>
+ <para>
+ Specifies the amount of messages emitted by <literal>file_fdw</literal>,
+ the same as <command>COPY</command>'s <literal>LOG_VERBOSITY</literal> option.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
<para>
--
2.39.2