On Fri, Jan 26, 2024 at 2:30 AM Yugo NAGATA <nag...@sraoss.co.jp> wrote:

> On Fri, 26 Jan 2024 00:00:57 -0700
> "David G. Johnston" <david.g.johns...@gmail.com> wrote:
>
> > I will need to make this tweak and probably a couple others to my own
> > suggestions in 12 hours or so.
> >
>
>
And here is my v2.

Notably I choose to introduce the verbiage "soft error" and then define in
the ON_ERROR clause the specific soft error that matters here - "invalid
input syntax".

I also note the log message behavior when ignore mode is chosen.  I haven't
confirmed that it is accurate but that is readily tweaked if approved of.

David J.
From 2d656fe0a69ea1349472d88794c18f8a2e2d37e9 Mon Sep 17 00:00:00 2001
From: "David G. Johnston" <david.g.johns...@gmail.com>
Date: Thu, 25 Jan 2024 23:35:44 -0700
Subject: [PATCH] v2 Improve ON_ERROR verbiage in COPY documentation

---
 doc/src/sgml/ref/copy.sgml | 48 +++++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 21a5c4a052..38ce0b0a4c 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -90,6 +90,13 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     in the <structname>pg_stat_progress_copy</structname> view. See
     <xref linkend="copy-progress-reporting"/> for details.
   </para>
+
+  <para>
+    By default, <command>COPY</command> will fail if it encounters an error
+    during processing. For use cases where a best-effort attempt at loading
+    the entire file is desired, the <literal>ON_ERROR</literal> clause can
+    be used to specify some other behavior.
+  </para>
  </refsect1>
 
  <refsect1>
@@ -378,17 +385,23 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
     <term><literal>ON_ERROR</literal></term>
     <listitem>
      <para>
-      Specifies which <replaceable class="parameter">
-      error_action</replaceable> to perform when there is malformed data in the input.
-      Currently, only <literal>stop</literal> (default) and <literal>ignore</literal>
-      values are supported.
-      If the <literal>stop</literal> value is specified,
-      <command>COPY</command> stops operation at the first error.
-      If the <literal>ignore</literal> value is specified,
-      <command>COPY</command> skips malformed data and continues copying data.
-      The option is allowed only in <command>COPY FROM</command>.
-      Only <literal>stop</literal> value is allowed when
-      using <literal>binary</literal> format.
+      Specifies which how to behave when encountering a soft error.
+      An <replaceable class="parameter">error_action</replaceable> value of
+      <literal>stop</literal> means fail the command, while
+      <literal>ignore</literal> means discard the input row and continue with the next one.
+      The default is <literal>stop</literal>
+     </para>
+     <para>
+      The <literal>ignore</literal> option is applicable only for <command>COPY FROM</command>
+      when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
+     </para>
+     <para>
+      The only relevant soft error is "invalid input syntax", which manifests when attempting
+      to create a column value from the text input.
+     </para>
+     <para>
+      An <literal>INFO</literal> level context message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one row was discarded.
      </para>
     </listitem>
    </varlistentry>
@@ -576,15 +589,12 @@ COPY <replaceable class="parameter">count</replaceable>
    </para>
 
    <para>
-    <command>COPY</command> stops operation at the first error when
-    <literal>ON_ERROR</literal> is not specified. This
-    should not lead to problems in the event of a <command>COPY
-    TO</command>, but the target table will already have received
-    earlier rows in a <command>COPY FROM</command>. These rows will not
-    be visible or accessible, but they still occupy disk space. This might
+    The <command>COPY FROM</command> command physically inserts input rows
+    into the table as it progresses.  If the command fails these rows are
+    left in a deleted state, still occupying disk space. This might
     amount to a considerable amount of wasted disk space if the failure
-    happened well into a large copy operation. You might wish to invoke
-    <command>VACUUM</command> to recover the wasted space.
+    happened well into a large copy operation. <command>VACUUM</command>
+    should be used to recover the wasted space.
    </para>
 
    <para>
-- 
2.34.1

Reply via email to