On 2020/03/10 11:36, Fujii Masao wrote:


On 2020/03/09 14:21, Magnus Hagander wrote:
On Sun, Mar 8, 2020 at 10:13 PM Kyotaro Horiguchi
<horikyota....@gmail.com> wrote:

At Fri, 6 Mar 2020 09:54:09 -0800, Magnus Hagander <mag...@hagander.net> wrote 
in
On Fri, Mar 6, 2020 at 1:51 AM Fujii Masao <masao.fu...@oss.nttdata.com> wrote:
I believe that the time required to estimate the backup size is not so large
in most cases, so in the above idea, most users don't need to specify more
option for the estimation. This is good for UI perspective.

OTOH, users who are worried about the estimation time can use
--no-estimate-backup-size option and skip the time-consuming estimation.

Personally, I think this is the best idea. it brings a "reasonable
default", since most people are not going to have this problem, and
yet a good way to get out from the issue for those that potentially
have it. Especially since we are now already showing the state that
"walsender is estimating the size", it should be easy enugh for people
to determine if they need to use this flag or not.

In nitpicking mode, I'd just call the flag --no-estimate-size -- it's
pretty clear things are about backups when you call pg_basebackup, and
it keeps the option a bit more reasonable in length.

+1

I agree to the negative option and the shortened name.  What if both
--no-estimate-size and -P are specifed?  Rejecting as conflicting
options or -P supercedes?  I would choose the former because we don't
know which of them has priority.

I would definitely prefer rejecting an invalid combination of options.

+1

So, I will make the patch adding support for --no-estimate-size option
in pg_basebackup.

Patch attached.

Regards,


--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 987580d6df..a35b690e33 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4388,10 +4388,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
      <entry><structfield>backup_total</structfield></entry>
      <entry><type>bigint</type></entry>
      <entry>
-      Total amount of data that will be streamed. If progress reporting
-      is not enabled in <application>pg_basebackup</application>
-      (i.e., <literal>--progress</literal> option is not specified),
-      this is <literal>0</literal>. Otherwise, this is estimated and
+      Total amount of data that will be streamed. This is estimated and
       reported as of the beginning of
       <literal>streaming database files</literal> phase. Note that
       this is only an approximation since the database
@@ -4399,7 +4396,10 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
       and WAL log may be included in the backup later. This is always
       the same value as <structfield>backup_streamed</structfield>
       once the amount of data streamed exceeds the estimated
-      total size.
+      total size. If the estimation is disabled in
+      <application>pg_basebackup</application>
+      (i.e., <literal>--no-estimate-size</literal> option is specified),
+      this is always <literal>0</literal>.
      </entry>
     </row>
     <row>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml 
b/doc/src/sgml/ref/pg_basebackup.sgml
index 29bf2f9b97..33f9e69418 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -460,20 +460,14 @@ PostgreSQL documentation
         in this case the estimated target size will increase once it passes the
         total estimate without WAL.
        </para>
-       <para>
-        When this is enabled, the backup will start by enumerating the size of
-        the entire database, and then go back and send the actual contents.
-        This may make the backup take slightly longer, and in particular it
-        will take longer before the first data is sent.
-       </para>
        <para>
         Whether this is enabled or not, the
         <structname>pg_stat_progress_basebackup</structname> view
-        report the progress of the backup in the server side. But note
-        that the total amount of data that will be streamed is estimated
-        and reported only when this option is enabled. In other words,
-        <literal>backup_total</literal> column in the view always
-        indicates <literal>0</literal> if this option is disabled.
+        report the progress of the backup in the server side.
+       </para>
+       <para>
+        This option is not allowed when using
+        <option>--no-estimate-size</option>.
        </para>
       </listitem>
      </varlistentry>
@@ -552,6 +546,30 @@ PostgreSQL documentation
        </para>
       </listitem>
      </varlistentry>
+
+     <varlistentry>
+      <term><option>--no-estimate-size</option></term>
+      <listitem>
+       <para>
+        This option prevents the server from estimating the total
+        amount of backup data that will be streamed. In other words,
+        <literal>backup_total</literal> column in the
+        <structname>pg_stat_progress_basebackup</structname>
+        view always indicates <literal>0</literal> if this option is enabled.
+       </para>
+       <para>
+        When this is disabled, the backup will start by enumerating
+        the size of the entire database, and then go back and send
+        the actual contents. This may make the backup take slightly
+        longer, and in particular it will take longer before the first
+        data is sent. This option is useful to avoid such estimation
+        time if it's too long.
+       </para>
+       <para>
+        This option is not allowed when using <option>--progress</option>.
+       </para>
+      </listitem>
+     </varlistentry>
     </variablelist>
    </para>
 
@@ -767,6 +785,13 @@ PostgreSQL documentation
    permissions are enabled on the source cluster.
   </para>
 
+  <para>
+   <application>pg_basebackup</application> asks the server to estimate
+   the total amount of data that will be streamed by default (unless
+   <option>--no-estimate-size</option> is specified) in version 13 or later,
+   and does that only when <option>--progress</option> is specified in
+   the older versions.
+  </para>
  </refsect1>
 
  <refsect1>
diff --git a/src/bin/pg_basebackup/pg_basebackup.c 
b/src/bin/pg_basebackup/pg_basebackup.c
index 48bd838803..1032b474b4 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -102,6 +102,11 @@ typedef void (*WriteDataCallback) (size_t nbytes, char 
*buf,
  */
 #define MINIMUM_VERSION_FOR_TEMP_SLOTS 100000
 
+/*
+ * --no-estimate-size option is supported from version 13.
+ */
+#define MINIMUM_VERSION_FOR_ESTIMATE_SIZE 130000
+
 /*
  * Different ways to include WAL
  */
@@ -121,6 +126,7 @@ static char *label = "pg_basebackup base backup";
 static bool noclean = false;
 static bool checksum_failure = false;
 static bool showprogress = false;
+static bool estimatesize = true;
 static int     verbose = 0;
 static int     compresslevel = 0;
 static IncludeWal includewal = STREAM_WAL;
@@ -386,6 +392,7 @@ usage(void)
        printf(_("      --no-slot          prevent creation of temporary 
replication slot\n"));
        printf(_("      --no-verify-checksums\n"
                         "                         do not verify checksums\n"));
+       printf(_("      --no-estimate-size do not estimate backup size in 
server side\n"));
        printf(_("  -?, --help             show this help, then exit\n"));
        printf(_("\nConnection options:\n"));
        printf(_("  -d, --dbname=CONNSTR   connection string\n"));
@@ -1738,10 +1745,17 @@ BaseBackup(void)
                        fprintf(stderr, "\n");
        }
 
+       /*
+        * In 12 or before, PROGRESS is specified in BASE_BACKUP command to
+        * estimate the backup size only when --progress option is specified.
+        */
+       if (PQserverVersion(conn) < MINIMUM_VERSION_FOR_ESTIMATE_SIZE)
+               estimatesize = showprogress;
+
        basebkp =
                psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s %s",
                                 escaped_label,
-                                showprogress ? "PROGRESS" : "",
+                                estimatesize ? "PROGRESS" : "",
                                 includewal == FETCH_WAL ? "WAL" : "",
                                 fastcheckpoint ? "FAST" : "",
                                 includewal == NO_WAL ? "" : "NOWAIT",
@@ -2066,6 +2080,7 @@ main(int argc, char **argv)
                {"waldir", required_argument, NULL, 1},
                {"no-slot", no_argument, NULL, 2},
                {"no-verify-checksums", no_argument, NULL, 3},
+               {"no-estimate-size", no_argument, NULL, 4},
                {NULL, 0, NULL, 0}
        };
        int                     c;
@@ -2234,6 +2249,9 @@ main(int argc, char **argv)
                        case 3:
                                verify_checksums = false;
                                break;
+                       case 4:
+                               estimatesize = false;
+                               break;
                        default:
 
                                /*
@@ -2356,6 +2374,14 @@ main(int argc, char **argv)
        }
 #endif
 
+       if (showprogress && !estimatesize)
+       {
+               pg_log_error("--progress and --no-estimate-size are 
incompatible options");
+               fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
+                               progname);
+               exit(1);
+       }
+
        /* connection in replication mode to server */
        conn = GetConnection();
        if (!conn)

Reply via email to