Re: [DOCS] Updated docs on base backups

Amir Rohan Sat, 26 Sep 2015 18:24:39 -0700

On 09/26/2015 12:02 PM, Amir Rohan wrote:
> Hi all,
> 
> See attached changed to current docs on subject. They have clearly
> been reworked by multiple people piecemeal and had many issues
> which made less than a joy to read, in fact they were damn frustrating
> to read.


Updated version, with missing step for file system level backup
revised and added back.

Amir

>From fcae4deeda20621b04c8ca33bb5269ba40251b77 Mon Sep 17 00:00:00 2001
From: root <root@localhost.localdomain>
Date: Sat, 26 Sep 2015 11:50:43 +0300
Subject: [PATCH] Rewritten documentation on base backups V2


diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 7413666..5b6d71c 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -427,8 +427,8 @@ tar -cf backup.tar /usr/local/pgsql/data
   <para>
    If simultaneous snapshots are not possible, one option is to shut down
    the database server long enough to establish all the frozen snapshots.
-   Another option is to perform a continuous archiving base backup (<xref
-   linkend="backup-base-backup">) because such backups are immune to file
+   Another option is to create a base backup using the continuous archiving feature
+   (<xref linkend="backup-base-backup">) because such backups are immune to file
    system changes during the backup.  This requires enabling continuous
    archiving just during the backup process; restore is done using
    continuous archive recovery (<xref linkend="backup-pitr-recovery">).
@@ -752,60 +752,65 @@ test ! -f /mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_xlog/
    <title>Making a Base Backup</title>
 
    <para>
-    The easiest way to perform a base backup is to use the
-    <xref linkend="app-pgbasebackup"> tool. It can create
-    a base backup either as regular files or as a tar archive. If more
-    flexibility than <xref linkend="app-pgbasebackup"> can provide is
-    required, you can also make a base backup using the low level API
-    (see <xref linkend="backup-lowlevel-base-backup">).
+    A base backup consists of one or more WAL files and a small textual
+    file containing associated metadata. Together with a file system
+    level backup, a base backup is all that's required to recreate the
+    database's state at some point in the past. Once a base backup is made,
+    the WAL files that precede its creation are no longer necessary in order
+    to recover the database to some later point in time.
+   </para>
+
+   <para>
+    The interval between base backups should usually be
+    chosen based on how much storage you want to expend on archived WAL
+    files, since you must keep all the archived WAL files back to your
+    last base backup.
+    You should also consider how long you are prepared to spend
+    recovering, if recovery should be necessary &mdash; the system will have to
+    replay all those WAL segments, and that could take awhile if it has
+    been a long time since the last base backup.
    </para>
 
    <para>
-    It is not necessary to be concerned about the amount of time it takes
-    to make a base backup. However, if you normally run the
-    server with <varname>full_page_writes</> disabled, you might notice a drop
-    in performance while the backup runs since <varname>full_page_writes</> is
-    effectively forced on during backup mode.
+    Creating a base backup may be a lengthy process if you have a lots of data.
+    Be aware that If you normally run the server with <varname>full_page_writes</>
+    disabled, you might notice a drop in performance while the backup runs since
+    <varname>full_page_writes</> is effectively forced on during backup mode.
    </para>
 
+
    <para>
     To make use of the backup, you will need to keep all the WAL
     segment files generated during and after the file system backup.
-    To aid you in doing this, the base backup process
-    creates a <firstterm>backup history file</> that is immediately
-    stored into the WAL archive area. This file is named after the first
-    WAL segment file that you need for the file system backup.
-    For example, if the starting WAL file is
-    <literal>0000000100001234000055CD</> the backup history file will be
-    named something like
-    <literal>0000000100001234000055CD.007C9330.backup</>. (The second
-    part of the file name stands for an exact position within the WAL
-    file, and can ordinarily be ignored.) Once you have safely archived
-    the file system backup and the WAL segment files used during the
-    backup (as specified in the backup history file), all archived WAL
-    segments with names numerically less are no longer needed to recover
-    the file system backup and can be deleted. However, you should
-    consider keeping several backup sets to be absolutely certain that
-    you can recover your data.
+    To aid you in doing this, the base backup process creates a
+    a text file, termed a <firstterm>backup history file</>, which details
+    the range of WAL files making up the base backup, together with other
+    useful information such as the date of the backup, and the text label
+    associated with the backup which you provide when initiating the backup.
    </para>
 
    <para>
-    The backup history file is just a small text file. It contains the
-    label string you gave to <xref linkend="app-pgbasebackup">, as well as
-    the starting and ending times and WAL segments of the backup.
-    If you used the label to identify the associated dump file,
-    then the archived history file is enough to tell you which dump file to
-    restore.
+    The location of this file depends on the method used to perform the backup.
+    The simplest way to perform a base backup is by using <xref linkend="app-
+    pgbasebackup">, which uses the <literal>postgres</>'s replication mechanism to
+    connect to a running database and create a  backup archive or directory. When
+    using pg_basebackup, the backup history file is named
+    <filename>backup_label</> and can be found the root of the backup
+    archive/directory created by pg_basebackup. Note that, because pg_basebackup
+    creates a single archive (or directory) to hold the backup, the
+    <filename>backup_label</> file lists only the backup's first WAL file,  and
+    does not list the last WAL in the series. Ass long as you provide  the
+    <literal>-x</> switch to pg_basebackup, it will fetch a copy of  all the
+    required WAL files in the backup and save them in the created
+    archive/directory. It is recommended that you use the <literal>-l</> switch to
+    set a label for the backup. This will become part of the backup history file,
+    for future reference.
    </para>
 
    <para>
-    Since you have to keep around all the archived WAL files back to your
-    last base backup, the interval between base backups should usually be
-    chosen based on how much storage you want to expend on archived WAL
-    files.  You should also consider how long you are prepared to spend
-    recovering, if recovery should be necessary &mdash; the system will have to
-    replay all those WAL segments, and that could take awhile if it has
-    been a long time since the last base backup.
+    If more flexibility than <xref linkend="app-pgbasebackup"> can provide is
+    required, you can also make a base backup using the low level API
+    (see <xref linkend="backup-lowlevel-base-backup">).
    </para>
   </sect2>
 
@@ -833,82 +838,118 @@ SELECT pg_start_backup('label');
      where <literal>label</> is any string you want to use to uniquely
      identify this backup operation.  (One good practice is to use the
      full path where you intend to put the backup dump file.)
-     <function>pg_start_backup</> creates a <firstterm>backup label</> file,
-     called <filename>backup_label</>, in the cluster directory with
-     information about your backup, including the start time and label
-     string.  The function also creates a <firstterm>tablespace map</> file,
-     called <filename>tablespace_map</>, in the cluster directory with
+     After you initiate <function>pg_start_backup</>, The backup history file
+     (See previous section) describing your backup will be created in the root
+     of your cluster directory with the name <filename>backup_label</> .
+     This file will be moved to the archive directory under a new name
+     when the backup ends (if archiving is not enabled, it is simply deleted then).
+     <function>pg_start_backup</> also creates a <firstterm>tablespace map</> file,
+     called <filename>tablespace_map</>, in the cluster directory, with
      information about tablespace symbolic links in <filename>pg_tblspc/</>
      if one or more such link is present.  Both files are critical to the
      integrity of the backup, should you need to restore from it.
     </para>
 
     <para>
-     It does not matter which database within the cluster you connect to to
-     issue this command.  You can ignore the result returned by the function;
-     but if it reports an error, deal with that before proceeding.
+     It does not matter which database within the cluster you are connected
+     to when you issue this command, the base backup always backs up an entire
+     server.
+     If <function>pg_start_backup</> reports an error, you should resolve the
+     issue before proceeding. The result returned by the function is
+     an identifier for the first WAL file in the base backup. You can
+     use the <function>pg_xlogfile_name</> function to get the filename
+     for the actual WAL file it identifies.
     </para>
 
     <para>
-     By default, <function>pg_start_backup</> can take a long time to finish.
-     This is because it performs a checkpoint, and the I/O
-     required for the checkpoint will be spread out over a significant
-     period of time, by default half your inter-checkpoint interval
+     <function>pg_start_backup</> performs a checkpoint in preparation
+     for the backup, spreading the I/O involved over a a period
+     of time to minimize the impact on queries during the backup process.
+     By default, this period is set at half your inter-checkpoint interval
      (see the configuration parameter
-     <xref linkend="guc-checkpoint-completion-target">).  This is
-     usually what you want, because it minimizes the impact on query
-     processing.  If you want to start the backup as soon as
-     possible, use:
+     <xref linkend="guc-checkpoint-completion-target">).
+     This can make the backup process lengthy, but is usually desirable
+     because it minimizes the impact on query processing.
+     If you prefer to start the backup as soon as possible, with
+     possible a larger impact on your server's performance during
+     the backup, use:
 <programlisting>
 SELECT pg_start_backup('label', true);
 </programlisting>
-     This forces the checkpoint to be done as quickly as possible.
+     Which forces the checkpoint to be done as quickly as possible.
     </para>
    </listitem>
    <listitem>
     <para>
-     Perform the backup, using any convenient file-system-backup tool
+     Perform file system backup of your cluster directory,
+     see <xref linkend="backup-file">.
+     You can use any convenient file-system-backup tool
      such as <application>tar</> or <application>cpio</> (not
      <application>pg_dump</application> or
-     <application>pg_dumpall</application>).  It is neither
-     necessary nor desirable to stop normal operation of the database
-     while you do this.
-    </para>
-   </listitem>
+     <application>pg_dumpall</application>).  Note that, when used in conjunction
+     with a base backup, it is neither necessary nor desirable to stop normal
+     operation of the database while you create the file system backup.
+     </para>
+    </listitem>
    <listitem>
     <para>
-     Again connect to the database as a superuser, and issue the command:
+     Once <function>pg_start_backup</> finished, connect to the database
+     as a superuser again, and and issue the command:
 <programlisting>
 SELECT pg_stop_backup();
 </programlisting>
-     This terminates the backup mode and performs an automatic switch to
-     the next WAL segment.  The reason for the switch is to arrange for
-     the last WAL segment file written during the backup interval to be
-     ready to archive.
+     This completes the backup and forces a switch to the the next WAL segment.
+     The reason for the switch is so that the last WAL segment file written to
+     during the backup interval is released and becomes the last WAL in the
+     sequence comprising the base backup, which can now be backed up.
+     Again, the result return is an identifier for a WAL file, this time the
+     last in the sequence making up the base backup, and again you can use the
+     <function>pg_xlogfile_name</> function to get the filename for the actual
+     WAL file it identifies.
     </para>
    </listitem>
    <listitem>
     <para>
-     Once the WAL segment files active during the backup are archived, you are
-     done.  The file identified by <function>pg_stop_backup</>'s result is
-     the last segment that is required to form a complete set of backup files.
-     If <varname>archive_mode</> is enabled,
-     <function>pg_stop_backup</> does not return until the last segment has
-     been archived.
-     Archiving of these files happens automatically since you have
-     already configured <varname>archive_command</>. In most cases this
-     happens quickly, but you are advised to monitor your archive
-     system to ensure there are no delays.
-     If the archive process has fallen behind
-     because of failures of the archive command, it will keep retrying
-     until the archive succeeds and the backup is complete.
-     If you wish to place a time limit on the execution of
-     <function>pg_stop_backup</>, set an appropriate
-     <varname>statement_timeout</varname> value.
+    Because the base backup was created with continuous archiving enabled,
+    the WAL files comprising the base backup and the associated backup
+    history file should now be archived in the usual way by
+    <literal>postgres</>'s continuous archiving feature.  Note that during
+    archiving the backup history file, <filename>backup_label</>, that
+    appeared in your cluster directory during the backup is moved to the
+    archive directory and <emphasis>renamed</emphasis>. This file is an
+    of the base backup. The file's extension changed to <literal>.backup</>,
+    and the filename is changed to match the first WAL segment file in
+    the base backup. For example, if the first WAL file in the range is
+    <literal>0000000100001234000055CD</> the backup history file will be
+    named similarly to <literal>0000000100001234000055CD.007C9330.backup</>.
+    (The second part of the file name stands for an exact position within
+    the WAL file, and can ordinarily be ignored.)
     </para>
-   </listitem>
-  </orderedlist>
+
+    <para>
+    Once all the WAL files included in the range listed inside the backup
+    history file, as well as the backup history file itself have been
+    archived, the base backup is complete. <function>pg_stop_backup</> will
+    not return until the last segment has been archived. In most cases this
+    happens quickly, but you are advised to monitor your archive system to
+    nsure there are no delays. If the archive process has fallen behind because
+    of failures of the archive command, it will keep retrying until the archive
+    succeeds and the backup is complete.
+    If you wish to place a time limit on the execution of
+    <function>pg_stop_backup</>, set an appropriate
+    <varname>statement_timeout</varname> value.
+    </para>
+
+    <para>
+    Once archiving is complete, all archived WAL segments with names
+    numerically less are no longer needed to recover the file system
+    backup and can be deleted. However, you should consider keeping
+    several backup sets to be absolutely certain that you can recover
+    your data.
    </para>
+  </listitem>
+  </orderedlist>
+  </para>
 
    <para>
     Some file system backup tools emit warnings or errors
-- 
2.4.3

-- 
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

Re: [DOCS] Updated docs on base backups

Reply via email to