Re: [HACKERS] Allowing multiple concurrent base backups

Robert Haas Thu, 17 Mar 2011 12:39:47 -0700

On Mon, Jan 31, 2011 at 10:45 PM, Fujii Masao <masao.fu...@gmail.com> wrote:
> On Tue, Feb 1, 2011 at 1:31 AM, Heikki Linnakangas
> <heikki.linnakan...@enterprisedb.com> wrote:
>> Hmm, good point. It's harmless, but creating the history file in the first
>> place sure seems like a waste of time.
>
> The attached patch changes pg_stop_backup so that it doesn't create
> the backup history file if archiving is not enabled.
>
> When I tested the multiple backups, I found that they can have the same
> checkpoint location and the same history file name.
>
> --------------------
> $ for ((i=0; i<4; i++)); do
> pg_basebackup -D test$i -c fast -x -l test$i &
> done
>
> $ cat test0/backup_label
> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
> CHECKPOINT LOCATION: 0/20000E8
> START TIME: 2011-02-01 12:12:31 JST
> LABEL: test0
>
> $ cat test1/backup_label
> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
> CHECKPOINT LOCATION: 0/20000E8
> START TIME: 2011-02-01 12:12:31 JST
> LABEL: test1
>
> $ cat test2/backup_label
> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
> CHECKPOINT LOCATION: 0/20000E8
> START TIME: 2011-02-01 12:12:31 JST
> LABEL: test2
>
> $ cat test3/backup_label
> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002)
> CHECKPOINT LOCATION: 0/20000E8
> START TIME: 2011-02-01 12:12:31 JST
> LABEL: test3
>
> $ ls archive/*.backup
> archive/000000010000000000000002.000000B0.backup
> --------------------
>
> This would cause a serious problem. Because the backup-end record
> which indicates the same "START WAL LOCATION" can be written by the
> first backup before the other finishes. So we might think wrongly that
> we've already reached a consistency state by reading the backup-end
> record (written by the first backup) before reading the last required WAL
> file.
>
>                /*
>                 * Force a CHECKPOINT.  Aside from being necessary to prevent 
> torn
>                 * page problems, this guarantees that two successive backup 
> runs will
>                 * have different checkpoint positions and hence different 
> history
>                 * file names, even if nothing happened in between.
>                 *
>                 * We use CHECKPOINT_IMMEDIATE only if requested by user (via 
> passing
>                 * fast = true).  Otherwise this can take awhile.
>                 */
>                RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT |
>                                                  (fast ? CHECKPOINT_IMMEDIATE 
> : 0));
>
> This problem happens because the above code (in do_pg_start_backup)
> actually doesn't ensure that the concurrent backups have the different
> checkpoint locations. ISTM that we should change the above or elsewhere
> to ensure that. Or we should include backup label name in the backup-end
> record, to prevent a recovery from reading not-its-own backup-end record.
>
> Thought?


This patch is on the 9.1 open items list, but I don't understand it
well enough to know whether it's correct.  Can someone else pick it
up?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing multiple concurrent base backups

Reply via email to