On Mon, Jan 31, 2011 at 10:45 PM, Fujii Masao <masao.fu...@gmail.com> wrote: > On Tue, Feb 1, 2011 at 1:31 AM, Heikki Linnakangas > <heikki.linnakan...@enterprisedb.com> wrote: >> Hmm, good point. It's harmless, but creating the history file in the first >> place sure seems like a waste of time. > > The attached patch changes pg_stop_backup so that it doesn't create > the backup history file if archiving is not enabled. > > When I tested the multiple backups, I found that they can have the same > checkpoint location and the same history file name. > > -------------------- > $ for ((i=0; i<4; i++)); do > pg_basebackup -D test$i -c fast -x -l test$i & > done > > $ cat test0/backup_label > START WAL LOCATION: 0/20000B0 (file 000000010000000000000002) > CHECKPOINT LOCATION: 0/20000E8 > START TIME: 2011-02-01 12:12:31 JST > LABEL: test0 > > $ cat test1/backup_label > START WAL LOCATION: 0/20000B0 (file 000000010000000000000002) > CHECKPOINT LOCATION: 0/20000E8 > START TIME: 2011-02-01 12:12:31 JST > LABEL: test1 > > $ cat test2/backup_label > START WAL LOCATION: 0/20000B0 (file 000000010000000000000002) > CHECKPOINT LOCATION: 0/20000E8 > START TIME: 2011-02-01 12:12:31 JST > LABEL: test2 > > $ cat test3/backup_label > START WAL LOCATION: 0/20000B0 (file 000000010000000000000002) > CHECKPOINT LOCATION: 0/20000E8 > START TIME: 2011-02-01 12:12:31 JST > LABEL: test3 > > $ ls archive/*.backup > archive/000000010000000000000002.000000B0.backup > -------------------- > > This would cause a serious problem. Because the backup-end record > which indicates the same "START WAL LOCATION" can be written by the > first backup before the other finishes. So we might think wrongly that > we've already reached a consistency state by reading the backup-end > record (written by the first backup) before reading the last required WAL > file. > > /* > * Force a CHECKPOINT. Aside from being necessary to prevent > torn > * page problems, this guarantees that two successive backup > runs will > * have different checkpoint positions and hence different > history > * file names, even if nothing happened in between. > * > * We use CHECKPOINT_IMMEDIATE only if requested by user (via > passing > * fast = true). Otherwise this can take awhile. > */ > RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT | > (fast ? CHECKPOINT_IMMEDIATE > : 0)); > > This problem happens because the above code (in do_pg_start_backup) > actually doesn't ensure that the concurrent backups have the different > checkpoint locations. ISTM that we should change the above or elsewhere > to ensure that. Or we should include backup label name in the backup-end > record, to prevent a recovery from reading not-its-own backup-end record. > > Thought?
This patch is on the 9.1 open items list, but I don't understand it well enough to know whether it's correct. Can someone else pick it up? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers