On 2 July 2016 at 22:42, Bruce Momjian <br...@momjian.us> wrote: > > > I suspect, but cannot prove, that it is also safe to snapshot pg_xlog on > a > > separate filesystem if and only if you take the datadir snapshot before > the > > pg_xlog snapshot and you have wal_keep_segments high enough to ensure > that WAL > > needeed by the redo checkpoint in the datadir snapshot is not removed. I > > wouldn't want to do this, and certainly not document it, since it's way > saner > > to use pg_start_backup() etc. > > Yes, I have wanted to document that WAL-at-the-end is sufficient for > non-atomic snapshots assuming the needed WAL is there. However, even if > the WAL is backed up, it doesn't mean we are going to read it during > crash recovery, i.e. we only read from the last checkpoint or something > like that. I have no idea how to tell people when this is safe. > > My simplistic idea would be to tell people to run a checkpoint right > before all the snapshots are taken, but even that doesn't seem 100% > safe. This needs someone who understands the WAL and how to tell people > a totally safe procedure. >
The main thing is to provide an easy way to get the filenames of all the archives that are required. pg_start_backup() and pg_stop_backup() provide the range of LSNs required to restore, but you have to - correctly - convert them into xlog file names and copy everything in that range not just the start and end. There's no simple way to ask PostgreSQL for the file-list via a query, since we lack arithmetic operators for pg_lsn or any sort of pg_xlogfile_name_next function or similar. You can easily get the start and end filenames using pg_xlogfile_name() and rely on lexical comparision of filenames but that's way less convenient than just getting a file-list you can feed into rsync / tar / whatever. It's too complicated. Yes, most users should just use pg_basebackup, but that doesn't play well with snapshots etc. A pg_xlogfile_name_range function would be a real help I think, so you could psql -qAt a simple query to get all the xlogs you must copy from the saved LSNs reported by pg_start_backup() and pg_stop_backup(). Especially if the docs incorporated a sample script, including a test that marks the backup aborted/failed if there are any missing files. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services