Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2013-01-02 Thread Heikki Linnakangas

On 27.12.2012 12:06, Heikki Linnakangas wrote:

On 23.12.2012 15:33, Fujii Masao wrote:

On Fri, Dec 21, 2012 at 9:54 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

Yes, this should be backpatched to 9.2. I came up with the attached.


In this patch, if '-X stream' is specified in pg_basebackup, the timeline
history files are not backed up.


Good point.


We should change pg_backup background
process and walsender so that they stream also timeline history files,
for example, by using 'TIMELINE_HISTORY' replication command?
Or basebackup.c should send all timeline history files at the end of
backup
even if '-X stream' is specified?


Perhaps. We should enhance pg_receivexlog to follow timeline switches,
anyway. I was thinking of leaving that as a todo item, but pg_basebackup
-X stream shares the code, so we should implement that now to get that
support into both.

In the problem you reported on the other thread
(http://archives.postgresql.org/message-id/50db5ea9.7010...@vmware.com),
you also need the timeline history files, but that one didn't use -X
at all. Even if we teach pg_basebackup to fetch the timeline history
files in -X stream mode, that still leaves the problem on that other
thread.

The simplest solution would be to always include all timeline history
files in the backup, even if -X is not used. Currently, however, pg_xlog
is backed up as an empty directory in that case, but that would no
longer be the case if we start including timeline history files there. I
wonder if that would confuse any existing backup scripts people are using.


This thread has spread out a bit, so here's a summary of the remaining 
issues and what I'm going to do about them:


9.2
---

If you take a backup with pg_basebackup -X fetch, and the timeline 
switches while the backup is taken, you currently get an error like 
requested WAL segment 0001000C has already been 
removed. To fix, let's change the server-side support of -X fetch to 
include all WAL files between the backup start and end pointers, 
regardless of timelines. I'm thinking of doing this by scanning pg_xlog 
with readdir(), and sending over any files in that range. Another option 
would be to read and parse the timeline history file to figure out the 
exact filenames expected, but the readdir() approach seems simpler.


You also need the timeline history files. With -X fetch, I think we 
should always include them in the pg_xlog directory of the backup, along 
with the WAL files themselves.


-X stream has a similar problem. If timeline changes during backup, 
the replication will stop at the timeline switch, and the backup fails. 
There isn't much we can do about that, as you can't follow a timeline 
switch via streaming replication in 9.2. At best, we could try to detect 
the situation and give a better error message.


With plain pg_basebackup, without -X option, you usually need a WAL 
archive to restore. The only exception is when you initialize a 
streaming standby with plain pg_basebackup, intending to connect it to 
the master right after taking the backup, so that it can stream all the 
required WAL at that point. We have a problem with that scenario, 
because even if there was no timeline switch while the backup was taken 
(if there was, you're screwed the same as with -X stream), because of 
the issue mentioned in the first post in this thread: the beginning of 
the first WAL file contains the old timeline ID. Even though that's not 
replayed, because the checkpoint is in the later part of the segment, 
recovery still complains if there is a timeline ID in the beginning of 
the file that we don't recognize as our ancestor. This could be fixed if 
we somehow got the timeline history files in the backup, but I'm worried 
that might break people's backup scripts. At the moment, the pg_xlog 
directory in the backup is empty when -X is not used, we even spell that 
out explicitly in the manual. Including timeline history files would 
change that. That might be an issue if you symlink pg_xlog to a 
different drive, for example. To make things worse, there is no timeline 
history file for timeline 1, so you would not notice when you test your 
backup scripts in simple cases.


In summary, in 9.2 I think we should fix -X fetch to tolerate a 
timeline switch, and include all the timeline history files. The 
behavior of other modes would not be changed, and they will fail if a 
timeline changes during or just before backup.


Master
--

In master, we can try harder for the -X stream case, because you can 
replicate a timeline switch, and fetch timeline history files via a 
replication connection. Let's teach pg_receivexlog, and pg_basebackup 
-X stream, to use those facilities, so that even if the timeline 
changes while the backup is taken, all the history files and WAL files 
are correctly included in the backup. I'll start working on a patch to 
do that.


That leaves one case not covered: If you take a backup with plain 

Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2013-01-02 Thread Greg Stark
On Wed, Jan 2, 2013 at 1:55 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 If you take a backup with pg_basebackup -X fetch, and the timeline
 switches while the backup is taken, you currently get an error like
 requested WAL segment 0001000C has already been removed.
 To fix, let's change the server-side support of -X fetch to include all
 WAL files between the backup start and end pointers, regardless of
 timelines. I'm thinking of doing this by scanning pg_xlog with readdir(),
 and sending over any files in that range. Another option would be to read
 and parse the timeline history file to figure out the exact filenames
 expected, but the readdir() approach seems simpler.

I'm not clear what you mean by any files in that range. There could
be other timelines in the archive that aren't relevant to the restore
at all. For example if the database you're requesting a backup from
has previously been restored from an old backup the archive could have
archives from the original timeline as well as the active timeline.

I'm trying to wrap my head around what other combinations are
possible. Is it possible there have been other false starts or
multiple timeline switches during the time the backup was being taken?
At first blush I think not, I think it's only possible for there to be
one timeline switch and it would be when a standby database was being
backed up and is activated while the backup was being taken.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-27 Thread Heikki Linnakangas

On 23.12.2012 15:33, Fujii Masao wrote:

On Fri, Dec 21, 2012 at 9:54 PM, Heikki Linnakangas
hlinnakan...@vmware.com  wrote:

Yes, this should be backpatched to 9.2. I came up with the attached.


In this patch, if '-X stream' is specified in pg_basebackup, the timeline
history files are not backed up.


Good point.


We should change pg_backup background
process and walsender so that they stream also timeline history files,
for example, by using 'TIMELINE_HISTORY' replication command?
Or basebackup.c should send all timeline history files at the end of backup
even if '-X stream' is specified?


Perhaps. We should enhance pg_receivexlog to follow timeline switches, 
anyway. I was thinking of leaving that as a todo item, but pg_basebackup 
-X stream shares the code, so we should implement that now to get that 
support into both.


In the problem you reported on the other thread 
(http://archives.postgresql.org/message-id/50db5ea9.7010...@vmware.com), 
you also need the timeline history files, but that one didn't use -X 
at all. Even if we teach pg_basebackup to fetch the timeline history 
files in -X stream mode, that still leaves the problem on that other 
thread.


The simplest solution would be to always include all timeline history 
files in the backup, even if -X is not used. Currently, however, pg_xlog 
is backed up as an empty directory in that case, but that would no 
longer be the case if we start including timeline history files there. I 
wonder if that would confuse any existing backup scripts people are using.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-23 Thread Fujii Masao
On Fri, Dec 21, 2012 at 9:54 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Yes, this should be backpatched to 9.2. I came up with the attached.

In this patch, if '-X stream' is specified in pg_basebackup, the timeline
history files are not backed up. We should change pg_backup background
process and walsender so that they stream also timeline history files,
for example, by using 'TIMELINE_HISTORY' replication command?
Or basebackup.c should send all timeline history files at the end of backup
even if '-X stream' is specified?

 However, thinking about this some more, there's a another bug in the way WAL
 files are included in the backup, when a timeline switch happens.
 basebackup.c includes all the WAL files on ThisTimeLineID, but when the
 backup is taken from a standby, the standby might've followed a timeline
 switch. So it's possible that some of the WAL files should come from
 timeline 1, while others should come from timeline 2. This leads to an error
 like requested WAL segment 0001000C has already been
 removed in pg_basebackup.

 Attached is a script to reproduce that bug, if someone wants to play with
 it. It's a bit sensitive to timing, and needs tweaking the paths at the top.

 One solution to that would be to pay more attention to the timelines to
 include WAL from. basebackup.c could read the timeline history file, to see
 exactly where the timeline switches happened, and then construct the
 filename of each WAL segment using the correct timeline id. Another approach
 would be to do readdir() on pg_xlog, and include all WAL files, regardless
 of timeline IDs, that fall in the right XLogRecPtr range. The latter seems
 easier to backpatch.

The latter sounds good to me. But how all WAL files with different timelines
are shipped in pg_basebackup -X stream mode?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-21 Thread Heikki Linnakangas

On 17.12.2012 18:58, Magnus Hagander wrote:

On Mon, Dec 17, 2012 at 5:19 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

Heikki Linnakangashlinnakan...@vmware.com  writes:

I'm not happy with the fact that we just ignore the problem in a backup
taken from a standby, silently giving the user a backup that won't start
up. Why not include the timeline history file in the backup?


+1.  I was not aware that we weren't doing that --- it seems pretty
foolish, especially since as you say they're tiny.


Yeah, +1. That should probably have been a part of the whole
basebackup from slave patch, so it can probably be considered a
back-patchable bugfix in itself, no?


Yes, this should be backpatched to 9.2. I came up with the attached.

However, thinking about this some more, there's a another bug in the way 
WAL files are included in the backup, when a timeline switch happens. 
basebackup.c includes all the WAL files on ThisTimeLineID, but when the 
backup is taken from a standby, the standby might've followed a timeline 
switch. So it's possible that some of the WAL files should come from 
timeline 1, while others should come from timeline 2. This leads to an 
error like requested WAL segment 0001000C has already 
been removed in pg_basebackup.


Attached is a script to reproduce that bug, if someone wants to play 
with it. It's a bit sensitive to timing, and needs tweaking the paths at 
the top.


One solution to that would be to pay more attention to the timelines to 
include WAL from. basebackup.c could read the timeline history file, to 
see exactly where the timeline switches happened, and then construct the 
filename of each WAL segment using the correct timeline id. Another 
approach would be to do readdir() on pg_xlog, and include all WAL files, 
regardless of timeline IDs, that fall in the right XLogRecPtr range. The 
latter seems easier to backpatch.


- Heikki
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 65200c1..5c0deaa 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -12,12 +12,13 @@
  */
 #include postgres.h
 
+#include string.h
 #include sys/types.h
 #include sys/stat.h
 #include unistd.h
 #include time.h
 
-#include access/xlog_internal.h		/* for pg_start/stop_backup */
+#include access/xlog_internal.h
 #include catalog/pg_type.h
 #include lib/stringinfo.h
 #include libpq/libpq.h
@@ -44,6 +45,7 @@ typedef struct
 
 
 static int64 sendDir(char *path, int basepathlen, bool sizeonly);
+static void sendTimeLineHistoryFiles(void);
 static void sendFile(char *readfilename, char *tarfilename,
 		 struct stat * statbuf);
 static void sendFileWithContent(const char *filename, const char *content);
@@ -286,6 +288,27 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir)
 break;
 		}
 
+		/*
+		 * Include all timeline history files.
+		 *
+		 * The timeline history files are usually not strictly required to
+		 * restore the backup, but if you take a backup from a standby server,
+		 * and the WAL segment containing the checkpoint record contains WAL
+		 * from an older timeline, recovery will complain on the older
+		 * timeline's ID if there is no timeline history file listing it. This
+		 * can happen if you take a backup right after promoting a standby to
+		 * become new master, and take the backup from a different, cascading
+		 * standby server.
+		 *
+		 * However, even when not strictly required, the timeline history
+		 * files are tiny, and provide a lot of forensic information about the
+		 * recovery history of the database, so it's best to always include
+		 * them all. (If asked to include WAL, that is. Otherwise you need a
+		 * WAL archive to restore anyway, and the timeline history files
+		 * should be present in the archive)
+		 */
+		sendTimeLineHistoryFiles();
+
 		/* Send CopyDone message for the last tar file */
 		pq_putemptymessage('c');
 	}
@@ -726,6 +749,58 @@ sendDir(char *path, int basepathlen, bool sizeonly)
 	return size;
 }
 
+/*
+ * Include all timeline history files from pg_xlog in the output tar stream.
+ */
+static void
+sendTimeLineHistoryFiles(void)
+{
+	DIR		   *dir;
+	struct dirent *de;
+	char		pathbuf[MAXPGPATH];
+	struct stat statbuf;
+
+	dir = AllocateDir(./pg_xlog);
+	while ((de = ReadDir(dir, ./pg_xlog)) != NULL)
+	{
+		CHECK_FOR_INTERRUPTS();
+
+		if (strlen(de-d_name) == 8 + strlen(.history) 
+			strspn(de-d_name, 0123456789ABCDEF) == 8 
+			strcmp(de-d_name + 8, .history) == 0)
+		{
+			/* It looks like a timeline history file. Include it. */
+			snprintf(pathbuf, MAXPGPATH, ./pg_xlog/%s, de-d_name);
+
+			if (lstat(pathbuf, statbuf) != 0)
+			{
+if (errno != ENOENT)
+	ereport(ERROR,
+			(errcode_for_file_access(),
+			 errmsg(could not stat file or directory \%s\: %m,
+	pathbuf)));
+
+/* If the file went away while scanning, it's no error. */
+continue;
+			}
+
+			if (!S_ISREG(statbuf.st_mode))
+			{
+/*
+	

Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-21 Thread Amit kapila
On Friday, December 21, 2012 6:24 PM Heikki Linnakangas wrote:
On 17.12.2012 18:58, Magnus Hagander wrote:
 On Mon, Dec 17, 2012 at 5:19 PM, Tom Lanet...@sss.pgh.pa.us  wrote:
 Heikki Linnakangashlinnakan...@vmware.com  writes:
 I'm not happy with the fact that we just ignore the problem in a backup
 taken from a standby, silently giving the user a backup that won't start
 up. Why not include the timeline history file in the backup?

 +1.  I was not aware that we weren't doing that --- it seems pretty
 foolish, especially since as you say they're tiny.

 Yeah, +1. That should probably have been a part of the whole
 basebackup from slave patch, so it can probably be considered a
 back-patchable bugfix in itself, no?

Yes, this should be backpatched to 9.2. I came up with the attached.



 One solution to that would be to pay more attention to the timelines to
 include WAL from. basebackup.c could read the timeline history file, to
 see exactly where the timeline switches happened, and then construct the
 filename of each WAL segment using the correct timeline id. Another
 approach would be to do readdir() on pg_xlog, and include all WAL files,
 regardless of timeline IDs, that fall in the right XLogRecPtr range. The
 latter seems easier to backpatch.

I also think approach implemented by you is more better.
One small point, shouldn't it check (walsender_shutdown_requested || 
walsender_ready_to_stop) during ReadDir of pg_xlog similar to what is done in 
ReadDir() in SendDir?

With Regards,
Amit Kapila.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-18 Thread Simon Riggs
On 18 December 2012 00:53, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 17 December 2012 14:16, Heikki Linnakangas hlinnakan...@vmware.com 
 wrote:
 I also wonder if pg_basebackup should
 include *all* timeline history files in the backup, not just the latest one
 strictly required to restore.

 Why? the timeline history file includes the previous timelines already.

 The original intention was that the WAL archive might contain multiple
 timeline files corresponding to various experimental recovery attempts;
 furthermore, such files might be hand-annotated (that's why there's a
 comment provision).  So they would be at least as valuable from an
 archival standpoint as the WAL files proper.  I think we ought to just
 copy all of them, period.  Anything less is penny-wise and
 pound-foolish.

What I'm saying is that the new history file is created from the old
one, so the latest file includes all the history as a direct copy. The
only thing new is one line of information.

Copying all files grows at O(N^2) with redundancy and will eventually
become a space problem and a performance issue for smaller systems.
There should be some limit to keep this sane, for example, the last 32
history files, or the last 1000 lines of history. Some sane limit.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-18 Thread Fujii Masao
On Tue, Dec 18, 2012 at 8:09 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 18 December 2012 00:53, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 17 December 2012 14:16, Heikki Linnakangas hlinnakan...@vmware.com 
 wrote:
 I also wonder if pg_basebackup should
 include *all* timeline history files in the backup, not just the latest one
 strictly required to restore.

 Why? the timeline history file includes the previous timelines already.

 The original intention was that the WAL archive might contain multiple
 timeline files corresponding to various experimental recovery attempts;
 furthermore, such files might be hand-annotated (that's why there's a
 comment provision).  So they would be at least as valuable from an
 archival standpoint as the WAL files proper.  I think we ought to just
 copy all of them, period.  Anything less is penny-wise and
 pound-foolish.

 What I'm saying is that the new history file is created from the old
 one, so the latest file includes all the history as a direct copy. The
 only thing new is one line of information.

The timeline history file includes only ancestor timelines history. So
the latest one might not include all the history.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-18 Thread Tom Lane
Fujii Masao masao.fu...@gmail.com writes:
 On Tue, Dec 18, 2012 at 8:09 PM, Simon Riggs si...@2ndquadrant.com wrote:
 What I'm saying is that the new history file is created from the old
 one, so the latest file includes all the history as a direct copy. The
 only thing new is one line of information.

 The timeline history file includes only ancestor timelines history. So
 the latest one might not include all the history.

Indeed.  And even if there are a thousand of them, so what?  It'd still
be less space than a single WAL segment file.

Better to keep the data than wish we had it later.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-17 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 I'm not happy with the fact that we just ignore the problem in a backup 
 taken from a standby, silently giving the user a backup that won't start 
 up. Why not include the timeline history file in the backup?

+1.  I was not aware that we weren't doing that --- it seems pretty
foolish, especially since as you say they're tiny.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-17 Thread Magnus Hagander
On Mon, Dec 17, 2012 at 5:19 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas hlinnakan...@vmware.com writes:
 I'm not happy with the fact that we just ignore the problem in a backup
 taken from a standby, silently giving the user a backup that won't start
 up. Why not include the timeline history file in the backup?

 +1.  I was not aware that we weren't doing that --- it seems pretty
 foolish, especially since as you say they're tiny.

Yeah, +1. That should probably have been a part of the whole
basebackup from slave patch, so it can probably be considered a
back-patchable bugfix in itself, no?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-17 Thread Simon Riggs
On 17 December 2012 14:16, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 I'm not happy with the fact that we just ignore the problem in a backup
 taken from a standby, silently giving the user a backup that won't start up.

That's spooky. I just found a different issue with prmotion during
backup on your other thread.

 Why not include the timeline history file in the backup? That seems like a
 good idea regardless of this issue.

Yeh

 I also wonder if pg_basebackup should
 include *all* timeline history files in the backup, not just the latest one
 strictly required to restore.

Why? the timeline history file includes the previous timelines already.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup from cascading standby after timeline switch

2012-12-17 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 17 December 2012 14:16, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 I also wonder if pg_basebackup should
 include *all* timeline history files in the backup, not just the latest one
 strictly required to restore.

 Why? the timeline history file includes the previous timelines already.

The original intention was that the WAL archive might contain multiple
timeline files corresponding to various experimental recovery attempts;
furthermore, such files might be hand-annotated (that's why there's a
comment provision).  So they would be at least as valuable from an
archival standpoint as the WAL files proper.  I think we ought to just
copy all of them, period.  Anything less is penny-wise and
pound-foolish.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers