On 27.12.2012 12:06, Heikki Linnakangas wrote:
On 23.12.2012 15:33, Fujii Masao wrote:
On Fri, Dec 21, 2012 at 9:54 PM, Heikki Linnakangas
<hlinnakan...@vmware.com> wrote:
Yes, this should be backpatched to 9.2. I came up with the attached.

In this patch, if '-X stream' is specified in pg_basebackup, the timeline
history files are not backed up.

Good point.

We should change pg_backup background
process and walsender so that they stream also timeline history files,
for example, by using 'TIMELINE_HISTORY' replication command?
Or basebackup.c should send all timeline history files at the end of
backup
even if '-X stream' is specified?

Perhaps. We should enhance pg_receivexlog to follow timeline switches,
anyway. I was thinking of leaving that as a todo item, but pg_basebackup
-X stream shares the code, so we should implement that now to get that
support into both.

In the problem you reported on the other thread
(http://archives.postgresql.org/message-id/50db5ea9.7010...@vmware.com),
you also need the timeline history files, but that one didn't use "-X"
at all. Even if we teach pg_basebackup to fetch the timeline history
files in "-X stream" mode, that still leaves the problem on that other
thread.

The simplest solution would be to always include all timeline history
files in the backup, even if -X is not used. Currently, however, pg_xlog
is backed up as an empty directory in that case, but that would no
longer be the case if we start including timeline history files there. I
wonder if that would confuse any existing backup scripts people are using.

This thread has spread out a bit, so here's a summary of the remaining issues and what I'm going to do about them:

9.2
---

If you take a backup with "pg_basebackup -X fetch", and the timeline switches while the backup is taken, you currently get an error like "requested WAL segment 00000001000000000000000C has already been removed". To fix, let's change the server-side support of "-X fetch" to include all WAL files between the backup start and end pointers, regardless of timelines. I'm thinking of doing this by scanning pg_xlog with readdir(), and sending over any files in that range. Another option would be to read and parse the timeline history file to figure out the exact filenames expected, but the readdir() approach seems simpler.

You also need the timeline history files. With "-X fetch", I think we should always include them in the pg_xlog directory of the backup, along with the WAL files themselves.

"-X stream" has a similar problem. If timeline changes during backup, the replication will stop at the timeline switch, and the backup fails. There isn't much we can do about that, as you can't follow a timeline switch via streaming replication in 9.2. At best, we could try to detect the situation and give a better error message.

With plain pg_basebackup, without -X option, you usually need a WAL archive to restore. The only exception is when you initialize a streaming standby with plain "pg_basebackup", intending to connect it to the master right after taking the backup, so that it can stream all the required WAL at that point. We have a problem with that scenario, because even if there was no timeline switch while the backup was taken (if there was, you're screwed the same as with "-X stream"), because of the issue mentioned in the first post in this thread: the beginning of the first WAL file contains the old timeline ID. Even though that's not replayed, because the checkpoint is in the later part of the segment, recovery still complains if there is a timeline ID in the beginning of the file that we don't recognize as our ancestor. This could be fixed if we somehow got the timeline history files in the backup, but I'm worried that might break people's backup scripts. At the moment, the pg_xlog directory in the backup is empty when -X is not used, we even spell that out explicitly in the manual. Including timeline history files would change that. That might be an issue if you symlink pg_xlog to a different drive, for example. To make things worse, there is no timeline history file for timeline 1, so you would not notice when you test your backup scripts in simple cases.

In summary, in 9.2 I think we should fix "-X fetch" to tolerate a timeline switch, and include all the timeline history files. The behavior of other modes would not be changed, and they will fail if a timeline changes during or just before backup.

Master
------

In master, we can try harder for the "-X stream" case, because you can replicate a timeline switch, and fetch timeline history files via a replication connection. Let's teach pg_receivexlog, and "pg_basebackup -X stream", to use those facilities, so that even if the timeline changes while the backup is taken, all the history files and WAL files are correctly included in the backup. I'll start working on a patch to do that.

That leaves one case not covered: If you take a backup with plain "pg_basebackup" from a standby, without -X, and the first WAL segment contains a timeline switch (ie. you take the backup right after a failover), and you try to recover from it without a WAL archive, it doesn't work. This is the original issue that started this thread, except that I used "-x" in my original test case. The problem here is that even though streaming replication will fetch the timeline history file when it connects, at the very beginning of recovery, we expect that we already have the timeline history file corresponding the initial timeline available, either in pg_xlog or the WAL archive. By the time streaming replication has connected and fetched the history file, we've already initialized expectedTLEs to contain just the latest TLI. To fix that, we should delay calling readTimeLineHistoryFile() until streaming replication has connected and fetched the file.

Barring objections, I'll commit the attached two patches. include-wal-files-from-all-timelines-in-base-backup-1.patch is for 9.2 and master, and it fixes the "pg_basebackup -X fetch" case. delay-reading-timeline-history-file.patch is for master, and it changes recovery so if a timeline history file for the initial target timeline is fetched over streaming replication, expectedTLEs is initialized with the streamed file. That fixes the plain "pg_basebackup" without -X case on master.

What remains is to teach "pg_receivexlog" and "pg_basebackup -X stream" to cross timeline changes. I'll start working on a patch for that.

- Heikki
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c847913..060dc08 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2675,6 +2675,7 @@ XLogFileReadAnyTLI(XLogSegNo segno, int emode, int source)
 	char		path[MAXPGPATH];
 	ListCell   *cell;
 	int			fd;
+	List	   *tles;
 
 	/*
 	 * Loop looking for a suitable timeline ID: we might need to read any of
@@ -2685,8 +2686,21 @@ XLogFileReadAnyTLI(XLogSegNo segno, int emode, int source)
 	 * to go backwards; this prevents us from picking up the wrong file when a
 	 * parent timeline extends to higher segment numbers than the child we
 	 * want to read.
-	 */
-	foreach(cell, expectedTLEs)
+	 *
+	 * If we haven't read the timeline history file yet, we don't know which
+	 * TLIs to scan, so read it now. We don't save the list in expectedTLEs,
+	 * however, unless we actually find a valid segment. That way if there is
+	 * neither timeline history file nor WAL segment in the archive, and
+	 * streaming replication is set up, we'll read the timeline history file
+	 * streamed from the master when we start streaming, instead of recovering
+	 * with a dummy history generated here.
+	 */
+	if (expectedTLEs)
+		tles = expectedTLEs;
+	else
+		tles = readTimeLineHistory(recoveryTargetTLI);
+
+	foreach(cell, tles)
 	{
 		TimeLineID	tli = ((TimeLineHistoryEntry *) lfirst(cell))->tli;
 
@@ -2699,6 +2713,8 @@ XLogFileReadAnyTLI(XLogSegNo segno, int emode, int source)
 			if (fd != -1)
 			{
 				elog(DEBUG1, "got WAL segment from archive");
+				if (!expectedTLEs)
+					expectedTLEs = tles;
 				return fd;
 			}
 		}
@@ -2707,7 +2723,11 @@ XLogFileReadAnyTLI(XLogSegNo segno, int emode, int source)
 		{
 			fd = XLogFileRead(segno, emode, tli, XLOG_FROM_PG_XLOG, true);
 			if (fd != -1)
+			{
+				if (!expectedTLEs)
+					expectedTLEs = tles;
 				return fd;
+			}
 		}
 	}
 
@@ -5279,49 +5299,6 @@ StartupXLOG(void)
 	 */
 	readRecoveryCommandFile();
 
-	/* Now we can determine the list of expected TLIs */
-	expectedTLEs = readTimeLineHistory(recoveryTargetTLI);
-
-	/*
-	 * If the location of the checkpoint record is not on the expected
-	 * timeline in the history of the requested timeline, we cannot proceed:
-	 * the backup is not part of the history of the requested timeline.
-	 */
-	if (tliOfPointInHistory(ControlFile->checkPoint, expectedTLEs) !=
-			ControlFile->checkPointCopy.ThisTimeLineID)
-	{
-		XLogRecPtr switchpoint;
-
-		/*
-		 * tliSwitchPoint will throw an error if the checkpoint's timeline
-		 * is not in expectedTLEs at all.
-		 */
-		switchpoint = tliSwitchPoint(ControlFile->checkPointCopy.ThisTimeLineID, expectedTLEs);
-		ereport(FATAL,
-				(errmsg("requested timeline %u is not a child of this server's history",
-						recoveryTargetTLI),
-				 errdetail("Latest checkpoint is at %X/%X on timeline %u, but in the history of the requested timeline, the server forked off from that timeline at %X/%X",
-						   (uint32) (ControlFile->checkPoint >> 32),
-						   (uint32) ControlFile->checkPoint,
-						   ControlFile->checkPointCopy.ThisTimeLineID,
-						   (uint32) (switchpoint >> 32),
-						   (uint32) switchpoint)));
-	}
-
-	/*
-	 * The min recovery point should be part of the requested timeline's
-	 * history, too.
-	 */
-	if (!XLogRecPtrIsInvalid(ControlFile->minRecoveryPoint) &&
-		tliOfPointInHistory(ControlFile->minRecoveryPoint - 1, expectedTLEs) !=
-			ControlFile->minRecoveryPointTLI)
-		ereport(FATAL,
-				(errmsg("requested timeline %u does not contain minimum recovery point %X/%X on timeline %u",
-						recoveryTargetTLI,
-						(uint32) (ControlFile->minRecoveryPoint >> 32),
-						(uint32) ControlFile->minRecoveryPoint,
-						ControlFile->minRecoveryPointTLI)));
-
 	/*
 	 * Save archive_cleanup_command in shared memory so that other processes
 	 * can see it.
@@ -5443,6 +5420,47 @@ StartupXLOG(void)
 		wasShutdown = (record->xl_info == XLOG_CHECKPOINT_SHUTDOWN);
 	}
 
+	/*
+	 * If the location of the checkpoint record is not on the expected
+	 * timeline in the history of the requested timeline, we cannot proceed:
+	 * the backup is not part of the history of the requested timeline.
+	 */
+	Assert(expectedTLEs); /* was initialized by reading checkpoint record */
+	if (tliOfPointInHistory(checkPointLoc, expectedTLEs) !=
+			checkPoint.ThisTimeLineID)
+	{
+		XLogRecPtr switchpoint;
+
+		/*
+		 * tliSwitchPoint will throw an error if the checkpoint's timeline
+		 * is not in expectedTLEs at all.
+		 */
+		switchpoint = tliSwitchPoint(ControlFile->checkPointCopy.ThisTimeLineID, expectedTLEs);
+		ereport(FATAL,
+				(errmsg("requested timeline %u is not a child of this server's history",
+						recoveryTargetTLI),
+				 errdetail("Latest checkpoint is at %X/%X on timeline %u, but in the history of the requested timeline, the server forked off from that timeline at %X/%X",
+						   (uint32) (ControlFile->checkPoint >> 32),
+						   (uint32) ControlFile->checkPoint,
+						   ControlFile->checkPointCopy.ThisTimeLineID,
+						   (uint32) (switchpoint >> 32),
+						   (uint32) switchpoint)));
+	}
+
+	/*
+	 * The min recovery point should be part of the requested timeline's
+	 * history, too.
+	 */
+	if (!XLogRecPtrIsInvalid(ControlFile->minRecoveryPoint) &&
+		tliOfPointInHistory(ControlFile->minRecoveryPoint - 1, expectedTLEs) !=
+			ControlFile->minRecoveryPointTLI)
+		ereport(FATAL,
+				(errmsg("requested timeline %u does not contain minimum recovery point %X/%X on timeline %u",
+						recoveryTargetTLI,
+						(uint32) (ControlFile->minRecoveryPoint >> 32),
+						(uint32) ControlFile->minRecoveryPoint,
+						ControlFile->minRecoveryPointTLI)));
+
 	LastRec = RecPtr = checkPointLoc;
 
 	ereport(DEBUG1,
@@ -9569,13 +9587,24 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 					 */
 					if (PrimaryConnInfo)
 					{
-						XLogRecPtr ptr = fetching_ckpt ? RedoStartLSN : RecPtr;
-						TimeLineID tli = tliOfPointInHistory(ptr, expectedTLEs);
+						XLogRecPtr ptr;
+						TimeLineID tli;
 
-						if (curFileTLI > 0 && tli < curFileTLI)
-							elog(ERROR, "according to history file, WAL location %X/%X belongs to timeline %u, but previous recovered WAL file came from timeline %u",
-								 (uint32) (ptr >> 32), (uint32) ptr,
-								 tli, curFileTLI);
+						if (fetching_ckpt)
+						{
+							ptr = RedoStartLSN;
+							tli = ControlFile->checkPointCopy.ThisTimeLineID;
+						}
+						else
+						{
+							ptr = RecPtr;
+							tli = tliOfPointInHistory(ptr, expectedTLEs);
+
+							if (curFileTLI > 0 && tli < curFileTLI)
+								elog(ERROR, "according to history file, WAL location %X/%X belongs to timeline %u, but previous recovered WAL file came from timeline %u",
+									 (uint32) (ptr >> 32), (uint32) ptr,
+									 tli, curFileTLI);
+						}
 						curFileTLI = tli;
 						RequestXLogStreaming(curFileTLI, ptr, PrimaryConnInfo);
 					}
@@ -9739,11 +9768,16 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 				{
 					/*
 					 * Great, streamed far enough.  Open the file if it's not
-					 * open already.  Use XLOG_FROM_STREAM so that source info
-					 * is set correctly and XLogReceiptTime isn't changed.
+					 * open already.  Also read the timeline history file if
+					 * we haven't initialized timeline history yet; it should
+					 * be streamed over and present in pg_xlog by now.  Use
+					 * XLOG_FROM_STREAM so that source info is set correctly
+					 * and XLogReceiptTime isn't changed.
 					 */
 					if (readFile < 0)
 					{
+						if (!expectedTLEs)
+							expectedTLEs = readTimeLineHistory(receiveTLI);
 						readFile = XLogFileRead(readSegNo, PANIC,
 												receiveTLI,
 												XLOG_FROM_STREAM, false);
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 326c313..99280d8 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -338,7 +338,7 @@ WalReceiverMain(void)
 		 * ensure that a unique timeline id is chosen in every case, but let's
 		 * avoid the confusion of timeline id collisions where we can.
 		 */
-		WalRcvFetchTimeLineHistoryFiles(startpointTLI + 1, primaryTLI);
+		WalRcvFetchTimeLineHistoryFiles(startpointTLI, primaryTLI);
 
 		/*
 		 * Start streaming.
@@ -627,7 +627,7 @@ WalRcvFetchTimeLineHistoryFiles(TimeLineID first, TimeLineID last)
 
 	for (tli = first; tli <= last; tli++)
 	{
-		if (!existsTimeLineHistory(tli))
+		if (tli != 1 && !existsTimeLineHistory(tli))
 		{
 			char	   *fname;
 			char	   *content;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6f352fd..30d877b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3456,19 +3456,36 @@ PreallocXlogFiles(XLogRecPtr endptr)
 }
 
 /*
- * Get the log/seg of the latest removed or recycled WAL segment.
- * Returns 0/0 if no WAL segments have been removed since startup.
+ * Throws an error if the given log segment has already been removed or
+ * recycled. The caller should only pass a segment that it knows to have
+ * existed while the server has been running, as this function always
+ * succeeds if no WAL segments have been removed since startup.
+ * 'tli' is only used in the error message.
  */
 void
-XLogGetLastRemoved(uint32 *log, uint32 *seg)
+CheckXLogRemoved(uint32 log, uint32 seg, TimeLineID tli)
 {
 	/* use volatile pointer to prevent code rearrangement */
 	volatile XLogCtlData *xlogctl = XLogCtl;
+	uint32		lastRemovedLog,
+				lastRemovedSeg;
 
 	SpinLockAcquire(&xlogctl->info_lck);
-	*log = xlogctl->lastRemovedLog;
-	*seg = xlogctl->lastRemovedSeg;
+	lastRemovedLog = xlogctl->lastRemovedLog;
+	lastRemovedSeg = xlogctl->lastRemovedSeg;
 	SpinLockRelease(&xlogctl->info_lck);
+
+	if (log < lastRemovedLog ||
+		(log == lastRemovedLog && seg <= lastRemovedSeg))
+	{
+		char		filename[MAXFNAMELEN];
+
+		XLogFileName(filename, tli, log, seg);
+		ereport(ERROR,
+				(errcode_for_file_access(),
+				 errmsg("requested WAL segment %s has already been removed",
+						filename)));
+	}
 }
 
 /*
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index bc95215..3f46bfc 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -55,11 +55,10 @@ static void base_backup_cleanup(int code, Datum arg);
 static void perform_base_backup(basebackup_options *opt, DIR *tblspcdir);
 static void parse_basebackup_options(List *options, basebackup_options *opt);
 static void SendXlogRecPtrResult(XLogRecPtr ptr);
+static int compareWalFileNames(const void *a, const void *b);
 
 /*
  * Size of each block sent into the tar stream for larger files.
- *
- * XLogSegSize *MUST* be evenly dividable by this
  */
 #define TAR_SEND_SIZE 32768
 
@@ -219,70 +218,203 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir)
 	{
 		/*
 		 * We've left the last tar file "open", so we can now append the
-		 * required WAL files to it.
+		 * required WAL files to it. I'd rather not worry about timelines
+		 * here, so include WAL files belonging to any timeline, as long as
+		 * it's in the right WAL range, between 'startptr' and 'endptr'.
 		 */
+		char		pathbuf[MAXPGPATH];
 		uint32		logid,
 					logseg;
+		uint32		startlogid,
+					startlogseg;
 		uint32		endlogid,
 					endlogseg;
 		struct stat statbuf;
+		List	   *historyFileList = NIL;
+		List	   *walFileList = NIL;
+		char	  **walFiles;
+		int			nWalFiles;
+		char		firstoff[MAXFNAMELEN];
+		char		lastoff[MAXFNAMELEN];
+		DIR		   *dir;
+		struct dirent *de;
+		int			i;
+		ListCell   *lc;
+		TimeLineID	tli;
 
-		MemSet(&statbuf, 0, sizeof(statbuf));
-		statbuf.st_mode = S_IRUSR | S_IWUSR;
-#ifndef WIN32
-		statbuf.st_uid = geteuid();
-		statbuf.st_gid = getegid();
-#endif
-		statbuf.st_size = XLogSegSize;
-		statbuf.st_mtime = time(NULL);
-
-		XLByteToSeg(startptr, logid, logseg);
+		XLByteToSeg(startptr, startlogid, startlogseg);
+		XLogFileName(firstoff, ThisTimeLineID, startlogid, startlogseg);
 		XLByteToPrevSeg(endptr, endlogid, endlogseg);
-
-		while (true)
+		XLogFileName(lastoff, ThisTimeLineID, endlogid, endlogseg);
+		/* Read list of eligible WAL files into an array */
+		dir = AllocateDir("pg_xlog");
+		if (!dir)
+			ereport(ERROR,
+					(errmsg("could not open directory \"%s\": %m", "pg_xlog")));
+		while ((de = ReadDir(dir, "pg_xlog")) != NULL)
 		{
-			/* Send another xlog segment */
-			char		fn[MAXPGPATH];
-			int			i;
+			/* Does it look like a WAL segment, and is it in the range? */
+			if (strlen(de->d_name) == 24 &&
+				strspn(de->d_name, "0123456789ABCDEF") == 24 &&
+				strcmp(de->d_name + 8, firstoff + 8) >= 0 &&
+				strcmp(de->d_name + 8, lastoff + 8) <= 0)
+			{
+				walFileList = lappend(walFileList, pstrdup(de->d_name));
+			}
+			/* Does it look like a timeline history file? */
+			else if (strlen(de->d_name) == 8 + strlen(".history") &&
+					 strspn(de->d_name, "0123456789ABCDEF") == 8 &&
+					 strcmp(de->d_name + 8, ".history") == 0)
+			{
+				historyFileList = lappend(historyFileList, pstrdup(de->d_name));
+			}
+		}
+		FreeDir(dir);
 
-			XLogFilePath(fn, ThisTimeLineID, logid, logseg);
-			_tarWriteHeader(fn, NULL, &statbuf);
+		/*
+		 * Before we go any further, check that none of the WAL segments we
+		 * need were removed.
+		 */
+		CheckXLogRemoved(startlogid, startlogseg, ThisTimeLineID);
+
+		/*
+		 * Put the WAL filenames into an array, and sort. We send the files
+		 * in order from oldest to newest, to reduce the chance that a file
+		 * is recycled before we get a chance to send it over.
+		 */
+		nWalFiles = list_length(walFileList);
+		walFiles = palloc(nWalFiles * sizeof(char *));
+		i = 0;
+		foreach(lc, walFileList)
+		{
+			walFiles[i++] = lfirst(lc);
+		}
+		qsort(walFiles, nWalFiles, sizeof(char *), compareWalFileNames);
 
-			/* Send the actual WAL file contents, block-by-block */
-			for (i = 0; i < XLogSegSize / TAR_SEND_SIZE; i++)
+		/*
+		 * Sanity check: the first and last segment should include startptr
+		 * and endptr, with no gaps in between.
+		 */
+		XLogFromFileName(walFiles[0], &tli, &logid, &logseg);
+		if (logid != startlogid || logseg != startlogseg)
+		{
+			char startfname[MAXFNAMELEN];
+			XLogFileName(startfname, ThisTimeLineID, startlogid, startlogseg);
+			ereport(ERROR,
+					(errmsg("could not find WAL file %s", startfname)));
+		}
+		for (i = 0; i < nWalFiles; i++)
+		{
+			int		currlogid = logid,
+					currlogseg = logseg;
+			int		nextlogid = logid,
+					nextlogseg = logseg;
+			NextLogSeg(nextlogid, nextlogseg);
+
+			XLogFromFileName(walFiles[i], &tli, &logid, &logseg);
+			if (!((nextlogid == logid && nextlogseg == logseg) ||
+				  (currlogid == logid && currlogseg == logseg)))
 			{
-				char		buf[TAR_SEND_SIZE];
-				XLogRecPtr	ptr;
+				char nextfname[MAXFNAMELEN];
+				XLogFileName(nextfname, ThisTimeLineID, nextlogid, nextlogseg);
+				ereport(ERROR,
+						(errmsg("could not find WAL file %s", nextfname)));
+			}
+		}
+		if (logid != endlogid || logseg != endlogseg)
+		{
+			char endfname[MAXFNAMELEN];
+			XLogFileName(endfname, ThisTimeLineID, endlogid, endlogseg);
+			ereport(ERROR,
+					(errmsg("could not find WAL file %s", endfname)));
+		}
 
-				ptr.xlogid = logid;
-				ptr.xrecoff = logseg * XLogSegSize + TAR_SEND_SIZE * i;
+		/* Ok, we have everything we need. Send the WAL files. */
+		for (i = 0; i < nWalFiles; i++)
+		{
+			FILE	   *fp;
+			char		buf[TAR_SEND_SIZE];
+			size_t		cnt;
+			pgoff_t		len = 0;
+
+			snprintf(pathbuf, MAXPGPATH, "./pg_xlog/%s", walFiles[i]);
+			XLogFromFileName(walFiles[i], &tli, &logid, &logseg);
 
+			fp = AllocateFile(pathbuf, "rb");
+			if (fp == NULL)
+			{
 				/*
-				 * Some old compilers, e.g. gcc 2.95.3/x86, think that passing
-				 * a struct in the same function as a longjump might clobber a
-				 * variable.  bjm 2011-02-04
-				 * http://lists.apple.com/archives/xcode-users/2003/Dec//msg000
-				 * 51.html
+				 * Most likely reason for this is that the file was already
+				 * removed by a checkpoint, so check for that to get a better
+				 * error message.
 				 */
-				XLogRead(buf, ptr, TAR_SEND_SIZE);
-				if (pq_putmessage('d', buf, TAR_SEND_SIZE))
+				CheckXLogRemoved(logid, logseg, tli);
+
+				ereport(ERROR,
+						(errcode_for_file_access(),
+						 errmsg("could not open file \"%s\": %m", pathbuf)));
+			}
+
+			if (fstat(fileno(fp), &statbuf) != 0)
+				ereport(ERROR,
+						(errcode_for_file_access(),
+						 errmsg("could not stat file \"%s\": %m",
+								pathbuf)));
+			if (statbuf.st_size != XLogSegSize)
+			{
+				CheckXLogRemoved(logid, logseg, tli);
+				ereport(ERROR,
+						(errcode_for_file_access(),
+						 errmsg("unexpected WAL file size \"%s\"", walFiles[i])));
+			}
+
+			_tarWriteHeader(pathbuf + 1, NULL, &statbuf);
+
+			while ((cnt = fread(buf, 1, Min(sizeof(buf), XLogSegSize - len), fp)) > 0)
+			{
+				CheckXLogRemoved(logid, logseg, tli);
+				/* Send the chunk as a CopyData message */
+				if (pq_putmessage('d', buf, cnt))
 					ereport(ERROR,
 							(errmsg("base backup could not send data, aborting backup")));
+
+				len += cnt;
+				if (len == XLogSegSize)
+					break;
 			}
 
-			/*
-			 * Files are always fixed size, and always end on a 512 byte
-			 * boundary, so padding is never necessary.
-			 */
+			if (len != XLogSegSize)
+			{
+				CheckXLogRemoved(logid, logseg, tli);
+				ereport(ERROR,
+						(errcode_for_file_access(),
+						 errmsg("unexpected WAL file size \"%s\"", walFiles[i])));
+			}
 
+			/* XLogSegSize is a multiple of 512, so no need for padding */
+			FreeFile(fp);
+		}
 
-			/* Advance to the next WAL file */
-			NextLogSeg(logid, logseg);
+		/*
+		 * Send timeline history files too. Only the latest timeline history
+		 * file is required for recovery, and even that only if there happens
+		 * to be a timeline switch in the first WAL segment that contains the
+		 * checkpoint record, or if we're taking a base backup from a standby
+		 * server and the target timeline changes while the backup is taken. 
+		 * But they are small and highly useful for debugging purposes, so
+		 * better include them all, always.
+		 */
+		foreach(lc, historyFileList)
+		{
+			char *fname = lfirst(lc);
+			snprintf(pathbuf, MAXPGPATH, "./pg_xlog/%s", fname);
 
-			/* Have we reached our stop position yet? */
-			if (logid > endlogid ||
-				(logid == endlogid && logseg > endlogseg))
-				break;
+			if (lstat(pathbuf, &statbuf) != 0)
+				ereport(ERROR,
+						(errcode_for_file_access(),
+						 errmsg("could not stat file \"%s\": %m", pathbuf)));
+
+			sendFile(pathbuf, pathbuf + 1, &statbuf, false);
 		}
 
 		/* Send CopyDone message for the last tar file */
@@ -292,6 +424,19 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir)
 }
 
 /*
+ * qsort comparison function, to compare log/seg portion of WAL segment
+ * filenames, ignoring the timeline portion.
+ */
+static int
+compareWalFileNames(const void *a, const void *b)
+{
+	char *fna = *((char **) a);
+	char *fnb = *((char **) b);
+
+	return strcmp(fna + 8, fnb + 8);
+}
+
+/*
  * Parse the base backup options passed down by the parser
  */
 static void
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 6c27449..5c93146 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -977,8 +977,6 @@ XLogRead(char *buf, XLogRecPtr startptr, Size count)
 	char	   *p;
 	XLogRecPtr	recptr;
 	Size		nbytes;
-	uint32		lastRemovedLog;
-	uint32		lastRemovedSeg;
 	uint32		log;
 	uint32		seg;
 
@@ -1073,19 +1071,8 @@ retry:
 	 * read() succeeds in that case, but the data we tried to read might
 	 * already have been overwritten with new WAL records.
 	 */
-	XLogGetLastRemoved(&lastRemovedLog, &lastRemovedSeg);
 	XLByteToSeg(startptr, log, seg);
-	if (log < lastRemovedLog ||
-		(log == lastRemovedLog && seg <= lastRemovedSeg))
-	{
-		char		filename[MAXFNAMELEN];
-
-		XLogFileName(filename, ThisTimeLineID, log, seg);
-		ereport(ERROR,
-				(errcode_for_file_access(),
-				 errmsg("requested WAL segment %s has already been removed",
-						filename)));
-	}
+	CheckXLogRemoved(log, seg, ThisTimeLineID);
 
 	/*
 	 * During recovery, the currently-open WAL file might be replaced with the
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index ecd3f0f..c21e43a 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -275,7 +275,7 @@ extern int XLogFileInit(uint32 log, uint32 seg,
 extern int	XLogFileOpen(uint32 log, uint32 seg);
 
 
-extern void XLogGetLastRemoved(uint32 *log, uint32 *seg);
+extern void CheckXLogRemoved(uint32 log, uint32 seg, TimeLineID tli);
 extern void XLogSetAsyncXactLSN(XLogRecPtr record);
 
 extern Buffer RestoreBackupBlock(XLogRecPtr lsn, XLogRecord *record,
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to