On Wed, Oct 8, 2014 at 10:00 PM, Michael Paquier <michael.paqu...@gmail.com> wrote:
> The additional process at promotion sounds like a good idea, I'll try to > get a patch done tomorrow. This would result as well in removing the > XLogArchiveForceDone stuff. Either way, not that I have been able to > reproduce the problem manually, things can be clearly solved. > Please find attached two patches aimed to fix this issue and to improve the situation: - 0001 prevents the apparition of those phantom WAL segment file by ensuring that when a node is in recovery it will remove it whatever its status in archive_status. This patch is the real fix, and should be applied down to 9.2. - 0002 is a patch implementing Heikki's idea of enforcing all the segment files present in pg_xlog to have their status to .done, marking them for removal. When looking at the code, I finally concluded that Fujii-san's point, about marking in all cases as .done segment files that have been fully streamed, actually makes more sense to not be backward. This patch would actually not be mandatory for back-patching, but it makes the process more robust IMO. I imagine that it would be nice to get those things fixed before the next minor release. Regards, -- Michael
From 18c2d47b1d5dd3c0439f990ee4da6b305d477ca4 Mon Sep 17 00:00:00 2001 From: Michael Paquier <mich...@otacoo.com> Date: Thu, 9 Oct 2014 15:04:26 +0900 Subject: [PATCH 1/2] Fix apparition of archive status files of .ready on streaming standbys Commit 1bd42cd has removed a check based on the recovery status of a node when removing old WAL segment files in pg_xlog, causing the apparition of .ready files that prevented the removal of some WAL segment files that remained stuck in the archive folder. Note that this does not prevent the abscence of some .done files as it may still be possible that some segments cannot be marked correctly as complete after their stream is done in the case for example of an abrupt disconnection between a standby and its root node, but it ensures that when a node is in recovery old WAL segment files are removed whatever their status in the folder archive_status. Per report from Jehan-Guillaume de Rorthais. --- src/backend/access/transam/xlog.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 5a4dbb9..39701a3 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -3771,7 +3771,7 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr endptr) strspn(xlde->d_name, "0123456789ABCDEF") == 24 && strcmp(xlde->d_name + 8, lastoff + 8) <= 0) { - if (XLogArchiveCheckDone(xlde->d_name)) + if (RecoveryInProgress() || XLogArchiveCheckDone(xlde->d_name)) { snprintf(path, MAXPGPATH, XLOGDIR "/%s", xlde->d_name); -- 2.1.2
From 493a9d05e9386403fc1e6d78df19dd8a59c53cae Mon Sep 17 00:00:00 2001 From: Michael Paquier <mich...@otacoo.com> Date: Thu, 9 Oct 2014 15:09:44 +0900 Subject: [PATCH 2/2] Enforce all WAL segment files to be marked as .done at node promotion This is a safety mechanism to ensure that there are no files that are not considered as .done as some segments may have been missed particularly in the case of partially written files or files being written when a disconnection occurred between a streaming standby and its root node. This makes the node reaching promotion having a state consistent with what is expected using the assumption that all the WAL segment files that are done being streamed should be always considered as archived by the node. --- src/backend/access/transam/xlog.c | 10 ++++++++ src/backend/access/transam/xlogarchive.c | 39 +++++++++++++++++++++++++++++++- src/include/access/xlog_internal.h | 1 + 3 files changed, 49 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 39701a3..af5f548 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6862,6 +6862,16 @@ StartupXLOG(void) } /* + * Create a .done entry for each WAL file present in pg_xlog that has + * not been yet marked as such. In some cases where for example a streaming + * replica has had a connection to a remote node cut abruptly, such WAL + * files may have been only partially written or even not flagged correctly + * with .done. + */ + if (InRecovery) + XLogArchiveForceDoneAll(); + + /* * Kill WAL receiver, if it's still running, before we continue to write * the startup checkpoint record. It will trump over the checkpoint and * subsequent records if it's still alive when we start writing WAL. diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c index 047efa2..931106f 100644 --- a/src/backend/access/transam/xlogarchive.c +++ b/src/backend/access/transam/xlogarchive.c @@ -554,6 +554,40 @@ XLogArchiveNotifySeg(XLogSegNo segno) } /* + * XLogArchiveForceDoneAll + * + * Wrapper of XLogArchiveForceDone scanning all the XLOG segments files in + * XLOGDIR, switching them forcibly to <XLOG>.done. + */ +void +XLogArchiveForceDoneAll(void) +{ + DIR *xldir; + struct dirent *xlde; + + + xldir = AllocateDir(XLOGDIR); + if (xldir == NULL) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not open transaction log directory \"%s\": %m", + XLOGDIR))); + + /* + * Scan all the WAL segments present in the archives and switch them to + * .done. + */ + while ((xlde = ReadDir(xldir, XLOGDIR)) != NULL) + { + if (strlen(xlde->d_name) == 24 && + strspn(xlde->d_name, "0123456789ABCDEF") == 24) + XLogArchiveForceDone(xlde->d_name); + } + + FreeDir(xldir); +} + +/* * XLogArchiveForceDone * * Emit notification forcibly that an XLOG segment file has been successfully @@ -582,7 +616,6 @@ XLogArchiveForceDone(const char *xlog) (errcode_for_file_access(), errmsg("could not rename file \"%s\" to \"%s\": %m", archiveReady, archiveDone))); - return; } @@ -604,6 +637,10 @@ XLogArchiveForceDone(const char *xlog) archiveDone))); return; } + + ereport(DEBUG2, + (errmsg("archive status file of \"%s\" has been forcibly switched from \"%s\" to \"%s\"", + xlog, archiveReady, archiveDone))); } /* diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h index 27b9899..04a0de3 100644 --- a/src/include/access/xlog_internal.h +++ b/src/include/access/xlog_internal.h @@ -286,6 +286,7 @@ extern void ExecuteRecoveryCommand(char *command, char *commandName, extern void KeepFileRestoredFromArchive(char *path, char *xlogfname); extern void XLogArchiveNotify(const char *xlog); extern void XLogArchiveNotifySeg(XLogSegNo segno); +extern void XLogArchiveForceDoneAll(void); extern void XLogArchiveForceDone(const char *xlog); extern bool XLogArchiveCheckDone(const char *xlog); extern bool XLogArchiveIsBusy(const char *xlog); -- 2.1.2
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers