On Fri, Jan 22, 2016 at 9:32 PM, Michael Paquier
<[email protected]> wrote:
> On Fri, Jan 22, 2016 at 5:26 PM, Tomas Vondra
> <[email protected]> wrote:
>> On 01/22/2016 06:45 AM, Michael Paquier wrote:
>>> Here are some comments about your patch after a look at the code.
>>>
>>> Regarding the additions in fsync_fname() in xlog.c:
>>> 1) In InstallXLogFileSegment, rename() will be called only if
>>> HAVE_WORKING_LINK is not used, which happens only on Windows and
>>> cygwin. We could add it for consistency, but it should be within the
>>> #else/#endif block. It is not critical as of now.
>>> 2) The call in RemoveXlogFile is not necessary, the rename happening
>>> only on Windows.
>>
>> Hmmm, OK. Are we sure HAVE_WORKING_LINK is false only on Windows, or could
>> there be some other platforms? And are we sure the file systems on those
>> platforms are safe without the fsync call?
>> That is, while the report references ext4, there may be other file systems
>> with the same problem - ext4 was used mostly as it's the most widely used
>> Linux file system.
>
> From pg_config_manual.h:
> #if !defined(WIN32) && !defined(__CYGWIN__)
> #define HAVE_WORKING_LINK 1
> #endif
> If we want to be consistent with what Posix proposes, I am not against
> adding it.
I did some tests with NTFS using cygwin, and the rename() calls remain
even after powering off the VM. But I agree that adding an fsync() in
both cases would be fine.
>>> Thoughts?
>>
>> Thanks for the review and comments. I think the question is whether we only
>> want to do the additional fsync() only when it ultimately may lead to data
>> loss, or even in cases where it may cause operational issues (e.g. switching
>> back to recovery needlessly).
>> I'd vote for the latter, as I think it makes the database easier to operate
>> (less manual interventions) and the performance impact is 0 (as those fsyncs
>> are really rare).
>
> My first line of thoughts after looking at the patch is that I am not
> against adding those fsync calls on HEAD as there is roughly an
> advantage to not go back to recovery in most cases and ensure
> consistent names, but as they do not imply any data loss I would not
> encourage a back-patch. Adding them seems harmless at first sight I
> agree, but those are not actual bugs.
OK. It is true that PGDATA would be fsync'd in 4 code paths with your
patch which are not that much taken:
- Renaming tablespace map file and backup label file (three times)
- Renaming to recovery.done
So, what do you think about the patch attached? Moving the calls into
the critical sections is not really necessary except when installing a
new segment.
--
Michael
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index f6da673..4173a50 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -435,6 +435,12 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
(errcode_for_file_access(),
errmsg("could not rename file \"%s\" to \"%s\": %m",
tmppath, path)));
+
+ /*
+ * Make sure the rename is permanent by fsyncing the parent
+ * directory.
+ */
+ fsync_fname(XLOGDIR, true);
#endif
/* The history file can be archived immediately. */
@@ -525,6 +531,9 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size)
(errcode_for_file_access(),
errmsg("could not rename file \"%s\" to \"%s\": %m",
tmppath, path)));
+
+ /* Make sure the rename is permanent by fsyncing the directory. */
+ fsync_fname(XLOGDIR, true);
#endif
}
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a2846c4..b124f90 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3278,6 +3278,14 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
tmppath, path)));
return false;
}
+
+ /*
+ * Make sure the rename is permanent by fsyncing the parent
+ * directory.
+ */
+ START_CRIT_SECTION();
+ fsync_fname(XLOGDIR, true);
+ END_CRIT_SECTION();
#endif
if (use_lock)
@@ -3800,10 +3808,18 @@ RemoveXlogFile(const char *segname, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr)
path)));
return;
}
+
+ /*
+ * Make sure the rename is permanent by fsyncing the parent
+ * directory.
+ */
+ fsync_fname(XLOGDIR, true);
+
rc = unlink(newpath);
#else
rc = unlink(path);
#endif
+
if (rc != 0)
{
ereport(LOG,
@@ -5297,6 +5313,9 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog)
errmsg("could not rename file \"%s\" to \"%s\": %m",
RECOVERY_COMMAND_FILE, RECOVERY_COMMAND_DONE)));
+ /* Make sure the rename is permanent by fsyncing the data directory. */
+ fsync_fname(".", true);
+
ereport(LOG,
(errmsg("archive recovery complete")));
}
@@ -6150,6 +6169,12 @@ StartupXLOG(void)
TABLESPACE_MAP, BACKUP_LABEL_FILE),
errdetail("Could not rename file \"%s\" to \"%s\": %m.",
TABLESPACE_MAP, TABLESPACE_MAP_OLD)));
+
+ /*
+ * Make sure the rename is permanent by fsyncing the data
+ * directory.
+ */
+ fsync_fname(".", true);
}
/*
@@ -6525,6 +6550,13 @@ StartupXLOG(void)
TABLESPACE_MAP, TABLESPACE_MAP_OLD)));
}
+ /*
+ * Make sure the rename is permanent by fsyncing the parent
+ * directory.
+ */
+ if (haveBackupLabel || haveTblspcMap)
+ fsync_fname(".", true);
+
/* Check that the GUCs used to generate the WAL allow recovery */
CheckRequiredParameterValues();
@@ -7305,6 +7337,12 @@ StartupXLOG(void)
errmsg("could not rename file \"%s\" to \"%s\": %m",
origpath, partialpath)));
XLogArchiveNotify(partialfname);
+
+ /*
+ * Make sure the rename is permanent by fsyncing the parent
+ * directory.
+ */
+ fsync_fname(XLOGDIR, true);
}
}
}
@@ -10905,6 +10943,9 @@ CancelBackup(void)
BACKUP_LABEL_FILE, BACKUP_LABEL_OLD,
TABLESPACE_MAP, TABLESPACE_MAP_OLD)));
}
+
+ /* fsync the data directory to persist the renames */
+ fsync_fname(".", true);
}
/*
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 277c14a..8dda80b 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -477,6 +477,12 @@ KeepFileRestoredFromArchive(char *path, char *xlogfname)
path, xlogfpath)));
/*
+ * Make sure the renames are permanent by fsyncing the parent
+ * directory.
+ */
+ fsync_fname(XLOGDIR, true);
+
+ /*
* Create .done file forcibly to prevent the restored segment from being
* archived again later.
*/
@@ -586,6 +592,11 @@ XLogArchiveForceDone(const char *xlog)
errmsg("could not rename file \"%s\" to \"%s\": %m",
archiveReady, archiveDone)));
+ /*
+ * Make sure the rename is permanent by fsyncing the parent
+ * directory.
+ */
+ fsync_fname(XLOGDIR "/archive_status", true);
return;
}
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 397f802..7165c74 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -733,4 +733,10 @@ pgarch_archiveDone(char *xlog)
(errcode_for_file_access(),
errmsg("could not rename file \"%s\" to \"%s\": %m",
rlogready, rlogdone)));
+
+ /*
+ * Make sure the rename is permanent by fsyncing the parent
+ * directory.
+ */
+ fsync_fname(XLOGDIR "/archive_status", true);
}
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers