Re: Bug in slot.c and are replication slots ever used at Window?
On Fri, Aug 31, 2018 at 11:46:47AM -0700, Michael Paquier wrote: > The checks will not be included in the final fix, still they look useful > so I am planning to start a new thread on the matter as perhaps other > folks have more and/or better ideas. Pushed the fix down to 9.4, without the extra sanity checks. -- Michael signature.asc Description: PGP signature
Re: Bug in slot.c and are replication slots ever used at Window?
Hi Konstantin, On Thu, Aug 30, 2018 at 11:27:31AM -0700, Michael Paquier wrote: > It seems to me that you are right here, "path" points to > pg_replslot/$SLOTNAME/state which is a file so the fsync is incorrect. > I am not sure that we'd want to publish fsync_parent_path out of fd.c > though, so perhaps we could just save the slot path in a different > variable and use it? I have spent more time on this bug, and the code path you have pointed at is the only one having such an issue. Attached is a patch to fix the problem. It includes the sanity checks I have used to check all code paths calling fsync_fname() for both the frontend and the backend code. The checks will not be included in the final fix, still they look useful so I am planning to start a new thread on the matter as perhaps other folks have more and/or better ideas. -- Michael diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index 19978d9a9e..4f30904141 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -1352,6 +1352,7 @@ RestoreSlotFromDisk(const char *name) { ReplicationSlotOnDisk cp; int i; + char slotdir[MAXPGPATH]; char path[MAXPGPATH + 22]; int fd; bool restored = false; @@ -1361,13 +1362,14 @@ RestoreSlotFromDisk(const char *name) /* no need to lock here, no concurrent access allowed yet */ /* delete temp file if it exists */ - sprintf(path, "pg_replslot/%s/state.tmp", name); + sprintf(slotdir, "pg_replslot/%s", name); + sprintf(path, "%s/state.tmp", slotdir); if (unlink(path) < 0 && errno != ENOENT) ereport(PANIC, (errcode_for_file_access(), errmsg("could not remove file \"%s\": %m", path))); - sprintf(path, "pg_replslot/%s/state", name); + sprintf(path, "%s/state", slotdir); elog(DEBUG1, "restoring replication slot from \"%s\"", path); @@ -1402,7 +1404,7 @@ RestoreSlotFromDisk(const char *name) /* Also sync the parent directory */ START_CRIT_SECTION(); - fsync_fname(path, true); + fsync_fname(slotdir, true); END_CRIT_SECTION(); /* read part of statefile that's guaranteed to be version independent */ diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c index 8dd51f1767..49a5640c61 100644 --- a/src/backend/storage/file/fd.c +++ b/src/backend/storage/file/fd.c @@ -574,6 +574,14 @@ pg_flush_data(int fd, off_t offset, off_t nbytes) void fsync_fname(const char *fname, bool isdir) { +#ifdef USE_ASSERT_CHECKING + struct stat statbuf; + + stat(fname, ); + Assert((isdir && S_ISDIR(statbuf.st_mode)) || + (!isdir && !S_ISDIR(statbuf.st_mode))); +#endif + fsync_fname_ext(fname, isdir, false, ERROR); } diff --git a/src/common/file_utils.c b/src/common/file_utils.c index 48876061c3..36095b01af 100644 --- a/src/common/file_utils.c +++ b/src/common/file_utils.c @@ -266,6 +266,14 @@ fsync_fname(const char *fname, bool isdir, const char *progname) int flags; int returncode; +#ifdef USE_ASSERT_CHECKING + struct stat statbuf; + + stat(fname, ); + Assert((isdir && S_ISDIR(statbuf.st_mode)) || + (!isdir && !S_ISDIR(statbuf.st_mode))); +#endif + /* * Some OSs require directories to be opened read-only whereas other * systems don't allow us to fsync files opened read-only; so we need both signature.asc Description: PGP signature
Re: Bug in slot.c and are replication slots ever used at Window?
On Thu, Aug 30, 2018 at 11:00:43AM +0300, Konstantin Knizhnik wrote: > So if "isdir" is true (and it is true in this case), it sets O_RDONLY > flag. Then fsync_fname successfully opens slot file in readonly mode > and calls fsync() which at windows is substituted with _commit() which > in turn is wrapper for FlushFileBuffers. It seems to me that you are right here, "path" points to pg_replslot/$SLOTNAME/state which is a file so the fsync is incorrect. I am not sure that we'd want to publish fsync_parent_path out of fd.c though, so perhaps we could just save the slot path in a different variable and use it? -- Michael signature.asc Description: PGP signature
Bug in slot.c and are replication slots ever used at Window?
Hi hackers, I am really confused. If my conclusions are correct, then nobody ever tried to use replication slots at Windows! The function RestoreSlotFromDisk in slot.c contains the following code: static void RestoreSlotFromDisk(const char *name) { ReplicationSlotOnDisk cp; int i; char path[MAXPGPATH + 22]; int fd; bool restored = false; int readBytes; pg_crc32c checksum; /* no need to lock here, no concurrent access allowed yet */ /* delete temp file if it exists */ sprintf(path, "pg_replslot/%s/state.tmp", name); if (unlink(path) < 0 && errno != ENOENT) ereport(PANIC, (errcode_for_file_access(), errmsg("could not remove file \"%s\": %m", path))); sprintf(path, "pg_replslot/%s/state", name); elog(DEBUG1, "restoring replication slot from \"%s\"", path); fd = OpenTransientFile(path, O_RDWR | PG_BINARY); /* * We do not need to handle this as we are rename()ing the directory into * place only after we fsync()ed the state file. */ if (fd < 0) ereport(PANIC, (errcode_for_file_access(), errmsg("could not open file \"%s\": %m", path))); /* * Sync state file before we're reading from it. We might have crashed * while it wasn't synced yet and we shouldn't continue on that basis. */ pgstat_report_wait_start(WAIT_EVENT_REPLICATION_SLOT_RESTORE_SYNC); if (pg_fsync(fd) != 0) { int save_errno = errno; CloseTransientFile(fd); errno = save_errno; ereport(PANIC, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", path))); } pgstat_report_wait_end(); /* Also sync the parent directory */ START_CRIT_SECTION(); fsync_fname(path, true); END_CRIT_SECTION(); ... Please notice that fsync_fname with comment "also sync parent directory" is called for path of the file! fsync_fname in turn does the following: /* * fsync_fname -- Try to fsync a file or directory * * Ignores errors trying to open unreadable files, or trying to fsync * directories on systems where that isn't allowed/required. Reports * other errors non-fatally. */ int fsync_fname(const char *fname, bool isdir, const char *progname) { int fd; int flags; int returncode; /* * Some OSs require directories to be opened read-only whereas other * systems don't allow us to fsync files opened read-only; so we need both * cases here. Using O_RDWR will cause us to fail to fsync files that are * not writable by our userid, but we assume that's OK. */ flags = PG_BINARY; if (!isdir) flags |= O_RDWR; else flags |= O_RDONLY; /* * Open the file, silently ignoring errors about unreadable files (or * unsupported operations, e.g. opening a directory under Windows), and * logging others. */ fd = open(fname, flags); if (fd < 0) { if (errno == EACCES || (isdir && errno == EISDIR)) return 0; fprintf(stderr, _("%s: could not open file \"%s\": %s\n"), progname, fname, strerror(errno)); return -1; } returncode = fsync(fd); So if "isdir" is true (and it is true in this case), it sets O_RDONLY flag. Then fsync_fname successfully opens slot file in readonly mode and calls fsync() which at windows is substituted with _commit() which in turn is wrapper for FlushFileBuffers. Finally FlushFileBuffers returns ERROR_ACCESS_DENINED which cause assertion failure in _commit: if ( !FlushFileBuffers((HANDLE)_get_osfhandle(filedes)) ) { retval = GetLastError(); } else { retval = 0; /* return success */ } /* map the OS return code to C errno value and return code */ if (retval == 0) goto good; _doserrno = retval; } errno = EBADF; retval = -1; _ASSERTE(("Invalid file descriptor. File possibly closed by a different thread",0)); I think that the problem happen only with debug version of postgres. Release version will just return error in this case which is silently ignored by RestoreSlotFromDisk function. I think that bug fix is trivial: we just need to use fsync_parent_path instead of fsync_fname in RestoreSlotFromDisk. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company