On Wed, Jul 27, 2022 at 6:06 PM Thomas Munro <[email protected]> wrote: > On Wed, Jul 27, 2022 at 5:01 PM Thomas Munro <[email protected]> wrote: > > Someone else still has the old file open, so we can't rename the new > > one to its name? On Windows that should have gone through pgrename() > > in dirmod.c, which would retry 100 times with a 100ms sleep between. > > Since every backend reads that file (I added an elog() and saw it 2289 > > times during make check), I guess you can run out of luck. > > > > /me thinks > > Maybe we just have to rearrange the locking slightly? Something like > the attached.
Erm, let me try that again, this time with the CloseTransientFile() also under the lock, so that we never have a file handle without a lock.
From 3f2b3e0e81f4dcf69a658df35248a0dbaf638eaf Mon Sep 17 00:00:00 2001 From: Thomas Munro <[email protected]> Date: Wed, 27 Jul 2022 18:01:12 +1200 Subject: [PATCH v2] Fix read_relmap_file() concurrency on Windows. Commit d8cd0c6c introduced a file rename that could fail on Windows, probably due to other backends having an open file handle to the old file of the same name. Re-arrange the locking slightly to prevent that, by making sure the open() and close() run while we hold the lock. Discussion: https://postgr.es/m/CA%2BhUKGLZtCTgp4NTWV-wGbR2Nyag71%3DEfYTKjDKnk%2BfkhuFMHw%40mail.gmail.com --- src/backend/utils/cache/relmapper.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c index 79e09181b6..7dca3be723 100644 --- a/src/backend/utils/cache/relmapper.c +++ b/src/backend/utils/cache/relmapper.c @@ -788,16 +788,6 @@ read_relmap_file(RelMapFile *map, char *dbpath, bool lock_held, int elevel) Assert(elevel >= ERROR); - /* Open the target file. */ - snprintf(mapfilename, sizeof(mapfilename), "%s/%s", dbpath, - RELMAPPER_FILENAME); - fd = OpenTransientFile(mapfilename, O_RDONLY | PG_BINARY); - if (fd < 0) - ereport(elevel, - (errcode_for_file_access(), - errmsg("could not open file \"%s\": %m", - mapfilename))); - /* * Grab the lock to prevent the file from being updated while we read it, * unless the caller is already holding the lock. If the file is updated @@ -808,6 +798,16 @@ read_relmap_file(RelMapFile *map, char *dbpath, bool lock_held, int elevel) if (!lock_held) LWLockAcquire(RelationMappingLock, LW_SHARED); + /* Open the target file. */ + snprintf(mapfilename, sizeof(mapfilename), "%s/%s", dbpath, + RELMAPPER_FILENAME); + fd = OpenTransientFile(mapfilename, O_RDONLY | PG_BINARY); + if (fd < 0) + ereport(elevel, + (errcode_for_file_access(), + errmsg("could not open file \"%s\": %m", + mapfilename))); + /* Now read the data. */ pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_READ); r = read(fd, map, sizeof(RelMapFile)); @@ -825,15 +825,15 @@ read_relmap_file(RelMapFile *map, char *dbpath, bool lock_held, int elevel) } pgstat_report_wait_end(); - if (!lock_held) - LWLockRelease(RelationMappingLock); - if (CloseTransientFile(fd) != 0) ereport(elevel, (errcode_for_file_access(), errmsg("could not close file \"%s\": %m", mapfilename))); + if (!lock_held) + LWLockRelease(RelationMappingLock); + /* check for correct magic number, etc */ if (map->magic != RELMAPPER_FILEMAGIC || map->num_mappings < 0 || -- 2.35.1
