On Wed, Jul 27, 2022 at 6:06 PM Thomas Munro <[email protected]> wrote:
> On Wed, Jul 27, 2022 at 5:01 PM Thomas Munro <[email protected]> wrote:
> > Someone else still has the old file open, so we can't rename the new
> > one to its name?  On Windows that should have gone through pgrename()
> > in dirmod.c, which would retry 100 times with a 100ms sleep between.
> > Since every backend reads that file (I added an elog() and saw it 2289
> > times during make check), I guess you can run out of luck.
> >
> > /me thinks
>
> Maybe we just have to rearrange the locking slightly?  Something like
> the attached.

Erm, let me try that again, this time with the CloseTransientFile()
also under the lock, so that we never have a file handle without a
lock.
From 3f2b3e0e81f4dcf69a658df35248a0dbaf638eaf Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Wed, 27 Jul 2022 18:01:12 +1200
Subject: [PATCH v2] Fix read_relmap_file() concurrency on Windows.

Commit d8cd0c6c introduced a file rename that could fail on Windows,
probably due to other backends having an open file handle to the old
file of the same name.  Re-arrange the locking slightly to prevent that,
by making sure the open() and close() run while we hold the lock.

Discussion: https://postgr.es/m/CA%2BhUKGLZtCTgp4NTWV-wGbR2Nyag71%3DEfYTKjDKnk%2BfkhuFMHw%40mail.gmail.com
---
 src/backend/utils/cache/relmapper.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 79e09181b6..7dca3be723 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -788,16 +788,6 @@ read_relmap_file(RelMapFile *map, char *dbpath, bool lock_held, int elevel)
 
 	Assert(elevel >= ERROR);
 
-	/* Open the target file. */
-	snprintf(mapfilename, sizeof(mapfilename), "%s/%s", dbpath,
-			 RELMAPPER_FILENAME);
-	fd = OpenTransientFile(mapfilename, O_RDONLY | PG_BINARY);
-	if (fd < 0)
-		ereport(elevel,
-				(errcode_for_file_access(),
-				 errmsg("could not open file \"%s\": %m",
-						mapfilename)));
-
 	/*
 	 * Grab the lock to prevent the file from being updated while we read it,
 	 * unless the caller is already holding the lock.  If the file is updated
@@ -808,6 +798,16 @@ read_relmap_file(RelMapFile *map, char *dbpath, bool lock_held, int elevel)
 	if (!lock_held)
 		LWLockAcquire(RelationMappingLock, LW_SHARED);
 
+	/* Open the target file. */
+	snprintf(mapfilename, sizeof(mapfilename), "%s/%s", dbpath,
+			 RELMAPPER_FILENAME);
+	fd = OpenTransientFile(mapfilename, O_RDONLY | PG_BINARY);
+	if (fd < 0)
+		ereport(elevel,
+				(errcode_for_file_access(),
+				 errmsg("could not open file \"%s\": %m",
+						mapfilename)));
+
 	/* Now read the data. */
 	pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_READ);
 	r = read(fd, map, sizeof(RelMapFile));
@@ -825,15 +825,15 @@ read_relmap_file(RelMapFile *map, char *dbpath, bool lock_held, int elevel)
 	}
 	pgstat_report_wait_end();
 
-	if (!lock_held)
-		LWLockRelease(RelationMappingLock);
-
 	if (CloseTransientFile(fd) != 0)
 		ereport(elevel,
 				(errcode_for_file_access(),
 				 errmsg("could not close file \"%s\": %m",
 						mapfilename)));
 
+	if (!lock_held)
+		LWLockRelease(RelationMappingLock);
+
 	/* check for correct magic number, etc */
 	if (map->magic != RELMAPPER_FILEMAGIC ||
 		map->num_mappings < 0 ||
-- 
2.35.1

Reply via email to