On Wed, Jul 27, 2022 at 5:01 PM Thomas Munro <[email protected]> wrote:
> On Wed, Jul 27, 2022 at 4:35 PM Michael Paquier <[email protected]> wrote:
> > On Tue, Jul 26, 2022 at 07:10:22PM +0000, Robert Haas wrote:
> > > Remove the restriction that the relmap must be 512 bytes.
>
> > The CI on Windows is blowing up here and there after something that
> > looks to come from this commit, as of this backtrace:
> > 00000000`007fe300 00000001`405c62dd     postgres!errfinish(
> > char * filename = 0x00000001`40bf1513 "fd.c",
> > int lineno = 0n756,
> > char * funcname = 0x00000001`40bf14e0 "durable_rename")+0x41b
> > [c:\cirrus\src\backend\utils\error\elog.c @ 683]
>
> And here's what the error looks like:
>
> 2022-07-26 19:38:04.321 GMT [8020][client backend]
> [pg_regress/vacuum][8/349:4527] PANIC:  could not rename file
> "global/pg_filenode.map.tmp" to "global/pg_filenode.map": Permission
> denied
>
> Someone else still has the old file open, so we can't rename the new
> one to its name?  On Windows that should have gone through pgrename()
> in dirmod.c, which would retry 100 times with a 100ms sleep between.
> Since every backend reads that file (I added an elog() and saw it 2289
> times during make check), I guess you can run out of luck.
>
> /me thinks

Maybe we just have to rearrange the locking slightly?  Something like
the attached.
From c4f52c8cabb35d7db6d24681227146a195688c39 Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Wed, 27 Jul 2022 18:01:12 +1200
Subject: [PATCH] Fix read_relmap_file() concurrency on Windows.

Commit d8cd0c6c introduced a file rename that could fail on Windows due
to other backends reading an old file of the same name.  Re-arrange the
locking slightly to prevent that.

Discussion: https://postgr.es/m/CA%2BhUKGLZtCTgp4NTWV-wGbR2Nyag71%3DEfYTKjDKnk%2BfkhuFMHw%40mail.gmail.com
---
 src/backend/utils/cache/relmapper.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c
index 79e09181b6..4562a0ad6f 100644
--- a/src/backend/utils/cache/relmapper.c
+++ b/src/backend/utils/cache/relmapper.c
@@ -788,16 +788,6 @@ read_relmap_file(RelMapFile *map, char *dbpath, bool lock_held, int elevel)
 
 	Assert(elevel >= ERROR);
 
-	/* Open the target file. */
-	snprintf(mapfilename, sizeof(mapfilename), "%s/%s", dbpath,
-			 RELMAPPER_FILENAME);
-	fd = OpenTransientFile(mapfilename, O_RDONLY | PG_BINARY);
-	if (fd < 0)
-		ereport(elevel,
-				(errcode_for_file_access(),
-				 errmsg("could not open file \"%s\": %m",
-						mapfilename)));
-
 	/*
 	 * Grab the lock to prevent the file from being updated while we read it,
 	 * unless the caller is already holding the lock.  If the file is updated
@@ -808,6 +798,16 @@ read_relmap_file(RelMapFile *map, char *dbpath, bool lock_held, int elevel)
 	if (!lock_held)
 		LWLockAcquire(RelationMappingLock, LW_SHARED);
 
+	/* Open the target file. */
+	snprintf(mapfilename, sizeof(mapfilename), "%s/%s", dbpath,
+			 RELMAPPER_FILENAME);
+	fd = OpenTransientFile(mapfilename, O_RDONLY | PG_BINARY);
+	if (fd < 0)
+		ereport(elevel,
+				(errcode_for_file_access(),
+				 errmsg("could not open file \"%s\": %m",
+						mapfilename)));
+
 	/* Now read the data. */
 	pgstat_report_wait_start(WAIT_EVENT_RELATION_MAP_READ);
 	r = read(fd, map, sizeof(RelMapFile));
-- 
2.35.1

Reply via email to