On Thu, May 26, 2022 at 2:35 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.mu...@gmail.com> writes:
> > On a more practical note, I don't have access to the BF database right
> > now.  Would you mind checking if "latch already owned" has occurred on
> > any other animals?
>
> Looking back 6 months, these are the only occurrences of that string
> in failed tests:
>
>  sysname | branch |      snapshot       |     stage      |                    
>              l
> ---------+--------+---------------------+----------------+-------------------------------------------------------------------
>  gharial | HEAD   | 2022-04-28 23:37:51 | Check          | 2022-04-28 
> 18:36:26.981 MDT [22642:1] ERROR:  latch already owned
>  gharial | HEAD   | 2022-05-06 11:33:11 | IsolationCheck | 2022-05-06 
> 10:10:52.727 MDT [7366:1] ERROR:  latch already owned
>  gharial | HEAD   | 2022-05-24 06:31:31 | IsolationCheck | 2022-05-24 
> 02:44:51.850 MDT [13089:1] ERROR:  latch already owned
> (3 rows)

Thanks.  Hmm.  So far it's always a parallel worker.  The best idea I
have is to include the ID of the mystery PID in the error message and
see if that provides a clue next time.
diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 78c6a89271..07b8273a7d 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -402,6 +402,8 @@ InitSharedLatch(Latch *latch)
 void
 OwnLatch(Latch *latch)
 {
+	pid_t		previous_owner;
+
 	/* Sanity checks */
 	Assert(latch->is_shared);
 
@@ -410,8 +412,11 @@ OwnLatch(Latch *latch)
 	Assert(selfpipe_readfd >= 0 && selfpipe_owner_pid == MyProcPid);
 #endif
 
-	if (latch->owner_pid != 0)
-		elog(ERROR, "latch already owned");
+	previous_owner = latch->owner_pid;
+	if (previous_owner != 0)
+		elog(ERROR,
+			 "latch already owned by PID %lu",
+			 (unsigned long) previous_owner);
 
 	latch->owner_pid = MyProcPid;
 }

Reply via email to