Hi Amit

I've been testing some undo worker workloads (more on that soon), but
here's a small thing: I managed to reach an LWLock self-deadlock in
the undo worker launcher:

diff --git a/src/backend/access/undo/undorequest.c
b/src/backend/access/undo/undorequest.c
...
+bool
+UndoGetWork(bool allow_peek, bool remove_from_queue, UndoRequestInfo *urinfo,
...
+       /* Search the queues under lock as they can be modified concurrently. */
+       LWLockAcquire(RollbackRequestLock, LW_EXCLUSIVE);
...
+                               RollbackHTRemoveEntry(rh->full_xid,
rh->start_urec_ptr);

^ but that function acquires the same lock, leading to:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007fff5d110106 libsystem_kernel.dylib`semop + 10
    frame #1: 0x0000000104bbf24c
postgres`PGSemaphoreLock(sema=0x000000010e216a08) at pg_sema.c:428:15
    frame #2: 0x0000000104c90186
postgres`LWLockAcquire(lock=0x000000010e218300, mode=LW_EXCLUSIVE) at
lwlock.c:1246:4
    frame #3: 0x000000010487463d
postgres`RollbackHTRemoveEntry(full_xid=(value = 89144),
start_urec_ptr=20890721090967) at undorequest.c:1717:2
    frame #4: 0x0000000104873dbe
postgres`UndoGetWork(allow_peek=false, remove_from_queue=false,
urinfo=0x00007ffeeb4d3e30, in_other_db_out=0x0000000000000000) at
undorequest.c:1388:5
    frame #5: 0x0000000104876211 postgres`UndoLauncherMain(main_arg=0)
at undoworker.c:607:7
...
(lldb) print held_lwlocks[0]
(LWLockHandle) $0 = {
  lock = 0x000000010e218300
  mode = LW_EXCLUSIVE
}

-- 
Thomas Munro
https://enterprisedb.com


Reply via email to