On Thu, Oct 31, 2024 at 09:20:52PM -0700, Noah Misch wrote: > Here, one of the autovacuum workers had the guilty stack trace, appearing at > the end of this message. heap_inplace_update_and_unlock() calls > CacheInvalidateHeapTupleInplace() while holding BUFFER_LOCK_EXCLUSIVE on a > buffer of pg_class. CacheInvalidateHeapTupleInplace() may call > CatalogCacheInitializeCache(), which opens the cache's rel. If there's not a > valid relcache entry for the catcache's rel, we scan pg_class to make a valid > relcache entry. The ensuing hang makes sense. > > Tomorrow, I'll think more about fixes. Two that might work: > > 1. Call CacheInvalidateHeapTupleInplace() before locking the buffer. Each > time we need to re-find the tuple, discard the previous try's inplace > invals and redo CacheInvalidateHeapTupleInplace(). That's because > concurrent activity may have changed cache key fields like relname.
Attached. With this, I got no hangs in 1.9h of your test procedure. Without the patch, I got fourteen hangs in the same amount of time. > 2. Add some function that we call before locking the buffer. Its charter is > to ensure PrepareToInvalidateCacheTuple() won't have to call > CatalogCacheInitializeCache(). The existing InitCatalogCachePhase2() could satisfy that charter. I liked this a little less than (1), for two reasons. First, confirming that it avoids deadlocks requires thinking through more of the catcache and relcache procedures. Second, InitCatalogCachePhase2() would init unnecessary caches. > I think nothing resets catcache to the > extent that CatalogCacheInitializeCache() must happen again, so this should > suffice regardless of concurrent sinval traffic, debug_discard_caches, etc. The comment at CatalogCacheFlushCatalog() confirms that. > What else is worth considering? Any preferences among those? 3. Let a process take LW_SHARED if it already holds LW_EXCLUSIVE. That would change outcomes well beyond inplace update, and it feels like a bandage for doing things in a suboptimal order. (1) or (2) would be an improvement even if we did (3), since they would reduce I/O done while holding BUFFER_LOCK_EXCLUSIVE. This was a near miss to having a worst-in-years regression in a minor release, so I'm proposing this sequence: - Revert from non-master branches commits 8e7e672 (inplace180, "WAL-log inplace update before revealing it to other sessions.") and 243e9b4 (inplace160, "For inplace update, send nontransactional invalidations."). - Back-patch inplace230-index_update_stats-io-before-buflock to harden commit a07e03f (inplace110, "Fix data loss at inplace update after heap_update()"). - Push attached inplace240 to master. - Make the commitfest entry a request for review of v17 inplace160+inplace240. After some amount of additional review and master bake time, the reverted patches would return to non-master branches. If someone agrees or if nobody objects by 2024-11-02T15:00+0000, I'll make it so. That's not much time, but I want to minimize buildfarm members hanging and maximize inplace230 bake time before the release wrap. Less urgently, we likely should add defense in depth against adding rarely-reached LWLock deadlocks. Perhaps one or both of: - Assert-fail if entering a conditional CatalogCacheInitializeCache() caller (e.g. SearchCatCacheInternal()) while holding a catalog buffer lock. - elog(ERROR) instead of sleeping in LWLockAcquire() if the current process is the current lock holder. Thanks, nm
Author: Noah Misch <n...@leadboat.com> Commit: Noah Misch <n...@leadboat.com> Fix inplace update buffer self-deadlock. A CacheInvalidateHeapTuple* callee might call CatalogCacheInitializeCache(), which needs a relcache entry. Acquiring a valid relcache entry might scan pg_class. Hence, to prevent undetected LWLock self-deadlock, CacheInvalidateHeapTuple* callers must not hold BUFFER_LOCK_EXCLUSIVE on buffers of pg_class. Move the CacheInvalidateHeapTupleInplace() before the BUFFER_LOCK_EXCLUSIVE. Commit 243e9b40f1b2dd09d6e5bf91ebf6e822a2cd3704 introduced this regression. No back-patch, since I've reverted the culprit from non-master branches. Reported by Alexander Lakhin. Reviewed by FIXME. Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec41145...@gmail.com diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 1748eaf..9a31cdc 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -6214,6 +6214,17 @@ heap_inplace_lock(Relation relation, Assert(BufferIsValid(buffer)); + /* + * Construct shared cache inval if necessary. Because we pass a tuple + * version without our own inplace changes or inplace changes other + * sessions complete while we wait for locks, inplace update mustn't + * change catcache lookup keys. But we aren't bothering with index + * updates either, so that's true a fortiori. After LockBuffer(), it + * would be too late, because this might reach a + * CatalogCacheInitializeCache() that locks "buffer". + */ + CacheInvalidateHeapTupleInplace(relation, oldtup_ptr, NULL); + LockTuple(relation, &oldtup.t_self, InplaceUpdateTupleLock); LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); @@ -6309,6 +6320,7 @@ heap_inplace_lock(Relation relation, if (!ret) { UnlockTuple(relation, &oldtup.t_self, InplaceUpdateTupleLock); + ForgetInplace_Inval(); InvalidateCatalogSnapshot(); } return ret; @@ -6345,14 +6357,6 @@ heap_inplace_update_and_unlock(Relation relation, dst = (char *) htup + htup->t_hoff; src = (char *) tuple->t_data + tuple->t_data->t_hoff; - /* - * Construct shared cache inval if necessary. Note that because we only - * pass the new version of the tuple, this mustn't be used for any - * operations that could change catcache lookup keys. But we aren't - * bothering with index updates either, so that's true a fortiori. - */ - CacheInvalidateHeapTupleInplace(relation, tuple, NULL); - /* Like RecordTransactionCommit(), log only if needed */ if (XLogStandbyInfoActive()) nmsgs = inplaceGetInvalidationMessages(&invalMessages, @@ -6481,6 +6485,7 @@ heap_inplace_unlock(Relation relation, { LockBuffer(buffer, BUFFER_LOCK_UNLOCK); UnlockTuple(relation, &oldtup->t_self, InplaceUpdateTupleLock); + ForgetInplace_Inval(); } #define FRM_NOOP 0x0001 diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c index 986850c..fc972ed 100644 --- a/src/backend/utils/cache/inval.c +++ b/src/backend/utils/cache/inval.c @@ -1202,6 +1202,18 @@ AtInplace_Inval(void) } /* + * ForgetInplace_Inval + * Alternative to PreInplace_Inval()+AtInplace_Inval(): discard queued-up + * invalidations. This lets inplace update enumerate invalidations + * optimistically, before locking the buffer. + */ +void +ForgetInplace_Inval(void) +{ + inplaceInvalInfo = NULL; +} + +/* * AtEOSubXact_Inval * Process queued-up invalidation messages at end of subtransaction. * diff --git a/src/include/utils/inval.h b/src/include/utils/inval.h index 3390e7a..299cd75 100644 --- a/src/include/utils/inval.h +++ b/src/include/utils/inval.h @@ -30,6 +30,7 @@ extern void AtEOXact_Inval(bool isCommit); extern void PreInplace_Inval(void); extern void AtInplace_Inval(void); +extern void ForgetInplace_Inval(void); extern void AtEOSubXact_Inval(bool isCommit);