Hi, On 2023-11-18 21:45:35 +0100, Tomas Vondra wrote: > On 11/18/23 19:12, Andres Freund wrote: > >> If we increase the locks from ShareUpdateExclusive to ShareRowExclusive, > >> we're making it conflict with RowExclusive. Which is just DML, and I > >> think we need to do that. > > > > From what I can tell it needs to to be an AccessExlusiveLock. Completely > > independent of logical decoding. The way the cache stays coherent is catalog > > modifications conflicting with anything that builds cache entries. We have a > > few cases where we do use lower level locks, but for those we have explicit > > analysis for why that's ok (see e.g. reloptions.c) or we block until nobody > > could have an old view of the catalog (various CONCURRENTLY) operations. > > > > Yeah, I got too focused on the issue I triggered, which seems to be > fixed by using SRE (still don't understand why ...). But you're probably > right there may be other cases where SRE would not be sufficient, I > certainly can't prove it'd be safe.
I think it makes sense here: SRE prevents the problematic "scheduling" in your test - with SRE no DML started before ALTER PUB ... ADD can commit after. I'm not sure there are any cases where using SRE instead of AE would cause problems for logical decoding, but it seems very hard to prove. I'd be very surprised if just using SRE would not lead to corrupted cache contents in some situations. The cases where a lower lock level is ok are ones where we just don't care that the cache is coherent in that moment. In a way, the logical decoding cache-invalidation situation is a lot more atomic than the "normal" situation. During normal operation locking is strictly required to prevent incoherent states when building a cache entry after a transaction committed, but before the sinval entries have been queued. But in the logical decoding case that window doesn't exist. Greetings, Andres Freund